[Editor’s note: This post is part 4 of a series of posts discussing Log File Management. For more on this topic, be sure to read Tyler’s other posts.]
It’s not uncommon for a large Web site to have log files in the range of a few hundred megabytes per day. Compressed with a good archiving program (we use ‘GZIp’), the files can be reduced to somewhere between 30 and 50 megabytes. All in all, when added up, a site getting around one million visits per month can store one full year of log files in less than 20 gigabytes of total hard drive space.
If one were to consider the amount of knowledge and evidence-based data that can be extracted from logs, having a few years of historical logs on a few gigabytes of disk space is well worth it. For only hundreds of dollars, not thousands, in today’s fast paced computer market terabytes of disk space can be purchased for these purposes.
No differently than Canada Revenue Agency expects individuals and businesses to maintain documents for tax purposes for at least 7 years, I would recommend a similar rule of thumb. Web sites evolve drastically every few years; therefore, keeping logs for 10 years may not be worth it. However, benchmarking performance, growth, and improvements over the last 5 years is. Without the logs, this would not be possible. Sure you may have reports for each month over the last several years, but once you decide to create one larger report for consistency with the same filter set, its virtually impossible without all the logs.
Now; let’s consider this from an individual’s role within the organization. If you are in a marketing role and your boss or the president of your company asks you for some data comparing last year’s traffic for a sub-section of your site to the current year, could you provide it? Maybe. If you are in an Information Technology role, and someone from the business side of things asks you for some data, could you provide it? Maybe. Regardless of position, by establishing a process to warehouse logs for 5 years through the help of your IT team, these hypothetical situations, which are not all that hypothetical in today’s tough economic times, would be different. The answer would simply be ‘yes’. While you may not have the knowledge or expertise to answer the questions being asked, with the logs, you’d have the option to outsource the work to someone that does. Without the logs, once again you’re left in a tough situation and will likely be unable to answer the requests.
It’s simple; keep five running years of logs and you’ll always be able to answer questions regarding traffic to your website!
Director, Products and Operations
[Editor’s note: This post is part of a series of posts discussing Log File Management. For more on this topic, be sure to read Tyler’s other posts.]