[Editor’s note: This post introduces a series of posts discussing Log File Management. For more on this topic, be sure to read Tyler’s other posts listed below.]
In the world of technology today, changes take place so quickly that it is important to keep a record of every transaction that takes place; that is, to keep a ‘log’. With respect to the Internet and Web sites, this also holds true. Of course, one of the number one concerns for website owners is ‘how much traffic is the Web site getting?’ Currently there are two main methodologies for gathering statistics for Web sites: Log File Analysis and Page Tag Analysis. While each one of these methods has their advantages and disadvantages, they both have one core aspect in common: log files.
Over the course of the next few months, I will be writing this blog to discuss some of the best practices I subscribe to with respect to Web Log File Management – Best Practices for Web Analytics. While the blog is not intended to discuss Web Analytics directly, the best practices will reflect optimization of Web log file gathering and storage for the purposes of Web Analytics. Much of the information for which I will present comes from over 5 years of experience in working with log files for both my business and our clients.
With so many topics to cover, the following are core topics to be presented in a 9 part series:
1. Log Files are Intellectual Property!
The data contained in Web log files can be enormously valuable to an organization. While they are not intellectual property in the strictest sense, they should be treated as critical data and maintained accordingly.
2. Log File Formatting, Naming, and Compression – Single-server Environment
Log files can contain numerous fields of information; structure and sequence matters. Servers can be configured to customize the name of the log files as well as the time period of the log files.
3. Log File Formatting, Naming, and Compression – Multi-server Environment
In a multi-server environment for large Web sites, multiple log files exist and require additional levels of complexity in managing them.
4. Storage is Not an Issue
Large files? Disk drives are cheap! Compressed log files for highly visited Web sites can be consume a lot of disk space, but terabytes of data are relatively cheap to buy and easy to maintain.
5. Remote Hosted Sites and ISP Policies
For Web sites hosted through an Internet Service Provider (ISP), their policies may affect your ability to get accurate Web Analytics data.
6. Retrieving Log Files for Remote Hosted Web Sites
A number of best practices exist when retrieving log files from ISP’s to ensure data integrity and prevent loss of data.
7. Security – Log Files Need it Too
As the naming convention and structure of log files is critical to accurate Web Analytics data, access control must be done to prevent log file modifications.
8. Backups and Hardware Redundancy
With all the potential data that can be gathered from log files, it is important to keep quality backups. Hardware redundancy on the web server (or log file storage server) is also a way to ensure no loss of log data.
9. Log Files are not JUST for Analytics
Surprisingly, log files are not only for Web Analytics. There are a variety of other useful ways to capitalize on log data.
With this introduction to the issue and the topics associated with it, I invite you to frequent this blog to learn more insight on Best Practices for Web Log File Management.
[Editor’s note: This post is part 1 of a series of posts discussing Log File Management. For more on this topic, be sure to read Tyler’s other posts.]