[Editor’s note: This post is part 3 of a series of posts discussing Log File Management. For more on this topic, be sure to read Tyler’s other posts.]
What is different between this blog entry and the previous (called “Log File Formatting, Naming, and Compression – Single-server Environment”) is simply the number of servers in the Web hosting environment. When an organization uses more than one server to host a single Web site, log file management can become a bit more complex as can Web Analytics with these environments.
All the principles behind the log formatting and compression remain the same. The difference becomes in the naming of the files. More specifically, there is an added layer of complexity through the addition of a folder structure. In a single log setup, the log files can be stored with an appropriate name in a single folder. When there is more than one server involved, the logs must be named the same and stored in different folders.
For example, if we have a Web site which is load balanced across 2 servers, we could store the logs on a networked drive as follows:
Server 1 Log:
Server 2 Log:
When managing log file analyzers with multi-server sites, it is common to require the same filename for each day of the logs. The name of the log as well as the data within is how the log analyzer ‘stitches’ the data together. When Server 1 and Server 2 log files have the same naming convention, the log analyzer can parse the logs alphabetically, which in turn happens to also be chronologically by date (when a proper naming convention is used). It is important to understand that if the names are not the exact same for each server log file, the import process in an analytics tool may simply import all the logs chronologically from one server, than from the next. If this happens, the analytics engine is not able to ‘stitch’ together a particular visit (which may include pages served on both servers) as a single visit which results in higher and inaccurate ‘visit’ counts (page views will still remain the same however – a page view is a page view regardless).
Clearly we can see that it is impossible to store the 2 log files in the same folder with the same name, therefore, the need for the folder structure!
In conclusion, if your organization is growing and develops the need for load balancing, keep this in mind as analytics are not always considered as a priority when expanding technology needs for Web sites. It’s not complicated and when done from day one, can save a lot of lost data and headaches in the future…
[Editor’s note: This post is part 4 of a series of posts discussing Log File Management. For more on this topic, be sure to read Tyler’s other posts.]