This post is intended for two groups:
- the public at large who fear cookies to be invasive threats to our privacy; and
- the web analysts and marketers who fear tightening of privacy rules, cookie use and visitor tracking will make their jobs tougher and some tasks impossible.
Those who read my post Tracking Website Visitors is democratic know that I think tracking is a good thing for all parties but my view of cookies is that some should be outlawed.
But first the calm before the rant: Here’s a quick, non-geek cookie primer for regular surfers and activists alike.
First, Cookies Rules
Many think that cookies are:
- small text files stored by our browsers on our hard drives;
- harmful or malicious;
- containers of personally identifiable information;
- read by our browsers and sent “somewhere”, possibly to every site we visit.
Reality is somewhat different:
- “small” is not innocuous. Cookies are inherently “good” since they themselves can do no harm. In fact, unlike viruses, they cannot do anything;
- the text they store could be any information, including the most personally identifying. However, if it is personal information it had to have been previously entered by the user into a browser preferences dialog or via a web page;
- Typically, they store unique, randomly generated identifiers, most often referring to a browsing session or anonymously to the unique combination of user + browser + machine + web site;
- web servers cannot set cookies on our machines; they can only request our browsers to set cookies;
- Browsers are limited in what they can do with cookies.
1st party cookies and 3rd party cookies at the same party
A browser can only send cookies back to the domain for which the browser set them. So wikipedia.org cannot get cookies set for vkistudios.com.
A web page is served to our browser from a single domain (e.g.: the page http://en.wikipedia.org/wiki/Magic_cookie is served from wikipedia.org).
However, not everything on a web page comes only from a single domain. Banners, images and other content on a page can be served from domains other than the web page’s domain. We can refer to the web page’s domain as the 1st party domain and those other domains as 3rd party domains. When our browser requests an image from a domain, the image can be returned together with are request to set a cookie from that domain. Cookies set for 1st party or 3rd party domains are called 1st party cookies and 3rd party cookies, respectively.
Most sites set cookies for their own use and cannot function well without them. This is because every time you click a link, submit a form, etc., your browser connects to the web server as if for the first time. Once the request is dealt with, the connection is terminated. The server does not remember your visit from one request to the next.
However, actions like “logging in” are not possible without the server being reminded who you are.
To create that “logged in” state, the server asks your browser to set a cookie for the domain it is serving. Every time your browser requests anything from that domain, your cookie goes with it, reminding the server of your “logged in” session.
I’m not suggesting that cookies are the only way to achieve this. The server and browser could pass a “reminder ID” back and forth (and many sites work that way) but that will not allow you to return later and have it remember your login details. When you log in on a subsequent visit, you are telling the server who you are (to the extent that a “throw-away” or even a genuine email address can) so storing a cookie that does the same thing does not compromise privacy any further and is more secure.
The Good, the Bad and the Ugly
Something that cannot do anything cannot be harmful.
Within the above limitations the over-zealous, “to-serve-you-better” folks will encroach on our privacy. The malicious will go even further and serve themselves superbly. However, the latter scenario is primarily a security violation requiring security solutions, rather than a privacy issue as is, say, theft of a passport.
Cookies are, however, also used for less vital purposes. Typical uses include identifying a single sitting (aka a “visit”) and, anonymously, a visitor. That enables tracking which pages were visited together in a single visit and which in multiple visits by the same visitor, but without knowing who that visitor is.
If I know that 100 people visited this page and, in their same single sitting, 50 of them visited another post I wrote, it implies that the content of the first post might be helpful enough to cause visitors to read another. The implications are different if those 150 page views occurred during 150 visits (each page view is by a different visitor so visitors are reading only one post and leaving) or occurred during 100 visits by 30 visitors. (Other metrics like average time on page are also relevant, but that’s a different post)
Such cookies do not tell me who visited, nor do I have the time nor the need to know. And nor do the millions of others pouring over their web analytics data.
Those cookies enable you to tell the owners of the websites that you visit, what you think of them, in much the same was as do ballots. Tracking Website Visitors is democratic.
I submit these cookies too are good. They may not be part of a website’s life support but are vital to its extended life.
It would be an interesting study to determine what percentage of sites that are not being measured with analytics tools, are comatose—that is, they have not been updated for more than, say, 6 months.
The Ugly – Good Cookies, bad ingredients
However, there are some analytics tools and websites, that use 1st party cookies, that do identify visitors.
I will discuss identifying the identifying cookies in my next post on Bad Cookies.