The Blog

Tag Management: data layer/DOM-scraping pros & cons

Data Layer

Is the data layer the be all and end all of tag management? In this article, we explore the benefits and caveats of using a data layer versus DOM-scraping techniques.

As Justin Cutroni puts it “…a data layer is a JavaScript variable or object that holds all the information you want to collect in some other tool, like a web analytics tool. You add the data layer to every page on your site and then have the container pull data from the data layer… This insulates the data collection tools by separating the data from the page structure. No matter what happens to the structure of the HTML page the data layer will always be the same.”

This is brilliant and easy enough! But is there something looming in the darkest corner of your perfect implementation? You bet there is!

“The formulation of the problem is often more essential than its solution, which may be merely a matter of mathematical or experimental skill.” – Albert Einstein

Data Layer aficionados

Pros:

  • Your data layer specification is an integral part of the implementation strategy and as such, it formally describes the value-pairs and acceptable domain of values for each.
  • Formalization of the data structure makes for a cleaner/consistent data collection process regardless of the presentation layer.
  • The resulting implementation is simpler since there is no need to to DOM traversal to get the element you want.

Cons

  • “Standardization” and “independence” are fallacies often raised by paid TMS vendors. Presumably, once you have defined the data layer you could reuse it with any TMS. The reality is each vendor implement the data layer concept in a slightly different way. So until the W3C Customer Experience Digital Data proposed standard is adopted, your data layer will be very unique to you and the TMS you use.
  • What makes a good implementation is the rigour in the process, not the supporting format of it’s output – be it a data layer or HTML5 data attributes. Since you have to define a taxonomy, might as well define HTML5-compliant data attributes. Most likely, when your presentation layer change, the data points you want to collect will change too.
  • The extensive use of the data layer requires to go back to your web development team and ask them to populate exactly what is required. After those updates, if you find anything wrong you have to go back and ask for another round of changes. There’s also the likelihood something will eventually change and someone will “forget” about the data layer and break it, which essentially brings you back to square one – being dependent on IT.

DOM-scraping master

Pros

  • If we consider a web page is a structured document, we can “scrape” any of its elements by using standard JavaScript and knowledge of the DOM (Document Object Model). In fact, all tag management solutions include features to retrieve specific elements based on their tag, class, id or attributes.
  • You gain a lot of flexibility and agility – if it’s on the page, you can track it – without waiting on your web development team to expose it through the data layer.

Cons

  • This is somewhat of a dirtier approach that puts more responsibility and control on the tag management side, but alleviate most dependencies to IT. I say “most” because if a data element isn’t on the page, you still have to find a way to get it there (i.e. talk to your friend in IT).
  • This approach requires better knowledge of JavaScript and DOM concepts – which are front-end skills rather than back-end web development skills typically required to populate the data layer. Marketing usually have more control of the front-end and far less on the back-end.

An optimal and realistic approach

A data layer have the benefit of decoupling the presentation layer from the data collection needs, but it also makes those data elements more “disconnected” from their context. For aspects of the website that deals with IT systems (such as a purchase confirmation), a rigorous data layer approach should be privileged. For general front-end tracking such as content category, a data attribute approach might be easier and more flexible.

In the end, I don’t think there’s a clear cut for one approach over the other. My philosophy, akin to “user experience comes first”, is “instrumentations adapts to the web, not the other way around”, so I often use a DOM-scraping approach. Once the TMS bootstrap/container is deployed (and through some techniques, even when it’s not), it allows me to speed up the implementation process and have complete control and agility to make adjustments without bugging our friends in web development.

What’s your experience? Have you thought about the differences in those approaches?

This entry was posted in Technology, Google Tag Manager, Web Analytics, Web Development and tagged , , . Bookmark the permalink.
  • http://eduardo.cereto.net/ Eduardo Cereto Carvalho

    For me the biggest Con that made me give up on DOM scraping is maintainability. The DOM will change one time or another. And when the developer is changing it it’s very hard to tell if the changes are breaking some DOM scrapping code or not. So things stop working unexpectedly and you have to redo your work.

    With the dataLayer changes are less likely to break measurement because it’s clear and separated. I’ve become a dataLayer aficcionado and I’m not looking back. Better do it right, than do it fast.

  • Jay

    Hi Stephane, thanks for a great article. I’d have to say that a hybrid approach (if possible with your vendor) can provide both agility and maintainability. A solid data layer provides sustainability, consistency and a long term foundation for the 90%+ of the data more critical and consistent to an organization. Bringing order to that data chaos is the biggest win coming from the TMS industry in my mind. DOM scraping can then be used to evolve a company’s data strategy in lieu of additional IT resources getting involved. We’ve seen some pretty nasty things happen to companies attempting to go with a DOM scraping only strategy, data disappears and when you are relying on your TMS to manage your entire digital marketing stack that can be nothing short of disastrous. Once a solid data layer is in place you’ve got it made in the shade :-) Just my 2c.

  • https://www.linkedin.com/profile/view?id=5146216 Chris Brinkworth – ensighten

    Much needed piece Stéphane. There’s not enough focus on this and I’d advise people to view specifically page 30 on the W3C document, “Industry Standards”. It is becoming clear that each sector needs to define it’s common definitions as sub-chapters.

  • Stephane Hamel

    Thanks @all for your comments – the resulting discussion on G+ was very interesting. If there’s any conclusion to draw, it is that DOM scraping is useful for testing and quick wins, but a dataLayer strategy is better in the long run, especially for critical areas of the site such as the checkout flow or other online goals (while simple DOM scraping might still be acceptable for presentation-layer only elements).

Copyright © 2014, All Rights Reserved. Privacy and Copyright Policies.