The Blog

Segmenting Social Media Traffic Using Regular Expressions

sprechen-du-regex-a-beginners-guide-to-regex

I’ve been working on a project for a client and one thing they wanted to know was “what percentage of my overall traffic is from social media?”. This lead into a deep discussion of Reach on the channels they were engaging, and got me looking at creating some Advanced Segments to isolate traffic down to only refers from particular social media sites.

Depending on which type of analytics tool you use, your regular expression might differ, but because I spend 95% of my time in Google Analytics, I’m dedicated this post to it.

Like many things in life, there is more than one option for creating the same segment. John Doherty from Distilled wrote a great port on segmenting social traffic in Google Analytics one way, but I’m here to show you another way using regular expressions (also referred to as REGEX). I would recommend reading a beginner’s guide on regular expressions. Or, you may wish to download and print off a copy of the REGEX cheat sheet for your reference in case you plan to write your own advanced segment later on.

To include .com, or not to include .com? That is the question!

Without .com = facebook|twitter

With .com = facebook.com|twitter.com

Difference = 276

Do you see why? Ahh, people are using it in their URL!

Although the above value is not large in this sample, the number can slowly creep up to you if you are segmenting more channels that others have used in their URL, whether it be a sub-domain or within the domain itself.

**TIP: If you’re working on with a huge chunk of data and you run this Advanced Segment, you most likely will run it via “fast access mode”. To avoid this, you can go into your Traffic Sources report, paste your REGEX into your inline filter at the bottom (top for V5), and voila! You’ve escaped the fast access creeper (note: the inline filter for V5 is not set to REGEX initially and you would need to click on the Advanced link next to the text box to select it).

I hope this helps clean the data up a bit! Let me know if you would write the regular expressions differently, I’d be curious to know.

This entry was posted in Technology and tagged , . Bookmark the permalink.
  • http://twitter.com/lauw010 Laurens Geleedst

    Aren’t missing out al the URL shorteners with this segment (bit.ly et al)? In particular the new t.co shortening service?

    • http://jacksonlo.com Jackson Lo

      Hi Laurens,

      Glad you brought this up! The above was to illustrate the use of regular expressions and the importance to include “.com” to your filter. You are correct and t.co should be added to the regular expression for Twitter traffic… this is how I would write it: Include source match regex = twitter.com|twitterfeed.com|hootsuite.com|^t.co$. The t.co links are generated once a link is shared on Twitter, than 301 redirects to the page itself. The variables preceeding t.co/ shows up under your referral path in GA, which bit.ly never supplied.

      Another thought I had in that, if you were tagging your links using source = twitter or facebook, you may want to add an extra item to your regular expression = ^facebook$ and ^twitter$. This will prevent additional characters or URL look alikes to show up in your reports.

  • http://twitter.com/lauw010 Laurens Geleedst

    Aren’t missing out al the URL shorteners with this segment (bit.ly et al)? In particular the new t.co shortening service?

    • http://jacksonlo.com Jackson Lo

      Hi Laurens,

      Glad you brought this up! The above was to illustrate the use of regular expressions and the importance to include “.com” to your filter. You are correct and t.co should be added to the regular expression for Twitter traffic… this is how I would write it: Include source match regex = twitter.com|twitterfeed.com|hootsuite.com|^t.co$. The t.co links are generated once a link is shared on Twitter, than 301 redirects to the page itself. The variables preceeding t.co/ shows up under your referral path in GA, which bit.ly never supplied.

      Another thought I had in that, if you were tagging your links using source = twitter or facebook, you may want to add an extra item to your regular expression = ^facebook$ and ^twitter$. This will prevent additional characters or URL look alikes to show up in your reports.

  • http://twitter.com/rchrisborden Chris Borden

    Like the post…LOVE the comment! I’m kicking myself for not thinking of that but so glad you reminded me Laurens!

  • http://twitter.com/rchrisborden Chris Borden

    Like the post…LOVE the comment! I’m kicking myself for not thinking of that but so glad you reminded me Laurens!

  • Peter O’Neill

    Nice post Jackson. 

    Companies should also be using campaign tagging to track social media links they post themselves or that others generate through their social sharing buttons.  This post lists the three types of social media traffic and you should actually end up with a medium for each (an alternative to using segments or inline filters) – http://bit.ly/rOJ0kl. 

    I have since realised there is also a fourth group for campaign run on social media networks e.g. Facebook Ads – need to update that blog post.  Readers can also find a useful Excel tool for generating GA campaigns at http://bit.ly/uUBDER.

    Cheers

    Peter

    • http://jacksonlo.com Jackson Lo

      Those are excellent thoughts Peter! And I like the tools you’ve built in Excel, they come in handy when you have a number of people working on spreading the word, things can get messy, but with this it will solve that issues :)

      As I was reading through your recommended articles I came across a Twitter attribution piece I found very interesting, which I will explore more later. May be worth another post!

  • Peter O’Neill

    Nice post Jackson. 

    Companies should also be using campaign tagging to track social media links they post themselves or that others generate through their social sharing buttons.  This post lists the three types of social media traffic and you should actually end up with a medium for each (an alternative to using segments or inline filters) – http://bit.ly/rOJ0kl. 

    I have since realised there is also a fourth group for campaign run on social media networks e.g. Facebook Ads – need to update that blog post.  Readers can also find a useful Excel tool for generating GA campaigns at http://bit.ly/uUBDER.

    Cheers

    Peter

    • http://jacksonlo.com Jackson Lo

      Those are excellent thoughts Peter! And I like the tools you’ve built in Excel, they come in handy when you have a number of people working on spreading the word, things can get messy, but with this it will solve that issues :)

      As I was reading through your recommended articles I came across a Twitter attribution piece I found very interesting, which I will explore more later. May be worth another post!

  • http://twitter.com/LesFaber Les

    Hey Jackson,

    Just came across this post (excellent BTW). I also read Fabian’s post: http://www.seomoz.org/blog/advanced-google-analytics-tips-and-tricks.

    It would be great if one of you would compile the suggestions given in each post, along with any additions garnered through comments to both.

    I had implemented Social Media tracking via Advanced Segments in the past, but had missed your handy t.co recommendation. Thank you for that. 

    These days, measuring one’s Social Media Traction is truly a moving target!

    • http://jacksonlo.com Jackson Lo

      Les, I agree! I found that the best approach is to write an inventory of social media sites that you want to segment and analyze, then apply it as an advanced segment using regex in Google Analytics. Over time, you’ll need to build as things like t.co begin rolling out. 

  • http://twitter.com/LesFaber Les

    Hey Jackson,

    Just came across this post (excellent BTW). I also read Fabian’s post: http://www.seomoz.org/blog/advanced-google-analytics-tips-and-tricks.

    It would be great if one of you would compile the suggestions given in each post, along with any additions garnered through comments to both.

    I had implemented Social Media tracking via Advanced Segments in the past, but had missed your handy t.co recommendation. Thank you for that. 

    These days, measuring one’s Social Media Traction is truly a moving target!

    • http://jacksonlo.com Jackson Lo

      Les, I agree! I found that the best approach is to write an inventory of social media sites that you want to segment and analyze, then apply it as an advanced segment using regex in Google Analytics. Over time, you’ll need to build as things like t.co begin rolling out. 

  • http://twitter.com/AnalyticSamurai Analytics Samurai

    Testing Disqus

  • Anonymous

    Testing Disquson CP staging blog

  • Sahet

    Wow, thanks so much, hadn’t realized how powerful this could be.


Cardinal Path Training

Copyright © 2014, All Rights Reserved. Privacy and Copyright Policies.