Coding and Dismantling Stuff

Don't thank me, it's what I do.

About the author

Russell is a .Net developer based in Lancashire in the UK.  His day job is as a C# developer for the UK's largest online white-goods retailer, DRL Limited.

His weekend job entails alternately demolishing and constructing various bits of his home, much to the distress of his fiance Kelly, 3-year-old daughter Amelie, and menagerie of pets.

TextBox

  1. Fix dodgy keywords Google is scraping from my blog
  2. Complete migration of NHaml from Google Code to GitHub
  3. ReTelnet Mock Telnet Server à la Jetty
  4. Learn to use Git
  5. Complete beta release FHEMDotNet
  6. Publish FHEMDotNet on Google Code
  7. Learn NancyFX library
  8. Pull RussPAll/NHaml into NHaml/NHaml
  9. Open Source Blackberry Twitter app
  10. Other stuff

Google Analytics and the Bogus Account Number

I've recently fallen hit a snag with the Google Analytics API, that's really put a dint in my day. The suggested script comes with a caveat, that it should be included in the header of the page. But this goes against current performance guidelines, that you want all of your scripts to go at the bottom of the page, particularly if they're loading in third-party scripts.

So why does that script need to go at the top of the page? Is it anything to do with page load timings? Nope, the APIs used to time pages are built into the browser, or on some versions of IE they're built into the Google toolbar where it's present. So it ain't that. Could it be because if you put anything into the global "_gaq" array before the account number, those instructions go in with a bogus made-up account number?

We don't get a nice clear JS error telling us something's wrong. the analytics JS file doesn't look for the account number first before sending the rest of the traffic, it doesn't refuse to send the commands (so the bug might be more obvious if you're monitoing HTTP traffic using Fiddler or Firebug, etc), it sends the commands with a bogus account number. Ouch.

I propose a minor tweak to the Analytics code block that fixes this problem, but first some background.

The Existing Tracking Code Explained

So what's the existing tracking code look like?

var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-XXXXX-Y']);
_gaq.push(['_trackPageview']);

(function() {
  var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
  ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
  var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();

This code does four things:

  1. Check for an existing _gaq object, which could be an array or, if you've already hit the Analytics script elsewhere on the page, an AJAX-enabled Analytics object. If it doesn't exist let's create a dumb array.
  2. Push some data onto that tracking object, in this case an account number followed by a page view tag
  3. In a scoped block, load the "ga.js"  script
  4. Execute that script by adding it to the DOM.

The general principle is actually quite smart. We have a global _gaq object that initially behaves like a plain ol' array, when the "ga.js" script gets invoked it pushes all the data in that array to Google's servers, and replaces it with a new _gaq object, which still supports the "push" syntax, so it still looks like an array, but is now actually using AJAX under the hood to send the data in each push command as you append to it.

Our Non-Standard Use Pattern Goes Boom

Being the performance-conscious web devs that we are, we've decided to move this script right to the bottom of our pages, we don't want to risk that call out to Google delaying the rest of our page load. We already know that the important stuff like page load timing is not sensitive to this, we did our testing, all was well.

But hang on - if this script is at the bottom of our page, how do we fire some of the page-specific tags, for example the commerce tags? We decided to centralise our own interactions with Google into a reusable JavaScript file, that looks something like this:

var _gaq = _gaq || [];

var Tagging = {
  search: function(searchTerm) {
    _gaq.push(['data', 'goes', 'here');
  }
};

This is great - we've got nothing in here apart from the bare _gaq variable, and a nice wrapper to separate the GA particulars from how we want tagging to look.

Except that if we call that "Tagging.search" method in a page before we've loaded the generic GA script we looked at above, we've got no account number specified yet, so when google sends our search tag plus it's own two tags later on, the first tag goes across with an account number of something like "UA-XXXXX-X". This isn't good.

So what to do? Of course we could put our account number in our own tagging script, but that doesn't feel like a good idea, it certainly makes it hard to reuse that script anywhere else if the account number needs to change, and it means duplicating it in two places. So what about this...

A Possible Solution

What if the GA scrript, rather than blindly appending the account number onto the end of the _gaq, checked if it was an array, and if so took the extra pain to make sure it was the first command in the array? Wouldn't that make sense? Let's take a look at how that looks. The original problem code:

var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-XXXXX-Y']);

And the new code: 

var _gaq = _gaq || [];
if (toString.call(_gaq) == '[object Array]')
    _gaq.splice(0, 0, ['_setAccount', 'UA-XXXXX-Y'])
else
    _gaq.push(['_setAccount', 'UA-XXXXX-Y']);
To cut a long story short, this is exactly what we did, and for now all's well.

Categories: Hacking
Permalink | Comments (0)

Add comment

  Country flag

biuquote
  • Comment
  • Preview
Loading