has_many :codes
TipsGoogle

A Google Analytics proxy to game ad blockers

Published  

If you have a website, you are likely using some kind of analytics to track visits. According to several reports like this one, Google Analytics is currently used by up to 85% of ALL websites in the Internet that use known analytics services and software. Let that sink in: 85%. It's a massive market share that simply blows any competition out of the water. Personally, I have been thinking of ditching all things Google for a while due to privacy concerns - like many other people. However, I gave up on this idea for now because I rely on too many Google products that simply don't have alternatives available which are as good. Privacy concerns aside, these days Google Analytics isn't great from an accuracy point of view, because an increasing number of users use ad blockers or browsers with ad blockers built in, and despite the name these block various types of trackers including Google Analytics, besides ads. This means that Google Analytics tends to show much lower figures for metrics such as users and page views compared to other analytics solutions that allow you to use custom domains, which ad blockers don't recognize and thus ignore.

As for privacy concerns, there are many analytics options that claim to be privacy friendly, in that they do not collect any personally identifiable information and therefore are also GDPR compliant.

I have tried some of these services, such as Fathom and Plausible, but I then switched to self-hosted Umami, a young project that replicates the same features - including privacy friendliness - but for free if you have some server where you can install it very easily and quickly. One thing I like of these solutions is that they show all the most important metrics on a single page, so they don't require a PhD in order to use them. No countless menus and reports. Just one page with top visited pages, traffic sources, countries and little more.

However I have never stopped using Google Analytics at the same time, because some of the many features it offers can be very handy both for SEO purposes by itself or due to the integration with other tools (I am currently trying to learn more on the subject) as well as the integration with Google Ads which for many companies are a necessary evil.

So, provided that we want/have to stick with Google Analytics for a reason or another, what can we do to improve accuracy?

Today I came across this project on Github that basically allows you to use a Node based application to proxy all the requests to some Google properties (as well as a few non-Google ones) including Google Analytics, using your custom domain. Like I mentioned earlier, ad blockers ignore domains that they don't recognize as belonging to ad networks or other trackers.

Once you install the app - I use the Docker version in Kubernetes - you can just change your Google Analytics code snippet so that it looks like the following:

<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://<your custom domain>/*(d3d3Lmdvb2dsZXRhZ21hbmFnZXIuY29t)*/*(Z3RhZw)*/*(anM)*?id=G-7C6FWHXCTH"></script>
<script>
  window.dataLayer = window.dataLayer || [];
  function gtag(){dataLayer.push(arguments);}
  gtag('js', new Date());
  gtag('config', '<your  analytics property ID>');
</script>

That's it! It's as simple as that. You just need to change the URL of the Google Analytics/Tag Manager script with a URL that points to the custom domain you use for the proxy, having as the path the www.googletagmanager.com/gtag/js part "masked" (you can see the simple npm command to mask the URL in the repo's README), and you're done. So instead of https:///www.googletagmanager.com/gtag/js you would have a URL that looks like the one in the snippet above. Now all the requests to Google Analytics will be proxied through your custom domain and will be ignored by ad blockers altogether.

One downside of this approach is that out of the box if you have configured Google Analytics to exclude visits from specific IP addresses, the filters will stop working. This is because the proxy sends a parameter to Google to set the correct IP of the visitor to your website, otherwise Google will see the IP of the proxy and will be unable to determine the location of the visitor. The reason why the IP filters no longer work is that Google anonymizes the IPs specified by the proxy before comparing them with the IPs in the filters, so the IPs won't match and Google will ignore the filters and track the visits as usual.

Luckily there is an easy work-around for this: just change your filters in Google Analytics replacing the last octet of the IPs you want to exclude from the tracking with simply a zero, because that's what Google does to anonymize the IPs. So if the IP of the visitor is 123.123.123.123, it becomes 123.123.123.0 in the filter. I have been testing this for a few hours and filters seem to work again.

I will keep an eye on the stats from Google Analytics over the next few days and compare them to those of Umami, but I can already see a difference in the visits tracked, because the figures between GA and Umami seem to be the same now. 

We'll see, but this seems to be solid and simple solution to improve the accuracy of Google Analytics. Also, if you prefer a hosted solution for this over self hosting it, the author of the project is working on a SaaS for it at https://dataunlocker.com/ which is reasonably priced and would make things even easier. I might purchase the service later. For now I am happy with the self hosted option.

© Vito Botta