Referral Spam Solved: Give Your Analytics an Enema

Adam Steele
Sep 21, 2015
Quick navigation

Ghost Referrer spam is a new twist on marketing that directly targets webmasters. It exploits Analytics data by displaying urls or brands so that traffic can be driven to the spammer’s sites. In the process it litters up your analytics data. This type of spam (called ghost referrer spam) was especially frustrating because it had to be removed manually—until now.

If you’re anything like us, evaluating you or your client’s website traffic became a lot more difficult during the course of this year. At first we’re all like “Yeah, natural inbound links from button sites!!!”.

Two seconds of research left us hoping this trend would die a quick death.

Two months later, we knew we had a problem on our hands and started taking steps to block bogus referrers from our analytics account.

Before After Google Analytics Segment

I get that it’s hard to accept a graph with less data as an improvement, but it’s a far more honest evaluation for both you and your clients.

I appreciate the information, but let me just…

Clean Spam for Free

Understanding Analytics Spam

Referral-Spam-Solved

When I mentioned earlier that this spam is targeted at you, the webmaster or site owner, I wasn’t kidding. For us, we’ll notice the targets of the spam have lots to do with web development stuff … buttons, traffic, servers, seo, videos, rank tracking, porn (yeah, webmasters need love too). I suspect you would see a lot of the same stuff, but maybe a little more targeted to what you do.

You can see what I’m talking about on the screenshot. This is just a random sample I grabbed from our database.

This makes total sense since it’s 99% webmasters that concern themselves with analytics data. It’s how we measure our manhood against other nerds. It’s also the metric to use to justify to our clients that they keep us around for a reason.

IT Manager: “Times are tough, I just laid off two people and now I’m feeling the pressure to justify keeping your internet marketing firm around … what is it you do again?”

My SEO Firm: “As you can see from our metrics, your year-over-year growth is +30% and your traffic increased 35% this month. Conversions have doubled, blah, blah, blah. I have to challenge you to find another employee who has brought you anywhere close this much business”

IT Manager: “Yeah, I see that you guys carry your own weight … what’s this stuff about buttons & donkey shows? How are we getting traffic from them?”

My SEO Firm: “err, ummm”

My Google Analytics Account Needs an Enema

By now you’ve probably gone into your analytics account to look around, and you’re probably starting to realize that it’s sympathy you have for our problem and not empathy. I know this because we’re not special. Our clients aren’t the biggest in the world. We’re not targeted due to political position or race.

script-google-analytics-segment

Ghost spammers are like frogs in heat—they rely on the spray-and-pray reproductive strategy.

It’s likely not simply random UA-XXXXXXXX-X like many bloggers are saying. That would be incredibly inefficient and no hack in their right mind would think that’s going to be effective. At best, it’s a waste of CPU cycles. Their system is a spider that they’ve built and sent along to a set of SERP results to find universal analytics (UA) footprints for their target audience.

Once UA footprints are found, the ID is saved to a database and then the ghost spamming scripts just nail analytics directly over and over until they feel like they’ve got enough records in your GA account to show up at the top of your referrer list. It’s the same principle that applies to “above the fold” content for your landing pages, and why your site is only relevant if it shows up on page 1 of the serps—Exposure—Enough exposure to get you interested enough to go google wtf they’re all about. That traffic leads to conversions and ROI … meaning they won’t be stopping anytime soon.

A Solution That Really Works

There are 3 main schools of thought to addressing referrer spam.

  1. Block in .htaccess: Completely ineffective! If this were 2014, it would work well as long as you can keep up with the avalanche of spammers heading to your site. In this article I’m referring to the far more deceptive method of referrer advertising. Bots never hit your site so blocking them on your server will do you no good at all.
  2. Block in analytics via filters: This is exactly how we started off. Originally we used simo’s tool powered by Loan Goat’s blacklist. Immediately we noticed that the list is outdated and, while the solution worked, it was only about 15% effective. Worst yet, even if it was 100% effective it’s not able to do anything about yesterday’s traffic … it’s only good for future referral spammers.
  3. Block in analytics via segments with a custom blacklist: Segments are built into analytics so that you can filter your past visitors depending on certain criteria (time of day, geolocation of IP, etc). We decided to piggy-back on that technology to fight ghost referrers and holy hell did we have a winner on our hands.

How We’re Doing It

As you can imagine, going through analytics and copy/pasting referral spammers was a complete waste of time. Even if we got 100% of everything, new marketers litter our clients analytics daily. It’s like when black hats burn a domain and then just move on to another $10 TLD with a clean history to keep cranking out the spam. Ghost spam domains are cheap and fungible. A $10 registration fee is just a cost of doing business and they lease out a whole group of domains weekly (if not daily).

Our solution, after much trial and error, was to hook up the google analytics api to all of our clients sites and scrape ALL incoming referrers for the entire network every few minutes. With the help of a little programming & database voodoo magic, we’re now sending new potential referrer spammers directly to a slack channel so we can immediately update our system and create a new segment to use. We all work in slack every day so it was stupid easy to crowdsource the work of blacklisting internally.

To the Victor Go the Spoils

To-the-Victor-Go-the-Spoils

Yeah, that’s how we feel right about now. We’re pretty happy with the way this effort turned out.

Our analytics data is clean and organized.

Our clients are happy with our reporting.

Our team loves the system because it takes only a minute or so every day to keep up-to-date.

That’s a win if I’ve ever seen one.

Here’s where you can share in the spoils. Since the finished product of our ghost referrer blacklisting system is a simple shared link from Google Analytics, you can easily take advantage of all of our efforts just by sending our segment into your analytics account.

Yeah, we’ll require you to join our mailing list but there’s a reason for that and it’s not so we can replace ghost spam with Loganix spam. As I mentioned before, our blacklist gets updated daily. With our mailing list, we’re able to send you updated filters weekly so staying up to date takes just the click of a link.

Remove My Referral Spam

*cleans historical & future spam