Category Archives: seo

3 Replies

A search for a place to rest my head near to Exeter Airport (I had an early flight to catch yesterday) led me to perhaps the closest airport hotel, ever! Situated between the two runways at Exeter, this was going to be ultra, or even über-convenient!

Alas, Google Maps had failed me, as a quick look at the postcode of said hotel, and indeed the image which showed no nearby buildings, quickly alerted my suspicions. The Trelawney Hotel is actually in Torquay:

Google Maps, well and truly, failed!

Google Analytics and sub-domain tracking

1 Reply

I am always championing Google Analytics as my analytical package of choice, not least because it’s free, but because I feel the functionality you get out of it, as a free package, is second to none when compared to the rest of the free/nearly free products out there. Out of the box, GA works with minimal setup or configuration, and works really bloody well.

Tracking sub-domains with Google Analytics also works, with the minimal of configuration, by simply amending your GA code (this is the most up to date version, just by the way):

_gaq.push([‘_setAccount’, ‘UA-XXXXXXX-1’]);
_gaq.push([‘_trackPageview’]);

To read:

_gaq.push([‘_setAccount’, ‘UA-XXXXXXX-1’]);
_gaq.push([‘_setDomainName’, ‘.yourdomain.com’]);
_gaq.push([‘_trackPageview’]);

Simple enough? Well, here is a a nice tidbit of information, if you have two pages on your primary and sub-domains, that share the same page name (i.e. index.html), then Google Analytics will combine the statistics for these two pages artificially inflating the statistics for index.html on the primary domain. As my good friend Rob put it: “I wonder who decided that this was a sensible default?” Sensible default indeed!

More endless searching led me to information on setting up a filter, which will supposedly split your data and track traffic on both sub-domains and the primary domain:

Unfortunately, this didn’t work out too well either in my tests:

So, calling on the Twitterverse, David Whitehouse from Bronco volunteered to help out and see if we could get this fixed, and came across the following filter configuration from Brian Clifton’s book:

This has now been set up, and I’m running a few more tests on some new pages and will report on results as soon as they’re in…

Update

So it’s the morning after the afternoon before, and I can’t say the results are overly brilliant, especially when I spot the following duplication in my analytics overview:

Furthermore, the subdomain I have been testing is being treated like a subdirectory in the reporting:

So when you hover over the “Visit this page” link in GA, you get the following in your status bar:

Not what I would call perfect, by any means.

Robots.txt is case sensitive!

1 Reply

What are robots?

Robots, crawlers, spiders or agents are programs which are used to traverse the wobbly world wide web automatically, taking note on which content is where. Search engines use these programs to index content for their indices, spammers use them to scrape content for their own sites, or even to crawl the web for email addresses to spam you even more. Understanding of how search robots work is an intrinsic part of SEO!

What is robots.txt?

Robots.txt is a file which, when placed in the root of a publicly available webserver, tells search engine robots and agents which content they can and cannot/should and shouldn’t access. It can be used to block access to members’ only directories for example, or pages which nobody should really be finding through search engines, or, as I was trying to do earlier today, block search engines from indexing pages which are a part of your affiliate system or associated with an affiliate ID.

What is an affiliate system?

Many e-businesses run affiliate systems, which can be described as an semi-automated process by which the e-business takes referrals from partner websites, and remunerates them for any transactions that result from the referral. Having worked in the online gaming industry for over 10 years now, I’m quite familiar with many of them, as they an intricate part of just about every online gambling business model.

Most affiliate systems use standard html links which contain affiliate ID parameters in order to track referrals from their partner or affiliated sites. Anyone who clicks the link which contains the affiliate parameters will be associated with that affiliate’s account. If they go on to purchase something from the site, the affiliate will take their share of that revenue.

Search engine optimisation and brand protection

This is leading somewhere, I promise!

Many companies, when taking on an SEO consultant, agency or in-house employee, go straight for the proverbial jugular. They want to target the big, juicy keywords which will drive mountains of good, converting, valuable traffic. Because of this, they usually overlook the basics, ensuring you’re dominating the search results for your brand names. Imagine, if you will, the panic in the office this morning, when after no more than a few hours, I spot a discrepancy in our brand term search results: an affiliate tracked URL is ranking in second spot, taking a nice bounty per referral as well as a share of any future revenue from any clients that came through that link!

SEO, brand protection and affiliate URLs

Whoever was in here before me, had not taken the time to ensure that search engines, and especially Google (ye olde search dominator) were not allowed to index URLs tagged with affiliate IDs. Due to the high volume of traffic this affiliate was sending through his affiliate ID, that URL got indexed for our brand name, and is still ranking in second place. I immediately submitted a removal request, and asked for a change to the robots.txt to ensure that affiliate parameters were blocked from this point forward… the reply: “what robots.txt?”

Robots.txt to block affiliate IDs

Given the fact the content/web developer people here hadn’t implemented one, and the urgency required to get this resolved, I quickly typed up a robots.txt file for them to upload… ran it through the nicely-provided-by-Google testing utility:

Allowed by line 5: Disallow:

But what about line 6 you stupid test tool? The one that says:

Disallow: /?affid=*

No amount of fiddling would get it to work! I tried and tried and tried. And then I tested a second URL, one I typed up myself, and not copied and pasted:

Blocked by line 6: Disallow: /?affid=*

And that is when it struck me… affid and affId are two completely, and utterly different things according to robots.txt… why? Because ROBOTS.TXT IS BLOODY CASE SENSITIVE!

Andy Blackburn – SEO Consultant

A collection of SEO, tech and other thoughts…