It’s been a few weeks since Google announced support for canonical links using HTTP headers. Aside from the obligatory articles announcing the news, the SEO community has been pretty quiet about this new development — and I’m a little surprised.
This is a really big deal that could positively impact thousands of sites, yet I haven’t seen any tutorials cropping up on *how* to set the header.
I suspect the silence is largely due to people’s uncertainty about when or how to use canonical HTTP headers, so I’ll take some time to explain both when to use them and how to use them (scroll to the bottom of the post for a code sample if that’s what you’re after).
When to use Canonical HTTP Headers
In a nutshell, you should use canonical HTTP headers whenever you need to set a canonical tag on a non-HTML document.
If you’re still not sure when to use them, consider this example:
A lot of businesses like to publish case studies on their sites, and they often publish two versions: an HTML version and a PDF version. This is fine — it’s nice to offer your users multiple formats so they can choose the one that’s best for them.
The only problem is that PDF versions often outrank the HTML versions, which is ok — but it’s less than ideal for many sites. PDF downloads won’t show up in your analytics (unless you’re looking at weblogs, which few site owners do), and they don’t contain lead forms/other nice things that our websites use to generate income. In short, you’re potentially missing out on a lot of things if your HTML content is getting outranked by a PDF. Canonical HTTP headers will help with that.
Tutorial: How to Use HTTP Header Canonicals with htaccess
For a lot of people, HTTP headers are something of a mystery — they just sort of magically happen. The good news is that they’re really easy to understand and even easier to control. When it comes to setting the canonical on non-HTML documents, the easiest way for most people to control the headers is through your htaccess file.
For the sake of this example, let’s say that I have a PDF named my-file.pdf in the root of my site — and for some reason, I want to set the canonical to my homepage (not a good idea, but it makes the example simpler).
All I would need to do to make this happen is add the following code to my htaccess file:
[plain]
<FilesMatch “my-file.pdf”>
Header set Link ‘<http://makeitrank.com>; rel=”canonical”‘
</FilesMatch>
[/plain]
Once you’ve added that to your htaccess file, you’ll want to test your header to make sure that it’s correctly.
To test it, pop open the Net tab in Firebug or a use a specialized plugin like Live HTTP Headers and then open your PDF (or other document) if your web browser.
This is what the canonical header looks like for my example — yours should follow the same format:

That’s it! I hope that you see the power in controlling canonicals this way — it’s a fantastic way for us SEOs to control the weight of all documents in a way that was never before possible.
Note: This method is fine if you want to set only a few headers (if you have a lot, your htaccess file can get very big very quickly). I have not included an example of handling many canonicals with a single command, because it will need to be algorithmic in nature and the code will be largely dependent on the file structure of your individual site.



by:
Protect your content from scrapers, aggregators, and other scary creatures
Your masked affiliate links aren’t fooling Google
How to disable the RSS feed on a WordPress site
The search results page of the future?
Dear Google: It’s not all about you