I just received a memo (dated Feb 29) from our pals at Mountain View. Apparently they want us to start using another tag to specify canonical urls.
As someone who builds a fair amount of websites, this obviously intrigued me, but I was curious to know if anybody else had started using it? Is anybody planning to?
It’s interesting for Wikipedia which intends to have all alternative URLs indexed, e.g.
http://en.wikipedia.org/wiki/Python_language has canonical URL
It might also be useful for paged or sorted content, e.g.
/foobar?sorby=date with canonical
Personally I just prefer to exclude all non-preferred versions of pages from index by adding
robots.txt rules (in case of Google it’s simple, because it accepts regular expressions there) and
<meta robots>. It saves me bandwidth, and the result is probably the same.
It’s useful if you have a case-insensitive web server, since links into http://my.site.com/foo.html and http://my.site.com/FOO.html would be treated as two seperate documents by search engines (thus reducing that document’s page-rank)
Of course, you can get around this with a 301 redirect to the lower-cased version, but using rel=canonical means your users have 1 less HTTP request to make.
I’m already using the Content-Location header correctly. I won’t be in any hurry to implement Google’s new “standard”.
If you have multiple versions of the same content (maybe a print view, or different sort orders) then it seems to be very handy. I’ve used it on a couple pages – haven’t got any data to show its impact yet though.