Websites sometimes have identical pages across two or more URLs. It presents a dilemma for search engines to know which page to prioritize in rankings. That’s the purpose of a canonical tag — rel= “canonical.” It tells Google, Bing, and others which identical (or near identical) page to rank by pointing the tag(s) from the duplicates to the original.
Yet the tag is often misunderstood and misused. What follows are dos and don’ts for deploying canonical tags.
Not Definitive
A canonical tag is only a hint to Google. It’s not definitive and shouldn’t be the first choice when correcting duplicate content. Google uses many signals to pick the representative URL. The content owner’s instruction is only one of them.
Others include:
- Internal links. The duplicate page receiving the most internal links is presumably the most important.
- XML sitemaps. Duplicate pages in a sitemap are typically the priority over non-sitemap versions.
- Encryption. Google usually chooses the https version over http.
- Amount and quality of a page’s content. Google refers to a page’s primary content as the “centerpiece.” When the centerpiece is similar or identical to other pages, Google attempts to know which is more useful and selects that page in search results.
Google may use a combination of the above signals. And pointing a rel=” canonical” tag (e.g., rel= “canonical” href= “https://www.xyz.com/heres-an-article”) from one page to another is likely pointless (to Google) if the site structure suggests otherwise.
If it overrides your rel= “canonical” tag, Google will include a section in Search Console at Indexing > Pages (called “Duplicate, Google chose different canonical than user”) and explain why.
Google’s overriding of canonical tags is common and does not necessarily indicate a serious problem. The exceptions are when Google chooses the wrong URL, or your site has material architectural flaws, such as linking to lesser internal URLs. Still, check the report frequently and fix duplication glitches.
In a recent LinkedIn post, Google’s Gary Illyes posed a hypothetical canonicalization conflict:
You have a rel=canonical pointing from A to B, but A is HTTPS, it’s in your hreflang clusters [assigning a language version to a specific region], all your links are pointing to A, and A is included in your sitemaps instead of B. Which one should search engines pick as canonical, A or B?
If you just change the URLs from A to B in your sitemaps and hreflang clusters, combined with that rel=canonical it might already be enough to tip over canonicalization to B. Change the links also, and you have an even greater chance to convince search engines about your canonical preference.
In other words, the more signals it receives for a canonicalization preference, the better chances of Google picking the correct page. Still, Google may ignore the signals and choose what it thinks is the best option. For example, if multiple signals prioritize a page’s desktop version, Google may still serve a mobile version to a mobile user.
Illyes has also stated that canonical tags should use absolute URLs to be recognized by search engines.
Duplicate Content Only
Another frequent mistake of website owners is attempting to direct signals using rel=” canonical” even if there’s no duplicate content. For example, I’ve seen owners build external links to an on-site infographic and then use canonical tags to redirect that link equity to a lead generation page.
Google would treat the infographic and lead-generation pages differently because there’s no duplicate content.
Bottom line, Google knows canonicalization. A website owner can nonetheless ensure Google’s choices are correct with the proper use of canonical tags. But a better practice is avoiding duplicate content altogether.