Technical SEO

SEO: HTML, XML Sitemaps Explained

There are two types of sitemaps: HTML and XML. HTML sitemaps guide visitors, mostly. XML sitemaps guide search engine bots, to ensure they find a site’s URLs to index. Understanding the strengths and weaknesses of each will help with your search engine optimization.

XML Sitemaps

XML makes information machine-readable. XML sitemaps provide search engines with an efficient list of the URLs on a site.

XML sitemaps are just text files marked up with tags that identify types of data. The URL for an XML sitemap is typically at the root of a domain — for example, www.example.com/sitemap.xml — ready for bots to access.

Consider the screenshot below. It’s the product XML sitemap for Tiffany & Co. It contains 81,266 rows of data for 4,829 product URLs.

tiffany.com's product xml sitemap

The product XML sitemap for Tiffany & Co., showing the data for two products. Click image to enlarge.

The Tiffany sitemap has four types of data for each product URL. For example, for the URL https://www.tiffany.com/jewelry/rings/tiffany-diamond-wedding-band-GRP00001/ (a diamond wedding band), we see:

  • lastmod. When the URL’s content was last updated.
    <lastmod>2019-03-19</lastmod>
  • changefreq. How often the content is typically changed.
    <changefreq>monthly</changefreq></code></pre>
  • priority. Assigns a numerical value from 0 to 1 that represents the importance of that content. The highest value is 1. It is typically reserved for the home page and top landing pages. Setting every page to 1 will cause the search engines to ignore the field entirely.
    <priority>0.4</priority></code></pre>
  • hreflang. Identifies the URLs targeting other languages. This is an optional attribute that is usually found on a web page, but can also be included in XML sitemaps.
    <xhtml:link rel="alternate" hreflang="en-gb" href="https://www.tiffany.co.uk/jewelry/rings/tiffany-diamond-wedding-band-GRP00001/" />

When a bot visits a site, it first accesses the robots.txt file, which is a list of instructions, including the URLs to crawl or ignore. The robots.txt file should reference your XML sitemap, which in turn sends the bot off to crawl the list of URLs.

XML sitemaps follow precise markup rules. Once created, the XML sitemap is generated automatically, ideally, without human intervention. Check regularly for errors, though, because outdated, inaccurate, and duplicate URLs creep in with surprising frequency.

XML sitemaps have limits, including:

  • No guarantee of indexation. XML sitemaps merely recommend the URLs you want search engines to crawl and index. Search engines make it clear that they may not index every page or even crawl it.
  • No link authority passed. Unlike HTML links, the URLs in XML sitemaps do not pass link authority. If they only encounter it in the XML sitemap, search engines are unlikely to rank a URL.

HTML Sitemaps

In contrast to XML, HTML sitemaps are formatted links, usually at the bottom of a web page, to show readers what’s on a site. HTML sitemaps have, typically, limited SEO value.

Before the rise of header-based navigational rollovers — which provides visitors with deep access into a site — HTML sitemaps were helpful. They offered bots shortcut links to pages, which passed link authority and thus boosted rankings.

Today, many HTML sitemaps simply replicate links already available in the header or footer. A few sites, to be sure, still use HTML sitemaps for primary navigation.

In the screenshot below, Tiffany uses its HTML sitemap to link to pages that drive revenue but aren’t accessible via its header and footer navigation. It adds a small amount of organic search visibility of those pages.

tiffany.com html sitemap

In its HTML sitemap, Tiffany links to high-value pages that lack a place in the site’s header navigation.

HTML sitemaps can add SEO value in limited circumstances, including:

  • Primary site navigation does not link to all pages.
  • Navigation, or a section of the site, is inaccessible to search engines.
  • The pages in the HTML sitemap are important but would otherwise be buried deeply in the navigational structure. Examples include individual FAQ pages, support pages, holiday or event pages, and articles.
  • Analytics data shows that visitors are using the HTML sitemap. (If so, investigate what is forcing them to bypass the site’s primary navigation.)

There’s no harm to SEO in having an HTML sitemap. It’s another form of helpful internal linking. But beware of placing too much priority on an HTML sitemap. If you want a page to drive natural search traffic and conversions, linked to it in the site’s primary navigation.

Traffic Driver?

In short, HTML and XML sitemaps serve their purposes. But neither will drive your organic search traffic to new heights. For that, optimize your navigation. It will help visitors — and search bots.

Jill Kocher Brown
Jill Kocher Brown
Bio   •   RSS Feed


x