SEO: HTML, XML Sitemaps Explained
Sitemaps come in two flavors: HTML and XML. Each has different uses and values for search engine optimization.
HTML sitemaps are primarily designed to help guide shoppers. XML sitemaps are used solely to ensure that search engine crawlers can index the URLs listed on a site. Each sitemap has unique strengths and weaknesses when it comes to SEO. So it’s important to understand their roles when mapping out your SEO plans.
XML Sitemaps and SEO
Because XML sitemaps are more straightforward and typically less understood in the marketing world, I’ll start there. XML stands for Extensible Markup Language. It’s similar to HTML and defined by the same governing body. But XML is used primarily to make information machine readable, while HTML is used primarily to mark up text files with formatting and linking tags, to form the basis of the web pages. XML is typically used for lists of URLs and the data associated with them.
An XML sitemap is a type of list marked up with XML so that search engines can easily consume information about the URLs that make up a site. This is what an XML sitemap looks like.
Search engines and other crawlers are the only consumers of XML sitemaps. For SEO, an XML sitemap is an invitation to crawl the URLs listed. It’s a way of asking the search engines to crawl and index the pages listed.
There are some important limitations to XML sitemaps.
- XML sitemaps do not guarantee indexation. They merely recommend the URLs you would like the search engines to crawl and index.
- XML sitemaps do not convey authority. The URLs listed do not pass link authority, like an HTML link on your web site would.
XML sitemaps are not a strong asset in improving rankings. If the only place a search engine encounters a URL is the XML sitemap, it’s highly unlikely that that URL will rank. It may get indexed, but it will not have the authority that HTML links pass to a page. In essence, the page will still be orphaned — unlinked — in the site and will not perform well.
XML sitemaps follow very precise markup rules and are typically produced by developers. Ideally, the XML sitemap is generated and pushed live automatically on a weekly basis without any human intervention. This functionality would be enabled at the platform level via a built-in feature, a plugin, or some other piece of third party software. When XML sitemaps require manual effort to generate, update, or post, they tend to become low priorities or forgotten about.
To learn more about how to generate XML sitemaps for SEO, see the Google Search Console help file “Learn about sitemaps.”
HTML Sitemaps and SEO
Conversely, HTML sitemaps are the ones you’re likely used to seeing as a standard part of the site. They tend to be linked from the footer and are usually included more as a nod to legacy website practices than anything else.
Before the rise of rollovers in navigation, which enabled many more navigation options right from the header on every page, HTML sitemaps were a necessary way of quickly navigating deeper into the site, thereby enabling them to perform more strongly for SEO. Today, an HTML sitemap is usually nothing more than a regurgitation of the links in the header and footer.
However, an HTML sitemap is limited in its ability to pass link authority because it is just one page. If the HTML sitemap was linked to in the footer, as it typically is, then every page on the site passes a little bit of its own authority to that HTML sitemap. In turn, the HTML sitemap then passes a little of its own link authority to every page that it links to.
Modern rollover navigation typically links directly to the pages that would be found on an HTML sitemap. The image below shows Menards’ HTML sitemap overlaid with the header navigation rollover for the “Paint” category. Note how the same links are shown in the HTML sitemap and the header rollover.
As a result, every single page on the site is linking directly to the pages that in the past would only have been linked to from that single XML sitemap page. In the process, the links in the header navigation pass much more link authority than a single HTML sitemap would.
That’s not to say that HTML sitemaps have no value. They can be valuable if:
- Your current navigation is limited in the number pages to which it can link;
- Your current navigation or a section of the site is inaccessible to search engines based on the way it has been developed;
- The pages linked to are important enough to merit a more visible link higher up in the site but would otherwise be buried deeply in the navigational structure — FAQs pages, support pages, and articles are the most common beneficiaries;
- You have clear evidence from your analytics or testing showing that visitors are using the HTML sitemap. However, if they are using it you may also want to observe what isn’t working on the site that forces them to use the HTML sitemap to navigate.
There’s no harm to SEO in having an HTML sitemap. Because it’s another form of internal linking, it will do some small amount of good. Just beware of placing too much priority on optimizing that HTML sitemap. If you really want a page to drive natural search traffic and conversions, it needs to be linked to in the site’s navigation — as well as included in the HTML sitemap.
In summary, both HTML and XML sitemaps serve their purposes. But neither is the single tactic that drives your SEO performance to new heights. Sitemaps alone will not drive traffic and conversions. Navigational links are required to pass the relevance and authority signals required for SEO visibility. Understanding the differences and the amount of value each type of sitemap brings will help you prioritize limited resources.