We rely on search engines to drive shoppers to our sites to purchase our products. Before the shoppers can come, however, you have to let the bots in.
It sounds more like science fiction than marketing, but everything in natural search depends upon search engines’ algorithms and their bots’ ability to collect information to feed into those algorithms.
Think of a bot as a friendly little spider — one of the other names we commonly give bots — that comes to your website and catalogs everything about it. That bot starts on one page, saves the code, identifies every link within that code, and sends the code home to its datacenter. Then it does the same for all of the pages that the first page linked to. It saves the code for each of the pages and identifies every page that each of them links to and moves on, and so on.
That’s a bot in its most basic form. All it does is collect information about a website, and every bot is capable of crawling basic HTML.
Think of a bot as a friendly little spider — one of the other names we commonly give bots — that comes to your website and catalogs everything about it.
Some bots can crawl and find links within more complex forms of code like navigational links embedded in some forms of JavaScript. Others can analyze a page as a browser renders it to identify areas to crawl or elements that could be spam for further review. New bots are in development every day to crawl more content, faster and better.
But all bots have their limits. If your content falls within those limits, your content does not get collected to be eligible for rankings. If your content is not collected for analysis by search algorithms, you will not receive natural search shoppers to that content.
Bots have to be able to collect something for it to appear in search rankings.
Content that can only be seen after a form is filled out will not get crawled. Don’t think you have form entry on your site? The navigation in some ecommerce sites is coded like a form: each link clicked is actually a form entry selected, like clicking a box or a radio button. Depending on how it was coded, it may or may not actually be crawlable.
Controlling Bots
We sometimes place barriers in web content deliberately. We love to try to control the bots — go here but not here; see this, don’t look at that; when you crawl here, the page you really want is here.
“Good” bots, such as those from the major search engines’ crawlers, respect something called the robots exclusion protocol. Exclusions you might hear about (i.e., disallows in the robots.txt file and meta robots noindex) fall into this category. Some exclusions are necessary — we wouldn’t want the bots in password-protected areas and we don’t want the duplicate content that nearly every ecommerce site has to hurt SEO performance.
But we can get carried away with exclusions and end up keeping the bots out of content that we actually need to have crawled, such as products that shoppers are searching for.
So how do you know whether you’re excluding the bots on your site? The answer, uncomfortably, is that unless you really know what you’re looking for in the code of the page, and you have the experience to determine how the bots have treated code like that in the past, you really don’t know. But you can tell for certain when you don’t have a problem, and that’s a good place to start.
Head to the natural-search entry-page report in your web analytics. Look for the absence of a type of URLs or page names. Are you getting natural search traffic to your category pages? How about the faceted navigation pages? Products? If you’re getting natural search traffic to multiple pages within a type of page, then you (almost certainly) don’t have a crawling or accidental robots exclusion issue there.
If you are missing natural search traffic to an entire segment of pages, you have a technical issue of some kind. Diagnosing that issue starts with bots and assessing whether the bots can access those pages.
No Bots, No Rank
SEO is conceptually simple: Performance is based on the central concepts of contextual relevance (what words say and mean) and authority (how many important sites link to your site to make it feel more important). For more about the central concepts of relevance and authority, read my recent article, “To Improve SEO, Understand How It Works.” And always remember this: If the bots can’t crawl a site completely to feed into the algorithms, then that site can’t possibly rank well. In fact, it’s one of the first places to look when a site has a massive, widespread SEO performance issue.
In short, be aware of the abilities of the bots and the limitations that our own sites can accidentally put on them. That way we can open the floodgates to let bots in to collect the relevance and authority signals they need to send us shoppers.