Technical SEO

SEO: 3 Paths to Crawlable JavaScript

When complex site technology blocks search engines’ crawl paths, it’s also blocking natural search revenue. But there are ways to make sure your site welcomes search engines rather than locking them out.

Last week’s primer, “SEO: No, Google Does Not Support Newer JavaScript,” described in layman’s terms some of the reasons that Googlebot, in particular, has trouble with modern forms of JavaScript such as Angular, AJAX, and others.

This follow-on article addresses some of the solutions to compensate for bots’ limitations — to help drive more natural search traffic and revenue. Many ecommerce sites count on natural search traffic. Thus ensuring that the bots can access the content and URLs on your site should be a critical concern.

To be sure, halting innovation in ecommerce is not an option. Instead, discuss these workarounds with your developers so that bots can index your innovative site.

Anchors and HREFs

When is a link not a link to a search engine? When it’s coded in JavaScript without pairing a URL in an href with the visible anchor text that identifies where the link is going to.

This is the biggest issue I come across with ecommerce sites and JavaScript. It might look like a link, and when you click on it you might go somewhere different, but that doesn’t make it a link that search engines can crawl.

If you want to be certain, right click on the link and select “inspect.” If you don’t see an anchor tag with an href and an actual URL wrapped around the link text, it isn’t a crawlable link. If you don’t have an option to inspect, you might need to enable developer tools in the settings in your browser or try a free plug-in such as Firebug.

To rank your site, search engines must crawl links to pages on your site. No crawl means no indexation, which in turn means no ranking, no natural search-referred traffic, and no revenue from what could be your largest channel. Focus first on the “crawl” piece of the equation. For search engine optimization, nothing else matters unless the bots can crawl your pages and index them.

Crawlable with pushState()

If the page that is being linked to isn’t even a “page” to a search engine, it won’t crawl the link. Many ecommerce sites use AJAX to load increasingly specific product sets for each filter combination. It’s a compelling user experience, but one that can keep search engines from indexing pages of products that consumers want to buy.

For example, someone searching Google for a black dress won’t likely find one on The Gap because black dresses are not crawlable as a distinct page of content. Macy’s, however, does have a crawlable black dress page.

One easy way to tell if a page is generated with AJAX is to look for a hashtag. Google has stated that it will not crawl and index URLs with hashtags in them.

Regardless, AJAX URLs with and without hashtags can be made crawlable using a technology called pushState(). Don’t let the funky capitalization and parentheses put you off. It’s just a crawlable JavaScript function with a single purpose: It uses the HTML5 History API to load a crawlable URL into the browser bar for users and while making the URL indexable for search engines.

Prerendering Content

Faster page loads mean higher conversion rates. To deliver faceted content more quickly, many ecommerce sites have switched to client-side rendering techniques that limit the number of trips back and forth to the server to load a page of content. But client-side rendering can slow indexation by months for an ecommerce site, as described in last week’s article.

That delay can hurt revenue. Ensure that search engines can index all of your content, and in a faster timeframe, by “prerendering” client-side content.

Prerendering is especially critical when a site uses a framework such as Angular or React. Yes, Google is behind Angular’s development. But that doesn’t mean that Google can efficiently index Angular sites — quite the opposite in my experience.

Open source solutions for prerendering espoused by Google search engineers include Puppeteer and Rendertron. I’ve also run across as a frequent player for ecommerce.

Some of these technologies allow you to block certain user agents, such as popular browsers or Googlebot, from using the prerendered version. The goal is to allow consumers to use the client-side version of the site while delivering an identical prerendered version to bots and to users with JavaScript disabled. Don’t block Googlebot.

Two Google representatives — John Mueller, webmaster trends analyst, and Tom Greenaway, partner developer advocate for indexing of progressive web applications — spoke about search-friendly, JavaScript-powered websites at Google’s annual I/O developer conference last month. Watch the video of their refreshingly forthcoming presentation for a deeper discussion on these topics.

Jill Kocher Brown
Jill Kocher Brown
Bio   •   RSS Feed