Backlinks are the cornerstone of Google’s algorithm. Google’s original name was BackRub, referring to the way the algorithm counted inbound links as votes. The process returned better results than competitors’. People noticed, and Google became the world’s leading search engine.
To rank highly in Google’s search results, websites need links. But it can be hard to gain new links when you’re an online store. Search engine optimizers often say, “Products are not linkable assets.” Not many consumers will link to the products of most brands.
But there is a little-used opportunity for longtime retailers to gain links. If your ecommerce site has changed platforms in its lifetime, you might have editorial links — legitimate, not spam — waiting for you to reclaim.
Links Break with Age
A replatform almost always creates new URLs. This is especially the case for older replatforms from, say, the mid-2000s. Search engine considerations were new. Many retailers did not execute the best practices.
So, if a website acquired backlinks throughout the 2000s, and all the URLs were changed in with a migration, that website should have 301 redirected all pages. If it skipped this step, it likely broke all backlinks. I have seen high-value websites link to retailers’ 404 pages due to missing 301 redirects.
Consider this example. The Wayback Machine tells us that Gap.com had a URL structure in 2005 of secure.gap.com. But none of the now non-existing pages were redirected to URLs with that structure. So a user-generated link from Glamour.de no longer resolves. In 2008, Gap’s URL convention changed again. This time 302 redirect chains (instead of 301) were put in place, which may not have passed all the link equity from heavy hitters such as Esquire.com, Polyvore.com, and Askmen.com. These are the opportunities to reclaim.
Broken Link Matrix
There are tools that can go back in time and validate old URLs. I’ve broken this process into two steps.
Step 1: Collecting old URLs. The first step is to collect the URLs for review. Make a long list. There’s no reason to exclude URLs. If there’s any chance the URL you find is not properly redirected, add it to the list.
The Wayback Machine is an excellent resource for finding old URL structures. The Wayback Machine can show more than just a home page. Click through the website links; Wayback has likely captured many pages. Even if it didn’t save an entire page, you’ll be able to see the URL structure through navigation blocks.
Grab a few periods and crawl The Wayback Machine with a tool such as Screaming Frog or Sitebulb. You can extract thousands of URLs that look something like this:
Notice the second http? That’s where the legacy URL starts. Using Excel’s Text to Columns tool, separate and delete the first part of the URL, leaving a list of clean legacy URLs for step 2.
Backlink tools such as Ahrefs, Majestic, and Moz’s Link Explorer can also be helpful. These tools crawl the web and publish the links that they find. They often find old links. A quick look through Ahrefs (my tool of choice) shows an Entrepreneur magazine article linking to a now-defunct Gap page.
Another source for finding old link conventions is your site’s analytics. You likely have loads of history there. Pull some old reports from the years where you believe another platform may have been in use. Try to find outdated URLs to add to your list.
Step 2: Validating old URLs. At this point, you may have a lengthy list of URLs. Some may resolve to 404s; some may be properly connected. Don’t let the volume scare you.
Screaming Frog to the rescue, again. Change Screaming Frog’s crawler to list mode. This allows you to upload (or paste) your entire list of URLs.
Once you click start, Screaming Frog will run through each of the URLs and tell you whether they are properly resolving. If, instead, the URLs resolve to 404 pages, consider 301 redirects.
Collecting outdated URLs does mean third-party sites link to them. But keep in mind that backlink tools — Ahrefs, Majestic, Link Explorer — don’t index the entire web. There may be backlinks pointing to these 404s that Google alone tracks. But there’s no harm in redirecting 404s that have no links. In the long run, it might help your website clean the 404s out of Google’s index.
Setting up redirect rules for a lot of links is not always easy. Google wants each redirect to be relevant. Do not redirect URLs in bulk to the home page. Redirect to the same or similar pages. Sometimes a mapping process of matching variables in the old and new URLs can help. But in my experience, manual work is always necessary.
Lastly, reclaiming old links is not foolproof. Some links lose equity over time. A link from an outdated and irrelevant blog post, for example, may not have much link power. Only Google knows. But in SEO, even the smallest signals can make a big ranking difference.