Technical SEO

How to Remove a Web Page from Google

There are multiple reasons for removing a page from Google’s index. Examples include pages with confidential, premium, or outdated info.

Here are options for removing a web page from Google.

Options for Deindexing a Page

Remove the page from your site

For it to disappear altogether, remove or delete the page from your web server. Setting up an HTTP status code of 410 (gone) instead of 404 (not found) will make it clear to Google. And Google discourages using redirects to remove spammy pages as it would send the poor signals to the surviving redirected page.

Google Search Console no longer includes the URL removal tool. Once the page is moved, there’s no further required action. Allow a few days for Google to recrawl the site, discover the 410 code, and remove the page from its index.

As an aside, Google does offer a form to remove personal info from search results.

Add the noindex tag

Search engines nearly always honor the noindex meta tag. The search bots will crawl the page (especially if it’s linked or in sitemaps) but will not include it in search results.

In my experience, Google will immediately recognize a noindex tag once it crawls the page. Adding the noarchive tag instructs Google to also delete its saved cache of the page.

Password-protect the page

Consider adding a password to retain the page without it being publicly accessible. Google cannot crawl pages requiring passwords or user names.

Adding a password will not remove the page from Google’s index. Use the noindex tag to exclude the page from search results.

Remove internal links

Remove all internal links to non-public pages you want deindexed. Moreover, internal links to password-protected or deleted pages hurt the user experience and interrupt buying journeys.  Always focus on human visitors — not just search engines.

Robots.txt Dos and Don’ts

Many people attempt to use the robots.txt file to remove pages from Google’s index. But robots.txt prevents Google from crawling a page (or category), not removing it from the index.

Pages blocked via the robots.tx file could still be indexed (and ranked). Furthermore, since it cannot access those pages, Google will not encounter noindex or noarchive tags.

Include URLs in the robots.txt file to instruct web crawlers to ignore certain pages or sections — i.e., logins, personal archives, or pages resulting from unique sorting and filtering — and spend the crawl time on the parts you want to rank.

Ann Smarty
Ann Smarty
Bio   •   RSS Feed