What Is an Orphan Page?
Hand off the toughest tasks in SEO, PPC, and content without compromising quality
Explore ServicesDo you have pages that have the potential for ranking and organic search traffic but aren’t part of your site structure? Or pages that aren’t supposed to be in your site structure but are discovered by Google anyway?
The answer is most likely yes. At least, it is for the majority of websites and e-commerce sites!
These are known as orphan pages, and re-associating the good ones with your website structure allows you to fully utilize their potential (as does blocking search engine bots from your low-value ones!).
Crawlers need to be able to find your pages if they are linked to other pages and be able to pass link equity. Consider it an actual web for a spider to crawl on. The spider will have difficulty moving from one location to another if parts of it are broken. When this happens with customers, they will not stay on your orphan page; instead, they will leave entirely. Unlinked pages should be avoided at all costs.
This article will help bring an understanding as to what orphan pages are, how to find orphan pages on your site, if orphan pages are bad for SEO, and how to fix orphan pages.
Let’s get started.
What Is an Orphan Page?
Orphan pages are pages that exist on a website but are not linked to by any other page. Users cannot find them during their website journey because there is not a single link pointing to that page on the entire website.
A dead-end page, on the other hand, is a webpage that does not have any links to other internal or external web pages, resulting in a “dead end.”
These pages aren’t crawlable and indexed, so search engine crawlers can’t find them. They represent lost opportunities to acquire and engage customers, which can negatively impact your bounce rate.
We do not recommend losing page traffic, retention, and revenue, as well as jeopardizing your SEO success, due to orphan pages. Crawlers need to be able to find your pages if they are linked to other pages.
Search engines, such as Google, typically discover new pages in one of two ways:
- A link from another page is followed by the crawler.
- The crawler discovers the URL specified in your XML sitemap.
So, in order for Google to crawl and index your page, they must be able to find it.
How to Find Orphan Pages on Your Site
Several orphan page tools are available to help you identify what is floating and missing during a site audit. Let’s have a look at a few of them:
Google Analytics
See which pages are being visited from sources other than the organic channel. You can connect to the Google Analytics API during a crawl and pull analytics data for a specific account, view, property, or segment. Remember to select the ‘Organic Traffic’ segment to find orphan pages from organic search. According to statistics, Google updates its ranking algorithm 9 times per day.
Screaming Frog
To use Screaming Frog, make sure your account is linked to both Google Search Console and Google Analytics. Check the “Crawl New URLs Discovered in Google Analytics” checkbox on the API’s “General” tab. When that’s done, go to “Spider,” check “Crawl Linked XML Sitemaps,” tick “Crawl These Sitemaps,” and then enter the URL of your sitemap. When the crawl log analysis is finished, you will have a list of all the Orphan URLs on your site.
Google Search Console
This tool can show us pages that are generating impressions but no SEO traffic. It’s an intriguing way to discover pages that, despite not being linked from anywhere, are occasionally displayed in the results. Especially since Google considers 200 distinct factors when determining who should rank where.
XML sitemap file: in cases where your content management system (CMS) can generate an XML sitemap file listing all of the CMS’s pages.
A request to your server yields a list of all HTML URLs from the server log.
This way, we’ll get a list of all possible URLs, including those that are bringing in traffic or receiving authority from other domains. If you collect all of these URLs and evaluate them to a list obtained from a full crawl of your website, you would be able to identify orphan pages since they will appear in both the full list and the crawl.
Importing URLs with external links
Pages linked from other domains, whether or not they contribute traffic we can use Google Search Console or any other tools to analyze these, such as Raven, Semrush, Majestic, Mozlink Explorer, or Ahrefs.
If you prefer to examine a log file, you can use the Screaming Frog SEO Log File Analyzer to find your orphan pages. You can crawl your website in the same way that Googlebot does, and export a list of all the URLs that are discovered.
Using Google Sheets: We can use a =VLOOKUP formula in Google Sheets to identify URLs that were not found during our crawl, which in this case would be our orphan URLs.
Despite the fact that it is a relatively simple task that you can complete using an excel document, it is quite laborious. Crawlers like Screaming Frog and Ryte can be of great assistance to us in our endeavor.
Orphan Page FAQ
Are orphan pages bad for SEO?
Orphan pages cause two major search engine optimization (SEO) issues:
- Low traffic and rankings: Even if they have great content, orphan pages rarely rank well in SERPs or receive a lot of organic search traffic.
- Crawl Waste: Low-value orphan pages (such as duplicate pages) can divert the crawl budget away from important pages.
When orphan pages account for a sizable portion of the pages Google explores on your website, such as more than 70% in the example below, you get a good idea of how serious the problem is.
Why are orphan pages bad?
Orphan pages are inconvenient for both users and crawlers.
Users cannot access those pages via your site’s natural structure, so any important or useful information on those pages is lost. For example your website’s home page or landing page.
This can lead to a perplexing user experience.
There is no authority passed to the pages when there is no internal linking, and search engines have no semantic or structural context in which to evaluate the page.
It can be more difficult to determine which queries the page is relevant for if you don’t know where it fits into your site as a whole.
How do I fix an orphaned page?
Orphan pages are classified into two types:
- The expected orphan pages you shouldn’t be concerned about.
- The unexpected orphan pages about which you should be concerned.
The path you take to fix your orphan pages will be determined by their type. So, when we see a high volume of orphan pages, the first thing we do is look at what they look like and whether they are expected or not.
Expected orphan pages: usually not a cause for concern
After running a site crawl and comparing it to your server log files to find pages Google is finding but aren’t in your site structure, You can view a list of all your orphan pages by clicking on “found by Google.”
Many of these orphan pages will be generated by:
- Pages that return non-200 status codes. Google may continue to crawl pages that return 4xx status codes even after they have been corrected on your site.
How to resolve: Google will eventually stop crawling these pages. There is nothing to worry about.
- Pages that do not currently exist on your site but are linked to by another site. It’s not uncommon to receive an external link (backlink) to a page that you then remove or redirect. Google will still find the old link because it still exists on that other website.
How to fix them: Because you have no control over the links on other websites, the only way to fix this type of orphan page is to contact the site owner and request that they update the page to the correct new location.
- Pages that have expired. This is common on websites with a large number of short-lived pages, such as classified ads that expire quickly.
How to resolve: We should only be concerned about expired pages discovered by Google if they have been orphaned for an extended period of time. Otherwise, the number of orphan pages is merely indicative of the website’s page rotation rate and should be regarded as food for thought.
Unexpected orphan pages: a possible source of concern
- A syntax error occurred while creating sitemaps: These generate erroneous URLs, which can still return content, duplicates, or HTTP errors.
How to resolve: If you discover erroneous URLs caused by a syntax error, work with your development team to find a solution.
- Pages omitted during a previous site migration: These are web pages that have not been redirected, so old pages content may still be available.
How to resolve: If your new website has similar content, you should redirect these old URLs to it. If there isn’t, these outdated/omitted pages should return a 404 or 410 status code.
- Expired pages that continue to return content. Some websites simply stop linking to expired content (such as products removed from the catalog) and fail to return a status code (such as HTTP 404 or 410) indicating that the content is no longer available. As a result, the previous page is still accessible.
How to resolve: In addition to removing links to expired content, you should ensure that the expired page is updated with the correct status code. Make sure to 404 or 410 the content if it is no longer available.
- A syntax error occurred while creating canonical URL tags: These result in erroneous URLs. These URLs could be serving status codes 200 OK or error codes.
How to resolve: If you discover erroneous URLs caused by a syntax error, work with your development team to find a solution.
- Important, high-quality pages that aren’t linked in your website structure: Some websites employ navigation pages (content lists such as category pages or internal search result pages) that are only linked when one or more criteria are met. Sub-categories, for example, will appear in a menu only if the list is not empty or reaches a certain number of items. There are numerous instances in which we may fail to link to high-value pages, whether due to an error in automation or not.
How to resolve: The correct approach is to determine when a page no longer meets business criteria for organic traffic, and then remove it once and for all: remove links and return HTTP 404 or 410. Until that time, it should be linked to somewhere on the website.
Fix Your Orphan Pages
Hopefully, this article has given you a better understanding of an orphan page.
Remember that while orphan pages may not be a major issue, you must not be complacent and leave your web pages alone if you want to provide a great user experience to your site visitors and allow search engines to crawl the most important pages on your website.
Orphan pages are an SEO issue, but the good news is that they can be identified and resolved. You can find orphan pages using orphan page checkers and site audit tools, or you can rely on the expertise of Loganix since we specialize in dealing with orphan pages and technical SEO errors and blockers.
Contact us today to learn more about how we can help your business manage orphan pages and improve its search engine ranking.
Hand off the toughest tasks in SEO, PPC, and content without compromising quality
Explore ServicesWritten by Jake Sheridan on October 10, 2021
Founder of Sheets for Marketers, I nerd out on automating parts of my work using Google Sheets. At Loganix I build products, and content marketing. There’s nothing like a well deserved drink after a busy day spreadsheeting.