The Technical SEO Audit Guide

How to Find Noindex Pages Blocking Your Rankings

Accidental noindex tags silently remove pages from Google. Here's how to find every noindex directive on your site — and tell the intentional ones from the mistakes.

Published April 15, 2026
7 min read

How to find noindex pages blocking your rankings

A noindex directive is the most powerful and most dangerous indexing signal you can set. It does exactly what it says: it tells search engines not to include the page in their index. When intentional, it keeps utility pages, staging content, and duplicate pages out of search results. When accidental, it silently removes pages from Google — and you won't notice until the traffic disappears.

The dangerous part is "silently." Google doesn't send you an alert when it stops indexing a page you care about. The page still loads fine in the browser. It still appears in your CMS. It just stops showing up in search results, and the traffic graph slides downward with no obvious explanation.

This guide shows you how to find every noindex directive on your site, distinguish the intentional ones from the accidental ones, and identify the pages that need immediate attention.

The three ways a page gets noindexed

Before you can audit noindex directives, you need to know where they live. There are three implementation methods, and a site might use all three simultaneously.

1. The meta robots tag

The most common method. A <meta> tag in the page's <head>:

<meta name="robots" content="noindex" />

Or combined with other directives:

<meta name="robots" content="noindex, nofollow" />

This tag is visible in the page source and in the rendered HTML. Most crawlers detect it automatically.

2. The X-Robots-Tag HTTP header

Less common but equally effective. A response header sent by the server:

X-Robots-Tag: noindex

This is invisible in the page source — you need to inspect HTTP headers to find it. It's often set at the server or CDN level, which makes it harder to trace.

Common places the X-Robots-Tag appears:

  • Nginx or Apache server configuration
  • CDN rules (Cloudflare, Vercel, Netlify)
  • Application middleware (Next.js middleware, Express middleware)
  • CMS settings that apply headers to specific content types

3. Robots.txt disallow (different, but often confused)

Disallow in robots.txt is not the same as noindex. A Disallow directive tells crawlers not to crawl the page. A noindex directive tells crawlers not to index the page. The distinction matters:

  • A page blocked by robots.txt may still be indexed if other pages link to it — Google can index a URL it hasn't crawled, based on anchor text and link context
  • A page with a noindex tag is crawled (so Google sees the tag) but then removed from the index
  • Using both simultaneously can create a problem: if robots.txt blocks the page, Google can't crawl it, so it never sees the noindex tag, and may index the page anyway based on external signals

The audit implication: when looking for indexability issues, check noindex directives and robots.txt separately. They're different tools with different behaviors.

How to find noindex pages

Method 1: Search Console coverage report

Google Search Console's "Pages" report (formerly "Coverage") shows pages excluded from the index and the reason. Filter to "Excluded by 'noindex' tag" to see every page Google found with a noindex directive.

Strengths: This is Google's own data — it tells you exactly what Google sees. It catches both meta tag and HTTP header noindex directives.

Limitations: Only shows pages Google has attempted to crawl. If a page is blocked by robots.txt, Google won't report its noindex status (because it couldn't crawl the page to see the tag). The data also has a delay — it reflects Google's last crawl, not the current state of the page.

Method 2: Site crawl

A full site crawl examines every page and checks for noindex directives in both the meta tag and HTTP headers. This gives you the current state, not Google's delayed version.

Most crawl tools flag noindex pages automatically. The output is a list of every URL with its indexability status: indexable, noindex (meta tag), noindex (HTTP header), canonicalized to another URL, or blocked by robots.txt.

Method 3: Spot-checking with browser tools

For quick checks on specific pages, use the browser:

  1. View page source and search for noindex to find the meta tag
  2. Open DevTools → Network tab → select the HTML document → Headers tab and look for X-Robots-Tag to find the HTTP header
  3. Use Google's URL Inspection tool in Search Console to see Google's view of a specific URL's indexability

This works for investigating a specific page. It doesn't scale to a full site audit.

How to tell accidental from intentional noindex

Finding noindex pages is the easy part. The harder question is: which of these should actually be noindexed?

Pages that should have noindex

  • Admin and utility pages: Login, account settings, password reset, shopping cart
  • Staging and preview content: Draft pages, preview URLs, staging subdomain pages
  • Paginated archives beyond page 1: The first page of a category listing should be indexed; pages 2, 3, and beyond often shouldn't
  • Thank-you and confirmation pages: Post-form-submission pages that provide no value in search
  • Internal search results pages: These are thin, duplicative, and rarely useful in search
  • Filtered or faceted navigation pages: Especially on e-commerce sites with hundreds of filter combinations
  • Duplicate content that can't be canonicalized: When canonical tags don't apply (different domains, different structures), noindex is the correct tool

Pages that should NOT have noindex (red flags)

  • Pages receiving organic traffic. This is the biggest red flag. If a noindex page is still receiving impressions or clicks in Search Console, it means Google knew about the page, users were finding it, and you're now telling Google to stop showing it. Cross-reference your noindex list with GSC impression data.
  • Pages in the XML sitemap. A page that's both in the sitemap (telling Google to index it) and noindexed (telling Google not to) is sending contradictory signals. One of them is wrong.
  • Pages with inbound links. If other pages on your site link to a noindex page, you're passing link equity to a dead end. Either remove the noindex or redirect the internal links elsewhere.
  • Core content pages. Blog posts, product pages, service pages, and landing pages should almost never be noindexed. If they are, it's almost certainly a mistake.
  • Recently deployed pages. A common source of accidental noindex: a developer sets noindex during development or staging, and the directive survives into production. Check recently deployed pages against your noindex list.

Common causes of accidental noindex

Knowing the common causes helps you trace the problem:

CMS default settings. Some CMS platforms default new content types or sections to noindex. WordPress's "discourage search engines" setting (Settings → Reading) is the most famous example, but headless CMSs can have similar per-content-type defaults.

Environment-based configuration. Sites that use environment variables to control noindex (common in Next.js, Nuxt, and other JavaScript frameworks) sometimes deploy staging configuration to production. If NEXT_PUBLIC_NOINDEX=true leaks into the production build, every page gets noindexed.

Middleware and CDN rules. Server middleware or CDN rules that add X-Robots-Tag: noindex to specific paths or response types. These are invisible in the application code and easy to forget about.

Template inheritance. A layout template sets noindex for a section, and a new page added to that section inherits the directive without the author realizing it.

Migration artifacts. During a site migration, staging pages are noindexed to prevent them from being indexed before launch. After launch, the noindex directives aren't removed from all pages.

A practical audit workflow

Step 1: Crawl the site. Run a full crawl and filter to pages with noindex directives (both meta tag and HTTP header).

Step 2: Cross-reference with Search Console data. Check which noindex pages have received impressions or clicks in the past 90 days. Any page with traffic that's now noindexed needs immediate investigation.

Step 3: Check the sitemap. Identify pages that appear in both the sitemap and the noindex list. These contradictions should be resolved — either remove the page from the sitemap or remove the noindex directive.

Step 4: Categorize. For each noindex page, classify it as intentional or accidental. Use the criteria from the section above.

Step 5: Fix the accidentals. For accidentally noindexed pages, remove the noindex directive at the source — whether that's a meta tag, HTTP header, CMS setting, or environment variable. Then request re-indexing via Search Console's URL Inspection tool.

Step 6: Monitor. After fixes, re-crawl the site to verify the directives are removed. Check Search Console over the following weeks to confirm the pages are being re-indexed.

How Evergreen surfaces indexability issues

Evergreen's content audit table includes an indexability column for every page. Filter to indexable = false to see every page the crawler identified as noindexed, regardless of whether the directive comes from a meta tag, HTTP header, or canonical tag pointing elsewhere.

When GA4 and Google Search Console are connected, you can cross-reference indexability with traffic data directly in the audit table. The most dangerous finding — a noindex page that's receiving search impressions — is visible as a filtered view: indexable = false AND impressions > 0. That's your "fix immediately" list.

The visual sitemap highlights non-indexable pages so you can see which sections of your site have the most indexability restrictions. If an entire branch is noindexed, it's usually a template-level or CMS-level setting that needs to be changed in one place.

See your indexability status → Start free

Frequently asked questions

How long does it take for Google to re-index a page after removing noindex?

It varies. Requesting indexing via Search Console's URL Inspection tool can speed things up — Google typically re-crawls within hours to days. Full re-indexing (appearing in search results) can take one to four weeks, depending on how frequently Google crawls your site and the page's overall authority.

A noindex tag tells Google not to show the page in search results, but it doesn't prevent the page from passing link equity through its outbound links — unless you also add nofollow. A noindex page with outbound links still passes PageRank to those destinations. However, noindex pages may eventually be crawled less frequently, reducing their effectiveness as link equity conduits over time.

What's the difference between noindex and deleting a page?

A noindex page still exists and loads in the browser — it's just excluded from search results. Deleting a page (returning a 404 or 410) removes it from both the site and search results. Use noindex when you want the page to exist for users (internal tools, gated content) but not in search. Use deletion or redirection when the content is truly obsolete.

Should I use noindex or canonical to handle duplicate content?

Use canonical when you have multiple versions of the same content and want Google to pick a preferred version (the other versions still exist and are accessible). Use noindex when you want a page completely excluded from search results. Canonical is a suggestion — Google may ignore it. Noindex is a directive — Google follows it.

Your next step: check your site's indexability in 60 seconds → Create free account

Related Topics in The Technical SEO Audit Guide

The Technical SEO Checklist for 2026

A practical technical SEO checklist covering crawlability, indexation, Core Web Vitals, structured data, JavaScript rendering, and AI search visibility — updated for 2026.

How to Find and Fix All Broken Links on Your Site

A practical guide to finding, prioritizing, and fixing broken links across your website to improve user experience and SEO performance.

The Complete Website Audit Checklist for Agencies (2026)

A 25-point website audit checklist built for agencies managing multiple client sites. Covers structure, content, performance, and reporting workflows.

How to Run a Bulk Lighthouse Test on Your Entire Site

Stop testing one page at a time. Run Lighthouse across your entire site to find the pages dragging down performance — and fix them systematically.

Next.js SEO Audit Checklist for 2026

An auditor's checklist for Next.js 14+ sites built on the App Router. Covers metadata, rendering strategies, dynamic routes, and the technical pitfalls that don't show up in generic SEO guides.

Technical SEO Audit Guide for Headless Websites

Headless websites separate content from presentation, and that separation introduces SEO audit challenges that monolithic sites don't have. This guide covers the methodology for auditing any headless stack.

The Comprehensive Astro SEO Checklist

Astro ships fast HTML by default, but fast isn't the same as optimized. This checklist covers every SEO consideration specific to Astro 4.x+ — from Islands to View Transitions to content collections.

Lighthouse Score for Your Entire Site: Tools and Methods

Lighthouse tests one page at a time. Here are five ways to get scores for every page on your site — from free CLI tools to SaaS dashboards — and when each approach makes sense.

Automated SEO Monitoring: Set Up Daily Site Audits

One-off audits find problems after they've already cost you traffic. Continuous monitoring finds them as they happen. Here's how to set up daily automated SEO monitoring that catches regressions before rankings suffer.

Shareable SEO Reports: How to Send Audits Clients Actually Read

Most SEO reports are PDFs that clients download, glance at, and forget. Shareable URL-based reports stay current, require no login, and get acted on. Here's why and how.

JavaScript Rendering Audit Checklist

A checklist for auditing JavaScript-rendered pages: crawl accessibility, metadata after render, lazy-loaded content, and the tools to verify what Google actually sees.