The Technical SEO Audit Guide

The Technical SEO Checklist for 2026

A practical technical SEO checklist covering crawlability, indexation, Core Web Vitals, structured data, JavaScript rendering, and AI search visibility — updated for 2026.

Published April 14, 2026

Updated April 15, 2026

12 min read

The technical SEO checklist for 2026

The best content on the internet is invisible if search engines cannot crawl, render, and index it. Technical SEO is the infrastructure layer — the plumbing that determines whether your pages are even eligible to rank before content quality, backlinks, or any other signal enters the equation.

This checklist covers every technical element that matters in 2026, organized by priority. It reflects the current state of Google's crawling and indexing systems, the INP metric that replaced FID in March 2024, the growing influence of AI search surfaces, and the rendering complexity introduced by modern JavaScript frameworks like Next.js and Astro.

If you've been running the same technical SEO checklist since 2023, several items here will be new. If you're auditing a site for the first time, start at the top and work down — the sections are ordered by how foundational they are.

Crawlability

Search engines need to discover and access your pages before they can rank them. Crawlability issues are the most foundational technical problems — a page that can't be crawled can't be indexed, and a page that can't be indexed can't rank, regardless of how good it is.

robots.txt

Confirm robots.txt exists at /robots.txt and is accessible (returns a 200, not a 404 or 5xx)
Verify it is not blocking important pages, sections, or entire subdirectories
Ensure CSS and JavaScript files are not blocked — Google needs these to render JavaScript-heavy pages
Include a Sitemap: directive pointing to your XML sitemap URL
Test with the robots.txt report in Search Console
If you use multiple subdomains, each one needs its own robots.txt
Check for environment-specific leaks: staging robots.txt files sometimes contain Disallow: / and survive into production deployments

XML sitemaps

Generate and submit an XML sitemap to Google Search Console
Include only indexable, canonical URLs — every URL in the sitemap should return 200, not be noindexed, and self-canonicalize
Keep individual sitemap files under 50,000 URLs and 50MB
Use a sitemap index file for larger sites, referencing multiple child sitemaps
Ensure lastmod dates are accurate and update when content actually changes — Google ignores lastmod if it's unreliable
Remove URLs that return 404, redirect, or have a canonical pointing elsewhere
For sites using ISR or on-demand generation, verify that dynamically created pages get added to the sitemap

Internal linking

Every important page should be reachable within three clicks from the homepage
Check for orphan pages — pages with zero inbound internal links are invisible to crawlers and users
Use descriptive anchor text that communicates what the reader will find at the destination
Fix broken internal links (404 responses) — these waste crawl budget and create dead ends
Ensure navigation is built with standard <a> tags, not JavaScript-only event handlers that crawlers can't follow
Audit link distribution: avoid concentrating all internal links on the homepage while deeper pages link to nothing

Broken links and redirects

Crawl the site and identify all URLs returning 4xx or 5xx status codes
Fix or redirect broken URLs that still receive inbound links (internal or external)
Flatten redirect chains — any chain longer than one hop wastes crawl budget and dilutes link equity
Check for redirect loops
Convert temporary redirects (302) to permanent (301) when the move is permanent
After any site restructure, verify that old URLs redirect to the correct new destinations

Indexation

A page can be crawled without being indexed. Indexation issues are subtler than crawlability problems — the page loads fine in a browser, but Google silently excludes it from search results.

Canonical tags

Every indexable page should have a self-referencing canonical tag
Duplicate content should canonical to the preferred version
Canonical URLs must be absolute (include the full domain), not relative
Check for conflicting signals — a page with both a canonical to another URL and a noindex tag is sending contradictory instructions
Watch for trailing slash mismatches between the canonical URL and the actual URL
Verify that paginated pages canonical to themselves, not to page 1 (a common misconfiguration)

Meta robots and noindex

Verify no important pages have noindex directives — check both the <meta name="robots"> tag and the X-Robots-Tag HTTP header
Cross-reference your noindex pages with Search Console impression data — any noindexed page still receiving impressions is a red flag
Use noindex intentionally for thin content, utility pages, internal search results, and paginated archives beyond page 1
Confirm nofollow is not applied to important internal links — it prevents link equity from flowing to those pages
Check environment variables and middleware: Next.js and similar frameworks sometimes apply noindex via server configuration that survives into production

Indexing coverage

Review the Pages report in Google Search Console for pages Google has discovered but chosen not to index
Check for "Crawled — currently not indexed" entries — these indicate quality or relevance issues
Check for "Discovered — currently not indexed" entries — these suggest Google has deprioritized crawling these URLs
Verify that important pages appear in the "Indexed" category
Use the URL Inspection tool to test individual pages and request indexing for high-priority new content

Core Web Vitals and performance

Page speed is a confirmed ranking factor and directly affects user experience. Google evaluates three Core Web Vitals metrics using real-user data from the Chrome User Experience Report (CrUX). As of March 2024, INP officially replaced FID as the responsiveness metric.

Largest Contentful Paint (LCP)

Target: under 2.5 seconds. LCP measures how long it takes for the largest visible element (typically a hero image or heading) to render.

Optimize and compress images — use next-gen formats (WebP, AVIF) and serve responsive sizes via srcset
Preload the LCP element when it's predictable (e.g., a hero image that appears on every page load)
Minimize server response time (TTFB) — consider CDN caching, edge rendering, or static generation
Remove or defer render-blocking CSS and JavaScript
Avoid lazy-loading above-the-fold images — this delays the LCP element

Interaction to Next Paint (INP)

Target: under 200 milliseconds. INP replaced First Input Delay (FID) in March 2024. It measures responsiveness across all user interactions during a page visit, not just the first one.

Break up long JavaScript tasks (over 50ms) using requestIdleCallback or scheduler.yield()
Use web workers for heavy computation that doesn't need DOM access
Minimize main thread work — audit with Chrome DevTools Performance panel
Defer non-critical JavaScript: analytics, chat widgets, and A/B testing scripts should load after the page is interactive
Optimize event handlers — expensive operations triggered on every scroll, resize, or input event tank INP scores

Cumulative Layout Shift (CLS)

Target: under 0.1. CLS measures visual stability — how much the page layout shifts unexpectedly during loading.

Set explicit width and height attributes on images, videos, and iframes
Reserve space for ads, embeds, and dynamically injected content with CSS aspect-ratio or min-height
Avoid inserting content above existing content after the initial render
Use CSS contain property where appropriate to limit layout recalculations
Test with throttled network conditions — CLS often looks fine on fast connections but fails on slow ones

Site-wide performance testing

Don't test only the homepage. Performance varies dramatically across page templates — the homepage might score 95 while blog posts score 40.

Run Lighthouse across the entire site to see the full distribution
Compare lab data (Lighthouse) with field data (CrUX in Search Console) — discrepancies indicate real-world conditions your lab tests don't capture
Identify template-level patterns: if every product page is slow, fix the template rather than individual pages
Monitor performance continuously — a single deployment can regress CWV across the entire site

Structured data

Schema markup helps search engines understand content semantically and can generate rich results (star ratings, FAQ dropdowns, recipe cards, breadcrumb trails) in search.

Implement JSON-LD structured data for relevant schema types: Article, Product, FAQ, HowTo, BreadcrumbList, Organization, WebSite
Validate all markup with Google's Rich Results Test — invalid schema is worse than no schema
Ensure structured data matches visible page content — Google penalizes markup that describes content not present on the page
Monitor the Enhancements reports in Search Console for schema errors and warnings
Render structured data server-side — JSON-LD injected via client-side JavaScript may not be processed reliably
For sites using Next.js or similar frameworks, verify that <script type="application/ld+json"> tags appear in the initial HTML response, not only after JavaScript execution
Implement BreadcrumbList schema on every page — it's low effort, high reliability, and directly controls breadcrumb display in SERPs

JavaScript rendering

Sites built with React, Vue, Next.js, Astro, or any JavaScript framework face specific indexing challenges. Google renders JavaScript, but it does so on a delayed schedule and with resource limits — content that depends on client-side rendering is at a disadvantage.

Test how Google sees your pages using the URL Inspection tool in Search Console — click "Test Live URL" and compare the rendered output to what you see in the browser
Compare the initial HTML source (View Page Source) with the fully rendered DOM — if critical content (titles, headings, body text) is missing from the source, it's being rendered client-side
Server-render all indexable content: use SSR, SSG, or ISR instead of client-side-only rendering for pages that need to rank
Check that internal links are standard <a href> tags, not JavaScript-only navigation (onClick handlers, router.push without a fallback link)
For Next.js App Router sites, audit 'use client' boundaries — components marked 'use client' render in the browser, not on the server. See the full Next.js SEO audit checklist for framework-specific guidance
Watch for hydration mismatches — content that differs between server render and client render can cause Google to index the wrong version
Test with JavaScript disabled as a quick sanity check: if the page is blank or missing content, client-side rendering is doing too much

HTTPS and security

All pages should be served over HTTPS — HTTP is a negative ranking signal
HTTP URLs should 301 redirect to their HTTPS equivalents
Check for mixed content warnings — HTTP resources (images, scripts, fonts) loaded on HTTPS pages trigger browser warnings and erode trust
Ensure your SSL/TLS certificate is valid, not expired, and covers all subdomains in use
Implement HSTS (HTTP Strict Transport Security) headers to prevent protocol downgrade attacks
Verify that your CDN serves HTTPS correctly — some CDN configurations terminate SSL at the edge but fetch from the origin over HTTP

Mobile and responsive design

Google uses mobile-first indexing for all sites, meaning it uses the mobile version of your pages for indexing and ranking. If the mobile experience is broken, your rankings suffer even if the desktop version is fine.

Verify the site renders correctly on mobile devices — test across multiple screen widths, not just one breakpoint
Check for content parity between mobile and desktop — content hidden behind "read more" toggles or accordion elements on mobile may be deprioritized
Ensure tap targets (buttons, links) are large enough and spaced far enough apart for touch interaction
Test for horizontal scrolling — pages that overflow the viewport on mobile are flagged in Search Console
Check that the viewport meta tag is set correctly: <meta name="viewport" content="width=device-width, initial-scale=1">
Verify that interstitials and pop-ups don't cover primary content on mobile — Google's intrusive interstitial penalty applies specifically to the mobile experience

AI search and answer engine visibility

As of 2026, Google's AI Overviews, Bing Copilot, Perplexity, and ChatGPT search are pulling answers from web content. Technical SEO now includes making your content accessible to these systems.

Ensure content is crawlable by AI search bots — check robots.txt for blocks on GPTBot, ClaudeBot, PerplexityBot, Bytespider, or CCBot and decide deliberately whether to allow or block each one
Structure content with clear headings, concise answers near the top of each section, and semantic HTML — AI systems favor content that's easy to extract structured answers from
Implement comprehensive structured data — FAQ, HowTo, and Article schemas are particularly useful for AI answer extraction
Provide direct, specific answers to questions within the first 50–80 words of relevant sections — this increases the likelihood of being selected for AI-generated answers
Monitor AI search traffic in your analytics — these visits may appear differently from traditional organic search traffic

How to make this a continuous practice

Running this checklist once produces a snapshot. Running it continuously catches regressions before they cost traffic.

Technical issues accumulate silently. A deployment adds an accidental noindex. A CMS update changes the canonical tag format. A new third-party script tanks INP. Without continuous monitoring, you discover these problems when traffic drops — weeks or months after the damage started.

The most effective approach: automate crawling, Core Web Vitals monitoring, and indexability checks on a daily or weekly cadence. Reserve manual review for the items that require human judgment — structured data accuracy, content quality, and internal link strategy.

Evergreen runs this kind of continuous audit automatically. The crawler checks every page for the indexability, metadata, and performance issues on this checklist. The content audit table lets you filter to specific problems — missing meta descriptions, noindex pages, broken links, thin content — and sort by traffic impact. The visual sitemap shows structural issues (orphan pages, depth problems, link distribution) that are hard to spot in a flat list. And site-wide Lighthouse testing surfaces the performance distribution across every page, not just the homepage.

On the free plan, you can audit one site with up to 500 pages. That's enough to run this entire checklist on most sites.

Your next step: crawl your site in 60 seconds → Create free account

Frequently asked questions

How often should I run a technical SEO audit?

Continuously, with monthly reviews. Automated crawling and monitoring should run on a daily or weekly cadence to catch regressions as they happen. A full manual review of the checklist — including the items that require human judgment — should happen monthly for actively maintained sites, or quarterly for stable sites with infrequent changes.

What's the most common technical SEO issue?

Accidental noindex directives and broken internal links are the most frequently overlooked. Both are invisible to casual inspection — the pages look fine in a browser — and both silently degrade search performance. A comprehensive crawl catches both immediately.

Does this checklist apply to headless CMS and JavaScript framework sites?

Yes, and those sites often have additional technical SEO complexity around rendering strategy (SSR vs SSG vs client-side), metadata management, and sitemap generation. This checklist covers the universal fundamentals. For framework-specific guidance, see the Next.js SEO audit checklist or the technical SEO audit guide for headless websites.

Should I block AI crawlers in robots.txt?

It depends on your goals. Allowing AI crawlers (GPTBot, ClaudeBot, PerplexityBot) means your content can appear in AI-generated answers, which is a growing traffic source. Blocking them protects your content from being used to train models or generate answers without attribution. Most sites benefit from allowing AI search crawlers while monitoring the traffic impact. Make a deliberate decision rather than leaving the default.

The Technical SEO Checklist for 2026

The technical SEO checklist for 2026

Crawlability

robots.txt

XML sitemaps

Internal linking

Broken links and redirects

Indexation

Canonical tags

Meta robots and noindex

Indexing coverage

Core Web Vitals and performance

Largest Contentful Paint (LCP)

Interaction to Next Paint (INP)

Cumulative Layout Shift (CLS)

Site-wide performance testing

Structured data

JavaScript rendering

HTTPS and security

Mobile and responsive design

AI search and answer engine visibility

How to make this a continuous practice

Frequently asked questions

How often should I run a technical SEO audit?

What's the most common technical SEO issue?

Does this checklist apply to headless CMS and JavaScript framework sites?

Should I block AI crawlers in robots.txt?

Related Topics in The Technical SEO Audit Guide

How to Find and Fix All Broken Links on Your Site

The Complete Website Audit Checklist for Agencies (2026)

How to Run a Bulk Lighthouse Test on Your Entire Site

Next.js SEO Audit Checklist for 2026

How to Find Noindex Pages Blocking Your Rankings

Technical SEO Audit Guide for Headless Websites

The Comprehensive Astro SEO Checklist

Lighthouse Score for Your Entire Site: Tools and Methods

Automated SEO Monitoring: Set Up Daily Site Audits

Shareable SEO Reports: How to Send Audits Clients Actually Read

JavaScript Rendering Audit Checklist