Help & Documentation
Browse the full documentation index · Press Esc to close
Indexing & Crawl

Crawl Efficiency

Crawl Efficiency joins Googlebot’s last crawl date for each URL with its 28-day click and impression data, then classifies every page into one of four tiers. The goal is to answer two questions at once: “Where is Googlebot wasting its crawl budget on worthless pages?” and “Which important pages has Google neglected to refresh?”

The four crawl-efficiency tiers

Wasted crawl

Recently crawled, zero or near-zero traffic. Googlebot spent budget on pages that return no SEO value. Candidates for noindex, canonical consolidation, robots.txt exclusion, or removal via 410.

Under-crawled

High clicks but a stale crawl date. These are money pages Google hasn’t refreshed recently. Content changes or schema updates may not yet be reflected in the index. Re-submit via URL Inspection or add internal links to signal freshness.

Healthy

Recently crawled and earning traffic. The expected state for your important pages. Monitor to ensure this tier grows as a share of your indexed pages.

Stale tail

Low traffic, stale crawl. Rarely crawled pages earning minimal traffic. Often old blog posts, paginated archive pages or thin category pages. Evaluate each: refresh the best, consolidate or prune the rest.

How we compute it

  1. We pull the last crawl date for each tracked URL from the URL Inspection API (lastCrawlTime).
  2. We join each URL to its 28-day clicks and impressions from Search Console.
  3. We define recently crawled as a last-crawl date within the last 30 days. Stale means more than 30 days ago or no crawl record.
  4. We define high traffic as ≥10 clicks in the last 28 days.
  5. Tier assignment:
    • Recently crawled + low traffic = Wasted crawl
    • Stale + high traffic = Under-crawled
    • Recently crawled + high traffic = Healthy
    • Stale + low traffic = Stale tail
  6. Within each tier, rows are sorted by clicks descending.

Table columns

  • URL — links to the live page.
  • Tier — colour-coded badge.
  • Last crawl — the date Googlebot last fetched this URL. “Never” means no crawl record was returned by the URL Inspection API.
  • Clicks (28d) / Impressions (28d) — GSC totals for the last 28 days.
  • Index verdict — the URL Inspection indexing verdict (Indexed, Not indexed, etc.).

What to do with it

  1. Under-crawled pages are the biggest win. For each, verify the page has fresh content, is in the sitemap, and has internal links from high-authority pages. Submit via URL Inspection and request a re-crawl.
  2. Wasted-crawl pages are your robots/canonical backlog. For pages with zero traffic and no path to meaningful rankings: add noindex, set a canonical to a better version, or remove the page with a 410.
  3. A high Wasted crawl to Healthy ratio suggests your site is larger than it needs to be from Google’s perspective — a pruning project will likely improve rankings for the pages that remain.
  4. Re-check the Tier distribution monthly. A growing Healthy share is a direct proxy for overall crawl-budget health.

Related reports