Launch Week: Pro $9.50/mo (50% off) — promo code LAUNCH50. Learn more

AI Readiness

AI Readiness (AEO/GEO) measures how well the page is optimised for AI-powered search — ChatGPT, Perplexity, Google AI Overviews, Claude, and voice assistants. 21 signals are evaluated. Weight: 10%.

Details

What is AEO/GEO

Answer Engine Optimisation (AEO) and Generative Engine Optimisation (GEO) are emerging disciplines focused on making content discoverable and citable by AI systems. As AI Overviews and AI chatbots replace traditional blue-link search for some queries, being featured in AI-generated answers becomes a critical traffic source.

The AI Readiness score = (number of signals present / 21) × 100. Each signal is binary (present or not).

Entity clarity and Schema (3 signals)

AI systems extract entities (people, organisations, products, places) from web pages.

1. Entity Schema — Person, Organization, Product, etc. in JSON-LD makes entities machine-readable

2. sameAs links — connections to Wikidata, Wikipedia, LinkedIn, Crunchbase strengthen entity recognition

3. BreadcrumbList Schema — helps AI understand page position in site hierarchy

Missing Entity Schema or sameAs links are warning-level issues (−10 pts).

FAQ and Q&A content (3 signals)

FAQ-style content is frequently cited by AI systems because it is already structured as question-answer pairs.

4. FAQPage Schema — JSON-LD FAQPage markup. Missing = warning

5. FAQ HTML patterns — question-phrased headings (ending with '?' or starting with how/what/why/when) detected independently of schema

6. Q&A ratio — percentage of headings (H2+H3) phrased as questions. ≥10% indicates good Q&A content structure

Speakable and HowTo Schema (2 signals)

7. Speakable Schema — marks sections suitable for text-to-speech (Google Assistant, smart speakers)

8. HowTo Schema — step-by-step instructions in JSON-LD, used by AI for procedural queries

Content structure (3 signals)

Well-structured content is easier for AI systems to parse and cite.

9. Clear content sections — at least 3 distinct sections with H2/H3 headings and body text

10. Definitions — 'X is Y' patterns or explicit definition sentences that AI can extract as direct answers

11. Named entities — 3+ multi-word proper nouns (people, companies, places) detected via NLP patterns

Machine-readable formats (2 signals)

12. llms.txt — a proposed standard (similar to robots.txt) at /llms.txt that provides a structured overview of the site for LLMs and AI crawlers

13. Markdown for agents — whether the server returns Content-Type: text/markdown when an AI agent sends Accept: text/markdown. This lets AI agents receive clean, parseable text instead of HTML.

Citable content patterns (4 signals)

14. Citation passages — 2+ paragraphs containing statistics, percentages, dollar amounts, years, or large numbers. AI systems prefer citing content with concrete data

15. Comparison tables — HTML tables with ≥2 rows and ≥2 columns. Frequently extracted by AI for feature/product comparisons

16. TL;DR pattern — ≥30% of sections have a short opening paragraph (≤3 sentences, ≤80 words) that works as a summary

17. Authority backlinks — sameAs links pointing to Wikipedia, Reddit, Quora, or Crunchbase (signals credibility to AI)

Search and heading patterns (2 signals)

18. Long-tail headings — 2+ headings (H2/H3/H4) with ≥4 words and a modifier word (e.g., 'How to fix a broken canonical tag'). These match the specific queries AI systems handle

19. Conversational headings — 2+ headings with ≥5 words using natural language patterns. Improves voice search and AI assistant responses

Technical AI integration (2 signals)

20. AI plugin manifest — whether /.well-known/ai-plugin.json exists with a valid schema_version (ChatGPT Actions / OpenAI plugin integration)

21. max-snippet unlimited — whether meta robots allows unlimited snippet length (no max-snippet restriction, or max-snippet:-1). AI systems need full-length snippets to generate accurate answers

AI bot crawlability in robots.txt

Separately from the 21 signals, the audit checks robots.txt access for 6 major AI bots:

• GPTBot — OpenAI's crawler for ChatGPT web browsing and search

• ChatGPT-User — ChatGPT's real-time browsing agent

• ClaudeBot — Anthropic's crawler for Claude

• Google-Extended — Google's AI training crawler (Gemini)

• PerplexityBot — Perplexity AI's search crawler

• CCBot — Common Crawl bot used by many AI systems

If ALL bots are blocked simultaneously, this is a critical issue (−20 pts) — the site is invisible to all AI search. If only some bots are blocked, each is a warning (−10 pts).

To unblock, remove or comment out the Disallow rules for these user-agents in robots.txt. If you only want to block AI training but allow AI search, block CCBot and Google-Extended while allowing GPTBot, ChatGPT-User, ClaudeBot, and PerplexityBot.

Cloudflare AI bot blocking

Some sites use Cloudflare's 'Block AI Scrapers' feature or custom WAF rules to prevent AI crawlers. While this may be intentional, it can also inadvertently block AI search crawlers like GPTBot or PerplexityBot, preventing the page from being cited by AI systems.

Metrics

Metric Description
AI Readiness score Percentage of 21 signals present (0–100).
Entity Schema Whether Person, Organization, Product, or similar entity schema is present.
FAQ Schema Whether FAQPage JSON-LD is present.
FAQ HTML patterns Whether question-phrased headings are detected in HTML.
Speakable Whether Speakable schema is present.
HowTo Schema Whether HowTo JSON-LD is present.
BreadcrumbList Schema Whether BreadcrumbList JSON-LD is present.
sameAs links Number of sameAs links to authoritative sources.
Content sections Number of distinct content sections detected (≥3 = signal present).
Definitions Whether 'X is Y' definition patterns are detected in body text.
llms.txt Whether /llms.txt exists and returns HTTP 200.
llms.txt quality Whether llms.txt contains structured sections (not just a stub).
Markdown for agents Whether the server returns Content-Type: text/markdown for Accept: text/markdown requests.
Cloudflare Whether Cloudflare is detected (may affect AI crawler access).
Q&A ratio Percentage of headings phrased as questions (≥10% = signal present).
Named entities Number of multi-word proper nouns detected (≥3 = signal present).
Citation passages Number of paragraphs with statistics, dates, or data (≥2 = signal present).
Comparison tables Whether HTML tables with ≥2 rows and ≥2 columns exist.
TL;DR pattern Whether ≥30% of sections start with a short summary paragraph.
Long-tail headings Whether ≥2 headings use ≥4-word specific queries with modifier words.
Conversational headings Whether ≥2 headings use ≥5-word natural language patterns.
Authority backlinks Whether sameAs links point to Wikipedia, Reddit, Quora, or Crunchbase.
AI plugin manifest Whether /.well-known/ai-plugin.json exists with valid schema_version.
max-snippet unlimited Whether meta robots allows unlimited snippet length.
AI bots blocked Which AI crawlers (GPTBot, ClaudeBot, PerplexityBot, etc.) are blocked in robots.txt.

Related Topics

Check your AI Readiness score

Run a free SEO audit to see how your site performs in this category.

Free Audit
Try CheckSEO free — analyze your site in 30 seconds Start Free Audit