
The AI Discoverability Stack: How We Got AI Search Engines to Cite Us
We built 4 features that turn a website from a scraped source into a primary citation target for AI search engines. Here is the exact implementation.
Loading...
Our most-cited deep dives on AI search visibility, plus what we shipped this month.

We built 4 features that turn a website from a scraped source into a primary citation target for AI search engines. Here is the exact implementation.

Gemini 2.5 reads YouTube video natively, not just transcripts. The category for auditing YouTube AI visibility barely exists. Here is the 3-layer stack, the honest data, and what to do before competitors notice.

First findings from the Radar Brand Index. 50 named brands audited live with Radar. Stripe and BetterUp lead at 88/100. Hims and Hers hit 4/100. Three brands are completely invisible to AI. Here is what the data says.
How Radar gathers AI response data through live queries, calculates your AI Readiness Score, handles LLM response variance, and detects hallucinations. Every metric is transparent and reproducible.
Each dimension measures a specific aspect of your AI visibility infrastructure.
Measures: Whether 13 AI and search bot user-agents can successfully access your site.
Why it matters: If AI crawlers are blocked, nothing else matters. This is the foundation of AI visibility.
Measures: Deep parse of robots.txt rules for 16 known AI and search bots.
Why it matters: Misconfigured robots.txt is the most common reason sites are invisible to AI.
Measures: Whether your llms.txt file exists, is valid, and contains useful structured content.
Why it matters: llms.txt is the emerging standard for telling LLMs what your site is about.
Measures: 5-category audit across technical, content, structured data, authority, and discoverability.
Why it matters: A holistic assessment of whether your site is optimized for AI citation.
Measures: Whether ChatGPT, Claude, Gemini, and Perplexity mention and cite your brand.
Why it matters: The ultimate measure: are AI models actually talking about you?
Measures: Brand mentions across Reddit, including detection of seeded or artificial content.
Why it matters: Reddit is a major training data source for LLMs. Your Reddit presence shapes AI outputs.
Measures: Answer Engine Optimization readiness across 6 categories for a specific page.
Why it matters: Page-level optimization determines whether AI can extract quotable answers.
Measures: Whether AI platforms cite your specific page when asked a relevant question.
Why it matters: Tests the direct pathway from question to your content.
Measures: Which domains are shaping AI narratives in your category.
Why it matters: Understanding the competitive landscape tells you what you are up against.
Measures: Your share of voice in AI recommendations for your category.
Why it matters: Market share in AI-generated answers is the new SEO metric.
Measures: Validates JSON-LD structured data across 10 schema types.
Why it matters: Structured data is how AI models understand entity relationships on your site.
Measures: Detects factual errors in what AI models claim about your brand.
Why it matters: AI hallucinations can damage your brand. You need to know what is being said wrong.
The AI Readiness Score is a weighted average of all 12 dimensions, normalized to 0-100.
85-100
Excellent
70-84
Good
50-69
Average
30-49
Poor
0-29
Critical
Calibration: The scoring model was calibrated using Pixelmojo's own 516-commit journey from 0/4 LLM citations to 4/4 citations over 6 months. Each dimension's weight reflects its observed impact on AI citation outcomes.
What "good" looks like: A score of 70+ (Grade B) means your site has the technical foundations for consistent AI citation. A score of 85+ (Grade A) puts you in the top tier across all domains audited by Radar.
A composite metric across AI Crawl Check, Robots.txt Analysis, and llms.txt Validation. This is the technical foundation that no other tool measures.
The Crawl Integrity Score averages the scores from bot access testing (13 user-agents), robots.txt rule analysis (16 bots), and llms.txt validation. A high Crawl Integrity Score means AI models can technically reach, parse, and understand your content. A low score means you are invisible regardless of content quality.
LLM responses are non-deterministic. The same query can produce different results on different runs. Radar addresses this openly.
Methodology
Citation scores query each LLM and display the result. When comparing across runs (LLM Answer Diff), Radar highlights changes explicitly so users can distinguish genuine shifts from model variance.
Why this matters
Every other tool in the AI visibility space ignores non-determinism. Radar is the only tool that acknowledges and handles LLM variance explicitly, turning a vulnerability into a trust signal.
How Radar gathers AI response data, and why this matters for accuracy.
Live querying beats snapshot databases when AI outputs change hourly. Radar runs fresh queries against ChatGPT, Claude, Gemini, and Perplexity every time you audit a domain. The results reflect what those models say right now, not what they said weeks ago.
Snapshot approach
Tools build a static prompt library, often hundreds of millions of search-derived prompts, run them in batched cycles, and serve the cached results. Refresh windows can stretch from weekly to monthly per prompt.
Strength: scale of queries. Weakness: staleness, plus invisibility to AI sub-queries the platform never sees.
Radar live-query approach
Radar issues fresh queries to each LLM every audit. Results reflect the current model state, the most recent training cutoff, and live-search behavior at audit time.
Strength: zero staleness, captures current behavior. Trade-off: smaller per-audit query volume, addressed by repeat audits and explicit variance handling.
Most AI retrieval traffic is invisible to prompt-library tools.
When a user asks an AI assistant a question, the model often decomposes that question into multiple internal sub-queries before generating an answer. These sub-queries have no Google search volume, no public footprint, and never appear in keyword-research-based prompt libraries. Industry estimates put this dark-query share at roughly 88% of total AI retrieval traffic.
Radar handles this by querying LLMs directly with category and brand-specific prompts that mirror how AI assistants actually retrieve answers, not how Google users phrase searches.
Live querying is not always the right answer.
Snapshot databases earn their keep when you need historical trend lines, fixed comparison surfaces across thousands of brands, or coverage of branded search-volume data that only traditional search engines can supply. Radar focuses on technical AI readiness and live citation behavior, and we recommend pairing live audits with traditional search-volume tools for the full picture. Use the right instrument for the question you are answering.
How Radar identifies and scores factual errors in AI-generated content about your brand.
Ground truth extraction
Radar extracts verifiable claims from your site (meta tags, schema markup, published content) and uses these as the baseline for accuracy comparison.
Claim verification
AI model responses are compared against ground truth. Each discrepancy is flagged with a severity tier: Critical (wrong facts that could cause harm), Major (significant misrepresentations), and Minor (imprecise but not harmful).
Severity scoring
The hallucination score reflects both the number and severity of detected inaccuracies. A domain with zero hallucinations scores 100. Each critical flag reduces the score significantly; minor flags have smaller impact.
Radar's scoring was validated against a real-world case study.
Pixelmojo went from zero AI citations to being cited by all four major LLMs (ChatGPT, Claude, Gemini, Perplexity) in 6 months. Every fix was tracked with git commits. The scoring model was calibrated against this journey.
Read the full origin story →Tools outside the 12-dimension Radar audit that complement it.
YouTube Brand Monitor
YouTube is a major training corpus for every AI assistant, but it is not part of Radar's 12-dimension parallel audit. The standalone YouTube Brand Monitor tool scores your YouTube footprint across mention volume, channel diversity, sentiment, reach, and recency using live YouTube Data API queries. Use it alongside Radar for a fuller picture of off-site brand visibility.
Run a YouTube audit →