Methodology

How Radar gathers AI response data through live queries, calculates your AI Readiness Score, handles LLM response variance, and detects hallucinations. Every metric is transparent and traceable: we store the measured inputs, timestamps, and provider coverage behind each score. LLM responses are non-deterministic, so identical reruns are not guaranteed (see LLM Variance Handling below).

Live AI visibility data on

See all 50 brands in the Brand Index

The 13 Audit Dimensions

Each dimension measures a specific aspect of your AI visibility infrastructure.

AI Crawl CheckWeight: 9%

Measures: Whether 13 AI and search bot user-agents can successfully access your site.

Why it matters: If AI crawlers are blocked, nothing else matters. This is the foundation of AI visibility.

Robots.txt AnalysisWeight: 8%

Measures: Deep parse of robots.txt rules for 16 known AI and search bots.

Why it matters: Misconfigured robots.txt is the most common reason sites are invisible to AI.

llms.txt ValidationWeight: 8%

Measures: Whether your llms.txt file exists, is valid, and contains useful structured content.

Why it matters: llms.txt is the emerging standard for telling LLMs what your site is about.

AI Readiness ScoreWeight: 8%

Measures: 5-category audit across technical, content, structured data, authority, and discoverability.

Why it matters: A holistic assessment of whether your site is optimized for AI citation.

Citation TrackerWeight: 9%

Measures: Whether the ChatGPT, Claude, Gemini, and Perplexity APIs mention and cite your brand.

Why it matters: The ultimate measure: are AI models actually talking about you?

Reddit MonitorWeight: 5%

Measures: Brand mentions across Reddit, including detection of seeded or artificial content.

Why it matters: Reddit is a major training data source for LLMs. Your Reddit presence shapes AI outputs.

AEO Page AuditorWeight: 8%

Measures: Answer Engine Optimization readiness across 6 categories for a specific page.

Why it matters: Page-level optimization determines whether AI can extract quotable answers.

Citation TesterWeight: 8%

Measures: Whether AI platforms cite your specific page when asked a relevant question.

Why it matters: Tests the direct pathway from question to your content.

Source InfluenceWeight: 7%

Measures: Which domains are shaping AI narratives in your category.

Why it matters: Understanding the competitive landscape tells you what you are up against.

Prompt SOVWeight: 7%

Measures: Your share of voice in AI recommendations for your category.

Why it matters: Market share in AI-generated answers is the new SEO metric.

Schema AuditWeight: 7%

Measures: Validates JSON-LD structured data across 10 schema types.

Why it matters: Structured data is how AI models understand entity relationships on your site.

Hallucination CheckWeight: 8%

Measures: Detects factual errors in what AI models claim about your brand.

Why it matters: AI hallucinations can damage your brand. You need to know what is being said wrong.

Brand DisambiguationWeight: 8%

Measures: Whether AI engines link your brand name to the right entity or confuse it with a same-named company, person, or product.

Why it matters: If AI describes a different entity that shares your name, buyers get the wrong company. Entity accuracy protects your identity in AI answers.

Scoring Model

The AI Readiness Score averages the dimensions that completed for your domain, normalized to 0-100. Today all completed dimensions are weighted equally; the per-dimension weights shown below are our importance ranking and the basis for the calibrated weighting on our roadmap.

85-100

Excellent

70-84

Good

50-69

Average

30-49

Poor

0-29

Critical

Calibration: The dimension importance ranking is informed by Pixelmojo's own 516-commit journey from 0/4 LLM citations to 4/4 citations over 6 months — a directional prior from a single case, not a statistically calibrated model. Calibration is ongoing: calibrated weight sets are built and evaluated against real audit outcomes before any promotion into the live score.

What "good" looks like: A score of 70+ (Grade B) means your site has the technical foundations for consistent AI citation. A score of 85+ (Grade A) puts you in the top tier across all domains audited by Radar.

Crawl Integrity Score

A composite metric across AI Crawl Check, Robots.txt Analysis, and llms.txt Validation. This is a technical foundation that most monitoring tools don't measure.

The Crawl Integrity Score averages the scores from bot access testing (13 user-agents), robots.txt rule analysis (16 bots), and llms.txt validation. A high Crawl Integrity Score means AI models can technically reach, parse, and understand your content. A low score means you are invisible regardless of content quality.

LLM Variance Handling

LLM responses are non-deterministic. The same query can produce different results on different runs. Radar addresses this openly.

Methodology

Citation scores query each LLM and display the result. When comparing across runs (LLM Answer Diff), Radar highlights changes explicitly so users can distinguish genuine shifts from model variance.

Why this matters

Most tools in the AI visibility space don't surface non-determinism. Radar acknowledges and handles LLM variance explicitly, turning a vulnerability into a trust signal.

Live Queries vs Snapshot Databases

How Radar gathers AI response data, and why this matters for accuracy.

Live querying beats snapshot databases when AI outputs change hourly. Radar runs fresh queries against the ChatGPT, Claude, Gemini, and Perplexity provider APIs every time you audit a domain. The results reflect what the measured models return right now, not what they said weeks ago.

What Radar queries

Radar measures responses from the provider APIs (OpenAI gpt-4o-mini, Anthropic Claude Haiku, Perplexity Sonar, and Google Gemini 2.5 Flash). Consumer apps like chatgpt.com or gemini.google.com may differ: they can use different models, memory, personalization, and live web search. Results are a snapshot of the measured API responses at scan time; provider coverage and model versions are shown where available.

Snapshot approach

Tools build a static prompt library, often hundreds of millions of search-derived prompts, run them in batched cycles, and serve the cached results. Refresh windows can stretch from weekly to monthly per prompt.

Strength: scale of queries. Weakness: staleness, plus invisibility to AI sub-queries the platform never sees.

Radar live-query approach

Radar issues fresh queries to each LLM every audit. Results reflect the current model state, the most recent training cutoff, and live-search behavior at audit time.

Strength: zero staleness, captures current behavior. Trade-off: smaller per-audit query volume, addressed by repeat audits and explicit variance handling.

97.6%

Underreporting rate found in independent benchmark testing of a leading snapshot-based AI visibility tool against live ChatGPT mentions

Source: independent industry benchmark, 2026

The Dark Query Blind Spot

Most AI retrieval traffic is invisible to prompt-library tools.

When a user asks an AI assistant a question, the model often decomposes that question into multiple internal sub-queries before generating an answer. These sub-queries have no Google search volume, no public footprint, and never appear in keyword-research-based prompt libraries. Industry estimates put this dark-query share at roughly 88% of total AI retrieval traffic.

~88%

Estimated share of AI retrieval traffic generated by internally-decomposed dark queries with zero search volume

Source: industry analysis, 2026

Radar handles this by querying LLMs directly with category and brand-specific prompts that mirror how AI assistants actually retrieve answers, not how Google users phrase searches.

When Snapshots Are Useful

Live querying is not always the right answer.

Snapshot databases earn their keep when you need historical trend lines, fixed comparison surfaces across thousands of brands, or coverage of branded search-volume data that only traditional search engines can supply. Radar focuses on technical AI readiness and live citation behavior, and we recommend pairing live audits with traditional search-volume tools for the full picture. Use the right instrument for the question you are answering.

Hallucination Detection Framework

How Radar identifies and scores factual errors in AI-generated content about your brand.

Ground truth extraction

Radar extracts verifiable claims from your site (meta tags, schema markup, published content) and uses these as the baseline for accuracy comparison.

Claim verification

AI model responses are compared against ground truth. Each discrepancy is flagged with a severity tier: Critical (wrong facts that could cause harm), Major (significant misrepresentations), and Minor (imprecise but not harmful).

Severity scoring

The hallucination score reflects both the number and severity of detected inaccuracies. A domain with zero hallucinations scores 100. Each critical flag reduces the score significantly; minor flags have smaller impact.

Case Validation

Radar's scoring is illustrated by a real-world case study — a single documented journey, offered as early validation rather than a controlled study.

0/4Oct 2025

516 commits

4/4Mar 2026

Pixelmojo went from zero AI citations to being cited by all four major LLMs (ChatGPT, Claude, Gemini, Perplexity) in 6 months. Every fix was tracked with git commits. The scoring model's dimension ranking is informed by this journey — one case, tracked end to end.

Read the full origin story →

Related Standalone Tools

Tools outside the 13-dimension Radar audit that complement it.

YouTube Brand Monitor

YouTube is a major training corpus for every AI assistant, but it is not part of Radar's 13-dimension parallel audit. The standalone YouTube Brand Monitor tool scores your YouTube footprint across mention volume, channel diversity, sentiment, reach, and recency using live YouTube Data API queries. Use it alongside Radar for a fuller picture of off-site brand visibility.

Run a YouTube audit →

See it in action

Run a free AI Technical Readiness audit on your domain.

Check your brand free