What is a knowledge graph for LLM visibility?

A knowledge graph for LLM visibility is a structured data layer that defines entities (people, organizations, methodologies, products) and their relationships using JSON-LD schema markup. Unlike traditional SEO schema that describes pages, a knowledge graph describes the things themselves, giving LLMs concrete entities to resolve during retrieval-augmented generation.

How do knowledge graphs help with AI search citations?

Knowledge graphs help AI search engines by providing structured entity data that LLMs use during the entity resolution and context assembly phases of RAG pipelines. When an LLM can match your structured entities against a user query, it is more likely to treat your content as an authoritative source and cite it in generated responses.

What is the difference between a knowledge graph and regular schema markup?

Regular schema markup describes individual pages (Article, FAQPage, HowTo). A knowledge graph uses the JSON-LD @graph structure to define entities (DefinedTerm, Organization, Person, SoftwareApplication) and link them through properties like about, mentions, sameAs, and relatedLink. This creates a connected web of meaning rather than isolated page descriptions.

Can a knowledge graph work across multiple domains?

Yes. Cross-site knowledge graphs use sameAs URIs and shared @id references to connect entities across domains. For example, a Person entity on a personal site can link to the same Person referenced as worksFor on a company site. This creates multi-source authority signals that LLMs can verify across their index.

What schema types should I use for a knowledge graph?

Start with Organization and Person as your foundation. Add DefinedTerm for proprietary methodologies or frameworks, SoftwareApplication for products, and Service for offerings. Use the @graph array to include all entities on every page, and use about/mentions on Article schemas to connect blog posts to relevant entities.

How long does it take for a knowledge graph to affect LLM citations?

There is no guaranteed timeline. LLM indexes update on their own schedules, and citation decisions depend on many factors beyond structured data. In our case, we saw referral traffic and key event anomalies within one week of deployment, but we also made other improvements simultaneously. Treat knowledge graphs as infrastructure that compounds over time, not a quick win.

Do I need llms.txt if I have a knowledge graph?

They serve different purposes. A knowledge graph provides structured entity data embedded in your HTML that gets indexed during crawling. llms.txt provides a machine-readable summary document that AI crawlers can consume in a single request. Using both is the strongest approach: the knowledge graph enriches every page, while llms.txt gives crawlers a quick overview of your entire site.

What is the LLM Visibility Stack?

The LLM Visibility Stack is a five-layer framework for optimizing content for AI search engine citations. From bottom to top: (1) Entity Foundation with DefinedTerm and Organization schemas, (2) Knowledge Graph with @graph enrichment and relationship mapping, (3) Content Signals with authoritative citations and topical depth, (4) Machine-Readable Context with llms.txt and FAQ schema, and (5) Multi-Source Authority with cross-site entity linking.

Published: February 25, 2026•15 min read

How We Built a Knowledge Graph That LLMs Actually Cite (With Real Data)

We built a cross-site knowledge graph connecting two domains via JSON-LD entity linking. Here is the architecture, the code patterns, and the real analytics from the first week.

by Lloyd Pilapil

Knowledge graph visualization showing interconnected entity nodes across two domains with JSON-LD connections

+18.1%

active users after deploying a knowledge graph for LLM visibility, with referral traffic 3x above predicted

Source: Pixelmojo analytics

LLMs Find You Through Entities, Not Keywords

When a user asks ChatGPT "what agency specializes in AI product development," the model does not run a keyword search. It performs entity resolution: matching structured data from its index against the query to find authoritative sources.

Most businesses optimize for keywords. We optimized for entities. We built a knowledge graph that defines 18 entities across two domains, connects them through JSON-LD @graph enrichment, and gives LLMs something concrete to cite. Here is exactly what we built, why we built it this way, and what happened in the first week.

TL;DR

LLMs use entity resolution (not keyword matching) to decide which sources to cite in generated responses
We built a knowledge graph with 18 entities across 2 domains (pixelmojo.io + lloydpilapil.com) connected via sameAs URIs
The LLM Visibility Stack has 5 layers: Entity Foundation, Knowledge Graph, Content Signals, Machine-Readable Context, Multi-Source Authority
Real analytics: +18.1% active users, +21.8% new users, referral traffic above GA4 forecasts, key events 3x above predicted range
Correlation is not causation. We made other improvements simultaneously. But the referral and key event spikes align specifically with the knowledge graph deployment
You can build this with a TypeScript entity registry, a schema engine, and auto-mapping from blog post tags. No external tools required

The Problem: Why Traditional SEO Alone Falls Short for LLMs

We have covered the tactical side of generative engine optimization across our 5-part GEO series. That series walks through how AI search is shifting traffic patterns, the technical playbook for getting cited, what actually changed our own AI search results, dynamic llms.txt implementation, and building brands that AI search engines recommend.

But all of those posts focus on content and delivery mechanisms. None of them address the foundational layer: how LLMs actually resolve entities and decide what is authoritative.

Here is the gap we identified:

robots.txt and llms.txt tell crawlers what to index and summarize. They are access control and context delivery, not identity.
FAQ schema and Article markup describe individual pages. They do not define the entities those pages are about.
Topical authority through content clusters signals expertise, but it is implicit. LLMs have to infer your authority from content patterns rather than reading it directly from structured data.

What was missing was the infrastructure layer: a knowledge graph that explicitly defines who we are, what we do, and how everything connects. Not for Google (though it helps there too), but specifically for the retrieval and entity resolution phases of LLM pipelines.

How LLMs Actually Discover Sources

To understand why knowledge graphs matter for LLM citations, you need to understand how retrieval-augmented generation (RAG) actually works. Most explanations oversimplify this, so let us walk through the pipeline step by step.

HOW LLMS DISCOVER SOURCES

Simplified RAG pipeline showing where structured data matters most

User Query

"What agency does AI product development?"

Retrieval

LLM searches its index for relevant documents

Entity Resolution

YOUR DATA MATTERS HERE

Matches structured data to resolve who/what entities are

Context Assembly

YOUR DATA MATTERS HERE

Ranks and combines sources by authority signals

Response Generation

Synthesizes answer with citations from top-ranked sources

Steps 3 and 4 are where knowledge graphs win.

Without structured entity data, the LLM treats your site like every other page. With it, you give the model something concrete to resolve against: named entities, defined relationships, and explicit authority signals. This is the difference between "possibly relevant" and "authoritative source."

The critical insight is that steps 3 and 4 are where structured data creates separation. During entity resolution, the LLM is trying to match query concepts against its index. If your site has explicit DefinedTerm schemas for "Thread-Based Engineering" or "Generative Engine Optimization," the model can resolve those entities directly instead of inferring them from unstructured content.

During context assembly, the LLM ranks sources by authority signals. A site with a connected @graph of Organization, Person, Service, Product, and DefinedTerm entities provides stronger signals than a site with just Article schema on each page.

This is not speculation. The Princeton GEO study found that authoritative citations and structured claims improved visibility in generative engines by 30-40%. Knowledge graphs are how you make those signals machine-readable at scale.

The LLM Visibility Stack

After building our knowledge graph and analyzing what moved the needle across our GEO work, we identified five layers that determine whether AI search engines cite your content. We call this the LLM Visibility Stack.

THE LLM VISIBILITY STACK

Five layers that determine whether AI search engines cite your content

L5Multi-Source Authority

Cross-site entity linking, sameAs connections, consistent signals across domains

L4Machine-Readable Context

llms.txt, robots.txt AI directives, FAQ schema, structured data signals

L3Content Signals

Authoritative citations, statistical claims, quotable passages, topical depth

L2Knowledge Graph

Entity definitions, @graph enrichment, topic clusters, relationship mapping

L1Entity Foundation

DefinedTerm schemas, Organization identity, Person profiles, sameAs URIs

Each layer amplifies the ones below it.

Most teams jump to Layer 4 (llms.txt) without Layers 1-2. That is like building a house starting from the roof. The entity foundation and knowledge graph are what make everything else meaningful to LLMs.

Layer 1: Entity Foundation

This is where most teams need to start and where most teams skip to Layer 4 instead. The entity foundation defines the core things your site is about using schema.org types:

Organization with complete identity (name, description, url, logo, sameAs to social profiles)
Person entities for key authors with knowsAbout, jobTitle, and worksFor connections
DefinedTerm for proprietary methodologies or frameworks you have created
SoftwareApplication for products
Service for service offerings

Each entity gets a stable @id URI (like https://www.pixelmojo.io/#thread-based-engineering) that can be referenced from anywhere in your schema.

Layer 2: Knowledge Graph

This layer connects the entities from Layer 1 into a web of meaning. It is not enough to define entities in isolation. You need to express:

Which entities are about which blog posts (and vice versa)
Which entities mention other entities
Which entities are related to each other
How entities across different domains connect via sameAs

This is what transforms isolated schema markup into a knowledge graph. The @graph array on each page includes all entity definitions, and each Article schema gets enriched with about and mentions references to relevant entities.

Layers 3-5: Content, Machine-Readable Context, Multi-Source Authority

These layers build on the entity foundation. We covered them in detail across our GEO series:

Layer 3 (Content Signals): Authoritative citations, statistical claims, and topical depth. See our GEO playbook.
Layer 4 (Machine-Readable Context): llms.txt, AI crawler directives, and FAQ schema. See our llms.txt implementation guide.
Layer 5 (Multi-Source Authority): Cross-site entity linking, which we will cover in the architecture section below.

What We Built: Architecture Walkthrough

Our knowledge graph connects two domains: pixelmojo.io (the agency) and lloydpilapil.com (the founder's personal site). Here is how the pieces fit together.

CROSS-SITE ENTITY LINKING

Two domains, one knowledge graph, connected via sameAs and shared @id URIs

pixelmojo.io

Organization

Services (6)

Products (2)

Methodologies (3)

lloydpilapil.com

Person

KnowsAbout (7)

WorksFor

SameAs URIs

Shared Entities

Thread-Based Engineering

@type: DefinedTerm

AX Design

@type: DefinedTerm

GEO

@type: DefinedTerm

Lakbay AI

@type: SoftwareApplication

The sameAs bridge is bidirectional. When an LLM encounters "Lloyd Pilapil" on either site, JSON-LD connects it to the same Person entity. The organization, methodologies, and products all resolve to the same @id URIs regardless of which domain the LLM indexed first.

The Entity Registry

The core of the system is a TypeScript file (knowledge-graph.ts) that defines every entity as a structured object:

export const entities: Record<string, Entity> = {
  'thread-based-engineering': {
    id: 'thread-based-engineering',
    name: 'Thread-Based Engineering',
    type: 'Methodology',
    schemaType: 'DefinedTerm',
    description: 'Productivity and governance framework...',
    relatedEntities: ['ai-technical-debt', 'claude-code-development'],
    primaryPosts: ['thread-based-engineering-scaling-ai-development'],
    mentionedInPosts: ['vibe-coding-technical-debt-crisis-2026-2027'],
    keywords: ['thread-based-engineering', 'ai-governance'],
  },
  // ... 17 more entities
}

Each entity has:

A stable @id that becomes its URI in JSON-LD
primaryPosts and mentionedInPosts for manual relationship overrides
keywords that enable automatic tag-based matching (more on this below)
relatedEntities for the relationship graph

The Schema Engine

A resolver (schema-engine.ts) takes the entity registry and converts it into JSON-LD output. It does three things:

Generates DefinedTerm fragments for the global @graph (included on every page)
Auto-matches blog posts to entities based on tag overlap (2+ keyword matches = about, 1 match = mentions)
Enriches Article schemas with about and mentions references

The auto-matching is the key feature that keeps the system maintainable. When we write a new blog post, we just include relevant tags. The schema engine automatically connects the post to the right entities. No manual editing of the knowledge graph file required for routine posts.

Cross-Site Linking

The personal site (lloydpilapil.com) has its own structured data with a Person entity that includes:

{
  "@type": "Person",
  "sameAs": [
    "https://www.linkedin.com/in/lloydpilapil",
    "https://www.pixelmojo.io/author/lloyd-pilapil"
  ],
  "worksFor": {
    "@type": "Organization",
    "@id": "https://www.pixelmojo.io/#organization"
  },
  "knowsAbout": [
    "Thread-Based Engineering",
    "Generative Engine Optimization",
    "AX Design"
  ]
}

The sameAs and worksFor properties create the bridge. When an LLM encounters "Lloyd Pilapil" on either domain, it can resolve both references to the same entity. The knowsAbout array connects the person to the exact DefinedTerm entities defined in the pixelmojo.io knowledge graph.

This is bidirectional: the pixelmojo.io Article schemas include an author reference that links back to the same Person @id. Two domains, one entity graph.

Dynamic llms.txt Integration

Our dynamic llms.txt consumes the knowledge graph at build time. The "Entity Context" section of llms.txt is generated directly from the entity registry, giving AI crawlers a plain-text summary of every entity, its relationships, and its primary content. When we add a new entity to the knowledge graph, llms.txt updates automatically on the next build.

The Results: Real Analytics

We deployed the cross-site knowledge graph on February 12, 2026. Here is what GA4 showed in the first seven days.

WHAT THE DATA SHOWS

GA4 analytics for pixelmojo.io, 7 days after deploying the knowledge graph

Active Users

248+18.1%

Last 7 days vs previous period

New Users

229+21.8%

Last 7 days vs previous period

Referral Traffic

6Above forecast

GA4 predicted 1-5 users; actual exceeded range

Key Events (Direct)

193x predicted

GA4 predicted 0-6 events; actual was 19

Honest caveat: These numbers correlate with deploying the knowledge graph, but correlation is not causation. We also published new content and made technical SEO improvements during the same period. The referral and key event anomalies specifically align with the knowledge graph deployment timeline, which is why we highlight them separately.

Let us be specific about what these numbers mean:

What went up:

Active users increased 18.1% over the previous 7-day period (248 vs 210)
New users increased 21.8% (229 vs 188)
Referral traffic spiked above GA4's forecasted range: GA4 predicted 1-5 referral users, we got 6
Key events through the Direct channel hit 19, where GA4 predicted 0-6 (over 3x the upper bound of the forecast)

What we cannot claim:

We cannot isolate the knowledge graph's impact from other changes we made during the same period
We published new content and made technical SEO updates simultaneously
Referral and key event anomalies align with the deployment timeline, but that is correlation, not proof

What we think is happening: The referral traffic anomaly is the most interesting signal. Referral traffic means users coming from other sites that link to us. We did not build new backlinks during this period. The increase could indicate AI-assisted tools or platforms starting to surface our content, which gets counted as referral traffic in GA4. The key event spike through Direct traffic could indicate users arriving via AI chat interfaces (which often show as Direct in analytics).

We will continue monitoring and will update this post as the data matures.

Free Tool

Can AI Bots Find Your Content?

Test how GPTBot, Claude, Perplexity, and 11 other bots see your website. Checks robots.txt, structured data, llms.txt, and content accessibility.

Try the AI Crawl Checker

To measure whether your own knowledge graph and entity definitions are making a difference, see our complete guide to free AI visibility tools. The AI Citation Tracker and llms.txt Validator are especially useful for tracking the impact of entity-level optimizations.

What Did Not Work

“We spent two days trying to get Google's Rich Results Test to validate our DefinedTerm entities before realizing Google does not render DefinedTerm in search features. The entities still work for LLMs that crawl and parse JSON-LD, but Google's tooling does not surface them. We also tried adding aggregateRating to our Organization schema before catching ourselves: we have no reviews to aggregate, and fabricating data would undermine the credibility we are trying to build.”

Build journal, Feb 2026

Other things that did not work as expected:

Over-connecting entities. Our first version had every entity related to every other entity. This diluted the signal. We trimmed relationships to only meaningful connections (5-6 per entity, not 12).
Manual post mapping. We started by manually adding every blog slug to entity primaryPosts arrays. This became stale within a week. The auto-matching system based on tag keywords was the fix.
Expecting immediate indexing. LLMs do not re-index on your schedule. Some of our entities may not be in any LLM's index yet. This is infrastructure that compounds, not a launch-day win.

The Playbook: How to Build Your Own Knowledge Graph

If you want to replicate this approach, here is the sequence that worked for us:

Audit your entities. List every methodology, product, service, and project that your business is known for. If it has a name and could be a Wikipedia article, it is probably an entity.
Choose your schema types. Organization and Person are mandatory. Add DefinedTerm for frameworks, SoftwareApplication for products, Service for offerings. Avoid types that require data you do not have (aggregateRating without real reviews, for example).
Create stable @id URIs. Each entity needs a permanent identifier. We use the pattern https://www.pixelmojo.io/#entity-id. These URIs do not need to resolve to a page. They are identifiers, not URLs.
Define relationships. Map which entities relate to which. Keep it honest: 3-6 related entities per item is more useful than connecting everything to everything.
Build the auto-matcher. Define keywords for each entity. When a blog post's tags match 2+ keywords, the post automatically gets connected as about that entity. One match = mentions. This keeps the system alive without manual maintenance.
Inject into every page. The @graph array with all entity DefinedTerm schemas should appear on every page, not just the homepage. Article pages additionally get about and mentions properties linking to relevant entities.
Connect your personal brand. If a founder or key person has their own domain, add sameAs and worksFor links that bridge the two sites. This creates the multi-source authority signal from Layer 5 of the stack.
Wire into llms.txt. If you have a dynamic llms.txt (and you should), generate the entity section from the same source of truth. One data source, multiple outputs.

Where This Goes Next

The knowledge graph is infrastructure, not a finished product. Here is what we are building on top of it:

Analytics correlation tracking. We are building dashboards to track which entities appear in AI search citations over time, and correlating that with the GA4 anomaly data.
Automated entity expansion. When we launch new products or define new methodologies, the knowledge graph should grow automatically from the same TypeScript source of truth.
Cross-platform verification. Testing whether the same entities get cited differently across ChatGPT, Perplexity, Claude, and Gemini, and adjusting the schema based on what each model responds to.

If you are serious about AI search visibility, start with the entities. Build the foundation. The content and delivery layers (llms.txt, FAQ schema, topical depth) all work better when they have something real to stand on.

Want to see the knowledge graph in action? Check out our Vector lead qualification engine and Hive multi-agent platform, both of which are defined as entities in the graph. Or explore the full GEO series for the tactical playbook that sits on top of this infrastructure layer.

Continue the AI Search Playbook

SEO vs GEO vs AEO: The Complete Guide

Understand the three disciplines competing for AI search visibility

The GEO Playbook: Get Cited by ChatGPT, Perplexity, and Claude

Tactical guide to the content and technical signals that drive AI citations

Your llms.txt Is Already Stale. Here's How to Fix It.

Build a dynamic llms.txt that updates automatically with your content

About the Author

Lloyd Pilapil

Founder & AI Product Architect at Pixelmojo

Lloyd Pilapil is the founder of Pixelmojo and a former Salesforce engineer who builds production AI systems for B2B companies. He writes about agentic AI, multi-agent orchestration, AX (Agentic Experience) design, GEO, and Thread-Based Engineering. His work focuses on shipping AI products that generate revenue, not prototypes.

Expertise

Agentic AI SystemsMulti-Agent OrchestrationAX DesignGEO & AI SearchThread-Based EngineeringAI Product DevelopmentGrowth MarketingUI/UX Design

LLMs Find You Through Entities, Not Keywords

TL;DR

LLMs use entity resolution (not keyword matching) to decide which sources to cite in generated responses
We built a knowledge graph with 18 entities across 2 domains (pixelmojo.io + lloydpilapil.com) connected via sameAs URIs
The LLM Visibility Stack has 5 layers: Entity Foundation, Knowledge Graph, Content Signals, Machine-Readable Context, Multi-Source Authority
Real analytics: +18.1% active users, +21.8% new users, referral traffic above GA4 forecasts, key events 3x above predicted range
Correlation is not causation. We made other improvements simultaneously. But the referral and key event spikes align specifically with the knowledge graph deployment
You can build this with a TypeScript entity registry, a schema engine, and auto-mapping from blog post tags. No external tools required

The Problem: Why Traditional SEO Alone Falls Short for LLMs

But all of those posts focus on content and delivery mechanisms. None of them address the foundational layer: how LLMs actually resolve entities and decide what is authoritative.

Here is the gap we identified:

robots.txt and llms.txt tell crawlers what to index and summarize. They are access control and context delivery, not identity.
FAQ schema and Article markup describe individual pages. They do not define the entities those pages are about.
Topical authority through content clusters signals expertise, but it is implicit. LLMs have to infer your authority from content patterns rather than reading it directly from structured data.

How LLMs Actually Discover Sources

HOW LLMS DISCOVER SOURCES

Simplified RAG pipeline showing where structured data matters most

User Query

"What agency does AI product development?"

Retrieval

LLM searches its index for relevant documents

Entity Resolution

YOUR DATA MATTERS HERE

Matches structured data to resolve who/what entities are

Context Assembly

YOUR DATA MATTERS HERE

Ranks and combines sources by authority signals

Response Generation

Synthesizes answer with citations from top-ranked sources

Steps 3 and 4 are where knowledge graphs win.

The LLM Visibility Stack

THE LLM VISIBILITY STACK

Five layers that determine whether AI search engines cite your content

L5Multi-Source Authority

Cross-site entity linking, sameAs connections, consistent signals across domains

L4Machine-Readable Context

llms.txt, robots.txt AI directives, FAQ schema, structured data signals

L3Content Signals

Authoritative citations, statistical claims, quotable passages, topical depth

L2Knowledge Graph

Entity definitions, @graph enrichment, topic clusters, relationship mapping

L1Entity Foundation

DefinedTerm schemas, Organization identity, Person profiles, sameAs URIs

Each layer amplifies the ones below it.

Layer 1: Entity Foundation

This is where most teams need to start and where most teams skip to Layer 4 instead. The entity foundation defines the core things your site is about using schema.org types:

Organization with complete identity (name, description, url, logo, sameAs to social profiles)
Person entities for key authors with knowsAbout, jobTitle, and worksFor connections
DefinedTerm for proprietary methodologies or frameworks you have created
SoftwareApplication for products
Service for service offerings

Each entity gets a stable @id URI (like https://www.pixelmojo.io/#thread-based-engineering) that can be referenced from anywhere in your schema.

Layer 2: Knowledge Graph

This layer connects the entities from Layer 1 into a web of meaning. It is not enough to define entities in isolation. You need to express:

Which entities are about which blog posts (and vice versa)
Which entities mention other entities
Which entities are related to each other
How entities across different domains connect via sameAs

Layers 3-5: Content, Machine-Readable Context, Multi-Source Authority

These layers build on the entity foundation. We covered them in detail across our GEO series:

Layer 3 (Content Signals): Authoritative citations, statistical claims, and topical depth. See our GEO playbook.
Layer 4 (Machine-Readable Context): llms.txt, AI crawler directives, and FAQ schema. See our llms.txt implementation guide.
Layer 5 (Multi-Source Authority): Cross-site entity linking, which we will cover in the architecture section below.

What We Built: Architecture Walkthrough

Our knowledge graph connects two domains: pixelmojo.io (the agency) and lloydpilapil.com (the founder's personal site). Here is how the pieces fit together.

CROSS-SITE ENTITY LINKING

Two domains, one knowledge graph, connected via sameAs and shared @id URIs

pixelmojo.io

Organization

Services (6)

Products (2)

Methodologies (3)

lloydpilapil.com

Person

KnowsAbout (7)

WorksFor

SameAs URIs

Shared Entities

Thread-Based Engineering

@type: DefinedTerm

AX Design

@type: DefinedTerm

GEO

@type: DefinedTerm

Lakbay AI

@type: SoftwareApplication

The Entity Registry

The core of the system is a TypeScript file (knowledge-graph.ts) that defines every entity as a structured object:

export const entities: Record<string, Entity> = {
  'thread-based-engineering': {
    id: 'thread-based-engineering',
    name: 'Thread-Based Engineering',
    type: 'Methodology',
    schemaType: 'DefinedTerm',
    description: 'Productivity and governance framework...',
    relatedEntities: ['ai-technical-debt', 'claude-code-development'],
    primaryPosts: ['thread-based-engineering-scaling-ai-development'],
    mentionedInPosts: ['vibe-coding-technical-debt-crisis-2026-2027'],
    keywords: ['thread-based-engineering', 'ai-governance'],
  },
  // ... 17 more entities
}

Each entity has:

A stable @id that becomes its URI in JSON-LD
primaryPosts and mentionedInPosts for manual relationship overrides
keywords that enable automatic tag-based matching (more on this below)
relatedEntities for the relationship graph

The Schema Engine

A resolver (schema-engine.ts) takes the entity registry and converts it into JSON-LD output. It does three things:

Generates DefinedTerm fragments for the global @graph (included on every page)
Auto-matches blog posts to entities based on tag overlap (2+ keyword matches = about, 1 match = mentions)
Enriches Article schemas with about and mentions references

Cross-Site Linking

The personal site (lloydpilapil.com) has its own structured data with a Person entity that includes:

{
  "@type": "Person",
  "sameAs": [
    "https://www.linkedin.com/in/lloydpilapil",
    "https://www.pixelmojo.io/author/lloyd-pilapil"
  ],
  "worksFor": {
    "@type": "Organization",
    "@id": "https://www.pixelmojo.io/#organization"
  },
  "knowsAbout": [
    "Thread-Based Engineering",
    "Generative Engine Optimization",
    "AX Design"
  ]
}

This is bidirectional: the pixelmojo.io Article schemas include an author reference that links back to the same Person @id. Two domains, one entity graph.

Dynamic llms.txt Integration

The Results: Real Analytics

We deployed the cross-site knowledge graph on February 12, 2026. Here is what GA4 showed in the first seven days.

WHAT THE DATA SHOWS

GA4 analytics for pixelmojo.io, 7 days after deploying the knowledge graph

Active Users

248+18.1%

Last 7 days vs previous period

New Users

229+21.8%

Last 7 days vs previous period

Referral Traffic

6Above forecast

GA4 predicted 1-5 users; actual exceeded range

Key Events (Direct)

193x predicted

GA4 predicted 0-6 events; actual was 19

Let us be specific about what these numbers mean:

What went up:

Active users increased 18.1% over the previous 7-day period (248 vs 210)
New users increased 21.8% (229 vs 188)
Referral traffic spiked above GA4's forecasted range: GA4 predicted 1-5 referral users, we got 6
Key events through the Direct channel hit 19, where GA4 predicted 0-6 (over 3x the upper bound of the forecast)

What we cannot claim:

We cannot isolate the knowledge graph's impact from other changes we made during the same period
We published new content and made technical SEO updates simultaneously
Referral and key event anomalies align with the deployment timeline, but that is correlation, not proof

We will continue monitoring and will update this post as the data matures.

Free Tool

Can AI Bots Find Your Content?

Test how GPTBot, Claude, Perplexity, and 11 other bots see your website. Checks robots.txt, structured data, llms.txt, and content accessibility.

Try the AI Crawl Checker

What Did Not Work

Build journal, Feb 2026

Other things that did not work as expected:

Over-connecting entities. Our first version had every entity related to every other entity. This diluted the signal. We trimmed relationships to only meaningful connections (5-6 per entity, not 12).
Manual post mapping. We started by manually adding every blog slug to entity primaryPosts arrays. This became stale within a week. The auto-matching system based on tag keywords was the fix.
Expecting immediate indexing. LLMs do not re-index on your schedule. Some of our entities may not be in any LLM's index yet. This is infrastructure that compounds, not a launch-day win.

The Playbook: How to Build Your Own Knowledge Graph

If you want to replicate this approach, here is the sequence that worked for us:

Audit your entities. List every methodology, product, service, and project that your business is known for. If it has a name and could be a Wikipedia article, it is probably an entity.
Choose your schema types. Organization and Person are mandatory. Add DefinedTerm for frameworks, SoftwareApplication for products, Service for offerings. Avoid types that require data you do not have (aggregateRating without real reviews, for example).
Create stable @id URIs. Each entity needs a permanent identifier. We use the pattern https://www.pixelmojo.io/#entity-id. These URIs do not need to resolve to a page. They are identifiers, not URLs.
Define relationships. Map which entities relate to which. Keep it honest: 3-6 related entities per item is more useful than connecting everything to everything.
Build the auto-matcher. Define keywords for each entity. When a blog post's tags match 2+ keywords, the post automatically gets connected as about that entity. One match = mentions. This keeps the system alive without manual maintenance.
Inject into every page. The @graph array with all entity DefinedTerm schemas should appear on every page, not just the homepage. Article pages additionally get about and mentions properties linking to relevant entities.
Connect your personal brand. If a founder or key person has their own domain, add sameAs and worksFor links that bridge the two sites. This creates the multi-source authority signal from Layer 5 of the stack.
Wire into llms.txt. If you have a dynamic llms.txt (and you should), generate the entity section from the same source of truth. One data source, multiple outputs.

Where This Goes Next

The knowledge graph is infrastructure, not a finished product. Here is what we are building on top of it:

Analytics correlation tracking. We are building dashboards to track which entities appear in AI search citations over time, and correlating that with the GA4 anomaly data.
Automated entity expansion. When we launch new products or define new methodologies, the knowledge graph should grow automatically from the same TypeScript source of truth.
Cross-platform verification. Testing whether the same entities get cited differently across ChatGPT, Perplexity, Claude, and Gemini, and adjusting the schema based on what each model responds to.

Continue the AI Search Playbook

SEO vs GEO vs AEO: The Complete Guide

Understand the three disciplines competing for AI search visibility

The GEO Playbook: Get Cited by ChatGPT, Perplexity, and Claude

Tactical guide to the content and technical signals that drive AI citations

Your llms.txt Is Already Stale. Here's How to Fix It.

Build a dynamic llms.txt that updates automatically with your content

About the Author

Lloyd Pilapil

Founder & AI Product Architect at Pixelmojo

Expertise

Agentic AI SystemsMulti-Agent OrchestrationAX DesignGEO & AI SearchThread-Based EngineeringAI Product DevelopmentGrowth MarketingUI/UX Design

LLMs Find You Through Entities, Not Keywords

TL;DR

LLMs use entity resolution (not keyword matching) to decide which sources to cite in generated responses
We built a knowledge graph with 18 entities across 2 domains (pixelmojo.io + lloydpilapil.com) connected via sameAs URIs
The LLM Visibility Stack has 5 layers: Entity Foundation, Knowledge Graph, Content Signals, Machine-Readable Context, Multi-Source Authority
Real analytics: +18.1% active users, +21.8% new users, referral traffic above GA4 forecasts, key events 3x above predicted range
Correlation is not causation. We made other improvements simultaneously. But the referral and key event spikes align specifically with the knowledge graph deployment
You can build this with a TypeScript entity registry, a schema engine, and auto-mapping from blog post tags. No external tools required

The Problem: Why Traditional SEO Alone Falls Short for LLMs

But all of those posts focus on content and delivery mechanisms. None of them address the foundational layer: how LLMs actually resolve entities and decide what is authoritative.

Here is the gap we identified:

robots.txt and llms.txt tell crawlers what to index and summarize. They are access control and context delivery, not identity.
FAQ schema and Article markup describe individual pages. They do not define the entities those pages are about.
Topical authority through content clusters signals expertise, but it is implicit. LLMs have to infer your authority from content patterns rather than reading it directly from structured data.

How LLMs Actually Discover Sources

HOW LLMS DISCOVER SOURCES

Simplified RAG pipeline showing where structured data matters most

User Query

"What agency does AI product development?"

Retrieval

LLM searches its index for relevant documents

Entity Resolution

YOUR DATA MATTERS HERE

Matches structured data to resolve who/what entities are

Context Assembly

YOUR DATA MATTERS HERE

Ranks and combines sources by authority signals

Response Generation

Synthesizes answer with citations from top-ranked sources

Steps 3 and 4 are where knowledge graphs win.

The LLM Visibility Stack

THE LLM VISIBILITY STACK

Five layers that determine whether AI search engines cite your content

L5Multi-Source Authority

Cross-site entity linking, sameAs connections, consistent signals across domains

L4Machine-Readable Context

llms.txt, robots.txt AI directives, FAQ schema, structured data signals

L3Content Signals

Authoritative citations, statistical claims, quotable passages, topical depth

L2Knowledge Graph

Entity definitions, @graph enrichment, topic clusters, relationship mapping

L1Entity Foundation

DefinedTerm schemas, Organization identity, Person profiles, sameAs URIs

Each layer amplifies the ones below it.

Layer 1: Entity Foundation

This is where most teams need to start and where most teams skip to Layer 4 instead. The entity foundation defines the core things your site is about using schema.org types:

Organization with complete identity (name, description, url, logo, sameAs to social profiles)
Person entities for key authors with knowsAbout, jobTitle, and worksFor connections
DefinedTerm for proprietary methodologies or frameworks you have created
SoftwareApplication for products
Service for service offerings

Each entity gets a stable @id URI (like https://www.pixelmojo.io/#thread-based-engineering) that can be referenced from anywhere in your schema.

Layer 2: Knowledge Graph

This layer connects the entities from Layer 1 into a web of meaning. It is not enough to define entities in isolation. You need to express:

Which entities are about which blog posts (and vice versa)
Which entities mention other entities
Which entities are related to each other
How entities across different domains connect via sameAs

Layers 3-5: Content, Machine-Readable Context, Multi-Source Authority

These layers build on the entity foundation. We covered them in detail across our GEO series:

Layer 3 (Content Signals): Authoritative citations, statistical claims, and topical depth. See our GEO playbook.
Layer 4 (Machine-Readable Context): llms.txt, AI crawler directives, and FAQ schema. See our llms.txt implementation guide.
Layer 5 (Multi-Source Authority): Cross-site entity linking, which we will cover in the architecture section below.

What We Built: Architecture Walkthrough

Our knowledge graph connects two domains: pixelmojo.io (the agency) and lloydpilapil.com (the founder's personal site). Here is how the pieces fit together.

CROSS-SITE ENTITY LINKING

Two domains, one knowledge graph, connected via sameAs and shared @id URIs

pixelmojo.io

Organization

Services (6)

Products (2)

Methodologies (3)

lloydpilapil.com

Person

KnowsAbout (7)

WorksFor

SameAs URIs

Shared Entities

Thread-Based Engineering

@type: DefinedTerm

AX Design

@type: DefinedTerm

GEO

@type: DefinedTerm

Lakbay AI

@type: SoftwareApplication

The Entity Registry

The core of the system is a TypeScript file (knowledge-graph.ts) that defines every entity as a structured object:

export const entities: Record<string, Entity> = {
  'thread-based-engineering': {
    id: 'thread-based-engineering',
    name: 'Thread-Based Engineering',
    type: 'Methodology',
    schemaType: 'DefinedTerm',
    description: 'Productivity and governance framework...',
    relatedEntities: ['ai-technical-debt', 'claude-code-development'],
    primaryPosts: ['thread-based-engineering-scaling-ai-development'],
    mentionedInPosts: ['vibe-coding-technical-debt-crisis-2026-2027'],
    keywords: ['thread-based-engineering', 'ai-governance'],
  },
  // ... 17 more entities
}

Each entity has:

A stable @id that becomes its URI in JSON-LD
primaryPosts and mentionedInPosts for manual relationship overrides
keywords that enable automatic tag-based matching (more on this below)
relatedEntities for the relationship graph

The Schema Engine

A resolver (schema-engine.ts) takes the entity registry and converts it into JSON-LD output. It does three things:

Generates DefinedTerm fragments for the global @graph (included on every page)
Auto-matches blog posts to entities based on tag overlap (2+ keyword matches = about, 1 match = mentions)
Enriches Article schemas with about and mentions references

Cross-Site Linking

The personal site (lloydpilapil.com) has its own structured data with a Person entity that includes:

{
  "@type": "Person",
  "sameAs": [
    "https://www.linkedin.com/in/lloydpilapil",
    "https://www.pixelmojo.io/author/lloyd-pilapil"
  ],
  "worksFor": {
    "@type": "Organization",
    "@id": "https://www.pixelmojo.io/#organization"
  },
  "knowsAbout": [
    "Thread-Based Engineering",
    "Generative Engine Optimization",
    "AX Design"
  ]
}

This is bidirectional: the pixelmojo.io Article schemas include an author reference that links back to the same Person @id. Two domains, one entity graph.

Dynamic llms.txt Integration

The Results: Real Analytics

We deployed the cross-site knowledge graph on February 12, 2026. Here is what GA4 showed in the first seven days.

WHAT THE DATA SHOWS

GA4 analytics for pixelmojo.io, 7 days after deploying the knowledge graph

Active Users

248+18.1%

Last 7 days vs previous period

New Users

229+21.8%

Last 7 days vs previous period

Referral Traffic

6Above forecast

GA4 predicted 1-5 users; actual exceeded range

Key Events (Direct)

193x predicted

GA4 predicted 0-6 events; actual was 19

Let us be specific about what these numbers mean:

What went up:

Active users increased 18.1% over the previous 7-day period (248 vs 210)
New users increased 21.8% (229 vs 188)
Referral traffic spiked above GA4's forecasted range: GA4 predicted 1-5 referral users, we got 6
Key events through the Direct channel hit 19, where GA4 predicted 0-6 (over 3x the upper bound of the forecast)

What we cannot claim:

We cannot isolate the knowledge graph's impact from other changes we made during the same period
We published new content and made technical SEO updates simultaneously
Referral and key event anomalies align with the deployment timeline, but that is correlation, not proof

We will continue monitoring and will update this post as the data matures.

Free Tool

Can AI Bots Find Your Content?

Test how GPTBot, Claude, Perplexity, and 11 other bots see your website. Checks robots.txt, structured data, llms.txt, and content accessibility.

Try the AI Crawl Checker

What Did Not Work

Build journal, Feb 2026

Other things that did not work as expected:

Over-connecting entities. Our first version had every entity related to every other entity. This diluted the signal. We trimmed relationships to only meaningful connections (5-6 per entity, not 12).
Manual post mapping. We started by manually adding every blog slug to entity primaryPosts arrays. This became stale within a week. The auto-matching system based on tag keywords was the fix.
Expecting immediate indexing. LLMs do not re-index on your schedule. Some of our entities may not be in any LLM's index yet. This is infrastructure that compounds, not a launch-day win.

The Playbook: How to Build Your Own Knowledge Graph

If you want to replicate this approach, here is the sequence that worked for us:

Audit your entities. List every methodology, product, service, and project that your business is known for. If it has a name and could be a Wikipedia article, it is probably an entity.
Choose your schema types. Organization and Person are mandatory. Add DefinedTerm for frameworks, SoftwareApplication for products, Service for offerings. Avoid types that require data you do not have (aggregateRating without real reviews, for example).
Create stable @id URIs. Each entity needs a permanent identifier. We use the pattern https://www.pixelmojo.io/#entity-id. These URIs do not need to resolve to a page. They are identifiers, not URLs.
Define relationships. Map which entities relate to which. Keep it honest: 3-6 related entities per item is more useful than connecting everything to everything.
Build the auto-matcher. Define keywords for each entity. When a blog post's tags match 2+ keywords, the post automatically gets connected as about that entity. One match = mentions. This keeps the system alive without manual maintenance.
Inject into every page. The @graph array with all entity DefinedTerm schemas should appear on every page, not just the homepage. Article pages additionally get about and mentions properties linking to relevant entities.
Connect your personal brand. If a founder or key person has their own domain, add sameAs and worksFor links that bridge the two sites. This creates the multi-source authority signal from Layer 5 of the stack.
Wire into llms.txt. If you have a dynamic llms.txt (and you should), generate the entity section from the same source of truth. One data source, multiple outputs.

Where This Goes Next

The knowledge graph is infrastructure, not a finished product. Here is what we are building on top of it:

Analytics correlation tracking. We are building dashboards to track which entities appear in AI search citations over time, and correlating that with the GA4 anomaly data.
Automated entity expansion. When we launch new products or define new methodologies, the knowledge graph should grow automatically from the same TypeScript source of truth.
Cross-platform verification. Testing whether the same entities get cited differently across ChatGPT, Perplexity, Claude, and Gemini, and adjusting the schema based on what each model responds to.

Continue the AI Search Playbook

SEO vs GEO vs AEO: The Complete Guide

Understand the three disciplines competing for AI search visibility

The GEO Playbook: Get Cited by ChatGPT, Perplexity, and Claude

Tactical guide to the content and technical signals that drive AI citations

Your llms.txt Is Already Stale. Here's How to Fix It.

Build a dynamic llms.txt that updates automatically with your content

LLMs Find You Through Entities, Not Keywords

TL;DR

The Problem: Why Traditional SEO Alone Falls Short for LLMs

How LLMs Actually Discover Sources

HOW LLMS DISCOVER SOURCES

The LLM Visibility Stack

THE LLM VISIBILITY STACK

Layer 1: Entity Foundation

Layer 2: Knowledge Graph

Layers 3-5: Content, Machine-Readable Context, Multi-Source Authority

What We Built: Architecture Walkthrough

CROSS-SITE ENTITY LINKING

The Entity Registry

The Schema Engine

Cross-Site Linking

Dynamic llms.txt Integration

The Results: Real Analytics

WHAT THE DATA SHOWS

Can AI Bots Find Your Content?

What Did Not Work

The Playbook: How to Build Your Own Knowledge Graph

Where This Goes Next

Continue the AI Search Playbook

About the Author

Lloyd Pilapil

Related Reading

LLMs Find You Through Entities, Not Keywords

TL;DR

The Problem: Why Traditional SEO Alone Falls Short for LLMs

How LLMs Actually Discover Sources

HOW LLMS DISCOVER SOURCES

The LLM Visibility Stack

THE LLM VISIBILITY STACK

Layer 1: Entity Foundation

Layer 2: Knowledge Graph

Layers 3-5: Content, Machine-Readable Context, Multi-Source Authority

What We Built: Architecture Walkthrough

CROSS-SITE ENTITY LINKING

The Entity Registry

The Schema Engine

Cross-Site Linking

Dynamic llms.txt Integration

The Results: Real Analytics

WHAT THE DATA SHOWS

Can AI Bots Find Your Content?

What Did Not Work

The Playbook: How to Build Your Own Knowledge Graph

Where This Goes Next

Continue the AI Search Playbook

About the Author

Lloyd Pilapil

Related Reading

LLMs Find You Through Entities, Not Keywords

TL;DR

The Problem: Why Traditional SEO Alone Falls Short for LLMs

How LLMs Actually Discover Sources

HOW LLMS DISCOVER SOURCES

The LLM Visibility Stack

THE LLM VISIBILITY STACK

Layer 1: Entity Foundation

Layer 2: Knowledge Graph

Layers 3-5: Content, Machine-Readable Context, Multi-Source Authority

What We Built: Architecture Walkthrough

CROSS-SITE ENTITY LINKING

The Entity Registry

The Schema Engine

Cross-Site Linking

Dynamic llms.txt Integration

The Results: Real Analytics

WHAT THE DATA SHOWS

Can AI Bots Find Your Content?

What Did Not Work

The Playbook: How to Build Your Own Knowledge Graph

Where This Goes Next

Continue the AI Search Playbook

About the Author

Lloyd Pilapil

Related Reading

LLMs Find You Through Entities, Not Keywords

TL;DR