
We Took Our Own Advice. Here Is Everything That Happened.
Throughout this series, we presented the data (Part 1), the framework (Part 2), and the tactical playbook (Part 3). Now we are showing our work.
This is not a theoretical exercise. We implemented every GEO tactic we recommended on our own site, pixelmojo.io. 39 blog posts. 15 service and product pages. A custom AI referrer tracking system. A 262-line llms.txt file. FAQPage schema on 37 posts.
This post documents exactly what we built, what we changed, and the lessons we learned doing it. No cherry-picked metrics. No fabricated case studies. Just the honest implementation story of a small AI product studio trying to get cited by the same platforms we build products on.
The Audit: Where We Started vs Where We Ended
Before implementing GEO, we ran a full audit of our site against the checklist from Part 3. The results were humbling. We had been publishing content for over a year without any AI-specific optimization.
Basic robots.txt, no AI-specific rules
6 AI bots explicitly allowed, training bots blocked, key pages specified
Did not exist
262-line Markdown file with products, services, portfolio, URLs, use policy
Did not exist
Explicit corrections for common AI misinterpretations of our services
0 posts with FAQPage schema
37 of 39 posts with BlogFAQ components and FAQPage JSON-LD
Basic metadata only
Full Article JSON-LD on every post (author, publisher, dates, images)
No AI traffic tracking
Custom tracker detecting ChatGPT, Perplexity, Claude, Gemini, Copilot referrals
Mixed quality, no consistent format
Answer-first, TLDR boxes, comparison tables, 3,000-5,000 words per post
Standalone posts
4 multi-part series (4+4+3+2 = 13 interlinked posts)
The audit revealed eight categories where we had zero or partial coverage. None of this is unusual for a site that grew organically. But it meant that AI crawlers were either being blocked, or finding our content without the structural signals they need to select it as a citation source.
Change 1: Rebuilding robots.txt for the AI Crawler Ecosystem
Our original robots.txt was basic. It allowed Googlebot, disallowed admin pages, and that was about it. We had no AI-specific rules.
The rebuild took less than a day but required careful decisions. The key distinction we initially made: allow search bots, block training bots. That position evolved in May 2026 to allow search bots, selectively allow training bots that feed AI engines we want citing us, block data brokers.
Here is the current logic.
Browsing bots — always allowed (these power AI search results):
GPTBotandChatGPT-User(OpenAI search)ClaudeBotandClaude-Web(Anthropic)PerplexityBot(Perplexity search)GoogleOther(Google AI features)
Training bots we now allow (these feed AI engines we want citing us):
CCBot(Common Crawl, used by most LLMs as training input)Google-Extended(Gemini training)anthropic-ai(Anthropic training)Applebot-Extended(Apple Intelligence training)
Bots we still block (data brokers and adversarial scrapers):
cohere-ai,Meta-ExternalAgent,FacebookBot,Bytespider,Diffbot,Omgili
Why the shift: when we first published this, we blocked all training bots to protect IP. As Pixelmojo's citation strategy matured, we recognized brand recognition in future model generations matters more than blanket IP protection for a new brand still earning ground in AI answers. The four allowed training bots feed the AI engines we want to be cited by; the blocked bots either resell scraped data (Diffbot, Omgili) or train models we have no strategic alignment with.
We also explicitly surfaced our most important pages. Instead of just Allow: /, we listed the specific directories that contain our highest-value content: /blogs/, /services/, /projects/, /about/, /pricing/, and our product pages (/vector, /hive).
Lesson learned: We initially missed OAI-SearchBot and Claude-SearchBot (the dedicated search bots), only listing the general-purpose bots. These search-specific bots are arguably the most important ones to allow, since they are the ones that power citation results. Always check OpenAI's crawler documentation and Perplexity's bot guide for the latest bot names.
Change 2: Creating llms.txt from Scratch
Before we started, our llms.txt file did not exist. Now it is a 262-line Markdown document that serves as a curated guide for AI systems trying to understand what Pixelmojo does.
The file includes:
- Company overview: What we build, how we work, our architecture model
- Products: Vector (lead qualification) and Hive (AI co-workers) with pricing, features, and URLs
- Services: Sprint packages, retainers, and custom development
- Technology stack: Exact frameworks, databases, AI tools we use
- Portfolio: Six projects with descriptions and links
- Featured content: Our blog series organized by topic
- Use policy: Explicitly stating what is allowed (citing with attribution) and what is not (model training)
The use policy section is worth highlighting. We explicitly told AI systems:
Allowed: Citing content with attribution. Including in AI-assisted answers with source links. Indexing for search with attribution.
Not allowed: Model training or fine-tuning on our content. Verbatim republishing. Commercial redistribution.
This is not legally binding in the way robots.txt is technically respected. But it sets a clear expectation for how we want our content used.
We also created an ai-capabilities-factsheet.txt that directly addresses a problem we noticed: AI systems sometimes describe Pixelmojo as "primarily a marketing/branding agency," which is incomplete. The factsheet explicitly corrects this misinterpretation with structured data about our full software development, AI, and infrastructure capabilities.
Time investment: About 2 hours for llms.txt, 1 hour for the capabilities factsheet.
Change 3: FAQ Schema on (Almost) Every Post
This was the highest-ROI change we made. Adding FAQPage JSON-LD schema to our blog posts took approximately 15 minutes per post, and the research from Part 3 shows pages with FAQ schema are 3.2x more likely to appear in Google AI Overviews.
We added FAQ sections to 37 of 39 blog posts (the two exceptions are very short posts that did not have natural FAQ material). Each FAQ section includes 6 to 10 questions with comprehensive answers, wrapped in both a visual BlogFAQ component for readers and FAQPage JSON-LD in the frontmatter for machines.
Every post also has Article JSON-LD with author, publisher, dates, and image information. This reinforces the E-E-A-T signals that AI platforms use for trust evaluation.
| Schema Type | Coverage | Implementation Time | Impact |
|---|---|---|---|
| FAQPage JSON-LD | 37/39 posts (95%) | ~15 min per post | 3.2x more likely in AI Overviews |
| Article JSON-LD | 39/39 posts (100%) | Built into template | E-E-A-T signal for all AI platforms |
| Organization schema | Service pages | ~30 min total | Brand entity recognition |
The investment math is simple: 37 posts at 15 minutes each = approximately 9 hours of work. If even one additional AI citation per month leads to a qualified lead, the ROI is significant for a B2B service business.
Change 4: Series-Based Content Architecture
This was the most strategic change and the one we believe has the strongest compound effect.
Instead of publishing standalone blog posts on random topics, we reorganized our content into multi-part series. Each series covers a topic comprehensively across 2 to 4 interlinked posts.
Why series work for GEO: AI platforms treat interlinked series as topical authority clusters. Each post reinforces the others, creating compound citation potential.
Why series work for GEO: AI platforms treat interlinked content clusters as topical authority signals. When ChatGPT or Perplexity encounters a 4-part series on AI search optimization (this series), it infers that the publisher has deep expertise on the topic. A single standalone post on the same topic does not carry the same authority weight.
Each series follows a deliberate structure:
- Part 1: Present the problem with data (hooks the reader and the AI)
- Part 2: Framework or comparison (establishes authority through analysis)
- Part 3: Tactical playbook (provides actionable, citable content)
- Part 4: Case study or results (demonstrates real-world application)
This mirrors how AI research workflows function. When a user asks ChatGPT "how do I optimize for AI search?", the AI breaks that into sub-queries: "what is the problem?", "what are the options?", "how do I do it?", "who has done it?". A 4-part series that answers each of these sub-queries has a structural advantage over a single post trying to cover everything.
We currently have 4 series totaling 13 posts, plus 26 standalone posts covering other topics. The series posts consistently perform better for topic-relevant queries.
Change 5: Building Custom AI Referrer Tracking
You cannot optimize what you cannot measure. We built a custom AI referrer tracking system that detects visits from specific AI platforms and categorizes them separately from generic referral traffic.
The tracker matches document.referrer against known AI domains:
| Domain | Source Category | Notes |
|---|---|---|
| chatgpt.com | ChatGPT | Also matches chat.openai.com |
| perplexity.ai | Perplexity | Answer engine traffic |
| claude.ai | Claude | Anthropic assistant traffic |
| gemini.google.com | Gemini | Formerly bard.google.com |
| bing.com/chat | Copilot | Microsoft Copilot chat |
This data feeds into our analytics alongside standard traffic sources. Without it, AI-referred visits would be categorized as generic "Referral" or (worse) "Direct" traffic in GA4, invisible in aggregate reports.
Key insight from measurement: True AI influence on your traffic is likely 2-3x what analytics reports, according to Seer Interactive. Mobile app visits from ChatGPT, zero-click AI interactions where the AI summarizes your content without linking, and AI Overviews that do not pass referrer data all create blind spots. What you see in analytics is the floor, not the ceiling.
The Full Technical Stack
Here is the complete picture of what our GEO infrastructure looks like after implementation.
Crawl Layer
- robots.txt: 6 AI bots allowed, training bots blocked
- XML sitemap with accurate lastmod dates
- Key pages explicitly surfaced (/blogs, /services, /projects)
Discovery Layer
- llms.txt: 262 lines covering products, services, portfolio
- ai-capabilities-factsheet.txt: Corrects AI misinterpretations
- Structured URLs with descriptive slugs
Schema Layer
- FAQPage JSON-LD on 37/39 blog posts
- Article JSON-LD with author, publisher, dates
- Organization schema on service pages
Measurement Layer
- Custom AI referrer tracker (ChatGPT, Perplexity, Claude, Gemini)
- UTM parameter detection for chatgpt.com sources
- Session-level AI attribution in analytics
Every layer serves a distinct purpose. The crawl layer controls who can access our content. The discovery layer tells AI systems where our best content lives and how to interpret our business. The schema layer provides structured data that AI platforms can extract and present. The measurement layer tells us whether it is working.
What We Did Not Do (And Why)
Transparency matters. Here is what we deliberately chose not to implement:
We did not buy AI citation monitoring tools. Tools like Otterly.AI ($29/month) and Profound ($499/month) exist, but for a small studio, manual testing across ChatGPT, Perplexity, and Claude once per week gives us sufficient signal. We will invest in tooling when our AI referral volume justifies it.
We did not create content specifically for AI training. Some GEO guides recommend creating "training-optimized content" designed to influence how AI models represent your brand. We think this is premature. We focused on making our existing content excellent, well-structured, and well-sourced. If the content is genuinely useful for humans, it will be useful for AI citations.
We did not chase every AI platform. We focused on ChatGPT, Perplexity, and Google AI Overviews because they account for 90%+ of AI search referral traffic. Claude, Gemini, and Copilot are growing, but the citation mechanics are similar enough that optimizing for the top three covers the rest.
6 Lessons We Learned
After implementing GEO across our entire site, here are the lessons that mattered most.
Series beat standalone posts
AI platforms treat interlinked content clusters as topical authority. 4-part series generate more citations than 4 disconnected posts.
FAQ schema is the highest-ROI change
15 minutes per post to add FAQPage JSON-LD. 3.2x more likely to appear in AI Overviews. We added it to 37 posts.
llms.txt is cheap insurance
Took 2 hours to create. Zero confirmed AI platforms read it yet. But 844,000 sites adopted it, and the downside is zero.
Block training, allow search
Our robots.txt blocks training bots (CCBot, Google-Extended) but allows search bots (GPTBot, PerplexityBot). We keep our content ours while staying discoverable.
Track AI traffic separately from day one
Our custom AI referrer tracker differentiates ChatGPT, Perplexity, Claude, and Gemini traffic. Without this, AI visits disappear into "Direct" or "Referral."
Answer-first writing is a discipline
Putting the conclusion in paragraph 1 feels unnatural. But data-backed content with statistics in the opening gets cited 67% more by AI. We restructured every post.
Lesson 1: Series Beat Standalone Posts
We cannot overstate this. Our 13 series posts consistently outperform our 26 standalone posts for topic-relevant AI queries. The inter-linking between parts creates a topical authority cluster that AI platforms recognize and reward.
If you take one structural change from this post, it should be: reorganize your content into series.
Lesson 2: FAQ Schema Is the Highest-ROI Change
At 15 minutes per post, adding FAQPage JSON-LD to your existing content is the most time-efficient GEO optimization available. The 3.2x increase in AI Overview likelihood is a documented, measured effect. Every blog post you publish without FAQ schema is leaving citations on the table.
Lesson 3: Separate Search Bots from Training Bots
Your robots.txt should make a clear distinction between AI bots that power search results and AI bots that collect training data. You can maintain full visibility in AI search while protecting your content from being used to train competing models.
Lesson 4: Measure AI Traffic from Day One
We made the mistake of not implementing AI referrer tracking immediately. By the time we did, we had months of AI-referred visits categorized as generic referral or direct traffic. The data was not lost, but it was much harder to reconstruct. Set up tracking before you start optimizing.
Lesson 5: Answer-First Writing Is a Discipline
Restructuring every blog post to put the conclusion in the opening paragraph is uncomfortable. As writers, we are trained to build arguments gradually. But the data is clear: answer-first content gets cited 67% more often. Every post in this series starts with the key insight, then supports it with evidence.
Lesson 6: GEO Is Not a One-Time Project
This was our biggest misconception. We initially treated GEO as a project with a start and end date. In reality, it is an ongoing discipline. Content freshness matters for AI citations. New AI crawlers appear regularly. Platform citation mechanics evolve. Your robots.txt, llms.txt, and content need periodic review.
We now review our GEO implementation monthly. The checklist takes 30 minutes.
If you want to replicate this audit process without building custom tracking, our free AI visibility tools cover bot access testing, citation tracking, Reddit monitoring, and llms.txt validation in a single workflow.
The Honest Assessment
We are not going to claim that GEO transformed our business overnight. That would be dishonest. Here is what we can say with confidence:
What we know worked: Our content is now structurally optimized for AI extraction. Every post has FAQ schema, answer-first structure, source citations, and verifiable statistics. Our robots.txt correctly manages AI crawler access. Our llms.txt gives AI systems a curated guide to our best content.
What we are still measuring: The long-term impact on AI referral volume and quality. Attribution is genuinely difficult, and as we noted throughout this series, true AI influence is 2-3x what analytics shows. We are tracking trends rather than absolute numbers.
What we would do differently: Start with measurement. We optimized content before we had proper AI traffic tracking in place, which meant we could not cleanly measure the before-and-after impact. If we were starting today, week 1 would be AI referrer tracking, week 2 would be technical infrastructure, and weeks 3-4 would be content optimization.
The case studies from Part 3 show what is possible: Go Fish Digital saw 3x lead growth in 3 months. The Rank Masters saw 8,337% ChatGPT referral growth. Seer Interactive documented 15.9% conversion rates from ChatGPT traffic versus 1.76% from Google organic.
We expect similar directional results as our implementation matures. The fundamentals are the same: make your content genuinely excellent, structurally accessible to AI, and measurable.
GEO Implementation: Questions Readers Ask
Common questions about this topic, answered.
The Complete AI Search Playbook
This concludes our 4-part series. Here is the complete roadmap:
The shift from traditional SEO to AI search is not coming. It is here. The businesses that implement GEO now will build the topical authority and brand signals that compound over time. The fundamentals are simple: make your content genuinely useful, structurally accessible to AI, and measurable.
