Agent SEO — State of the Discipline (April 2026)

The question

Founder is rethinking content strategy. Today the engine is X-first: short posts, fast cadence, optimized for human social discovery. He’s considering inverting it to blog-first on raydata.co with explicit optimization for LLM citation — content that becomes the canonical source Claude / ChatGPT / Perplexity / Gemini / Google AI Mode pulls from. Thesis: agents are a meaningful share of audience; being cited by an LLM may be more valuable than ranking on Google for a human. He asked: (a) what is the industry calling this discipline, and (b) what techniques are documented as working?

The naming game

Five terms are in active circulation. None has won outright; the field has not consolidated.

GEO — Generative Engine Optimization. Currently the strongest candidate. Originated in the Princeton/IIT-Delhi paper (Aggarwal et al., arXiv Nov 2023; published at KDD 2024). Adopted by AthenaHQ, Profound, Peec.ai, Frase, Jasper, and most YC-era vendors. Conference talks at SMX, BrightonSEO, and MozCon now use “GEO” as the default. Wikipedia entry is titled “Generative engine optimization.”
AEO — Answer Engine Optimization. Older term, predates LLMs (originally for featured snippets and voice search). Now repositioned to mean “optimize for AI Overviews and answer cards.” Used heavily by Scrunch, eMarketer, and traditional SEO agencies who want continuity with their existing AEO practice. Often paired with GEO (“AEO/GEO”).
LLMO — Large Language Model Optimization. Niche. Used by technical practitioners who want to distinguish getting cited inside an LLM response from appearing in an answer card. Treated as a subset of GEO.
AI SEO / AIO — AI Optimization. Catchall used by generalist agencies who don’t want to commit. Google’s Danny Sullivan and VP Nick Fox both lean here, arguing it’s all just SEO.
Search Everywhere Optimization. Rand Fishkin / Ashley Liddell’s preferred term — explicitly anti-acronym, frames the work as multi-surface visibility (Reddit, YouTube, TikTok, AI tools, Google) rather than LLM-specific.

Winner-so-far: GEO. It has the academic anchor, the Wikipedia entry, the vendor tooling, and the search volume. AEO is a respectable second and the two are increasingly used as a compound. LLMO and AIO are losing.

The technique catalog

Twelve documented techniques, organized by category. Evidence levels: Rigorous (controlled study), Anecdotal (vendor case studies, blog claims), Speculative (proposed but untested at scale).

Content structure

Answer capsules / front-loaded answer. A question-format H2 followed immediately by a 40–80 word direct answer, then supporting detail. Why: extractable units map cleanly to what LLMs lift into their responses. Anecdotal — strong vendor consensus, no controlled study.
Quotation Addition. Embedding direct quotes from authorities into your content. Princeton GEO measured +41% visibility (Position-Adjusted Word Count) on GEO-bench. Rigorous.
Statistics Addition. Adding concrete numbers, percentages, dated figures. Princeton: +37–40%. Strongest in Law & Government, Debate, Opinion domains. Rigorous.

Citation density and outbound linking

Cite Sources. Linking out to primary sources. Princeton: +30–40% overall, and a striking +115% for rank-5 search results (vs −30% for rank-1) — meaning citation-density helps underdogs disproportionately. Rigorous.
Authoritative tone. Definitive claims, named frameworks, no hedging. Princeton: ~+10%. Smaller lift than expected. Rigorous.

Structured data

Schema.org markup — FAQPage, Article, HowTo, Organization, Person. Vendor claim (geneo, Snezzi, others): pages with FAQPage schema are “3.2x more likely” to appear in AI responses. Anecdotal — the 3.2x figure traces to a single vendor study, not independently replicated.
JSON-LD author entity. Named author with Person schema, linked to ORCID/LinkedIn/About page. Aligns with E-E-A-T signals Google already weights. Anecdotal but well-grounded.

Direct LLM-control conventions

llms.txt. Markdown file at site root pointing crawlers to canonical URLs. Adopted by Anthropic, Stripe, Zapier, Cloudflare, Mintlify, and most dev-tool companies. But — 2025 CDN audits show GPTBot, ClaudeBot, and PerplexityBot do NOT actually fetch llms.txt. Useful primarily for IDE-side agents (Cursor, Aider, Continue) and human-prompted retrieval. Speculative on production-LLM impact; real for dev-tool surfaces.
AI-crawler robots.txt rules. Allow GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot-Extended explicitly. Standard hygiene; not optimization. Established.

Tone and framing

Named concepts / citation-bait. Coining a term (“answer capsule,” “data observability,” “vibe coding”) so LLMs cite you whenever the term appears in a query. The most under-rated lever in the catalog. Anecdotal — but Princeton’s “Quotation Addition” finding is the closest controlled analog.

Distribution

Own-domain canonical, syndicate to high-authority surfaces. LLMs disproportionately cite Reddit, Wikipedia, GitHub, Stack Overflow, YouTube transcripts, and a small set of high-authority publications. Cross-posting summaries with canonical links back to your domain is the consensus play. Anecdotal — supported by SparkToro/Datos clickstream data on AI tool source mix.
Reddit-and-forum participation. Several vendor studies (Profound, Scrunch) report Reddit threads as one of the top three citation sources for ChatGPT. Authentic participation, not spam, is the consensus tactic. Anecdotal but consistent across multiple sources.

What we actually know works (evidence base)

The Princeton/IIT-Delhi paper (Aggarwal, Murahari, Rajpurohit, Kalyan, Narasimhan, Deshpande — KDD 2024) is still the only rigorous, replicable study in the public record. The setup: GEO-bench (10,000 queries, 8K train / 1K validation / 1K test) tested against GPT-3.5-turbo + Google Search top-5 sources, with secondary validation on Perplexity. Headline: up to +40% visibility in generative responses from the right combination of techniques.

The honest read: the strongest techniques are surprisingly old-school content craft — quote authorities, cite numbers, link out to primary sources. The weakest are the things vendors love to sell — keyword stuffing (declined), unique-words optimization (no effect). The Authoritative-tone lift was milder than expected (~10%), suggesting “write definitively” matters less than “write with receipts.”

The most interesting finding for RDCO: rank-5 sources gained +115% from Cite Sources while rank-1 sources LOST 30%. GEO is asymmetrically a tool for underdogs. For a domain like raydata.co with no organic authority yet, this is the buy signal — citation-dense, statistic-heavy writing can punch above the domain’s PageRank weight in a way pure-SEO cannot.

Everything else in the catalog is anecdotal: vendor case studies, single-site experiments, before/after screenshots from agencies selling the service. Useful, but not evidence in the Princeton sense. The “FAQ schema = 3.2x lift” claim circulating across the geneo / Snezzi / Pixelmojo blogs all traces to one unreplicated vendor study.

Tooling landscape

Five tools worth knowing. All paid; nothing meaningfully open-source yet.

Profound ($99 / $399 / Enterprise). Enterprise-grade visibility tracking across 10+ AI engines (ChatGPT, Claude, Perplexity, Gemini, Copilot, DeepSeek, Grok, Meta AI, Google AI Mode, AI Overviews). Strongest on monitoring and share-of-answer analytics.
AthenaHQ (YC-backed). Positions as the “action” platform — runs an active optimization loop, not just monitoring. Their own self-published 30-day test claimed +45% answer share vs Peec.ai’s +8% and Profound’s −1%; treat the numbers as marketing, but the active-optimization framing is differentiated.
Peec.ai (Berlin, Feb 2025 launch). Pure analytics — monitoring and brand-mention tracking across AI engines. Cleanest UX for early-stage teams.
Otterly ($39/mo entry). Beginner-friendly mention tracking across ChatGPT, Google AI Overviews, Perplexity. Closest to “Mention.com for AI.”
Scrunch / Adobe LLM Optimizer / Semrush AI Visibility / Bluefish. Established SEO incumbents bolting on AI-visibility modules. Useful if you already pay one of them; otherwise skip.

Signal vs noise. Monitoring (Profound, Peec, Otterly) is real and useful — you cannot optimize what you cannot measure. “Optimization” tools (AthenaHQ, Scrunch) are mostly automated content-rewrite loops; the optimization work itself is still mostly editorial judgment. No open-source option of note yet — a gap.

The credible skeptic position

Two skeptics worth steelmanning.

Rand Fishkin (SparkToro). His clickstream data with Datos shows AI search interest is inflated 10–100x relative to actual usage; traditional Google search is not in measurable decline; at current doubling, AI tools rival Google only in 6–10 years. He rejects the GEO/AEO/LLMO acronym sprawl and prefers “Search Everywhere Optimization.” His critique: the discipline is being sold faster than it is being used. If most queries still go to Google and Google itself is increasingly LLM-injected, the right move is good content widely distributed, not LLM-citation-optimized content.

Google’s Danny Sullivan and Nick Fox. Their position: optimizing for AI search is the same as optimizing for traditional search. The crawl, the index, the ranking signals, E-E-A-T — same machinery whether the surface is a blue link, an AI Overview, or a ChatGPT citation. Partly self-interest, but empirically defensible: AI Overviews pull from the same retrieval index as classic search, and Perplexity’s web search uses Bing. The GEO techniques that work — cite authorities, add stats, link out, name concepts, structured data — are recognizably 2018-era SEO craft with a new label.

The synthesis: GEO is real, but 80% rebranded SEO with three knobs turned up (citation density, structured data, named-entity authority) and 20% genuinely new (llms.txt, share-of-answer monitoring). A founder who does excellent content with discipline captures most of the GEO upside without buying tooling. Risk of going all-in on “agent SEO as a category”: paying tooling tax and writing for the wrong audience while the actual usage curve is still years from Google parity.

Synthesis for RDCO

Recommendation: do GEO as a constraint on existing content craft — not as a new identity. The “agent SEO” framing is mostly correct as a thesis, mostly oversold as a product category, and the techniques that move the needle (Princeton’s findings) are writing-quality upgrades you should be doing regardless. Don’t reposition raydata.co as “the agent-SEO blog.” Reposition it as opinionated reference content on data quality, AI agents, and the data-platform thesis — and bake the GEO-effective patterns into the editorial standard.

The asymmetry in the Princeton data is the strategic insight: citation-dense writing helps low-authority domains 3–5x more than high-authority ones. raydata.co has effectively zero domain authority today — that’s not a disadvantage, it’s the exact regime where GEO’s underdog effect is largest. The leverage move: ship 15–25 reference pieces (~2,000–3,000 words) that name concepts already developed in the vault — data-quality frameworks, the agent-deployer thesis, macro-vs-micro data observability — with heavy quotation, statistics, and outbound primary citation. Each piece is a coined-concept lure: when an LLM is asked about the concept, raydata.co is the canonical reference because no one else holds that handle.

This also resolves the burnout pattern. X-first cadence punishes you for not posting daily; reference content compounds — a piece written in April 2026 is still cited in October 2027. The vault is raw material for ~30 such pieces; the work is editorial conversion, not net-new generation.

Concrete first three moves:

Ship one keystone prototype. Best candidate: “macro vs micro data quality” or “the agent-deployer positioning.” 2,500 words, GEO pattern (question H2s, 40–80 word answer capsules, ≥3 sourced statistics, ≥5 outbound citations, FAQPage + Article + Person schema). Add llms.txt + author entity. Test article, not a brand pivot.
Stand up monitoring before optimization. One tool — Profound at $99 or Peec.ai equivalent — seeded with 20–30 brand and concept queries. Can’t optimize without a baseline. Run 60 days before judging.
Cadence: 2 reference pieces/month, not 8. Two production days per month. X stays as distribution surface for those pieces, not the primary channel. If 90 days of monitoring shows zero citation movement, the thesis is wrong for raydata.co’s current authority and the right move is Reddit/HN/podcast guesting until authority exists.

Skeptic case also right enough to honor: don’t buy the agency narrative, don’t buy more than one tool, don’t write for the LLM at the expense of the human. Princeton-validated techniques are good writing techniques — optimize for both audiences with the same content. GEO-specific surface area is a checklist, not a strategy.

Open follow-ups

What does raydata.co’s actual current crawl status look like in GPTBot / ClaudeBot / PerplexityBot logs? (Need server-log audit.)
Reddit participation as a GEO lever — is there a sustainable, on-brand way for the founder to be present in r/dataengineering, r/MachineLearning, r/Analytics without it becoming a time sink?
The “named concept” lever specifically — can we mine the vault for the 5–10 most original framings the founder has already articulated, and audit which already have search volume vs. which need to be created?
Is there an open-source GEO monitoring stack worth building? (Gap in the tooling landscape; could be a product, not a purchase.)
Long-tail AI engines: DeepSeek, Grok, Meta AI, Apple Intelligence — citation behavior likely diverges materially from ChatGPT/Claude/Perplexity. Worth a separate deep-research round in 6 months.