A practical guide to Answer Engine Optimization

What answer engines extract from a page, what they skip, and how to be the source they cite.

BY SALESFLYER EDITORIALBLOGAPR 28, 2026 · UPDATED MAY 2, 202614 MIN

On this page

01What AEO actually is
02Why this is the work now
03Schema is not decoration
04Write the lede for the extractor
05The full GEO findings
06FAQ blocks, comparison tables, and freshness
07AI bot access and machine‑readable files
08Sins to avoid
09What we ship by default

What AEO actually is

Answer Engine Optimization is the work of getting your page cited inside an AI‑generated answer, not ranked next to it. The surfaces that matter are Google's AI Overviews, ChatGPT's web search, Perplexity, Gemini, and Copilot. Each one decides what to extract from a page based on how the page is structured, how it's sourced, and whether the answer is sitting near the top or buried under a hero carousel and three feature accordions.

The shorthand we use is that SEO ranks pages and AEO extracts passages. A page that ranks fourth on Google can still be cited inside the AI Overview if its definition block is cleaner than the page that ranks first. The corollary is uncomfortable for teams that worked hard on rank: the rules for what gets cited are not the same as the rules for what gets ranked, and pretending they are is how a competitor with fewer backlinks and a tidier first paragraph edges you out of the answer.

AEO is also not a checklist of one‑time fixes. It's a posture you take across every page you publish. The teams that get cited regularly are the ones whose pages emit JSON‑LD by default, lead with the answer, cite their numbers, and ship a comparison table when the query implies one. The teams that don't are the ones running a quarterly AEO sweep that gets deprioritized the first time a paid campaign needs the calendar slot.

Why this is the work now

Most AEO advice on the open web is vibes. The Princeton GEO researchers ran nine optimization methods against a benchmark of real generative‑answer queries and measured what actually moves a page's visibility. Three of the numbers from that paper are worth pinning to the wall before any of the rest of this guide makes sense.

+115.1%

visibility lift for a site ranked fifth that started citing sources

Aggarwal et al., Princeton GEO (KDD 2024)

+34%

visibility lift from adding statistics with sources

Aggarwal et al., Princeton GEO (KDD 2024)

−9%

visibility loss from keyword stuffing, the inverse of the old SEO instinct

Aggarwal et al., Princeton GEO (KDD 2024)

The first number is the most useful one in the paper. It says the citation gap between rank one and rank five is not destiny. A page that adds sources can leapfrog a higher‑ranked page that didn't, by more than a hundred percent of visibility. For any team with real numbers in their decks and no citations on their landing pages, that is the cheapest AEO win on the table.

The second number is the one most teams underweight. Adding statistics with sources lifted visibility by roughly a third on its own, before any other change. Most pricing pages do not include a single hard number with an attribution. Most case studies paraphrase a customer figure instead of quoting it. Both leave a measurable amount of citation lift on the floor.

The third number is the one that surprises old SEO hands. Keyword stuffing was a slow‑decay sin in classic SEO. In AEO it is a fast‑acting one. The model recognizes the pattern and downweights the page. Anything that reads like a writer trying to game a search engine reads to the answer engine the same way, and the answer engine has the receipts.

These numbers also explain why AEO became a category at all. Discovery moved. Close to half of Google's queries now trigger an AI Overview, per BrightEdge's tracker, and ChatGPT crossed 900 million weekly active users in February 2026. When that much attention sits inside generative answers, the rules above stop being a curiosity and start being the playbook.

Schema is not decoration

Structured data is how an answer engine knows what's canonical, what's supporting, and what's narrative. A page without schema isn't penalized in some abstract SEO score. It's harder to extract from, which is now the whole game on AI Overviews and most AI search results. Missing schema is the AEO equivalent of shipping a page with no headings.

The schemas that actually carry weight on landing pages are smaller than most teams think. You don't need every Schema.org type. You need a handful, applied consistently, and you need them on every publish.

The schemas worth shipping

Article or BlogPosting on every editorial page, with a real author, a publish date, and an updated date that actually moves when you edit the page.
FAQPage on any page with a FAQ block, with the questions phrased the way real people ask them and the answers self‑contained enough to extract without context.
HowTo on any step‑by‑step page, with each step labeled separately and a clear outcome at the end.
Product on every product or pricing page, with a price, an offer, and a real availability flag instead of placeholder text.
BreadcrumbList on every page that sits inside a section, so engines know where the page lives in your information architecture.
Organization once, on the home page or root layout, with sameAs links to your real social and review profiles.

Where teams break this is in maintenance, not authoring. The first FAQ block ships with FAQPage schema. The second FAQ ships without, because the page was a quick promo. The third one ships with the wrong question text because somebody edited the visible copy and forgot the JSON‑LD. By the end of a quarter the structured data on the site is a patchwork that no engine fully trusts. The fix is to emit schema as a side effect of composition rather than a hand‑maintained block, which is the entire point of treating composition as code instead of as a builder canvas.

Write the lede for the extractor

Extraction windows are short. The model that summarizes your page does not read it the way a human does. It pulls the first defensible chunk of text under the heading that matches the query and uses that. If your first paragraph is a brand statement, that's what gets cited. If your first paragraph defines the term and states the claim, you have a chance.

The rule we follow on Salesflyer is that the first 60 to 80 tokens of any explainer page have to carry the answer. Not the setup. Not the brand voice. The answer. The next paragraph can have the brand voice and the proof and the texture. The first one is for the extractor.

What an extractable lede looks like

Open with a definition. 'Answer Engine Optimization is the work of getting your page cited inside an AI‑generated answer, not ranked next to it.' That sentence works as a snippet on its own.
State the scope. Which surfaces, which queries, which audience. The extractor wants to know whether this passage is the right fit before it lifts it.
Give one concrete example or proof point. Specifics survive extraction better than adjectives. 'Cited by Perplexity for the query AEO checklist' carries more weight than 'highly trusted source.'
Stop. The hero section is not where you list every feature. It's where you earn the right to be quoted.

The same rule applies inside the body. Every H2 should be answerable in the first paragraph that follows it. If the H2 says 'How does FAQPage schema work,' the next paragraph explains how it works in two sentences before the supporting paragraphs go deeper. This sounds like a writing rule. It's actually an extraction rule. The pages that get quoted are the ones that put the answer where the model looks first.

The full GEO findings

The three headline numbers earlier in this guide are the strongest signals from the Princeton paper, but they are not the only ones. The researchers tested nine optimization methods in total. Six of them are worth knowing as a working AEO writer, because they tell you what to do when you've already added the obvious citations and need the next half‑percent.

Optimization method

Visibility change

How to apply

Cite sources

+29% on average; +115.1% for sites ranked fifth

Add authoritative references with links. Highest absolute gain for mid‑ranked pages.

Add statistics

+34%

Specific numbers with sources beat adjectives consistently.

Add quotations

Positive lift across queries

Expert quotes with name and title. Underused on most landing pages.

Authoritative tone

Positive, smaller than the top three

Confident, sourced writing without hedging filler.

Fluency optimization

Positive across most query types

Cleaner prose, fewer subordinate clauses, shorter sentences.

Keyword stuffing

−9%

Actively reduces visibility. The opposite of the old SEO instinct.

Cite sources

Visibility change

+29% on average; +115.1% for sites ranked fifth

How to apply

Add authoritative references with links. Highest absolute gain for mid‑ranked pages.

Add statistics

Visibility change

+34%

How to apply

Specific numbers with sources beat adjectives consistently.

Add quotations

Visibility change

Positive lift across queries

How to apply

Expert quotes with name and title. Underused on most landing pages.

Authoritative tone

Visibility change

Positive, smaller than the top three

How to apply

Confident, sourced writing without hedging filler.

Fluency optimization

Visibility change

Positive across most query types

How to apply

Cleaner prose, fewer subordinate clauses, shorter sentences.

Keyword stuffing

Visibility change

−9%

How to apply

Actively reduces visibility. The opposite of the old SEO instinct.

The middle three rows are the ones to lean on once your sourcing is clean. Adding expert quotations with name and title is underused on landing pages, especially in B2B SaaS, where founder commentary or analyst attribution is sitting in a Slack channel waiting to be quoted. Authoritative tone is what's left after you cut the hedging filler from your prose: 'we believe' and 'one of the leading' and 'in many cases' all signal to the model that you don't trust your own claim. Fluency optimization is unglamorous but quietly powerful, because the engine's underlying summarizer is itself a language model and it picks the cleaner sentence every time.

The honest takeaway from the paper is that the cheap wins are still on the table. Most landing pages we audit do not cite a single source. Most pricing pages do not include a hard number with an attribution. Most FAQs paraphrase a real claim instead of quoting it. A team that fixes those three habits on every publish will outrun a competitor that still treats AEO as a quarterly content audit.

FAQ blocks, comparison tables, and freshness

Three formats over‑perform on AEO surfaces by enough margin to be worth treating as defaults rather than enhancements. They are FAQ blocks with real schema, comparison tables on competitive queries, and visible freshness stamps that the writer can prove are accurate.

FAQ blocks the way they work in 2026

A FAQ block used to be a desperate SEO gambit shoved at the bottom of every page. The pattern still works, but the rules changed. The FAQPage markup tells the engine which sentence answers which question, so a clean answer to a real customer question reads as a self‑contained snippet. Vague answers are skipped. Hedged answers are skipped. Answers that don't actually answer the question are skipped. The FAQs that get cited are the ones that read like they were written for a curious customer who already searched for that exact phrasing.

Use the questions a real customer would type, not the questions a marketer wishes they would type. Pull them from sales calls, support tickets, and Reddit threads in your space.
Answer in two to four sentences. Long enough to be useful, short enough to extract cleanly.
Include the entity name in the answer where it makes sense. The model needs to know which product the answer applies to.
Match the FAQPage JSON‑LD to the visible copy exactly. If you edit one, edit the other.

Comparison tables on competitive queries

Comparison content is the highest‑citation format on AEO surfaces by a wide margin. AI Overviews quote tables. ChatGPT links to comparison pages. Perplexity often picks a vendor‑comparison page over the vendor's own home page when a user asks 'X vs Y.' If you sell into a market with named competitors, the comparison page is not optional. It's the page most likely to be cited for the highest‑intent query you have.

The catch is that the table has to be honest. AI systems penalize obviously biased comparisons. A row that says 'their product is bad and ours is great' gets ignored. A row that says 'Unbounce ships popups; Salesflyer does not' gets quoted, because it tells the reader something specific they can verify.

Freshness stamps that aren't lying

Every page should carry a 'Last updated' date the reader can see and the schema can read. Engines weight recency. They also notice when the date in the schema disagrees with the byline on the page, or when the date moves every week without any actual edit. Move the stamp when the content moves. Keep it still when the content is still. The credibility of the freshness signal lives in the consistency, not in the recency.

AI bot access and machine‑readable files

If your robots.txt blocks the AI crawlers, those platforms can't cite you. This is the easiest AEO mistake to make and the easiest to fix, and it's worth checking on every site you maintain because the default snippets that ship with most CMS templates are out of date.

The bots to allow on purpose

GPTBot and ChatGPT‑User for OpenAI
PerplexityBot for Perplexity
ClaudeBot and anthropic‑ai for Anthropic
Google‑Extended for Gemini and AI Overviews
Bingbot for Microsoft Copilot

The trade‑off is real. Allowing these bots means your content can be used in training and in retrieval. Blocking them keeps the content out of training and out of citation. If the worry is training, block CCBot (Common Crawl) and leave the search‑oriented bots alone. That gets you most of the privacy and none of the citation cost.

Two files agents look for

AI agents are starting to act as buyers. They evaluate vendors on behalf of users before a human ever lands on the home page. Two small files at the site root make that evaluation work in your favor instead of against you.

/llms.txt is a short context file that tells an LLM what your product does, who it's for, and where the canonical pages live. The spec is at llmstxt.org and the file is plain Markdown.
/pricing.md or /pricing.txt is a structured rendering of your pricing plans, units, limits, and contact paths. An agent that can't parse a JavaScript‑rendered pricing page will skip you. An agent that can read a Markdown file will quote it.

Neither file replaces the marketing page. They run alongside it. The page is for the human reader. The file is for the agent that reads twenty vendor pages in a minute and then hands a shortlist back to a human. If you sell into a buyer who increasingly delegates the early funnel to an assistant, those files are the cheapest competitive moat available right now.

Sins to avoid

Some AEO failures are common enough to call out by name. None of these are exotic. Most of them are habits left over from an older SEO instinct that doesn't translate.

Repeating your H1 verbatim as the JSON‑LD headline. The JSON‑LD headline is what the engine reads. The H1 is what the human reads. They don't have to be identical, and when they are, the engine often picks a worse summary than it would have written for itself.
Hiding the value proposition behind a hero carousel. The model reads markup order, not visual order. A carousel that animates between three slides ships as three sequential blocks of body copy, and only the first one gets treated as the lede.
Putting numbers that matter inside images. Pricing screenshots, customer counts on a banner, the chart on a results page. Engines don't OCR your hero image. If a number is load‑bearing, it has to be in HTML with a citation next to it.
Shipping a page with more CSS than HTML by weight. A page that's 80 percent layout machinery and 20 percent content is also 80 percent noise from an extraction standpoint. Cut the chrome.
Generic FAQs nobody actually asks. 'What is your product?' is not a real customer question, and answering it gets you no citations. Pull the questions from sales calls, support tickets, and the Reddit threads where your buyers complain.
Stale 'Last updated' stamps that auto‑move on every deploy. The model notices, and so do readers who happen to check the same page twice. Move the stamp when the content moves, and only then.
Gating the most authoritative content. The case study with the best numbers, the methodology page with the real data, the comparison page with the honest table. Those are the pages most likely to be cited. A login wall in front of any of them is a self‑inflicted citation cap.
Treating AEO as a quarterly sweep. Citations decay. Statistics get dated. New schemas matter. A team that runs a once‑a‑quarter audit will lose ground to a team that ships AEO defaults on every publish.

What we ship by default

Salesflyer composes pages with AEO defaults baked in. JSON‑LD is emitted alongside the layout, not by a plugin. FAQPage and HowTo schemas are written when the page contains those blocks. The 'Last updated' stamp moves when the content moves and stays still when it doesn't. Robots.txt allows the search‑oriented AI bots out of the box. The /llms.txt and /pricing.md files are generated from the brand kernel and the pricing config so they stay current without a hand edit.

None of those are checkboxes a marketer has to remember. They are properties of the kernel that composes the page. If you want to see what that looks like in production, the demo at /demo walks through a publish and shows the JSON‑LD, the schema scoring, and the AEO posture for the page that just shipped.

Frequently asked

What's the difference between SEO and AEO?

SEO is the practice of getting a page ranked. AEO is the practice of getting a page cited inside an AI answer. The two overlap (good headings and good schema help both) but the rules are not identical. A page can rank fourth on Google and still be the source the AI Overview quotes, if its definition block is cleaner than the page that ranks first. SEO optimizes the link. AEO optimizes the passage.

Does AEO replace SEO?

No. Traditional ranking signals still matter, especially for AI Overviews, which lean heavily on top‑ranked pages. AEO sits on top of SEO, not next to it. The order is: ship the SEO basics (crawlable, fast, well‑linked), then add the AEO defaults (schema, extractable lede, cited statistics, FAQ blocks, freshness stamps, allowed AI bots). Skipping either layer leaves citations on the table.

Which schemas matter most for landing pages?

Article or BlogPosting on editorial pages, FAQPage on any page with a FAQ block, HowTo on step‑by‑step pages, Product on pricing or product pages, BreadcrumbList on any page inside a section, and Organization once on the root layout with sameAs links to your real profiles. You don't need the long tail of Schema.org types. You need those six applied consistently on every publish.

Why do FAQ blocks still work?

Because the FAQPage markup tells the engine exactly which sentence answers which question, and a clean self‑contained answer to a real customer question reads as a usable snippet on its own. The format hasn't changed in five years. The reason it works has. It used to be an SEO trick. Now it's an extraction surface.

Should I block AI bots in robots.txt?

Only if you're willing to pay the citation cost. Blocking GPTBot, PerplexityBot, ClaudeBot, and Google‑Extended keeps your content out of those platforms' training and out of their answers. If your goal is to be cited, leave the search‑oriented bots alone. If the concern is training data specifically, block CCBot (Common Crawl) and leave the rest. Most teams that block AI bots did it by accident, copying an old robots.txt template.

How do I measure whether AEO is working?

Pick 10 to 20 of your highest‑intent queries and check them monthly across Google AI Overviews, ChatGPT with web search, and Perplexity. Record whether you were cited, who else was, and which page got picked. The trend month over month is the metric. Tools like Otterly, Peec AI, and ZipTie automate this if you'd rather not run it by hand. Either way, the metric is share of citation, not share of clicks.

What content format gets cited most?

Comparison content over‑performs by a wide margin. 'X vs Y' pages, alternative pages, and roundups draw a disproportionate share of citations on AI Overviews and on ChatGPT. Definitive guides and original research come next. Generic blog posts without structure or original numbers do worst. If your team has limited time, build comparison and original‑data pages first.

Where can I see AEO posture in production?

Salesflyer ships JSON‑LD, schema scoring, and AEO defaults on every publish. The brand kernel handles freshness stamps, FAQPage and HowTo schemas, and the /llms.txt plus /pricing.md files agents read on your behalf. Book a walk‑through at /demo and we'll show you the AEO scorecard for a live page.

Rather see it than read about it?

TAKE THE TOUR

More blog

Blog

What is AI‑native marketing? A working definition

READ →

Blog

Shipping landing pages like product releases

READ →