AI SEO: how to get cited by ChatGPT, Perplexity, and Google AI
By William Walczak — CEO, Hiilite Creative Group. MBA (UBC), PhD candidate (UBC-Okanagan). Hiilite has helped BC-based SMEs grow since 2014. Author profile · LinkedIn · Google Scholar
TL;DR
LLMs like ChatGPT, Perplexity, and Google’s AI Overviews pull citations from sources they can identify, trust, and quote cleanly. To get cited: be a clear entity (consistent name, authorship, organization across the web), make your claims structured and liftable (TL;DRs, definitions, FAQs), back them with real sources, and show up in the same places where AI systems look for corroboration (Wikidata, authoritative outbound links, consistent NAP). This is not SEO dead; it is SEO evolving. The fundamentals still apply, but the reader is now a language model before it is a human.
What “AI SEO” actually means
Traditional SEO asks: will Google rank this page?
AI SEO asks: will a language model cite this page when someone asks a question it answers?
The difference matters. A ranked page sits at position 4 and waits to be clicked. A cited page gets its answer read aloud to the user, attributed to the source, and sometimes linked. The surface is different; the underlying trust signals are largely the same. Both reward authority, clarity, and verifiable claims.
“AI SEO” is sometimes called Answer Engine Optimization (AEO) or Generative Engine Optimization (GEO). The term that sticks least matters. What matters is that the models pulling answers from the web have a specific set of preferences, and those preferences are learnable.
Why LLMs cite what they cite
Large language models do not search the web the way Google does. When a tool like Perplexity or ChatGPT with Browse runs a query, it pulls a set of candidate pages and then reasons over them to produce an answer. The model chooses which passages to lift based on a few factors:
1. Entity clarity. The model needs to know who wrote the claim and whether that person or organization is trustworthy in this domain. If your author has a consistent, resolvable presence across your site, LinkedIn, Google Scholar, and Wikidata, the model can anchor the citation to a real entity. Anonymous pages or pages with vague authorship get skipped.
2. Clean, liftable language. Models prefer short, declarative sentences with one claim per line. Dense paragraphs, marketing hedges (“we believe that perhaps…”), and overly qualified academic writing get passed over. A sentence like “98.2% of Canadian employer businesses were small businesses as of December 2024, according to ISED” is immediately citable. A paragraph that eventually arrives at the same fact is not.
3. Structured content. TL;DRs, definition boxes, numbered steps, FAQ sections, and comparison tables are the formats models pull from most reliably. Structured data (schema markup) makes the structure machine-readable and removes ambiguity about what a block of content is.
4. Cross-web corroboration. If a claim appears on one site with no external references, the model treats it as unverified. If the same claim appears with an outbound link to a primary source (a study, a government stat, a recognized publication), and the same author appears on multiple authoritative platforms, the model’s confidence in the citation goes up.
5. Recency signals. Published and updated dates matter. A page last updated in 2021 on a fast-moving topic like AI tools gets deprioritized in favor of fresher sources.
The on-site playbook
Write for citation, not just for ranking
Every flagship page should open with a TL;DR block that states the main answer in three to five plain sentences. The rest of the page can add depth, nuance, and proof. Models will often pull the TL;DR verbatim.
Use definition-first structure when introducing any term. Lead with the definition, then expand. “Dynamic capability is the firm’s ability to sense, seize, and reconfigure resources in response to market change (Teece, 2007).” That is a citable sentence. “In this article we will explore what dynamic capability means for your business” is not.
Write short sentences. One claim per line. Active voice. If you are hedging, ask yourself whether the hedge is serving the reader or protecting you from being wrong.
Add structured data
Every article should carry Article schema with author (a Person with sameAs links to your author profiles), organization, datePublished, and dateModified. Every FAQ block should be wrapped in FAQPage schema. Definition pages should use DefinedTerm inside a DefinedTermSet.
This is not optional decoration. Schema markup is how you tell a language model — unambiguously — “this block is a question and this block is its answer.” Without it, the model has to infer structure from text formatting, which it sometimes gets wrong.
A minimal Person schema block for your author looks like this:
{
"@type": "Person",
"name": "William Walczak",
"url": "https://hiilite.com/team/william-walczak/",
"sameAs": [
"https://www.linkedin.com/in/williamwalczak/",
"https://scholar.google.ca/citations?user=tGCWfnsAAAAJ&hl=en"
],
"jobTitle": "CEO",
"worksFor": {
"@type": "Organization",
"name": "Hiilite Creative Group",
"url": "https://hiilite.com"
}
}
Add Organization schema to your homepage with the same sameAs fields pointing to your Google Business Profile, LinkedIn company page, and any authoritative directory listings.
Include a FAQ section on every major page
FAQ sections do two things. They match conversational query patterns (how LLM users phrase questions) and they give models a clean question-answer pair to lift. Three to five questions per page, answered in two to four sentences each, using the exact phrasing real users ask.
Use FAQPage schema around the block.
Link out to real sources
Models trust pages that cite their claims. Every statistic or assertion you make should carry a link to the primary source. A government stat, a peer-reviewed study, a recognized publication. This is not just good writing practice; it is a citation signal the model reads.
If you cannot find a real source for a claim, do not make the claim. A liftable sentence without a source is worth less than a well-sourced sentence the model can verify.
The off-site playbook
On-site optimization is necessary but not sufficient. The citation advantage goes to entities the model can recognize from multiple independent sources.
Create a Wikidata entry
Wikidata is the open-data backbone that many LLMs query for entity resolution. If your founder, company, or framework exists as a Wikidata entry, the model can resolve who you are without relying solely on your own site’s claims about itself. This is not vanity; it is entity infrastructure.
For a business owner: create a Q item for yourself with your profession, employer, education (UBC, SFU), and links to your Wikipedia article if one exists, your LinkedIn, and your authoritative publications. Cross-link the organization entry to the person entry.
Consistent NAP and cross-web presence
Name, Address, Phone must be identical across every directory and platform where your business appears. LLMs cross-reference entity information; inconsistencies lower confidence. This includes Google Business Profile, LinkedIn company page, your own site’s footer, and any industry directories.
The same applies to author identity. Your name, bio summary, and credentials should read consistently whether you appear on your own site, a guest article, a podcast interview transcript, or your university profile.
Digital PR and third-party mentions
Being cited in a recognized publication gives the model a corroborating source it can cross-reference against your own site. A single article in an industry journal, a quoted expert contribution to a news piece, or a guest post on a recognized platform can anchor your entity in a way no amount of on-site optimization achieves alone.
This is the same link-building logic as traditional SEO, with one difference: the quality and authority of the linking source matters more than the raw quantity of links.
Publish llms.txt
llms.txt is an emerging convention (analogous to robots.txt) that tells LLM crawlers which parts of your site you want indexed for AI citation purposes and which parts to skip. It is not yet universally honored, but it signals awareness and can improve indexing by AI crawlers that do read it. Place it at the site root.
How to measure AI citations
You cannot track AI citations in Google Analytics. The requests do not carry a referrer header that identifies ChatGPT or Perplexity as the source. You have to measure this indirectly.
Direct query monitoring. Run a weekly set of queries in ChatGPT, Perplexity, and Google’s AI Overviews for your target questions: “what is [your framework],” “how do I [problem you solve],” “best [category] for small business.” Note whether you are cited, and track the frequency over time.
Branded search growth. If AI tools are citing you, users who want to learn more will search your name and your framework name. Monitor branded query volume in GSC. Lift here correlates with AI citation activity.
Referral traffic from Perplexity. Perplexity does pass referral traffic in many cases, identifiable as perplexity.ai in the referrers report. Track this as a distinct channel.
Direct traffic baseline. A rising direct traffic trend, absent an obvious explanation, often includes AI-referred users who typed your URL after seeing it cited. Not clean measurement, but a useful signal.
A category of purpose-built AI-visibility trackers now exists. Tools like Otterly.AI, Profound, Peec AI, and AthenaHQ run a set of prompts against ChatGPT, Perplexity, and Google’s AI Overviews, then report how often you are named and which of your URLs get cited. They are useful for trend tracking, but the category is young and coverage varies by engine, so treat their numbers as a directional signal rather than a precise count. For most small businesses, manual monitoring plus branded search volume remains the most reliable proxy, with a tracker layered on once AI search is a meaningful channel for you.
The compounding flywheel
The reason AI SEO rewards early investment: it compounds. An entity that exists on Wikidata, has consistent cross-web presence, publishes structured and sourced content, and accumulates third-party mentions becomes progressively easier for models to cite. Each new piece of content adds to the entity’s footprint. Each citation drives branded searches. Each branded search reinforces the entity’s salience.
For Hiilite, this is not just a tactic. It is how the whole content program is built. We are writing the research that explains why AI-driven marketing works, citing the academic foundations (Anthropic’s work on agentic systems, Weng’s framework for LLM-powered agents), and building the author entity in public. The content strategy is the entity strategy. See our research and framework work for the full picture.
This piece is itself an example. It is structured, sourced, schema-ready, and written by a named author with a verifiable profile. If it earns citations, that is the flywheel starting.
AI SEO quick-checklist
On-site
– [ ] TL;DR block at the top of every major page (3-5 plain sentences, one claim per)
– [ ] Definition-first structure for any framework term or concept
– [ ] FAQ section (3-5 questions) with FAQPage schema on every hub page
– [ ] Article schema with author (Person + sameAs), datePublished, dateModified
– [ ] Organization schema on homepage with sameAs links
– [ ] Every stat and assertion linked to its primary source
– [ ] Short sentences, active voice, one idea per line
– [ ] Published and updated dates visible in the HTML
Off-site
– [ ] Wikidata entry for the founder (and organization if warranted)
– [ ] Google Knowledge Panel claimed and verified
– [ ] Consistent NAP across Google Business Profile, LinkedIn, directories
– [ ] Author bio consistent across your site, LinkedIn, and any guest publications
– [ ] llms.txt at site root
– [ ] At least one third-party mention in a recognized publication
Measurement – [ ] Weekly manual query set in ChatGPT, Perplexity, and AI Overviews – [ ] Branded search volume tracked weekly in GSC – [ ] Perplexity.ai referrer tracked as a distinct channel in GA4
Frequently asked questions
What is AI SEO?
AI SEO is the practice of optimizing your content and online presence so that AI tools — ChatGPT, Perplexity, Google’s AI Overviews, and similar systems — cite your site when answering questions in your domain. It extends traditional SEO by adding entity clarity, structured data, and cross-web corroboration to the optimization checklist.
How do I get cited by ChatGPT?
ChatGPT cites sources it can identify and trust. The three most direct levers: (1) make your authorship explicit and consistent across your site and external profiles; (2) write in short, declarative sentences with one claim per line and link every stat to a primary source; (3) build your entity footprint off-site (Wikidata, consistent NAP, third-party mentions). Clean structure and FAQPage schema also improve the model’s ability to lift your answers.
Is traditional SEO still relevant?
Yes. AI systems pull from web-crawled content, so the same authority signals that help you rank in Google — quality content, authoritative backlinks, clear site structure, fast load times — also improve your likelihood of being cited. AI SEO adds to the checklist; it does not replace it.
How long does it take to see AI citation results?
Expect three to six months for the off-site entity work (Wikidata, Knowledge Panel, consistent cross-web presence) to propagate and for models to begin indexing your entity reliably. On-site structural improvements can take effect faster if your pages are already being crawled. The flywheel accelerates once you have multiple corroborating sources in place.
What is the difference between AI SEO and AEO (Answer Engine Optimization)?
The terms describe the same goal: optimizing for AI and answer engines rather than ranked-link search results. AEO was the earlier term; AI SEO and GEO (Generative Engine Optimization) are more commonly used as of 2025-2026. This article uses “AI SEO” because it is the highest-volume search term for the concept, but the practices it describes apply regardless of which label you prefer.
Further reading
- Anthropic, “Building Effective Agents” (2024) — the technical foundation for how agentic AI systems work and make decisions, including how they evaluate sources.
- Weng, Lilian, “LLM-Powered Autonomous Agents” (2023) — a widely-cited technical overview of how LLMs reason over retrieved content, useful for understanding the citation mechanism.
- How AI-driven marketing fits into the broader customer acquisition system — Hiilite’s guide to sustainable, measurable customer recruitment for SMEs.
- The research behind the Growth Mapping framework — the PhD-grounded operating model that this content strategy is built on.
Ready to put this into practice?
Getting cited by AI tools is part of a broader shift: the businesses that will win the next decade are the ones that treat marketing as a measurable system, not a collection of activities. Hiilite builds that system.
Book a discovery call and we will show you exactly where your current setup leaves citations and revenue on the table.
William Walczak is the founder and CEO of Hiilite Creative Group (2014) and a PhD candidate at UBC-Okanagan researching growth hacking and marketing systems. He holds an MBA from UBC and an engineering degree from Simon Fraser University. CEO Monthly named him Marketing Strategy CEO of the Year 2023 (BC). Full profile · LinkedIn · Google Scholar