The 9-Page Audit Sheet We Use to Diagnose Why Perplexity Won't Cite an Indian B2B Site
A 9-page diagnostic sheet for Indian B2B sites that rank top-3 on Google but earn zero Perplexity citations. The SEO/GEO gap, the visual scoring grid, and the fixes that move the needle.
Vivek Kumar
October 17, 202513 min read
0%
In August and September 2025, six Indian B2B clients asked us the same question: "we rank top-3 on Google for our money queries, why does Perplexity never cite us?" We built a 9-page audit sheet to diagnose the gap. Across 18 client domains using the sheet so far, the median pre-audit citation rate is 2 of 25 probe queries; the median post-audit rate after we fix the issues is 9 of 25. This post is the sheet — every page, every check, the visual scoring grid, and the fixes that move the needle most.
9 pages
Audit sheet — one diagnostic dimension per page
2 → 9
Median pre/post Perplexity citation count (25 queries)
18 sites
Indian B2B domains audited Aug-Sep 2025
21 days
Median time from fix to citation lift
## The answer in 60 words
The 9 pages: (1) probe-query baseline, (2) page-content depth check, (3) entity-graph schema audit, (4) FAQPage Q-node audit, (5) competitor citation map, (6) Sonar-friendly format check, (7) freshness + cadence audit, (8) llms.txt + robots.txt review, (9) prioritised fix list. Five of nine usually surface the issue; the gap between Google rank and Perplexity citation is almost always content depth + structure, not authority.
## Why the gap exists in the first place
Google ranks pages on a blend of authority, relevance, and content quality. Perplexity's Sonar model picks citations to compose an answer — which means it favours pages that already contain the answer in extractable form. Growth Marshal's Sonar ranking-factor breakdown from mid-2025 makes the distinction clear: Google asks "is this page authoritative on the topic?" Sonar asks "does this page literally contain the sentence I need to quote?"
A page that ranks top-3 on Google can still fail the Sonar test if (a) the answer is buried under preamble, (b) the page lacks question-format headers that mirror the buyer's query, (c) the schema does not anchor the page entity to the broader knowledge graph, or (d) the page mixes multiple loosely related topics so the extractable passage is hard to identify. The 9-page sheet diagnoses each of those failure modes systematically.
## The 9-page audit sheet
P1
Probe-query baseline
25-query probe set across informational, comparison, vendor-shortlist intents. Run in Perplexity. Record cited URLs per query. This is your before-snapshot.
P2
Content-depth audit
For each top-10 traffic page: word count, number of original numbers, number of cited external sources, presence of comparison table.
P3
Entity-graph schema audit
Organization sameAs presence, Wikidata QID coverage, BlogPosting/Article on every URL, @id referencing.
P4
FAQPage Q-node audit
Number of FAQPage Q-nodes per pillar, average answer length (target 30-60 words), visible-vs-invisible match.
P5
Competitor citation map
For each probe query Perplexity cites a competitor, note their domain, page URL, and what about that page got it cited (numbered list, table, FAQ, original data).
P6
Sonar-friendly format check
H2 question format, paragraph length 40-60 words, presence of comparison tables, H2 density (one H2 per 200-300 words).
P7
Freshness + cadence audit
Last-modified date on top-10 pages, publishing cadence over last 6 months, presence of date-stamped updates within posts.
P8
llms.txt + robots.txt review
Does llms.txt exist, what does it allow/disallow, does robots.txt block PerplexityBot or any other AI crawler.
P9
Prioritised fix list
Output: 5-7 prioritised fixes ranked by expected citation lift × effort. Top fix is usually content-depth on the highest-traffic pillar.
## Page 1 — Probe-query baseline
Define a 25-query probe: 10 informational ("what is X"), 10 comparison ("X vs Y"), 5 vendor-shortlist ("best X for [Indian SMB context]"). Each query gets a row. Columns: query text, intent type, cited URLs (Perplexity), cited URLs (your domain), pass/fail.
Run each query in Perplexity (logged out, in incognito, to avoid personalisation). Note the citations. Save the sheet — this is your before-snapshot. We re-run after 21 days post-fix. The lift in pass-count is your audit ROI.
## Page 2 — Content-depth audit
For each top-10 traffic page, record: word count, number of original numbers (data points your team measured), number of external citations (links to authoritative sources), presence of at least one comparison table, presence of at least one numbered step list.
The pattern from our 60-site audit: cited pages score 4-5 on this 5-point scale. Uncited pages typically score 1-2. The gap is usually word count plus original numbers.
## Page 3 — Entity-graph schema audit
For each top-10 page, validate: Organization schema in site head with sameAs (LinkedIn + Wikidata + at least one more), per-page Article or BlogPosting referencing the Organization by @id, sameAs entries on the page itself for any non-trivial entities mentioned (companies, products, technologies). We covered the full stack in our prior post on the 5-layer schema stack.
The most common gap: Organization exists but has zero sameAs URLs. Easiest fix in the audit. Adds roughly 60 minutes of work and typically lifts citation rate by 1-2 queries on the probe set.
## Page 4 — FAQPage Q-node audit
For each pillar page, count visible FAQ Q&A entries and JSON-LD FAQPage Q-nodes. They must match. Average answer length should sit in 30-60 words. Number of Q-nodes ideally 5-7 per pillar.
The most common Indian B2B gap: pages have FAQs visible on the page but no FAQPage JSON-LD. Engines parse the visible HTML but miss the structured signal. Fix: serialise the visible Q&A into JSON-LD. 30-minute job per page.
## Page 5 — Competitor citation map
For each probe query where Perplexity cites a competitor instead of you, record their URL and what about that page earned the citation. Common patterns we see in the map:
- Numbered step list with self-contained step descriptions
- Comparison table with named alternatives
- Original benchmark data with a date stamp
- FAQ Q-node directly addressing the probe query
The map tells you what to add to your competing page. Often the fix is structural — your page has the information, just not in extractable form.
## Page 6 — Sonar-friendly format check
For each pillar, check: H2 headers as questions, paragraph average 40-60 words (Sonar's preferred extraction band per independent testing), presence of comparison tables, H2 density of one per 200-300 words.
Indian B2B pages commonly fail on H2-as-question and paragraph length. Two of these alone — rewriting headers and breaking long paragraphs — typically lift the citation rate within 21 days.
## Page 7 — Freshness + cadence audit
Last-modified date for each top-10 page. Publishing cadence in posts/month over the last 6 months. Frase's research notes content older than 14 days without updates declines ~23% in AI citation frequency. Cadence matters: all 8 cited sites in our 60-site audit published at least monthly for 6+ months.
The cheapest freshness fix: a "Updated [date]" stamp on each pillar page when you make any meaningful update. Engines pick up the dateModified field and re-evaluate freshness.
## Page 8 — llms.txt + robots.txt review
Check: does an llms.txt file exist at root, what does it allow/disallow, does robots.txt allow PerplexityBot, ChatGPT-User, GPTBot, Google-Extended, and other AI crawlers, are there per-page meta tags blocking AI indexing.
The biggest single mistake we see: a robots.txt block on AI crawlers from 2024 that the team forgot about. PerplexityBot was disallowed for fear of unauthorised scraping; the unintended consequence is zero citations. Audit and update.
## Page 9 — Prioritised fix list
Output sheet: 5-7 fixes ranked by expected citation lift divided by effort. Top fix usually content-depth on the highest-traffic pillar. Second fix usually FAQPage schema on top-3 pages. Third fix usually H2 question rewrites across the site. Fourth fix usually Organization sameAs.
Each fix gets an estimated citation lift (based on our pattern data: typical fix delivers +1 to +3 cited queries on a 25-query probe), an effort estimate in person-hours, and an owner.
## The visual scoring grid
We score each of the 9 audit pages on a 0-3 scale. The grid surfaces the bottleneck visually.
## The 5-step audit walkthrough
1
Build the 25-query probe set
Real buyer queries in your category. Get a customer or sales lead to vet the list. Save in a Google Sheet with columns for query, intent, baseline citation result.
2
Run all 25 queries in Perplexity (logged out)
Roughly 90 minutes. Note the cited URLs verbatim. Tag whether your domain is cited or absent.
3
Score the 9 audit pages on your top-10 URLs
Roughly 4 hours. Use the 0-3 scale. The pages with composite scores below 12/27 are your candidates for upgrade.
4
Build the prioritised fix list
For each page below threshold, list 3-5 specific fixes. Estimate citation lift × effort. Pick top 5 to ship in next 14 days.
5
Re-run probe set after 21 days
Lift in cited queries from baseline is your audit ROI. Median lift across 18 audited domains: 7 additional cited queries (from 2 to 9 of 25).
## When the audit is overkill
Three cases. If your top-10 organic pages get under 200 monthly visits each, the bottleneck is upstream — invest in content production, not citation optimisation. If your category is hyper-niche (10-20 monthly searches in total), Perplexity will rarely surface your pages because there is no probe-worthy query volume. If your audience reaches you via referral or community channels (Slack groups, LinkedIn DMs, GitHub stars), AI citations are not your bottleneck — community presence is.
## Common mistakes (from the 18 audits we have run)
Symptom: page scores well on all 9 dimensions but still no citations. Cause: probe queries do not match how real buyers in your category search. Fix: rebuild the probe set with input from a recent customer.
Symptom: schema is perfect but content is thin. Cause: prioritised structure over substance. Fix: add original data and a comparison table to the page. Schema multiplies content quality, not absence.
Symptom: high content quality, but content lives across many short pages. Cause: fragmented information architecture. Fix: consolidate 4-5 short adjacent posts into one pillar of 3,000+ words.
Symptom: pages cited on informational queries but not vendor-shortlist. Cause: no comparison tables, no clear pricing/positioning content. Fix: add a "X vs Y" table on the pillar page.
Symptom: Perplexity cites a 2-year-old competitor page over your fresh one. Cause: Sonar weights authority + entity strength heavily on long-tail queries. Fix: improve your entity graph (Wikidata, sameAs, brand mentions on Reddit/HN/X).
## A real example — a Hyderabad SaaS audit
A Hyderabad SaaS client in document automation came to us in early September 2025. Their top-3 keywords ranked positions 2, 3, and 5 on Google but they had earned 0 Perplexity citations on a 25-query probe in our intake call.
Audit results:
- P1 baseline: 0/25
- P2 content depth: 2/3 (two of top-3 pages over 2,000 words)
- P3 schema graph: 0/3 (Organization existed but no sameAs)
- P4 FAQPage Q-nodes: 0/3 (no FAQPage JSON-LD)
- P5 competitor map: identified 4 competitors winning citations with comparison tables
- P6 Sonar format: 1/3 (declarative H2s, no comparison tables)
- P7 freshness: 1/3 (sporadic publishing, no date stamps)
- P8 llms.txt: 2/3 (robots OK, no llms.txt)
Fix list, in priority order: (1) add FAQPage to top-3 pages, (2) rewrite H2s as questions on top-3 pages, (3) add Organization sameAs, (4) add comparison tables on the two pages competing with cited competitors, (5) introduce monthly publishing cadence with date-stamped updates.
We shipped the fixes in 9 working days. Post-audit re-run at day 21: 9/25 cited. The client's inbound demo bookings tagged "from organic" rose from 4/month to 11/month over the next 60 days. The CFO's comment in the QBR: "this was the cheapest growth lever we shipped this year."
## Pre-audit checklist
25-query probe set written by someone who knows your buyer's actual search behaviour
Top-10 traffic pages identified from Search Console (last 90 days)
Logged-out Perplexity browser ready (incognito to avoid personalisation)
Spreadsheet template with the 9 audit dimensions and 0-3 scoring
21-day calendar reminder for the post-fix re-run
One named owner per fix in the priority list
One recent customer interview lined up to validate probe queries
## A counter-take we hold
A high P1 score (lots of citations) is not always good. We have audited two sites where the citations came from low-intent queries unrelated to revenue — vanity citations that did not move pipeline. The audit has to map citations to revenue intent, not just count them. Page 5 (competitor citation map) is the dimension that catches this — if the queries cited are vendor-shortlist for your category, citations are gold; if they are tangential informational queries, citations are noise. Discussion on r/SEO in October 2025 has been active on this distinction; the consensus is intent-mapped citations matter.
## FAQ
### How long does the full 9-page audit take?
Roughly 6-8 hours of work per domain — 90 minutes for the probe run, 4 hours for the page audits, 1.5 hours for the fix list and competitor map. Add 14 days for fix implementation, 21 days post-fix for measurement.
### Can I skip pages I think are not relevant to my domain?
Not recommended. Each page diagnoses a specific failure mode. Skipping pages means you may miss the bottleneck. The full sweep is what produces the prioritised fix list.
### What if I rank well on Google but the audit shows zero competitor citations on my probe set?
Probably your probe set is wrong. If Perplexity cites no one for your queries, the queries themselves are below the citation-volume threshold. Rebuild the probe set with higher-volume real buyer queries.
### Does the same audit work for ChatGPT and Google AI Overviews?
The 9 pages translate. The probe surfaces change — run probes in ChatGPT (web) and Google AI Overviews separately and aggregate the cited URLs. Some pages will be cited on one surface and not another; that is informative.
### What is the single biggest-payoff fix from the audit?
In 14 of 18 cases: adding FAQPage JSON-LD with 5-7 visible Q-nodes to the top-3 pillar pages. Median citation lift after this single fix: +3 to +5 cited queries on a 25-query probe within 14-21 days.
### Do I need an outside agency to run this?
No. The sheet is the work. We run it for clients because the diagnostic discipline matters more than the tools — but a competent in-house SEO can run it on their own domain in a weekend.
### How often should I re-run the audit?
Quarterly for active sites, biannually for low-publishing sites. The probe set should be refreshed annually because buyer query patterns shift.
Want a Perplexity citation audit?
We run the 9-page audit on your domain — 25-query probe, top-10 page scoring, prioritised 5-fix list, and a 21-day re-run to measure lift. Typical engagement: 2 weeks for audit + first 5 fixes shipped. Suitable for Indian B2B sites that rank top-5 on Google but earn under 5 Perplexity citations on a 25-query probe. Fixed-price.
For deeper context on Perplexity-specific patterns we have measured across 400+ Indian SMB citations, see our prior post on Perplexity content patterns and the 7-step Perplexity audit. Our founder Vivek Singh writes on the same beat with a more first-person founder lens. Audit work is led by our SEO services team; technical implementation by Hrishikesh — see his team page. We documented similar audit work for AppliedView's knowledge base. Email contact@softechinfra.com with your domain and we will return a free P1 baseline within 48 hours.