AEO Audit for crawlproof.com
Target: https://crawlproof.com/
Score: 57 / 100
Generated: 2026-07-01T20:10:59.490Z
Pages crawled: 9
Findings: 64 pass · 86 warn · 2 fail · 0 unknown
1. Crawl Summary
- ✅ Fetched 9 of 9 pages successfully Target: https://crawlproof.com
2. Data Found
| Data Point | Found? | Source | Notes |
|---|---|---|---|
| Pricing | Yes | Pricing page | https://crawlproof.com/pricing |
| Customer logos | No | — | — |
| Social proof | No | — | — |
| Recent launches | Yes | Press/news pages | https://crawlproof.com/blog |
| Blog post activity | Yes | Blog | https://crawlproof.com/blog |
| New hires | No | — | Often only on a /blog/team or LinkedIn page |
| Headline copy | Yes | Homepage | See your site the way AI crawlers do. |
| Positioning | No | — | — |
| Executive team | Yes | About/team page | https://crawlproof.com/about |
| Product/service descriptions | Yes | Homepage | From meta description |
| Case studies or testimonials | No | — | — |
| Contact/demo/signup paths | Yes | Navigation links | — |
3. Homepage Audit
- ⚠️ Long meta description (171 chars) Snippets truncate around 160 chars. Tighten to keep the key sentence visible.
- ✅ Homepage fetched successfully HTTP 200 · 64846 bytes · 99ms
- ✅ Page load time: 0.10s Fast — well within AI crawler budgets.
- ✅ declared
- ✅ Single H1 See your site the way AI crawlers do.
- ✅
<title>present (50 chars) - ✅ Canonical present https://crawlproof.com
- ✅ Open Graph tags complete
- ✅ Twitter Card tags complete
- ✅ Critical content is server-rendered Raw and rendered text are within 10% of each other.
- ✅ Alt text coverage: 100% 1/1 images have alt text.
- ✅ Content volume: 718 words Substantive content — AI models have enough to summarize and recommend.
- ✅ Heading structure: 15 (h1:1, h2:5, h3:9) Multiple headings help AI chunk and outline your page.
- ✅ Internal links: 22 22 internal + 1 external links help crawlers navigate.
- ✅ Favicon declared
4. Content Quality
- ⚠️ No question-style headings found Phrase at least one heading as a user question (e.g. 'How does pricing work?') to match conversational AI queries.
- ⚠️ No date signal found Add or article:published_time meta. AI ranking weights freshness.
- ⚠️ Text-to-HTML ratio: 7.8% Low text density — most of the response is markup/script.
- ⚠️ No author byline found Add
<meta name="author" content="Name">or a visible byline withrel="author". Strengthens E-E-A-T signals. - ✅ Heading levels are well-ordered 16 headings nested in order.
- ✅ Snippet-ready blocks: 4 (ul:4, ol:0, table:0) Lists and tables are extracted verbatim by AI answer engines.
5. Schema / Structured Data Audit
- ⚠️ Article / BlogPosting JSON-LD not found Add Article / BlogPosting where applicable so AI answer engines can resolve the entity precisely.
- ⚠️ BreadcrumbList JSON-LD not found Add BreadcrumbList where applicable so AI answer engines can resolve the entity precisely.
- ⚠️ LocalBusiness JSON-LD not found Add LocalBusiness where applicable so AI answer engines can resolve the entity precisely.
- ⚠️ Person (author / founder) JSON-LD not found Add Person (author / founder) where applicable so AI answer engines can resolve the entity precisely.
- ⚠️ HowTo JSON-LD not found Add HowTo where applicable so AI answer engines can resolve the entity precisely.
- ⚠️ VideoObject JSON-LD not found Add VideoObject where applicable so AI answer engines can resolve the entity precisely.
- ✅ 6 JSON-LD block(s) found Types: WebSite, Organization, SoftwareApplication, Organization, SoftwareApplication, FAQPage
- ✅ Organization present
- ✅ WebSite present
- ✅ SoftwareApplication present
- ✅ FAQPage JSON-LD present
6. Links & Images
- ⚠️ Modern image formats: 0% (0/1 webp/avif) 0 legacy (png/jpg/gif) image(s). Convert hero/above-the-fold images to WebP or AVIF.
- ❌ Explicit dimensions: 0% (0/1) Add width and height attributes on tags to prevent CLS.
- ⚠️ 1 broken link(s) in first 15 · timeout — https://vu1nz.com/
- ✅ External nofollow: 0% (0/1) Healthy mix of follow and nofollow outbound links.
7. Performance
- ⚠️ 1 render-blocking script(s) in Move non-critical scripts to end of or add
defer/async. - ⚠️ No images use loading=lazy Add loading="lazy" to off-screen images to defer their fetch until needed.
- ✅ Page size: 63 KB Compact HTML payload — well within AI crawler limits.
- ✅ Resource requests: 15 (scripts:13, css:1, img:1) Reasonable request count.
- ✅ Inline JS+CSS bulk: 41 KB Inline payload is modest.
- ✅ Response time: 99ms Fast first response.
- ✅ Cache-Control set Cache-Control: private, no-cache, no-store, max-age=0, must-revalidate
- ✅ Compression enabled (gzip) Content-Encoding: gzip
8. Security
- ⚠️ HSTS missing Add
Strict-Transport-Security: max-age=31536000; includeSubDomainsonce you're confident in https. - ⚠️ Content-Security-Policy missing Define a CSP to limit script sources — large reduction in XSS surface.
- ⚠️ X-Frame-Options missing Add
X-Frame-Options: SAMEORIGIN(or use CSP frame-ancestors) to prevent clickjacking. - ⚠️ X-Content-Type-Options missing Add
X-Content-Type-Options: nosniffto block MIME-type sniffing. - ⚠️ Referrer-Policy missing Add
Referrer-Policy: strict-origin-when-cross-originfor safer referrers. - ⚠️ Permissions-Policy missing Restrict browser features (camera, mic, geolocation) you don't use.
- ✅ Served over HTTPS
- ✅ No mixed content detected
9. robots.txt and sitemap.xml Audit
- ✅ robots.txt present 336 chars
- ✅ robots.txt references sitemap(s)
- ✅ sitemap.xml present (55 URLs)
10. LLM / AI Crawler Accessibility
- ⚠️ /.well-known/security.txt missing Publish a /.well-known/security.txt with at least a Contact: line. Crawlers and security researchers expect it; AI systems use it as a trust signal.
- ⚠️ /llms-full.txt missing Add /llms-full.txt with concatenated Markdown of all key pages. Lets LLMs ingest your full site in one request.
- ✅ GPTBot has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
- ✅ ClaudeBot has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
- ✅ PerplexityBot has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
- ✅ Google-Extended has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
- ✅ OAI-SearchBot has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
- ✅ Applebot-Extended has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
- ✅ CCBot has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
- ✅ llms.txt present 1618 chars
- ✅ skill.md present
- ✅ /.well-known/ai-plugin.json present
11. Positioning Clarity
- ⚠️ Value-prop language not detected Pages with phrases like 'we help X', 'platform for Y', 'built for Z' are easier for LLMs to summarize.
- ✅ About/Team path discoverable
- ✅ H1 communicates value See your site the way AI crawlers do.
- ✅ Pricing path discoverable
- ✅ Contact / signup path discoverable
12. Missing or Hard-to-Find Information
- ⚠️ 5 data point(s) could not be found from public pages · Customer logos · Social proof · New hires · Positioning · Case studies or testimonials
13. Recommended Fixes
⚠️ Add sameAs knowledge graph links to Organization schema Extend your Organization JSON-LD to include
sameAspointing to authoritative directories:{ "@context": "https://schema.org", "@type": "Organization", "name": "Your Brand", "url": "https://yoursite.com", "sameAs": [ "https://en.wikipedia.org/wiki/Your_Brand", "https://www.wikidata.org/wiki/Q12345678", "https://www.linkedin.com/company/your-brand", "https://www.crunchbase.com/organization/your-brand" ] }These links anchor your brand as a known entity in AI knowledge graphs, making it far more likely that generative models cite you by name rather than paraphrase.
⚠️ Phrase a heading as a user question Use headings like 'How does pricing work?' or 'Who is this for?' — they map directly to conversational AI queries.
⚠️ Publish a date signal Add
<time datetime="2026-05-17">or<meta property="article:published_time">. AI ranking heavily weights freshness.⚠️ Raise your text-to-HTML ratio Strip unused inline scripts/styles and move large bundles to external files. AI crawlers struggle when most of the response is markup.
⚠️ Generate /llms-full.txt for RAG pipelines llms-full.txt is a concatenation of the full markdown text of every resource listed in llms.txt. Generate it statically at build time and serve it from your root:
# Your Brand — Full Content ## Getting Started <full markdown content of /docs/start> ## API Reference <full markdown content of /docs/api>Large-context models can ingest your entire knowledge base in a single request, dramatically improving recall and citation accuracy.
⚠️ Add outbound links to authoritative sources Link to Wikipedia, .gov or .edu resources, peer-reviewed studies, or major news outlets when making factual claims. Generative AI systems treat pages that cite authoritative sources as more trustworthy, which raises citation likelihood.
Examples: statistics from Statista or Census.gov, definitions from Wikipedia, research from nature.com or pubmed.ncbi.nlm.nih.gov.
⚠️ Add a meta description 50–160 chars. Repeat your core value prop in plain language; this often becomes the AI snippet.
<meta name="description" content="CrawlProof shows you exactly how AI crawlers see your site, then tells you what to fix." />⚠️ Use modern image formats Serve WebP or AVIF for hero/above-the-fold images. Keep legacy PNG/JPG only as fallbacks.
⚠️ Set width/height on images Explicit dimensions prevent Cumulative Layout Shift and help AI extractors reserve space correctly.
⚠️ Fix broken homepage links We HEAD-probed the first 20 unique homepage links and found 4xx/5xx responses. Repair or remove them — broken links erode crawler trust.
⚠️ Publish /.well-known/security.txt A security contact builds trust with crawlers and researchers. Minimal example:
Contact: mailto:security@yourdomain.com Expires: 2027-01-01T00:00:00.000Z Preferred-Languages: en⚠️ Eliminate render-blocking head scripts Add
deferorasyncto any<script src="…">in<head>, or move it to the end of<body>.⚠️ State your audience explicitly Use phrases like 'Built for B2B SaaS marketing teams' on the homepage and About page.
⚠️ Add Article / BlogPosting JSON-LD On every blog/article page, include Article JSON-LD with headline, author, datePublished, dateModified. AI engines weight these heavily for freshness and authority.
⚠️ Add BreadcrumbList JSON-LD Helps AI engines understand site hierarchy and improves citation context.
⚠️ Enable HSTS Add
Strict-Transport-Security: max-age=31536000; includeSubDomainsonce you're confident every subdomain is https-ready.⚠️ Define a Content-Security-Policy Start with
Content-Security-Policy-Report-Onlyto learn safe sources, then enforce. Cuts XSS blast radius.⚠️ Declare an author byline Add
<meta name="author" content="Name">or a visible byline withrel="author". Combine with Person JSON-LD for E-E-A-T.⚠️ Add LocalBusiness JSON-LD (if you have a physical location) Include address, geo, openingHours, telephone. Required for AI engines to surface you in 'near me' queries.
⚠️ Add Person JSON-LD for authors / founders Mark up bylines and founder bios with Person schema — name, jobTitle, sameAs (their profiles). Strengthens E-E-A-T.
⚠️ Add HowTo JSON-LD for step-by-step content For any 'how to' page, wrap the steps in HowTo JSON-LD. AI step-by-step answers cite these heavily.
⚠️ Add VideoObject JSON-LD For embedded videos, include VideoObject with thumbnailUrl, uploadDate, duration. AI engines cite these in multimedia answers.
⚠️ Add X-Frame-Options
X-Frame-Options: SAMEORIGIN(or CSPframe-ancestors) blocks clickjacking via iframe embeds.⚠️ Add X-Content-Type-Options
X-Content-Type-Options: nosniffprevents browsers from MIME-sniffing responses.⚠️ Set a Referrer-Policy
Referrer-Policy: strict-origin-when-cross-originis a safe default.⚠️ Set a Permissions-Policy Restrict browser features you don't use, e.g.
Permissions-Policy: camera=(), microphone=(), geolocation=().
14. Priority To-Do List
P2 — Add sameAs knowledge graph links to Organization schema Extend your Organization JSON-LD to include
sameAspointing to authoritative directories:```json { "@context": "https://schema.org", "@type": "Organization", "name": "Your Brand", "url": "https://yoursite.com", "sameAs": [ "https://en.wikipedia.org/wiki/Your_Brand", "https://www.wikidata.org/wiki/Q12345678", "https://www.linkedin.com/company/your-brand", "https://www.crunchbase.com/organization/your-brand" ] } ``` These links anchor your brand as a known entity in AI knowledge graphs, making it far more likely that generative models cite you by name rather than paraphrase.P3 — Phrase a heading as a user question Use headings like 'How does pricing work?' or 'Who is this for?' — they map directly to conversational AI queries.
P3 — Publish a date signal Add
<time datetime="2026-05-17">or<meta property="article:published_time">. AI ranking heavily weights freshness.P3 — Raise your text-to-HTML ratio Strip unused inline scripts/styles and move large bundles to external files. AI crawlers struggle when most of the response is markup.
P3 — Generate /llms-full.txt for RAG pipelines llms-full.txt is a concatenation of the full markdown text of every resource listed in llms.txt. Generate it statically at build time and serve it from your root:
``` # Your Brand — Full Content ## Getting Started <full markdown content of /docs/start> ## API Reference <full markdown content of /docs/api> ``` Large-context models can ingest your entire knowledge base in a single request, dramatically improving recall and citation accuracy.P3 — Add outbound links to authoritative sources Link to Wikipedia, .gov or .edu resources, peer-reviewed studies, or major news outlets when making factual claims. Generative AI systems treat pages that cite authoritative sources as more trustworthy, which raises citation likelihood.
Examples: statistics from Statista or Census.gov, definitions from Wikipedia, research from nature.com or pubmed.ncbi.nlm.nih.gov.P3 — Add a meta description 50–160 chars. Repeat your core value prop in plain language; this often becomes the AI snippet.
```html <meta name="description" content="CrawlProof shows you exactly how AI crawlers see your site, then tells you what to fix." /> ```P3 — Use modern image formats Serve WebP or AVIF for hero/above-the-fold images. Keep legacy PNG/JPG only as fallbacks.
P3 — Set width/height on images Explicit dimensions prevent Cumulative Layout Shift and help AI extractors reserve space correctly.
P3 — Fix broken homepage links We HEAD-probed the first 20 unique homepage links and found 4xx/5xx responses. Repair or remove them — broken links erode crawler trust.
P3 — Publish /.well-known/security.txt A security contact builds trust with crawlers and researchers. Minimal example:
``` Contact: mailto:security@yourdomain.com Expires: 2027-01-01T00:00:00.000Z Preferred-Languages: en ```P3 — Eliminate render-blocking head scripts Add
deferorasyncto any<script src="…">in<head>, or move it to the end of<body>.P3 — State your audience explicitly Use phrases like 'Built for B2B SaaS marketing teams' on the homepage and About page.
P3 — Add Article / BlogPosting JSON-LD On every blog/article page, include Article JSON-LD with headline, author, datePublished, dateModified. AI engines weight these heavily for freshness and authority.
P3 — Add BreadcrumbList JSON-LD Helps AI engines understand site hierarchy and improves citation context.
P3 — Enable HSTS Add
Strict-Transport-Security: max-age=31536000; includeSubDomainsonce you're confident every subdomain is https-ready.P3 — Define a Content-Security-Policy Start with
Content-Security-Policy-Report-Onlyto learn safe sources, then enforce. Cuts XSS blast radius.P4 — Declare an author byline Add
<meta name="author" content="Name">or a visible byline withrel="author". Combine with Person JSON-LD for E-E-A-T.P4 — Add LocalBusiness JSON-LD (if you have a physical location) Include address, geo, openingHours, telephone. Required for AI engines to surface you in 'near me' queries.
P4 — Add Person JSON-LD for authors / founders Mark up bylines and founder bios with Person schema — name, jobTitle, sameAs (their profiles). Strengthens E-E-A-T.
Report by CrawlProof. Reusable after every major website change.