Public share link

AEO Audit for crawlproof.com

Target: https://crawlproof.com/
Score: 57 / 100
Generated: 2026-07-01T20:10:59.490Z
Pages crawled: 9
Findings: 64 pass · 86 warn · 2 fail · 0 unknown

1. Crawl Summary

✅ Fetched 9 of 9 pages successfully Target: https://crawlproof.com

2. Data Found

Data Point	Found?	Source	Notes
Pricing	Yes	Pricing page	https://crawlproof.com/pricing
Customer logos	No	—	—
Social proof	No	—	—
Recent launches	Yes	Press/news pages	https://crawlproof.com/blog
Blog post activity	Yes	Blog	https://crawlproof.com/blog
New hires	No	—	Often only on a /blog/team or LinkedIn page
Headline copy	Yes	Homepage	See your site the way AI crawlers do.
Positioning	No	—	—
Executive team	Yes	About/team page	https://crawlproof.com/about
Product/service descriptions	Yes	Homepage	From meta description
Case studies or testimonials	No	—	—
Contact/demo/signup paths	Yes	Navigation links	—

3. Homepage Audit

⚠️ Long meta description (171 chars) Snippets truncate around 160 chars. Tighten to keep the key sentence visible.
✅ Homepage fetched successfully HTTP 200 · 64846 bytes · 99ms
✅ Page load time: 0.10s Fast — well within AI crawler budgets.
✅ declared
✅ Single H1 See your site the way AI crawlers do.
✅ <title> present (50 chars)
✅ Canonical present https://crawlproof.com
✅ Open Graph tags complete
✅ Twitter Card tags complete
✅ Critical content is server-rendered Raw and rendered text are within 10% of each other.
✅ Alt text coverage: 100% 1/1 images have alt text.
✅ Content volume: 718 words Substantive content — AI models have enough to summarize and recommend.
✅ Heading structure: 15 (h1:1, h2:5, h3:9) Multiple headings help AI chunk and outline your page.
✅ Internal links: 22 22 internal + 1 external links help crawlers navigate.
✅ Favicon declared

4. Content Quality

⚠️ No question-style headings found Phrase at least one heading as a user question (e.g. 'How does pricing work?') to match conversational AI queries.
⚠️ No date signal found Add or article:published_time meta. AI ranking weights freshness.
⚠️ Text-to-HTML ratio: 7.8% Low text density — most of the response is markup/script.
⚠️ No author byline found Add <meta name="author" content="Name"> or a visible byline with rel="author". Strengthens E-E-A-T signals.
✅ Heading levels are well-ordered 16 headings nested in order.
✅ Snippet-ready blocks: 4 (ul:4, ol:0, table:0) Lists and tables are extracted verbatim by AI answer engines.

5. Schema / Structured Data Audit

⚠️ Article / BlogPosting JSON-LD not found Add Article / BlogPosting where applicable so AI answer engines can resolve the entity precisely.
⚠️ BreadcrumbList JSON-LD not found Add BreadcrumbList where applicable so AI answer engines can resolve the entity precisely.
⚠️ LocalBusiness JSON-LD not found Add LocalBusiness where applicable so AI answer engines can resolve the entity precisely.
⚠️ Person (author / founder) JSON-LD not found Add Person (author / founder) where applicable so AI answer engines can resolve the entity precisely.
⚠️ HowTo JSON-LD not found Add HowTo where applicable so AI answer engines can resolve the entity precisely.
⚠️ VideoObject JSON-LD not found Add VideoObject where applicable so AI answer engines can resolve the entity precisely.
✅ 6 JSON-LD block(s) found Types: WebSite, Organization, SoftwareApplication, Organization, SoftwareApplication, FAQPage
✅ Organization present
✅ WebSite present
✅ SoftwareApplication present
✅ FAQPage JSON-LD present

6. Links & Images

⚠️ Modern image formats: 0% (0/1 webp/avif) 0 legacy (png/jpg/gif) image(s). Convert hero/above-the-fold images to WebP or AVIF.
❌ Explicit dimensions: 0% (0/1) Add width and height attributes on tags to prevent CLS.
⚠️ 1 broken link(s) in first 15 · timeout — https://vu1nz.com/
✅ External nofollow: 0% (0/1) Healthy mix of follow and nofollow outbound links.

7. Performance

⚠️ 1 render-blocking script(s) in Move non-critical scripts to end of or add defer/async.
⚠️ No images use loading=lazy Add loading="lazy" to off-screen images to defer their fetch until needed.
✅ Page size: 63 KB Compact HTML payload — well within AI crawler limits.
✅ Resource requests: 15 (scripts:13, css:1, img:1) Reasonable request count.
✅ Inline JS+CSS bulk: 41 KB Inline payload is modest.
✅ Response time: 99ms Fast first response.
✅ Cache-Control set Cache-Control: private, no-cache, no-store, max-age=0, must-revalidate
✅ Compression enabled (gzip) Content-Encoding: gzip

8. Security

⚠️ HSTS missing Add Strict-Transport-Security: max-age=31536000; includeSubDomains once you're confident in https.
⚠️ Content-Security-Policy missing Define a CSP to limit script sources — large reduction in XSS surface.
⚠️ X-Frame-Options missing Add X-Frame-Options: SAMEORIGIN (or use CSP frame-ancestors) to prevent clickjacking.
⚠️ X-Content-Type-Options missing Add X-Content-Type-Options: nosniff to block MIME-type sniffing.
⚠️ Referrer-Policy missing Add Referrer-Policy: strict-origin-when-cross-origin for safer referrers.
⚠️ Permissions-Policy missing Restrict browser features (camera, mic, geolocation) you don't use.
✅ Served over HTTPS
✅ No mixed content detected

9. robots.txt and sitemap.xml Audit

✅ robots.txt present 336 chars
✅ robots.txt references sitemap(s)
✅ sitemap.xml present (55 URLs)

10. LLM / AI Crawler Accessibility

⚠️ /.well-known/security.txt missing Publish a /.well-known/security.txt with at least a Contact: line. Crawlers and security researchers expect it; AI systems use it as a trust signal.
⚠️ /llms-full.txt missing Add /llms-full.txt with concatenated Markdown of all key pages. Lets LLMs ingest your full site in one request.
✅ GPTBot has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
✅ ClaudeBot has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
✅ PerplexityBot has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
✅ Google-Extended has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
✅ OAI-SearchBot has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
✅ Applebot-Extended has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
✅ CCBot has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
✅ llms.txt present 1618 chars
✅ skill.md present
✅ /.well-known/ai-plugin.json present

11. Positioning Clarity

⚠️ Value-prop language not detected Pages with phrases like 'we help X', 'platform for Y', 'built for Z' are easier for LLMs to summarize.
✅ About/Team path discoverable
✅ H1 communicates value See your site the way AI crawlers do.
✅ Pricing path discoverable
✅ Contact / signup path discoverable

12. Missing or Hard-to-Find Information

⚠️ 5 data point(s) could not be found from public pages · Customer logos · Social proof · New hires · Positioning · Case studies or testimonials

13. Recommended Fixes

⚠️ Add sameAs knowledge graph links to Organization schema Extend your Organization JSON-LD to include sameAs pointing to authoritative directories:

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Your Brand",
  "url": "https://yoursite.com",
  "sameAs": [
    "https://en.wikipedia.org/wiki/Your_Brand",
    "https://www.wikidata.org/wiki/Q12345678",
    "https://www.linkedin.com/company/your-brand",
    "https://www.crunchbase.com/organization/your-brand"
  ]
}

These links anchor your brand as a known entity in AI knowledge graphs, making it far more likely that generative models cite you by name rather than paraphrase.

⚠️ Phrase a heading as a user question Use headings like 'How does pricing work?' or 'Who is this for?' — they map directly to conversational AI queries.
⚠️ Publish a date signal Add <time datetime="2026-05-17"> or <meta property="article:published_time">. AI ranking heavily weights freshness.
⚠️ Raise your text-to-HTML ratio Strip unused inline scripts/styles and move large bundles to external files. AI crawlers struggle when most of the response is markup.
⚠️ Generate /llms-full.txt for RAG pipelines llms-full.txt is a concatenation of the full markdown text of every resource listed in llms.txt. Generate it statically at build time and serve it from your root:
```
# Your Brand — Full Content

## Getting Started
<full markdown content of /docs/start>

## API Reference
<full markdown content of /docs/api>
```
Large-context models can ingest your entire knowledge base in a single request, dramatically improving recall and citation accuracy.
⚠️ Add outbound links to authoritative sources Link to Wikipedia, .gov or .edu resources, peer-reviewed studies, or major news outlets when making factual claims. Generative AI systems treat pages that cite authoritative sources as more trustworthy, which raises citation likelihood.
Examples: statistics from Statista or Census.gov, definitions from Wikipedia, research from nature.com or pubmed.ncbi.nlm.nih.gov.

⚠️ Add a meta description 50–160 chars. Repeat your core value prop in plain language; this often becomes the AI snippet.

<meta name="description" content="CrawlProof shows you exactly how AI crawlers see your site, then tells you what to fix." />

⚠️ Use modern image formats Serve WebP or AVIF for hero/above-the-fold images. Keep legacy PNG/JPG only as fallbacks.
⚠️ Set width/height on images Explicit dimensions prevent Cumulative Layout Shift and help AI extractors reserve space correctly.
⚠️ Fix broken homepage links We HEAD-probed the first 20 unique homepage links and found 4xx/5xx responses. Repair or remove them — broken links erode crawler trust.
⚠️ Publish /.well-known/security.txt A security contact builds trust with crawlers and researchers. Minimal example:
```
Contact: mailto:security@yourdomain.com
Expires: 2027-01-01T00:00:00.000Z
Preferred-Languages: en
```
⚠️ Eliminate render-blocking head scripts Add defer or async to any <script src="…"> in <head>, or move it to the end of <body>.
⚠️ State your audience explicitly Use phrases like 'Built for B2B SaaS marketing teams' on the homepage and About page.
⚠️ Add Article / BlogPosting JSON-LD On every blog/article page, include Article JSON-LD with headline, author, datePublished, dateModified. AI engines weight these heavily for freshness and authority.
⚠️ Add BreadcrumbList JSON-LD Helps AI engines understand site hierarchy and improves citation context.
⚠️ Enable HSTS Add Strict-Transport-Security: max-age=31536000; includeSubDomains once you're confident every subdomain is https-ready.
⚠️ Define a Content-Security-Policy Start with Content-Security-Policy-Report-Only to learn safe sources, then enforce. Cuts XSS blast radius.
⚠️ Declare an author byline Add <meta name="author" content="Name"> or a visible byline with rel="author". Combine with Person JSON-LD for E-E-A-T.
⚠️ Add LocalBusiness JSON-LD (if you have a physical location) Include address, geo, openingHours, telephone. Required for AI engines to surface you in 'near me' queries.
⚠️ Add Person JSON-LD for authors / founders Mark up bylines and founder bios with Person schema — name, jobTitle, sameAs (their profiles). Strengthens E-E-A-T.
⚠️ Add HowTo JSON-LD for step-by-step content For any 'how to' page, wrap the steps in HowTo JSON-LD. AI step-by-step answers cite these heavily.
⚠️ Add VideoObject JSON-LD For embedded videos, include VideoObject with thumbnailUrl, uploadDate, duration. AI engines cite these in multimedia answers.
⚠️ Add X-Frame-Options X-Frame-Options: SAMEORIGIN (or CSP frame-ancestors) blocks clickjacking via iframe embeds.
⚠️ Add X-Content-Type-Options X-Content-Type-Options: nosniff prevents browsers from MIME-sniffing responses.
⚠️ Set a Referrer-Policy Referrer-Policy: strict-origin-when-cross-origin is a safe default.
⚠️ Set a Permissions-Policy Restrict browser features you don't use, e.g. Permissions-Policy: camera=(), microphone=(), geolocation=().

14. Priority To-Do List

P2 — Add sameAs knowledge graph links to Organization schema Extend your Organization JSON-LD to include sameAs pointing to authoritative directories:

```json
{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Your Brand",
  "url": "https://yoursite.com",
  "sameAs": [
    "https://en.wikipedia.org/wiki/Your_Brand",
    "https://www.wikidata.org/wiki/Q12345678",
    "https://www.linkedin.com/company/your-brand",
    "https://www.crunchbase.com/organization/your-brand"
  ]
}
```

These links anchor your brand as a known entity in AI knowledge graphs, making it far more likely that generative models cite you by name rather than paraphrase.

P3 — Phrase a heading as a user question Use headings like 'How does pricing work?' or 'Who is this for?' — they map directly to conversational AI queries.
P3 — Publish a date signal Add <time datetime="2026-05-17"> or <meta property="article:published_time">. AI ranking heavily weights freshness.
P3 — Raise your text-to-HTML ratio Strip unused inline scripts/styles and move large bundles to external files. AI crawlers struggle when most of the response is markup.

P3 — Generate /llms-full.txt for RAG pipelines llms-full.txt is a concatenation of the full markdown text of every resource listed in llms.txt. Generate it statically at build time and serve it from your root:

```
# Your Brand — Full Content

## Getting Started
<full markdown content of /docs/start>

## API Reference
<full markdown content of /docs/api>
```

Large-context models can ingest your entire knowledge base in a single request, dramatically improving recall and citation accuracy.

P3 — Add outbound links to authoritative sources Link to Wikipedia, .gov or .edu resources, peer-reviewed studies, or major news outlets when making factual claims. Generative AI systems treat pages that cite authoritative sources as more trustworthy, which raises citation likelihood.
```
Examples: statistics from Statista or Census.gov, definitions from Wikipedia, research from nature.com or pubmed.ncbi.nlm.nih.gov.
```

P3 — Add a meta description 50–160 chars. Repeat your core value prop in plain language; this often becomes the AI snippet.

```html
<meta name="description" content="CrawlProof shows you exactly how AI crawlers see your site, then tells you what to fix." />
```

P3 — Use modern image formats Serve WebP or AVIF for hero/above-the-fold images. Keep legacy PNG/JPG only as fallbacks.
P3 — Set width/height on images Explicit dimensions prevent Cumulative Layout Shift and help AI extractors reserve space correctly.
P3 — Fix broken homepage links We HEAD-probed the first 20 unique homepage links and found 4xx/5xx responses. Repair or remove them — broken links erode crawler trust.
P3 — Publish /.well-known/security.txt A security contact builds trust with crawlers and researchers. Minimal example:
```
```
Contact: mailto:security@yourdomain.com
Expires: 2027-01-01T00:00:00.000Z
Preferred-Languages: en
```
```
P3 — Eliminate render-blocking head scripts Add defer or async to any <script src="…"> in <head>, or move it to the end of <body>.
P3 — State your audience explicitly Use phrases like 'Built for B2B SaaS marketing teams' on the homepage and About page.
P3 — Add Article / BlogPosting JSON-LD On every blog/article page, include Article JSON-LD with headline, author, datePublished, dateModified. AI engines weight these heavily for freshness and authority.
P3 — Add BreadcrumbList JSON-LD Helps AI engines understand site hierarchy and improves citation context.
P3 — Enable HSTS Add Strict-Transport-Security: max-age=31536000; includeSubDomains once you're confident every subdomain is https-ready.
P3 — Define a Content-Security-Policy Start with Content-Security-Policy-Report-Only to learn safe sources, then enforce. Cuts XSS blast radius.
P4 — Declare an author byline Add <meta name="author" content="Name"> or a visible byline with rel="author". Combine with Person JSON-LD for E-E-A-T.
P4 — Add LocalBusiness JSON-LD (if you have a physical location) Include address, geo, openingHours, telephone. Required for AI engines to surface you in 'near me' queries.
P4 — Add Person JSON-LD for authors / founders Mark up bylines and founder bios with Person schema — name, jobTitle, sameAs (their profiles). Strengthens E-E-A-T.

Report by CrawlProof. Reusable after every major website change.

AEO audit

https://crawlproof.com/

Completed 7/1/2026, 8:10:59 PM

/100

1. Crawl Summary

pass

Fetched 9 of 9 pages successfully

Target: https://crawlproof.com

Evidence

{
  "pages": [
    {
      "url": "https://crawlproof.com",
      "bytes": 64846,
      "status": 200,
      "fetchMs": 99
    },
    {
      "url": "https://crawlproof.com/about",
      "bytes": 28033,
      "status": 200,
      "fetchMs": 94
    },
    {
      "url": "https://crawlproof.com/pricing",
      "bytes": 43269,
      "status": 200,
      "fetchMs": 94
    },
    {
      "url": "https://crawlproof.com/blog",
      "bytes": 95901,
      "status": 200,
      "fetchMs": 195
    },
    {
      "url": "https://crawlproof.com/docs/autoblog-webhook",
      "bytes": 55539,
      "status": 200,
      "fetchMs": 96
    },
    {
      "url": "https://crawlproof.com/",
      "bytes": 64846,
      "status": 200,
      "fetchMs": 100
    },
    {
      "url": "https://crawlproof.com/hire",
      "bytes": 28836,
      "status": 200,
      "fetchMs": 93
    },
    {
      "url": "https://crawlproof.com/get-guide",
      "bytes": 34876,
      "status": 200,
      "fetchMs": 92
    },
    {
      "url": "https://crawlproof.com/recent",
      "bytes": 41156,
      "status": 200,
      "fetchMs": 94
    }
  ],
  "origin": "https://crawlproof.com",
  "target": "https://crawlproof.com"
}

2. Data Found

Data Point	Found?	Source	Notes
Pricing	Yes	Pricing page	https://crawlproof.com/pricing
Customer logos	No	—	—
Social proof	No	—	—
Recent launches	Yes	Press/news pages	https://crawlproof.com/blog
Blog post activity	Yes	Blog	https://crawlproof.com/blog
New hires	No	—	Often only on a /blog/team or LinkedIn page
Headline copy	Yes	Homepage	See your site the way AI crawlers do.
Positioning	No	—	—
Executive team	Yes	About/team page	https://crawlproof.com/about
Product/service descriptions	Yes	Homepage	From meta description
Case studies or testimonials	No	—	—
Contact/demo/signup paths	Yes	Navigation links	—

3. Homepage Audit

warn

Long meta description (171 chars)

Snippets truncate around 160 chars. Tighten to keep the key sentence visible.

Evidence

{
  "description": "CrawlProof runs an AEO audit on any URL and reports what LLM crawlers and answer engines can actually find — content, schema, robots rules, AI-bot access, and positioning."
}

pass
Homepage fetched successfully
HTTP 200 · 64846 bytes · 99ms
Evidence
```
{
  "bytes": 64846,
  "status": 200,
  "fetchMs": 99
}
```
P5
pass
Page load time: 0.10s
Fast — well within AI crawler budgets.
Evidence
```
{
  "fetchMs": 99
}
```
P5
pass
<html lang="en"> declared
Evidence
```
{
  "lang": "en"
}
```
P5
pass
Single H1
See your site the way AI crawlers do.
Evidence
```
{
  "h1": "See your site the way AI crawlers do."
}
```
P5

pass

`<title>` present (50 chars)

Evidence

{
  "title": "CrawlProof — See your site the way AI crawlers do."
}

pass
Canonical present
https://crawlproof.com
Evidence
```
{
  "canonical": "https://crawlproof.com"
}
```
P5

pass

Open Graph tags complete

Evidence

{
  "image": "https://crawlproof.com/banner.png",
  "title": "CrawlProof — See your site the way AI crawlers do.",
  "description": "CrawlProof runs an AEO audit on any URL and reports what LLM crawlers and answer engines can actually find — content, schema, robots rules, AI-bot access, and positioning."
}

pass

Twitter Card tags complete

Evidence

{
  "card": "summary_large_image",
  "image": "https://crawlproof.com/banner.png",
  "title": "CrawlProof — See your site the way AI crawlers do.",
  "description": "CrawlProof runs an AEO audit on any URL and reports what LLM crawlers and answer engines can actually find — content, schema, robots rules, AI-bot access, and positioning."
}

pass
Critical content is server-rendered
Raw and rendered text are within 10% of each other.
Evidence
```
{
  "ratio": 0.099250698501397,
  "rawTextLen": 47244,
  "renderedTextLen": 4689
}
```
P5
pass
Alt text coverage: 100%
1/1 images have alt text.
P5
pass
Content volume: 718 words
Substantive content — AI models have enough to summarize and recommend.
Evidence
```
{
  "wordCount": 718
}
```
P5
pass
Heading structure: 15 (h1:1, h2:5, h3:9)
Multiple headings help AI chunk and outline your page.
Evidence
```
{
  "h1": 1,
  "h2": 5,
  "h3": 9
}
```
P5
pass
Internal links: 22
22 internal + 1 external links help crawlers navigate.
Evidence
```
{
  "external": 1,
  "internal": 22
}
```
P5
pass
Favicon declared
Evidence
```
{
  "href": "/favicon.svg"
}
```
P5

4. Content Quality

warn
No question-style headings found
Phrase at least one heading as a user question (e.g. 'How does pricing work?') to match conversational AI queries.
P3
warn
No date signal found
Add <time datetime="…"> or article:published_time meta. AI ranking weights freshness.
P3
warn
Text-to-HTML ratio: 7.8%
Low text density — most of the response is markup/script.
Evidence
```
{
  "ratio": 0.07817811012916383,
  "htmlBytes": 64724,
  "textChars": 5060
}
```
P3
warn
No author byline found
Add `<meta name="author" content="Name">` or a visible byline with `rel="author"`. Strengthens E-E-A-T signals.
P4

pass

Heading levels are well-ordered

16 headings nested in order.

Evidence

{
  "levels": [
    1,
    2,
    3,
    3,
    3,
    3,
    3,
    3,
    3,
    3,
    2,
    3,
    4,
    2,
    2,
    2
  ]
}

pass
Snippet-ready blocks: 4 (ul:4, ol:0, table:0)
Lists and tables are extracted verbatim by AI answer engines.
Evidence
```
{
  "ol": 0,
  "ul": 4,
  "tables": 0
}
```
P5

5. Schema / Structured Data Audit

warn
Article / BlogPosting JSON-LD not found
Add Article / BlogPosting where applicable so AI answer engines can resolve the entity precisely.
P3
warn
BreadcrumbList JSON-LD not found
Add BreadcrumbList where applicable so AI answer engines can resolve the entity precisely.
P3
warn
Person (author / founder) JSON-LD not found
Add Person (author / founder) where applicable so AI answer engines can resolve the entity precisely.
P4
warn
LocalBusiness JSON-LD not found
Add LocalBusiness where applicable so AI answer engines can resolve the entity precisely.
P4
warn
HowTo JSON-LD not found
Add HowTo where applicable so AI answer engines can resolve the entity precisely.
P4
warn
VideoObject JSON-LD not found
Add VideoObject where applicable so AI answer engines can resolve the entity precisely.
P4

pass

6 JSON-LD block(s) found

Types: WebSite, Organization, SoftwareApplication, Organization, SoftwareApplication, FAQPage

Evidence

{
  "types": [
    "WebSite",
    "Organization",
    "SoftwareApplication",
    "Organization",
    "SoftwareApplication",
    "FAQPage"
  ],
  "blocks": 6
}

pass
Organization present
P5
pass
WebSite present
P5
pass
SoftwareApplication present
P5
pass
FAQPage JSON-LD present
P5

6. Links & Images

warn

1 broken link(s) in first 15

· timeout — https://vu1nz.com/

Evidence

{
  "broken": 1,
  "sampled": 15,
  "examples": [
    {
      "url": "https://vu1nz.com/",
      "status": 0
    }
  ]
}

warn
Modern image formats: 0% (0/1 webp/avif)
0 legacy (png/jpg/gif) image(s). Convert hero/above-the-fold images to WebP or AVIF.
Evidence
```
{
  "total": 1,
  "legacy": 0,
  "modern": 0
}
```
P3
fail
Explicit dimensions: 0% (0/1)
Add width and height attributes on <img> tags to prevent CLS.
Evidence
```
{
  "total": 1,
  "withDims": 0
}
```
P3
pass
External nofollow: 0% (0/1)
Healthy mix of follow and nofollow outbound links.
P5

7. Performance

warn
1 render-blocking script(s) in <head>
Move non-critical scripts to end of <body> or add `defer`/`async`.
P3
warn
No images use loading=lazy
Add loading="lazy" to off-screen images to defer their fetch until needed.
P3
pass
Page size: 63 KB
Compact HTML payload — well within AI crawler limits.
Evidence
```
{
  "bytes": 64846
}
```
P5

pass

Resource requests: 15 (scripts:13, css:1, img:1)

Reasonable request count.

Evidence

{
  "imgs": 1,
  "styles": 1,
  "scripts": 13,
  "inlineStyles": 0,
  "inlineScripts": 21
}

pass
Inline JS+CSS bulk: 41 KB
Inline payload is modest.
Evidence
```
{
  "inlineJsBytes": 42139,
  "inlineCssBytes": 0
}
```
P5
pass
Response time: 99ms
Fast first response.
Evidence
```
{
  "bytes": 64846,
  "fetchMs": 99
}
```
P5
pass
Cache-Control set
Cache-Control: private, no-cache, no-store, max-age=0, must-revalidate
Evidence
```
{
  "cdn": [],
  "cacheControl": "private, no-cache, no-store, max-age=0, must-revalidate"
}
```
P5
pass
Compression enabled (gzip)
Content-Encoding: gzip
Evidence
```
{
  "encoding": "gzip"
}
```
P5

8. Security

warn
HSTS missing
Add `Strict-Transport-Security: max-age=31536000; includeSubDomains` once you're confident in https.
P3
warn
Content-Security-Policy missing
Define a CSP to limit script sources — large reduction in XSS surface.
P3
warn
X-Frame-Options missing
Add `X-Frame-Options: SAMEORIGIN` (or use CSP frame-ancestors) to prevent clickjacking.
P4
warn
X-Content-Type-Options missing
Add `X-Content-Type-Options: nosniff` to block MIME-type sniffing.
P4
warn
Referrer-Policy missing
Add `Referrer-Policy: strict-origin-when-cross-origin` for safer referrers.
P4
warn
Permissions-Policy missing
Restrict browser features (camera, mic, geolocation) you don't use.
P4

pass

Served over HTTPS

Evidence

{
  "finalUrl": "https://crawlproof.com/"
}

pass
No mixed content detected
P5

9. robots.txt and sitemap.xml Audit

pass

robots.txt present

336 chars

Evidence

{
  "snippet": "User-Agent: *\nAllow: /\n\nUser-Agent: GPTBot\nAllow: /\n\nUser-Agent: ClaudeBot\nAllow: /\n\nUser-Agent: PerplexityBot\nAllow: /\n\nUser-Agent: Google-Extended\nAllow: /\n\nUser-Agent: OAI-SearchBot\nAllow: /\n\nUser-Agent: Applebot-Extended\nAllow: /\n\nUser-Agent: CCBot\nAllow: /\n\nHost: https://crawlproof.com\nSitemap: https://crawlproof.com/sitemap.xml\n"
}

pass

robots.txt references sitemap(s)

Evidence

{
  "sitemaps": [
    "https://crawlproof.com/sitemap.xml"
  ]
}

pass

sitemap.xml present (55 URLs)

Evidence

{
  "snippet": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\">\n<url>\n<loc>https://crawlproof.com/</loc>\n<changefreq>weekly</changefreq>\n<priority>1</priority>\n</url>\n<url>\n<loc>https://crawlproof.com/pricing</loc>\n<changefreq>monthly</changefreq>\n<priority>0.9</priority>\n</url>\n<url>\n<loc>https://crawlproof.com/hire</loc>\n<changefreq>monthly</changefreq>\n<priority>0.9</priority>\n</url>\n<url>\n<loc>https://crawlproof.com/get-guide</loc>\n<changefreq>monthly</changefreq>\n<priority>0.9</priority>\n</url>\n<url>\n<loc>https://crawlproof.com/about</loc>\n<changefreq>monthly</changefreq>\n<priority>0.7</priority>\n</url>\n<url>\n<loc>https://crawlproof.com/press</loc>\n<changefreq>monthly</changefreq>\n<priority>0.6</priority>\n</url>\n<url>\n<loc>https://crawlproof.com/blo",
  "urlCount": 55
}

10. LLM / AI Crawler Accessibility

warn
/.well-known/security.txt missing
Publish a /.well-known/security.txt with at least a Contact: line. Crawlers and security researchers expect it; AI systems use it as a trust signal.
P3
warn
/llms-full.txt missing
Add /llms-full.txt with concatenated Markdown of all key pages. Lets LLMs ingest your full site in one request.
P3
pass
GPTBot has explicit rules
An explicit User-agent block exists. Make sure it allows the paths you want indexed.
P5
pass
ClaudeBot has explicit rules
An explicit User-agent block exists. Make sure it allows the paths you want indexed.
P5
pass
PerplexityBot has explicit rules
An explicit User-agent block exists. Make sure it allows the paths you want indexed.
P5
pass
Google-Extended has explicit rules
An explicit User-agent block exists. Make sure it allows the paths you want indexed.
P5
pass
OAI-SearchBot has explicit rules
An explicit User-agent block exists. Make sure it allows the paths you want indexed.
P5
pass
Applebot-Extended has explicit rules
An explicit User-agent block exists. Make sure it allows the paths you want indexed.
P5
pass
CCBot has explicit rules
An explicit User-agent block exists. Make sure it allows the paths you want indexed.
P5

pass

llms.txt present

1618 chars

Evidence

{
  "snippet": "# CrawlProof\n\n> See your site the way AI crawlers do. CrawlProof runs an AEO audit on any URL and produces a structured report of what LLM crawlers and answer engines can actually find — content, schema, robots rules, AI-bot access, positioning clarity, and recommended fixes.\n\n## Product\n- Free single-URL AEO audit, 10 per day per IP, no signup required.\n- Signed-in users get 20 free credits (1 AI"
}

pass
skill.md present
P5
pass
/.well-known/ai-plugin.json present
P5

11. Generative Engine Optimization (GEO)

warn
Organization schema has no knowledge graph sameAs links
Add `sameAs` pointing to Wikipedia, Wikidata, LinkedIn, or Crunchbase. AI systems use these links to resolve your brand name as a known entity rather than an ambiguous string.
Evidence
```
{
  "existingSameAs": []
}
```
P2
warn
No llms-full.txt found
llms-full.txt is an extended version of llms.txt that embeds the full text of every listed resource. Large-context models and RAG pipelines can ingest your entire site in one request — greatly improving citation coverage.
P3
warn
No outbound links to authoritative sources on the homepage
Generative AI systems cite pages that themselves cite credible sources. Link to Wikipedia, .gov, .edu, peer-reviewed research, or major news outlets to signal epistemic trustworthiness.
P3

pass

llms.txt present with meaningful content

5 section(s), 6 linked resource(s), 223 words.

Evidence

{
  "words": 223,
  "snippet": "# CrawlProof\n\n> See your site the way AI crawlers do. CrawlProof runs an AEO audit on any URL and produces a structured report of what LLM crawlers and answer engines can actually find — content, schema, robots rules, AI-bot access, positioning clarity, and recommended fixes.\n\n## Product\n- Free single-URL AEO audit, 10 per day per IP, no signup required.\n- Signed-in users get 20 free credits (1 AI",
  "headings": 5,
  "urlLines": 6
}

pass
AI agent integration files present: ai-plugin.json, skill.md
Generative AI agents can discover and use your site's capabilities via standard integration files.
Evidence
```
{
  "files": [
    "ai-plugin.json",
    "skill.md"
  ]
}
```
P5
pass
Brand entity declared: "CrawlProof"
AI systems can resolve "CrawlProof" as a distinct entity. Ensure this name is used consistently in titles, H1s, and social profiles.
Evidence
```
{
  "schemaName": "CrawlProof"
}
```
P5

12. Positioning Clarity

warn
Value-prop language not detected
Pages with phrases like 'we help X', 'platform for Y', 'built for Z' are easier for LLMs to summarize.
P3
pass
Pricing path discoverable
P5
pass
Contact / signup path discoverable
P5
pass
About/Team path discoverable
P5
pass
H1 communicates value
See your site the way AI crawlers do.
P5

13. Foundations

warn
No feed announced
Add <link rel="alternate" type="application/rss+xml" href="/feed.xml"> so agents and readers can subscribe.
P4
pass
<!doctype html> declared
P5
pass
Meta viewport present
content="width=device-width, initial-scale=1, maximum-scale=5"
Evidence
```
{
  "content": "width=device-width, initial-scale=1, maximum-scale=5"
}
```
P5
pass
Theme color set (#0b0d10)
Evidence
```
{
  "color": "#0b0d10"
}
```
P5

14. Accessibility

warn
No skip navigation link found
Add a visible-on-focus skip link as the first focusable element: <a href="#main">Skip to main content</a>.
P3
fail
Form label coverage: 0% (0/2)
Each form input needs a <label for=…>, aria-label, or aria-labelledby for screen readers.
P3
pass
All semantic landmarks present (header/nav/main/footer)
Evidence
```
{
  "nav": 1,
  "main": 1,
  "footer": 1,
  "header": 1
}
```
P5

15. Well-Known URIs

warn
No HTTP Link header
Add a Link response header to advertise sitemap, llms.txt, and api-catalog to crawlers without requiring HTML parsing.
P4
warn
/.well-known/change-password missing
Redirect /.well-known/change-password to your password-change page so password managers can deep-link directly.
P4
warn
/.well-known/api-catalog missing
Publish a /.well-known/api-catalog (RFC 9727 Linkset) so agents can discover your API endpoints automatically.
P4
warn
/.well-known/agent-card.json missing
Publish an agent card at /.well-known/agent-card.json to enable agent-to-agent (A2A) discovery.
P4

16. Privacy

warn
1 third-party script(s) loaded
Each external script can read cookies and page data. Audit: https://feedback.profullstack.com/embed/profullstack-feedback.js
Evidence
```
{
  "scripts": [
    "https://feedback.profullstack.com/embed/profullstack-feedback.js"
  ]
}
```
P4
pass
Privacy policy link found
/privacy
Evidence
```
{
  "href": "/privacy"
}
```
P5

17. Resilience

pass
Web App Manifest declared
/manifest.webmanifest
Evidence
```
{
  "href": "/manifest.webmanifest"
}
```
P5

18. Website Specification

No findings.

19. Missing or Hard-to-Find Information

warn
5 data point(s) could not be found from public pages
· Customer logos · Social proof · New hires · Positioning · Case studies or testimonials
Evidence
```
{
  "missing": [
    "Customer logos",
    "Social proof",
    "New hires",
    "Positioning",
    "Case studies or testimonials"
  ]
}
```
P3

20. Recommended Fixes

warn
Add sameAs knowledge graph links to Organization schema
Extend your Organization JSON-LD to include `sameAs` pointing to authoritative directories: ```json { "@context": "https://schema.org", "@type": "Organization", "name": "Your Brand", "url": "https://yoursite.com", "sameAs": [ "https://en.wikipedia.org/wiki/Your_Brand", "https://www.wikidata.org/wiki/Q12345678", "https://www.linkedin.com/company/your-brand", "https://www.crunchbase.com/organization/your-brand" ] } ``` These links anchor your brand as a known entity in AI knowledge graphs, making it far more likely that generative models cite you by name rather than paraphrase.
Evidence
```
{
  "for": "geo.knowledge_graph"
}
```
P2
warn
Phrase a heading as a user question
Use headings like 'How does pricing work?' or 'Who is this for?' — they map directly to conversational AI queries.
Evidence
```
{
  "for": "content.qa_headings"
}
```
P3
warn
Publish a date signal
Add `<time datetime="2026-05-17">` or `<meta property="article:published_time">`. AI ranking heavily weights freshness.
Evidence
```
{
  "for": "content.date_signal"
}
```
P3
warn
Raise your text-to-HTML ratio
Strip unused inline scripts/styles and move large bundles to external files. AI crawlers struggle when most of the response is markup.
Evidence
```
{
  "for": "content.text_ratio"
}
```
P3
warn
Generate /llms-full.txt for RAG pipelines
llms-full.txt is a concatenation of the full markdown text of every resource listed in llms.txt. Generate it statically at build time and serve it from your root: ``` # Your Brand — Full Content ## Getting Started <full markdown content of /docs/start> ## API Reference <full markdown content of /docs/api> ``` Large-context models can ingest your entire knowledge base in a single request, dramatically improving recall and citation accuracy.
Evidence
```
{
  "for": "geo.llms_full_txt"
}
```
P3
warn
Add a meta description
50–160 chars. Repeat your core value prop in plain language; this often becomes the AI snippet. ```html <meta name="description" content="CrawlProof shows you exactly how AI crawlers see your site, then tells you what to fix." /> ```
Evidence
```
{
  "for": "homepage.description"
}
```
P3
warn
Use modern image formats
Serve WebP or AVIF for hero/above-the-fold images. Keep legacy PNG/JPG only as <picture> fallbacks.
Evidence
```
{
  "for": "images.format"
}
```
P3
warn
Set width/height on images
Explicit dimensions prevent Cumulative Layout Shift and help AI extractors reserve space correctly.
Evidence
```
{
  "for": "images.dimensions"
}
```
P3
warn
Fix broken homepage links
We HEAD-probed the first 20 unique homepage links and found 4xx/5xx responses. Repair or remove them — broken links erode crawler trust.
Evidence
```
{
  "for": "links.broken_sample"
}
```
P3
warn
Publish /.well-known/security.txt
A security contact builds trust with crawlers and researchers. Minimal example: ``` Contact: mailto:security@yourdomain.com Expires: 2027-01-01T00:00:00.000Z Preferred-Languages: en ```
Evidence
```
{
  "for": "security_txt"
}
```
P3
warn
Eliminate render-blocking head scripts
Add `defer` or `async` to any `<script src="…">` in `<head>`, or move it to the end of `<body>`.
Evidence
```
{
  "for": "perf.render_blocking"
}
```
P3
warn
State your audience explicitly
Use phrases like 'Built for B2B SaaS marketing teams' on the homepage and About page.
Evidence
```
{
  "for": "positioning.audience"
}
```
P3
warn
Add Article / BlogPosting JSON-LD
On every blog/article page, include Article JSON-LD with headline, author, datePublished, dateModified. AI engines weight these heavily for freshness and authority.
Evidence
```
{
  "for": "schema.article"
}
```
P3
warn
Add BreadcrumbList JSON-LD
Helps AI engines understand site hierarchy and improves citation context.
Evidence
```
{
  "for": "schema.breadcrumb"
}
```
P3
warn
Enable HSTS
Add `Strict-Transport-Security: max-age=31536000; includeSubDomains` once you're confident every subdomain is https-ready.
Evidence
```
{
  "for": "security.hsts"
}
```
P3
warn
Define a Content-Security-Policy
Start with `Content-Security-Policy-Report-Only` to learn safe sources, then enforce. Cuts XSS blast radius.
Evidence
```
{
  "for": "security.csp"
}
```
P3
warn
Add outbound links to authoritative sources
Link to Wikipedia, .gov or .edu resources, peer-reviewed studies, or major news outlets when making factual claims. Generative AI systems treat pages that cite authoritative sources as more trustworthy, which raises citation likelihood. Examples: statistics from Statista or Census.gov, definitions from Wikipedia, research from nature.com or pubmed.ncbi.nlm.nih.gov.
Evidence
```
{
  "for": "geo.citation_signals"
}
```
P3
warn
Declare an author byline
Add `<meta name="author" content="Name">` or a visible byline with `rel="author"`. Combine with Person JSON-LD for E-E-A-T.
Evidence
```
{
  "for": "content.author"
}
```
P4
warn
Add LocalBusiness JSON-LD (if you have a physical location)
Include address, geo, openingHours, telephone. Required for AI engines to surface you in 'near me' queries.
Evidence
```
{
  "for": "schema.local_business"
}
```
P4
warn
Add Person JSON-LD for authors / founders
Mark up bylines and founder bios with Person schema — name, jobTitle, sameAs (their profiles). Strengthens E-E-A-T.
Evidence
```
{
  "for": "schema.person"
}
```
P4
warn
Add HowTo JSON-LD for step-by-step content
For any 'how to' page, wrap the steps in HowTo JSON-LD. AI step-by-step answers cite these heavily.
Evidence
```
{
  "for": "schema.howto"
}
```
P4
warn
Add VideoObject JSON-LD
For embedded videos, include VideoObject with thumbnailUrl, uploadDate, duration. AI engines cite these in multimedia answers.
Evidence
```
{
  "for": "schema.video"
}
```
P4
warn
Add X-Frame-Options
`X-Frame-Options: SAMEORIGIN` (or CSP `frame-ancestors`) blocks clickjacking via iframe embeds.
Evidence
```
{
  "for": "security.xfo"
}
```
P4
warn
Add X-Content-Type-Options
`X-Content-Type-Options: nosniff` prevents browsers from MIME-sniffing responses.
Evidence
```
{
  "for": "security.xcto"
}
```
P4
warn
Set a Referrer-Policy
`Referrer-Policy: strict-origin-when-cross-origin` is a safe default.
Evidence
```
{
  "for": "security.referrer"
}
```
P4
warn
Set a Permissions-Policy
Restrict browser features you don't use, e.g. `Permissions-Policy: camera=(), microphone=(), geolocation=()`.
Evidence
```
{
  "for": "security.permissions"
}
```
P4

21. Priority To-Do List

warn
[ ] Add sameAs knowledge graph links to Organization schema
Extend your Organization JSON-LD to include `sameAs` pointing to authoritative directories: ```json { "@context": "https://schema.org", "@type": "Organization", "name": "Your Brand", "url": "https://yoursite.com", "sameAs": [ "https://en.wikipedia.org/wiki/Your_Brand", "https://www.wikidata.org/wiki/Q12345678", "https://www.linkedin.com/company/your-brand", "https://www.crunchbase.com/organization/your-brand" ] } ``` These links anchor your brand as a known entity in AI knowledge graphs, making it far more likely that generative models cite you by name rather than paraphrase.
Evidence
```
{
  "priority": 2
}
```
P2
warn
[ ] Phrase a heading as a user question
Use headings like 'How does pricing work?' or 'Who is this for?' — they map directly to conversational AI queries.
Evidence
```
{
  "priority": 3
}
```
P3
warn
[ ] Publish a date signal
Add `<time datetime="2026-05-17">` or `<meta property="article:published_time">`. AI ranking heavily weights freshness.
Evidence
```
{
  "priority": 3
}
```
P3
warn
[ ] Raise your text-to-HTML ratio
Strip unused inline scripts/styles and move large bundles to external files. AI crawlers struggle when most of the response is markup.
Evidence
```
{
  "priority": 3
}
```
P3
warn
[ ] Generate /llms-full.txt for RAG pipelines
llms-full.txt is a concatenation of the full markdown text of every resource listed in llms.txt. Generate it statically at build time and serve it from your root: ``` # Your Brand — Full Content ## Getting Started <full markdown content of /docs/start> ## API Reference <full markdown content of /docs/api> ``` Large-context models can ingest your entire knowledge base in a single request, dramatically improving recall and citation accuracy.
Evidence
```
{
  "priority": 3
}
```
P3
warn
[ ] Add outbound links to authoritative sources
Link to Wikipedia, .gov or .edu resources, peer-reviewed studies, or major news outlets when making factual claims. Generative AI systems treat pages that cite authoritative sources as more trustworthy, which raises citation likelihood. Examples: statistics from Statista or Census.gov, definitions from Wikipedia, research from nature.com or pubmed.ncbi.nlm.nih.gov.
Evidence
```
{
  "priority": 3
}
```
P3
warn
[ ] Add a meta description
50–160 chars. Repeat your core value prop in plain language; this often becomes the AI snippet. ```html <meta name="description" content="CrawlProof shows you exactly how AI crawlers see your site, then tells you what to fix." /> ```
Evidence
```
{
  "priority": 3
}
```
P3
warn
[ ] Use modern image formats
Serve WebP or AVIF for hero/above-the-fold images. Keep legacy PNG/JPG only as <picture> fallbacks.
Evidence
```
{
  "priority": 3
}
```
P3
warn
[ ] Set width/height on images
Explicit dimensions prevent Cumulative Layout Shift and help AI extractors reserve space correctly.
Evidence
```
{
  "priority": 3
}
```
P3
warn
[ ] Fix broken homepage links
We HEAD-probed the first 20 unique homepage links and found 4xx/5xx responses. Repair or remove them — broken links erode crawler trust.
Evidence
```
{
  "priority": 3
}
```
P3
warn
[ ] Publish /.well-known/security.txt
A security contact builds trust with crawlers and researchers. Minimal example: ``` Contact: mailto:security@yourdomain.com Expires: 2027-01-01T00:00:00.000Z Preferred-Languages: en ```
Evidence
```
{
  "priority": 3
}
```
P3
warn
[ ] Eliminate render-blocking head scripts
Add `defer` or `async` to any `<script src="…">` in `<head>`, or move it to the end of `<body>`.
Evidence
```
{
  "priority": 3
}
```
P3
warn
[ ] State your audience explicitly
Use phrases like 'Built for B2B SaaS marketing teams' on the homepage and About page.
Evidence
```
{
  "priority": 3
}
```
P3
warn
[ ] Add Article / BlogPosting JSON-LD
On every blog/article page, include Article JSON-LD with headline, author, datePublished, dateModified. AI engines weight these heavily for freshness and authority.
Evidence
```
{
  "priority": 3
}
```
P3
warn
[ ] Add BreadcrumbList JSON-LD
Helps AI engines understand site hierarchy and improves citation context.
Evidence
```
{
  "priority": 3
}
```
P3
warn
[ ] Enable HSTS
Add `Strict-Transport-Security: max-age=31536000; includeSubDomains` once you're confident every subdomain is https-ready.
Evidence
```
{
  "priority": 3
}
```
P3
warn
[ ] Define a Content-Security-Policy
Start with `Content-Security-Policy-Report-Only` to learn safe sources, then enforce. Cuts XSS blast radius.
Evidence
```
{
  "priority": 3
}
```
P3
warn
[ ] Add LocalBusiness JSON-LD (if you have a physical location)
Include address, geo, openingHours, telephone. Required for AI engines to surface you in 'near me' queries.
Evidence
```
{
  "priority": 4
}
```
P4
warn
[ ] Add Person JSON-LD for authors / founders
Mark up bylines and founder bios with Person schema — name, jobTitle, sameAs (their profiles). Strengthens E-E-A-T.
Evidence
```
{
  "priority": 4
}
```
P4
warn
[ ] Declare an author byline
Add `<meta name="author" content="Name">` or a visible byline with `rel="author"`. Combine with Person JSON-LD for E-E-A-T.
Evidence
```
{
  "priority": 4
}
```
P4

AEO Audit for crawlproof.com

1. Crawl Summary

2. Data Found

3. Homepage Audit

4. Content Quality

5. Schema / Structured Data Audit

6. Links & Images

7. Performance

8. Security

9. robots.txt and sitemap.xml Audit

10. LLM / AI Crawler Accessibility

11. Positioning Clarity

12. Missing or Hard-to-Find Information

13. Recommended Fixes

14. Priority To-Do List

https://crawlproof.com/

1. Crawl Summary

2. Data Found

3. Homepage Audit

4. Content Quality

5. Schema / Structured Data Audit

6. Links & Images

7. Performance

8. Security

9. robots.txt and sitemap.xml Audit

10. LLM / AI Crawler Accessibility

11. Generative Engine Optimization (GEO)

12. Positioning Clarity

13. Foundations

14. Accessibility

15. Well-Known URIs

16. Privacy

17. Resilience

18. Website Specification

19. Missing or Hard-to-Find Information

20. Recommended Fixes

21. Priority To-Do List

Track this site week over week

Your charts, not these.

Email yourself this report