CrawlProof

Public share link

AEO Audit for crawlproof.com

Target: https://crawlproof.com/
Score: 57 / 100
Generated: 2026-07-01T20:10:59.490Z
Pages crawled: 9
Findings: 64 pass · 86 warn · 2 fail · 0 unknown


1. Crawl Summary

2. Data Found

Data PointFound?SourceNotes
PricingYesPricing pagehttps://crawlproof.com/pricing
Customer logosNo
Social proofNo
Recent launchesYesPress/news pageshttps://crawlproof.com/blog
Blog post activityYesBloghttps://crawlproof.com/blog
New hiresNoOften only on a /blog/team or LinkedIn page
Headline copyYesHomepageSee your site the way AI crawlers do.
PositioningNo
Executive teamYesAbout/team pagehttps://crawlproof.com/about
Product/service descriptionsYesHomepageFrom meta description
Case studies or testimonialsNo
Contact/demo/signup pathsYesNavigation links

3. Homepage Audit

  • ⚠️ Long meta description (171 chars) Snippets truncate around 160 chars. Tighten to keep the key sentence visible.
  • Homepage fetched successfully HTTP 200 · 64846 bytes · 99ms
  • Page load time: 0.10s Fast — well within AI crawler budgets.
  • declared
  • Single H1 See your site the way AI crawlers do.
  • <title> present (50 chars)
  • Canonical present https://crawlproof.com
  • Open Graph tags complete
  • Twitter Card tags complete
  • Critical content is server-rendered Raw and rendered text are within 10% of each other.
  • Alt text coverage: 100% 1/1 images have alt text.
  • Content volume: 718 words Substantive content — AI models have enough to summarize and recommend.
  • Heading structure: 15 (h1:1, h2:5, h3:9) Multiple headings help AI chunk and outline your page.
  • Internal links: 22 22 internal + 1 external links help crawlers navigate.
  • Favicon declared

4. Content Quality

  • ⚠️ No question-style headings found Phrase at least one heading as a user question (e.g. 'How does pricing work?') to match conversational AI queries.
  • ⚠️ No date signal found Add or article:published_time meta. AI ranking weights freshness.
  • ⚠️ Text-to-HTML ratio: 7.8% Low text density — most of the response is markup/script.
  • ⚠️ No author byline found Add <meta name="author" content="Name"> or a visible byline with rel="author". Strengthens E-E-A-T signals.
  • Heading levels are well-ordered 16 headings nested in order.
  • Snippet-ready blocks: 4 (ul:4, ol:0, table:0) Lists and tables are extracted verbatim by AI answer engines.

5. Schema / Structured Data Audit

  • ⚠️ Article / BlogPosting JSON-LD not found Add Article / BlogPosting where applicable so AI answer engines can resolve the entity precisely.
  • ⚠️ BreadcrumbList JSON-LD not found Add BreadcrumbList where applicable so AI answer engines can resolve the entity precisely.
  • ⚠️ LocalBusiness JSON-LD not found Add LocalBusiness where applicable so AI answer engines can resolve the entity precisely.
  • ⚠️ Person (author / founder) JSON-LD not found Add Person (author / founder) where applicable so AI answer engines can resolve the entity precisely.
  • ⚠️ HowTo JSON-LD not found Add HowTo where applicable so AI answer engines can resolve the entity precisely.
  • ⚠️ VideoObject JSON-LD not found Add VideoObject where applicable so AI answer engines can resolve the entity precisely.
  • 6 JSON-LD block(s) found Types: WebSite, Organization, SoftwareApplication, Organization, SoftwareApplication, FAQPage
  • Organization present
  • WebSite present
  • SoftwareApplication present
  • FAQPage JSON-LD present
  • ⚠️ Modern image formats: 0% (0/1 webp/avif) 0 legacy (png/jpg/gif) image(s). Convert hero/above-the-fold images to WebP or AVIF.
  • Explicit dimensions: 0% (0/1) Add width and height attributes on tags to prevent CLS.
  • ⚠️ 1 broken link(s) in first 15 · timeout — https://vu1nz.com/
  • External nofollow: 0% (0/1) Healthy mix of follow and nofollow outbound links.

7. Performance

  • ⚠️ 1 render-blocking script(s) in Move non-critical scripts to end of or add defer/async.
  • ⚠️ No images use loading=lazy Add loading="lazy" to off-screen images to defer their fetch until needed.
  • Page size: 63 KB Compact HTML payload — well within AI crawler limits.
  • Resource requests: 15 (scripts:13, css:1, img:1) Reasonable request count.
  • Inline JS+CSS bulk: 41 KB Inline payload is modest.
  • Response time: 99ms Fast first response.
  • Cache-Control set Cache-Control: private, no-cache, no-store, max-age=0, must-revalidate
  • Compression enabled (gzip) Content-Encoding: gzip

8. Security

  • ⚠️ HSTS missing Add Strict-Transport-Security: max-age=31536000; includeSubDomains once you're confident in https.
  • ⚠️ Content-Security-Policy missing Define a CSP to limit script sources — large reduction in XSS surface.
  • ⚠️ X-Frame-Options missing Add X-Frame-Options: SAMEORIGIN (or use CSP frame-ancestors) to prevent clickjacking.
  • ⚠️ X-Content-Type-Options missing Add X-Content-Type-Options: nosniff to block MIME-type sniffing.
  • ⚠️ Referrer-Policy missing Add Referrer-Policy: strict-origin-when-cross-origin for safer referrers.
  • ⚠️ Permissions-Policy missing Restrict browser features (camera, mic, geolocation) you don't use.
  • Served over HTTPS
  • No mixed content detected

9. robots.txt and sitemap.xml Audit

  • robots.txt present 336 chars
  • robots.txt references sitemap(s)
  • sitemap.xml present (55 URLs)

10. LLM / AI Crawler Accessibility

  • ⚠️ /.well-known/security.txt missing Publish a /.well-known/security.txt with at least a Contact: line. Crawlers and security researchers expect it; AI systems use it as a trust signal.
  • ⚠️ /llms-full.txt missing Add /llms-full.txt with concatenated Markdown of all key pages. Lets LLMs ingest your full site in one request.
  • GPTBot has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
  • ClaudeBot has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
  • PerplexityBot has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
  • Google-Extended has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
  • OAI-SearchBot has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
  • Applebot-Extended has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
  • CCBot has explicit rules An explicit User-agent block exists. Make sure it allows the paths you want indexed.
  • llms.txt present 1618 chars
  • skill.md present
  • /.well-known/ai-plugin.json present

11. Positioning Clarity

  • ⚠️ Value-prop language not detected Pages with phrases like 'we help X', 'platform for Y', 'built for Z' are easier for LLMs to summarize.
  • About/Team path discoverable
  • H1 communicates value See your site the way AI crawlers do.
  • Pricing path discoverable
  • Contact / signup path discoverable

12. Missing or Hard-to-Find Information

  • ⚠️ 5 data point(s) could not be found from public pages · Customer logos · Social proof · New hires · Positioning · Case studies or testimonials
  • ⚠️ Add sameAs knowledge graph links to Organization schema Extend your Organization JSON-LD to include sameAs pointing to authoritative directories:

    {
      "@context": "https://schema.org",
      "@type": "Organization",
      "name": "Your Brand",
      "url": "https://yoursite.com",
      "sameAs": [
        "https://en.wikipedia.org/wiki/Your_Brand",
        "https://www.wikidata.org/wiki/Q12345678",
        "https://www.linkedin.com/company/your-brand",
        "https://www.crunchbase.com/organization/your-brand"
      ]
    }
    

    These links anchor your brand as a known entity in AI knowledge graphs, making it far more likely that generative models cite you by name rather than paraphrase.

  • ⚠️ Phrase a heading as a user question Use headings like 'How does pricing work?' or 'Who is this for?' — they map directly to conversational AI queries.

  • ⚠️ Publish a date signal Add <time datetime="2026-05-17"> or <meta property="article:published_time">. AI ranking heavily weights freshness.

  • ⚠️ Raise your text-to-HTML ratio Strip unused inline scripts/styles and move large bundles to external files. AI crawlers struggle when most of the response is markup.

  • ⚠️ Generate /llms-full.txt for RAG pipelines llms-full.txt is a concatenation of the full markdown text of every resource listed in llms.txt. Generate it statically at build time and serve it from your root:

    # Your Brand — Full Content
    
    ## Getting Started
    <full markdown content of /docs/start>
    
    ## API Reference
    <full markdown content of /docs/api>
    

    Large-context models can ingest your entire knowledge base in a single request, dramatically improving recall and citation accuracy.

  • ⚠️ Add outbound links to authoritative sources Link to Wikipedia, .gov or .edu resources, peer-reviewed studies, or major news outlets when making factual claims. Generative AI systems treat pages that cite authoritative sources as more trustworthy, which raises citation likelihood.

    Examples: statistics from Statista or Census.gov, definitions from Wikipedia, research from nature.com or pubmed.ncbi.nlm.nih.gov.

  • ⚠️ Add a meta description 50–160 chars. Repeat your core value prop in plain language; this often becomes the AI snippet.

    <meta name="description" content="CrawlProof shows you exactly how AI crawlers see your site, then tells you what to fix." />
    
  • ⚠️ Use modern image formats Serve WebP or AVIF for hero/above-the-fold images. Keep legacy PNG/JPG only as fallbacks.

  • ⚠️ Set width/height on images Explicit dimensions prevent Cumulative Layout Shift and help AI extractors reserve space correctly.

  • ⚠️ Fix broken homepage links We HEAD-probed the first 20 unique homepage links and found 4xx/5xx responses. Repair or remove them — broken links erode crawler trust.

  • ⚠️ Publish /.well-known/security.txt A security contact builds trust with crawlers and researchers. Minimal example:

    Contact: mailto:security@yourdomain.com
    Expires: 2027-01-01T00:00:00.000Z
    Preferred-Languages: en
    
  • ⚠️ Eliminate render-blocking head scripts Add defer or async to any <script src="…"> in <head>, or move it to the end of <body>.

  • ⚠️ State your audience explicitly Use phrases like 'Built for B2B SaaS marketing teams' on the homepage and About page.

  • ⚠️ Add Article / BlogPosting JSON-LD On every blog/article page, include Article JSON-LD with headline, author, datePublished, dateModified. AI engines weight these heavily for freshness and authority.

  • ⚠️ Add BreadcrumbList JSON-LD Helps AI engines understand site hierarchy and improves citation context.

  • ⚠️ Enable HSTS Add Strict-Transport-Security: max-age=31536000; includeSubDomains once you're confident every subdomain is https-ready.

  • ⚠️ Define a Content-Security-Policy Start with Content-Security-Policy-Report-Only to learn safe sources, then enforce. Cuts XSS blast radius.

  • ⚠️ Declare an author byline Add <meta name="author" content="Name"> or a visible byline with rel="author". Combine with Person JSON-LD for E-E-A-T.

  • ⚠️ Add LocalBusiness JSON-LD (if you have a physical location) Include address, geo, openingHours, telephone. Required for AI engines to surface you in 'near me' queries.

  • ⚠️ Add Person JSON-LD for authors / founders Mark up bylines and founder bios with Person schema — name, jobTitle, sameAs (their profiles). Strengthens E-E-A-T.

  • ⚠️ Add HowTo JSON-LD for step-by-step content For any 'how to' page, wrap the steps in HowTo JSON-LD. AI step-by-step answers cite these heavily.

  • ⚠️ Add VideoObject JSON-LD For embedded videos, include VideoObject with thumbnailUrl, uploadDate, duration. AI engines cite these in multimedia answers.

  • ⚠️ Add X-Frame-Options X-Frame-Options: SAMEORIGIN (or CSP frame-ancestors) blocks clickjacking via iframe embeds.

  • ⚠️ Add X-Content-Type-Options X-Content-Type-Options: nosniff prevents browsers from MIME-sniffing responses.

  • ⚠️ Set a Referrer-Policy Referrer-Policy: strict-origin-when-cross-origin is a safe default.

  • ⚠️ Set a Permissions-Policy Restrict browser features you don't use, e.g. Permissions-Policy: camera=(), microphone=(), geolocation=().

14. Priority To-Do List

  • P2 — Add sameAs knowledge graph links to Organization schema Extend your Organization JSON-LD to include sameAs pointing to authoritative directories:

    ```json
    {
      "@context": "https://schema.org",
      "@type": "Organization",
      "name": "Your Brand",
      "url": "https://yoursite.com",
      "sameAs": [
        "https://en.wikipedia.org/wiki/Your_Brand",
        "https://www.wikidata.org/wiki/Q12345678",
        "https://www.linkedin.com/company/your-brand",
        "https://www.crunchbase.com/organization/your-brand"
      ]
    }
    ```
    
    These links anchor your brand as a known entity in AI knowledge graphs, making it far more likely that generative models cite you by name rather than paraphrase.
    
  • P3 — Phrase a heading as a user question Use headings like 'How does pricing work?' or 'Who is this for?' — they map directly to conversational AI queries.

  • P3 — Publish a date signal Add <time datetime="2026-05-17"> or <meta property="article:published_time">. AI ranking heavily weights freshness.

  • P3 — Raise your text-to-HTML ratio Strip unused inline scripts/styles and move large bundles to external files. AI crawlers struggle when most of the response is markup.

  • P3 — Generate /llms-full.txt for RAG pipelines llms-full.txt is a concatenation of the full markdown text of every resource listed in llms.txt. Generate it statically at build time and serve it from your root:

    ```
    # Your Brand — Full Content
    
    ## Getting Started
    <full markdown content of /docs/start>
    
    ## API Reference
    <full markdown content of /docs/api>
    ```
    
    Large-context models can ingest your entire knowledge base in a single request, dramatically improving recall and citation accuracy.
    
  • P3 — Add outbound links to authoritative sources Link to Wikipedia, .gov or .edu resources, peer-reviewed studies, or major news outlets when making factual claims. Generative AI systems treat pages that cite authoritative sources as more trustworthy, which raises citation likelihood.

    Examples: statistics from Statista or Census.gov, definitions from Wikipedia, research from nature.com or pubmed.ncbi.nlm.nih.gov.
    
  • P3 — Add a meta description 50–160 chars. Repeat your core value prop in plain language; this often becomes the AI snippet.

    ```html
    <meta name="description" content="CrawlProof shows you exactly how AI crawlers see your site, then tells you what to fix." />
    ```
    
  • P3 — Use modern image formats Serve WebP or AVIF for hero/above-the-fold images. Keep legacy PNG/JPG only as fallbacks.

  • P3 — Set width/height on images Explicit dimensions prevent Cumulative Layout Shift and help AI extractors reserve space correctly.

  • P3 — Fix broken homepage links We HEAD-probed the first 20 unique homepage links and found 4xx/5xx responses. Repair or remove them — broken links erode crawler trust.

  • P3 — Publish /.well-known/security.txt A security contact builds trust with crawlers and researchers. Minimal example:

    ```
    Contact: mailto:security@yourdomain.com
    Expires: 2027-01-01T00:00:00.000Z
    Preferred-Languages: en
    ```
    
  • P3 — Eliminate render-blocking head scripts Add defer or async to any <script src="…"> in <head>, or move it to the end of <body>.

  • P3 — State your audience explicitly Use phrases like 'Built for B2B SaaS marketing teams' on the homepage and About page.

  • P3 — Add Article / BlogPosting JSON-LD On every blog/article page, include Article JSON-LD with headline, author, datePublished, dateModified. AI engines weight these heavily for freshness and authority.

  • P3 — Add BreadcrumbList JSON-LD Helps AI engines understand site hierarchy and improves citation context.

  • P3 — Enable HSTS Add Strict-Transport-Security: max-age=31536000; includeSubDomains once you're confident every subdomain is https-ready.

  • P3 — Define a Content-Security-Policy Start with Content-Security-Policy-Report-Only to learn safe sources, then enforce. Cuts XSS blast radius.

  • P4 — Declare an author byline Add <meta name="author" content="Name"> or a visible byline with rel="author". Combine with Person JSON-LD for E-E-A-T.

  • P4 — Add LocalBusiness JSON-LD (if you have a physical location) Include address, geo, openingHours, telephone. Required for AI engines to surface you in 'near me' queries.

  • P4 — Add Person JSON-LD for authors / founders Mark up bylines and founder bios with Person schema — name, jobTitle, sameAs (their profiles). Strengthens E-E-A-T.


Report by CrawlProof. Reusable after every major website change.

Email yourself this report

Get a PDF copy of this audit in your inbox — handy for sharing with a client, dev, or teammate.