CrawlProof

Autoblog webhook

CrawlProof generates one SEO blog post per scheduled slot and POSTs it to your endpoint. The wire shape is a CloudEvents 1.0 envelope, signed per Standard Webhooks. Your endpoint owns the actual publish — turning the payload into a row in your CMS, a file in your repo, an MDX file in S3, whatever.

The easiest receiver is @profullstack/autoblog (v0.3.0, npm): its verifyAndParse helper validates the bearer + signature + envelope in a single call and hands you a normalized Post object. v0.3 also covers two adjacent delivery protocols — W3C Micropub (push posts to a Micropub endpoint on your CMS) and W3C ActivityPub (federate to the fediverse) — so you can pick whichever protocol matches your stack.

Request

Headers

Authorization:      Bearer <secret-key>                  # token from /autoblog/setup
webhook-id:         <event uuid>                        # stable across retries
webhook-timestamp:  <unix seconds>                       # delivery time
webhook-signature:  v1,<base64 HMAC-SHA256>              # signs id.timestamp.body
Content-Type:       application/cloudevents+json
User-Agent:         @profullstack/autoblog/0.3

The bearer is a secret you generate on your receiver site (e.g. a per-source token from your blog's admin page) and paste into /autoblog/setup. Crawlproof stores it verbatim and uses the same value as the HMAC key for the signature header.

Signing details (Standard Webhooks): the signature is HMAC-SHA256(secret, "{id}.{timestamp}.{body}") base64-encoded and prefixed with v1,. Receivers should reject deliveries whose timestamp is more than 5 minutes from now (replay defense). Multiple space-separated signatures are allowed in the header so we can rotate keys without dropping in-flight deliveries.

Body

CloudEvents 1.0 envelope. The data.post object is the canonical, normalized blog post.

{
  "specversion": "1.0",
  "id":          "0193a8b9-d2c4-7f44-9a31-3f1c2e7b9a01",
  "type":        "com.crawlproof.post.published.v1",
  "source":      "https://crawlproof.com",
  "subject":     "post:<id>",
  "time":        "2026-05-15T09:00:00.000Z",
  "datacontenttype": "application/json",
  "data": {
    "post": {
      "id":            "uuid",
      "url":           "https://your-site/blog/{slug}",
      "canonical_url": "https://your-site/blog/{slug}",
      "title":         "string",
      "slug":          "kebab-case-slug",
      "excerpt":       "≤240-char prose summary" | null,
      "html":          "<p>…</p>",
      "markdown":      "..." | null,
      "status":        "published",
      "published_at":  "ISO-8601",
      "updated_at":    "ISO-8601",
      "author":        null,
      "tags":          ["seo", "ai bots"],
      "categories":    [],
      "featured_image": { "url": "https://...", "alt": "..." } | null
    }
  }
}

meta_description (≤160 chars, SEO copy) is sent inside the legacy fields when present but the canonical short summary lives in post.excerpt (≤240 chars). Receivers should prefer excerpt.

Example — send a valid signed request

Drop-in bash that POSTs a real, validly-signed CloudEvents envelope at your receiver. Only curl + openssl + uuidgen — no Node, no Python. Set the two variables at the top and run; this is the same shape CrawlProof puts on the wire for a real delivery.

# Replace these two and run.
URL="https://your-site.example/api/webhooks/crawlproof"
SECRET="<secret-key>"

ID="$(uuidgen | tr 'A-Z' 'a-z')"
TS="$(date +%s)"
NOW="$(date -u +%Y-%m-%dT%H:%M:%S.000Z)"

BODY=$(cat <<JSON
{"specversion":"1.0","id":"$ID","type":"com.crawlproof.post.published.v1","source":"https://crawlproof.com","subject":"post:$ID","time":"$NOW","datacontenttype":"application/json","data":{"post":{"id":"$ID","url":"$URL","canonical_url":"$URL","title":"Local test post","slug":"local-test-post","excerpt":"Verifying the autoblog webhook end-to-end from curl.","html":"<p>Hello from a signed test webhook.</p>","markdown":"Hello from a signed test webhook.","status":"published","published_at":"$NOW","updated_at":"$NOW","author":null,"tags":["test"],"categories":[],"featured_image":null}}}
JSON
)

SIG="v1,$(printf '%s.%s.%s' "$ID" "$TS" "$BODY" \
  | openssl dgst -sha256 -hmac "$SECRET" -binary \
  | openssl base64 -A)"

curl -sS -X POST "$URL" \
  -H "Authorization: Bearer $SECRET" \
  -H "webhook-id: $ID" \
  -H "webhook-timestamp: $TS" \
  -H "webhook-signature: $SIG" \
  -H "Content-Type: application/cloudevents+json" \
  --data-binary "$BODY"

The signing string is {id}.{timestamp}.{body} — exactly the same bytes that go in the headers + body. Edit one without regenerating the signature and the receiver will 401, which is the whole point.

Retry

At-least-once delivery. On 5xx, 408, 429, or network error we retry up to 3 attempts spaced at 0s / 10s / 60s. On any other 4xx we give up immediately — that's your endpoint asking us to stop. The webhook-id stays stable across retries of the same article, so dedupe on that.

Receiver — recommended (SDK)

The 30-LOC version. The SDK handles bearer + signature + envelope validation; you handle storage.

// npm i @profullstack/autoblog
import { verifyAndParse } from "@profullstack/autoblog";

export async function POST(req: Request) {
  const body = await req.text(); // raw bytes — needed for signature
  const r = verifyAndParse({
    headers: Object.fromEntries(req.headers),
    body,
    opts: { secret: process.env.CRAWLPROOF_WEBHOOK_SECRET! },
  });
  if (!r.ok) return new Response(r.reason, { status: r.status });

  await savePost(r.post); // your CMS / DB
  return new Response(null, { status: 200 });
}

Receiver — recommended (SDK + gate)

If your blog is in a topical network, add the network gate so off-niche or low-quality posts are rejected before they touch your DB. The SDK ships @profullstack/autoblog/quality for this:

import { verifyAndParse } from "@profullstack/autoblog";
import { gatePost } from "@profullstack/autoblog/quality";

export async function POST(req: Request) {
  const body = await req.text();
  const r = verifyAndParse({
    headers: Object.fromEntries(req.headers),
    body,
    opts: { secret: process.env.CRAWLPROOF_WEBHOOK_SECRET! },
  });
  if (!r.ok) return new Response(r.reason, { status: r.status });

  const gated = await gatePost(r.post, {
    allowedNiches: ["security", "ctem", "soc"],
    heuristics: { minWordCount: 500, maxLinkDensity: 1.0 },
    minQualityScore: 6,
    anthropicApiKey: process.env.ANTHROPIC_API_KEY!,
  });
  if (!gated.ok) {
    return new Response(gated.reasons.join("; "), {
      status: gated.stage === "niche" ? 403 : 422,
    });
  }

  await savePost(r.post);
  return new Response(null, { status: 200 });
}

Niche match is loose by default (case-insensitive, partial-word overlap). Empty allowedNiches = accept any niche.

Receiver — from scratch (no SDK)

If you can't add a dependency, the verification is ~40 LOC of standard library. crypto.timingSafeEqual for the bearer, crypto.createHmac for the signature.

import { NextResponse } from "next/server";
import crypto from "node:crypto";

const SECRET = process.env.CRAWLPROOF_WEBHOOK_SECRET!;
const TOLERANCE_SEC = 5 * 60;

export const runtime = "nodejs";

export async function POST(req: Request) {
  const body = await req.text();
  const h = (k: string) => req.headers.get(k) ?? "";

  // Bearer.
  const bearer = h("authorization").replace(/^Bearer\s+/i, "");
  if (
    bearer.length !== SECRET.length ||
    !crypto.timingSafeEqual(Buffer.from(bearer), Buffer.from(SECRET))
  ) {
    return NextResponse.json({ ok: false }, { status: 401 });
  }

  // Standard Webhooks signature.
  const id = h("webhook-id");
  const ts = h("webhook-timestamp");
  const sig = h("webhook-signature");
  const now = Math.floor(Date.now() / 1000);
  if (!id || !ts || !sig) return NextResponse.json({ ok: false }, { status: 401 });
  if (Math.abs(now - Number(ts)) > TOLERANCE_SEC) {
    return NextResponse.json({ ok: false, reason: "stale" }, { status: 401 });
  }
  const expected =
    "v1," +
    crypto.createHmac("sha256", SECRET).update(`${id}.${ts}.${body}`).digest("base64");
  const ok = sig.split(/\s+/).some(
    (s) =>
      s.length === expected.length &&
      crypto.timingSafeEqual(Buffer.from(s), Buffer.from(expected)),
  );
  if (!ok) return NextResponse.json({ ok: false, reason: "bad sig" }, { status: 401 });

  // Envelope.
  const evt = JSON.parse(body);
  if (evt?.specversion !== "1.0" || !evt?.data?.post) {
    return NextResponse.json({ ok: false }, { status: 400 });
  }

  await savePost(evt.data.post); // your storage
  return NextResponse.json({ ok: true });
}

Local testing

We ship a zero-dependency reference receiver under examples/autoblog-webhook-receiver/ in the CrawlProof repo. Drop your secret into CRAWLPROOF_WEBHOOK_SECRET, run node server.mjs, expose it via ngrok or Cloudflare Tunnel, and paste the public URL into /autoblog/setup. Hit Generate article now on the dashboard to fire an immediate delivery.