Modules

Site Audit (Crawler + LLM Citation Readiness)

Technical SEO + Grounding-Page scan in one crawl — includes Close-the-Loop for API callers.

Site Audit crawls your domain (BFS or via sitemap.xml), generates an issue report for every URL (technical SEO + content hygiene), and then automatically fires a Grounding-Audit batch over up to 100 pages. The findings are added as additional issues with prefix grounding_* into the same issue list — one dashboard view for technical SEO + LLM-Citation-Readiness.

What it does

Crawl modes — bfs (link discovery from start_url, depth-limited) OR sitemap (loads all URLs from sitemap.xml, recursive incl. sitemap-index, ignores crawl_depth). Hard cap: 10,000 seeds, max_pages enforced.
Issue classification — fix-priority sorted, filtered by severity (critical|high|medium|low|notice), issue_type, and status (open|fixed|dismissed). Eager-loaded url per issue.
Auto-Grounding-Bridge — auto-dispatched after crawl completion: Grounding-Audit-Batch over up to 100 pages → findings appear as grounding_*-issues in the same crawl. No extra credits. Idempotent (re-run de-duplicates).
Close-the-Loop — Mark issues as fixed/dismissed via UI or API — individually OR bulk per issue_type. Mandatory after each fix so the delta comparison metric ("fixed since last crawl") stays meaningful.
AI brief (5 credits per brief) — narrative LLM explanation per issue on demand.
Bridge timestamps — bridge_dispatched_at (batch dispatched) and bridge_completed_at (all grounding_* issues final). Deterministic polling condition instead of blind 1-3 min waits.

When to use

Site-wide technical SEO inventory.
Check whole-domain LLM citation readiness without 340× single-audit calls.
Pre-relaunch: what still needs to be removed / fixed.
Monthly health report for stakeholders.

UI workflow

Start — Form at /site-audit (start_url, max_pages, crawl_depth, crawl_mode, optional tracking_project_id).
Polling — Detail page /site-audit/{crawl} shows live counter (pages_crawled, total_issues). Auto-refresh.
Close-the-Loop — Per issue: 2 buttons ✓ Mark fixed and ⊘ Dismiss. Above the table: a bulk-action toolbar with dropdown "Select issue type" + Mark all as fixed / Dismiss all for mass fixes after layout/template changes.
Re-crawl — After critical+high are fixed, new crawl → trend block shows "fixed since last crawl" / "new since last crawl".

API workflow (skill caller)

Mandatory sequence after fix:

# 1. Fetch issues
ISSUES=$(curl -sH "Authorization: Bearer $TOKEN" \
    "$BASE/v1/site-audit/$CRAWL/issues?status=open&per_page=100" | jq '.data')

# 2. Per issue: apply fix (layout edit / content update / migration)

# 3. MANDATORY: mark issue as fixed — otherwise the platform doesn't know
#    the work is done and the delta metric becomes useless:
curl -X PATCH \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json" \
    -d '{"status":"fixed"}' \
    "$BASE/v1/site-audit/issues/$ISSUE_ID"

# 4. For layout/template fixes: bulk-mark instead of 50× single PATCH:
curl -X PATCH \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json" \
    -d '{"filter":{"issue_type":"missing_alt_text","status":"open"},"new_status":"fixed"}' \
    "$BASE/v1/site-audit/$CRAWL/issues/bulk"

Polling pattern (deterministic since 2026-05-14):

while true; do
    R=$(curl -sH "Authorization: Bearer $TOKEN" "$BASE/v1/site-audit/$CRAWL" | jq '.data')
    STATUS=$(echo "$R" | jq -r .status)
    BRIDGE=$(echo "$R" | jq -r .bridge_completed_at)
    [ "$STATUS" = "completed" ] && [ "$BRIDGE" != "null" ] && break
    sleep 30
done
# → ALL issues are now final, including grounding_*

Anti-patterns

Applying a fix without PATCH /issues/{id} with status=fixed → issue stays "open" in dashboard, delta comparison metric of next crawl iteration becomes useless.
Querying issues right after status='completed' without checking bridge_completed_at → grounding_* issues missing (bridge still running).
Bulk-mark without filter.issue_type → marks hundreds of unrelated issues. Server returns meta.applied_filter; caller MUST verify before next step.
AI brief in a loop for 100+ issues — 5 credits per brief. Only for selected critical items.

Related modules

Content Audit (Site Crawl) — content-quality scanner (what pages say, not what pages have). Complementary.
Grounding Audit — single-URL LLM-citation-readiness. Direct call without crawl overhead.
[[modules/page-audit]] — Conversion Audit per URL (Lighthouse + Vision + persona fit). More granular per URL.

Letzte Aktualisierung: May 14, 2026