Site Audit (Crawler + LLM Citation Readiness)
Technical SEO + Grounding-Page scan in one crawl — includes Close-the-Loop for API callers.
Site Audit crawls your domain (BFS or via sitemap.xml), generates an issue report for every URL (technical SEO + content hygiene), and then automatically fires a Grounding-Audit batch over up to 100 pages. The findings are added as additional issues with prefix grounding_* into the same issue list — one dashboard view for technical SEO + LLM-Citation-Readiness.
What it does
- Crawl modes —
bfs(link discovery fromstart_url, depth-limited) ORsitemap(loads all URLs fromsitemap.xml, recursive incl. sitemap-index, ignorescrawl_depth). Hard cap: 10,000 seeds,max_pagesenforced. - Issue classification — fix-priority sorted, filtered by
severity(critical|high|medium|low|notice),issue_type, andstatus(open|fixed|dismissed). Eager-loadedurlper issue. - Auto-Grounding-Bridge — auto-dispatched after crawl completion: Grounding-Audit-Batch over up to 100 pages → findings appear as
grounding_*-issues in the same crawl. No extra credits. Idempotent (re-run de-duplicates). - Close-the-Loop — Mark issues as
fixed/dismissedvia UI or API — individually OR bulk perissue_type. Mandatory after each fix so the delta comparison metric ("fixed since last crawl") stays meaningful. - AI brief (5 credits per brief) — narrative LLM explanation per issue on demand.
- Bridge timestamps —
bridge_dispatched_at(batch dispatched) andbridge_completed_at(all grounding_* issues final). Deterministic polling condition instead of blind 1-3 min waits.
When to use
- Site-wide technical SEO inventory.
- Check whole-domain LLM citation readiness without 340× single-audit calls.
- Pre-relaunch: what still needs to be removed / fixed.
- Monthly health report for stakeholders.
UI workflow
- Start — Form at
/site-audit(start_url, max_pages, crawl_depth, crawl_mode, optional tracking_project_id). - Polling — Detail page
/site-audit/{crawl}shows live counter (pages_crawled, total_issues). Auto-refresh. - Close-the-Loop — Per issue: 2 buttons ✓ Mark fixed and ⊘ Dismiss. Above the table: a bulk-action toolbar with dropdown "Select issue type" + Mark all as fixed / Dismiss all for mass fixes after layout/template changes.
- Re-crawl — After critical+high are fixed, new crawl → trend block shows "fixed since last crawl" / "new since last crawl".
API workflow (skill caller)
Mandatory sequence after fix:
# 1. Fetch issues
ISSUES=$(curl -sH "Authorization: Bearer $TOKEN" \
"$BASE/v1/site-audit/$CRAWL/issues?status=open&per_page=100" | jq '.data')
# 2. Per issue: apply fix (layout edit / content update / migration)
# 3. MANDATORY: mark issue as fixed — otherwise the platform doesn't know
# the work is done and the delta metric becomes useless:
curl -X PATCH \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"status":"fixed"}' \
"$BASE/v1/site-audit/issues/$ISSUE_ID"
# 4. For layout/template fixes: bulk-mark instead of 50× single PATCH:
curl -X PATCH \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"filter":{"issue_type":"missing_alt_text","status":"open"},"new_status":"fixed"}' \
"$BASE/v1/site-audit/$CRAWL/issues/bulk"
Polling pattern (deterministic since 2026-05-14):
while true; do
R=$(curl -sH "Authorization: Bearer $TOKEN" "$BASE/v1/site-audit/$CRAWL" | jq '.data')
STATUS=$(echo "$R" | jq -r .status)
BRIDGE=$(echo "$R" | jq -r .bridge_completed_at)
[ "$STATUS" = "completed" ] && [ "$BRIDGE" != "null" ] && break
sleep 30
done
# → ALL issues are now final, including grounding_*
Anti-patterns
- Applying a fix without
PATCH /issues/{id}withstatus=fixed→ issue stays "open" in dashboard, delta comparison metric of next crawl iteration becomes useless. - Querying issues right after
status='completed'without checkingbridge_completed_at→grounding_*issues missing (bridge still running). - Bulk-mark without
filter.issue_type→ marks hundreds of unrelated issues. Server returnsmeta.applied_filter; caller MUST verify before next step. - AI brief in a loop for 100+ issues — 5 credits per brief. Only for selected critical items.
Related modules
- Content Audit (Site Crawl) — content-quality scanner (what pages say, not what pages have). Complementary.
- Grounding Audit — single-URL LLM-citation-readiness. Direct call without crawl overhead.
- [[modules/page-audit]] — Conversion Audit per URL (Lighthouse + Vision + persona fit). More granular per URL.
Letzte Aktualisierung: May 14, 2026