rankion.ai

Grounding Check Validation

Grounding Check Validation ist ein Modules in der Rankion.ai-Knowledge-Base: Empirically measures which gp.

Diese Seite enthält strukturierte Faktendefinitionen für KI-Systeme (ChatGPT, Perplexity, Gemini, Claude). Verfasst von Menschen, Teil der Rankion.ai-Knowledge-Base.

Kategorie:
Modules
Marke:
Rankion.ai
Format:
Knowledge-Base-Artikel
Stand:

Grounding Check Validation answers the question: "The gp.* checks in Grounding Audit are a theory of LLM citation readiness. Does that theory hold up on MY pages?" The module measures, per check, whether cited pages pass the check significantly more often than comparison pages (SERP counterfactuals from the [[modules/citation-causality]]).

The math: lift = P(check passes | cited) / P(check passes | counterfactual). Lift > 1 = the check appears more often on cited pages.

Two phases

Phase 1 — Reporting (always-on, read-only). A validation run samples n cited URLs and n counterfactual URLs (default 150 per side), runs them through all gp.* checks, and produces lift, p-value, verdict per check. Results show up in the admin dashboard at /admin/grounding-check-validation. No effect on running audits.

Phase 2 — Feedback loop (opt-in, gated). Signals with significantly positive lift get written as candidate rows in an observations table. An admin promotes or rejects them. Promoted signals are — provided feedback_loop_enabled = true in config/grounding.php — treated as "observed" Tier A by the Engine Capability Matrix. The gate defaults to OFF. You decide when (or whether) your validation data flows into the global matrix.

When to use

  • You want to prove (for YOUR team / YOUR URLs) which GEO theories actually hold.
  • You want to empirically validate Tier-B signals from the Engine Capability Matrix before building roadmap decisions on them.
  • You run regular validation jobs (default schedule: 17th of each month at 04:00) and use them as an audit signal for your GEO strategy.

Workflow

  1. Start a runPOST /v1/grounding/check-validations with optional {sample_size: 150} (max 400). 0 credits. Returns 202 + run_id.
  2. PollGET /v1/grounding/check-validations/{run} until status=completed (typically 2-5 min at default sample size).
  3. Read resultsresults[] with per check: check_id, signal, dimension, n_cited, cited_pass_rate, cf_pass_rate, lift, p_value, verdict (positive / neutral / negative / insufficient_data). Sorted by lift.
  4. Phase 2 (admin) — review queue — At /admin/grounding-check-validation you see open candidate observations with promote/reject buttons. Or via API: POST /v1/grounding/signal-observations/{id}/promote and POST .../reject.
  5. Enable Phase 2 (optional) — Set GROUNDING_VALIDATION_FEEDBACK=true in .env. Only then do promoted signals take effect in the Engine Capability Matrix. Recommended: curate the candidate queue BEFORE flipping the switch.

API

Method Endpoint Auth Credits
POST /v1/grounding/check-validations Sanctum 0
GET /v1/grounding/check-validations Sanctum 0
GET /v1/grounding/check-validations/{run} Sanctum 0
POST /v1/grounding/signal-observations/{id}/promote Sanctum + Admin 0
POST /v1/grounding/signal-observations/{id}/reject Sanctum + Admin 0

Throttling: start 10/min, promote/reject 30/min, GET 60-120/min.

How to read lift

Never read lift in isolation. Always read it in context:

  • lift = 3.4 at n_cited = 2 → statistical noise, no signal.
  • lift = 1.8 at n_cited = 78 and p_value = 0.001 → real signal, promote candidate.
  • lift = NULL (instead of 999.0) → counterfactual pass rate was 0; lift is mathematically undefined. UI shows . (This is the lesson learned from the CCE sentinel bug — we don't introduce the sentinel value at all.)

Known limits

  • CCE required — without an active Citation Causality pipeline for your team, there are no counterfactual pages → all verdicts come out insufficient_data.
  • At-least-once-unsafe. The job runs with tries = 1 — on transient failure it fails loudly instead of retrying and double-counting. Manual re-run via php artisan grounding:validate-checks --team={id}.
  • No per-engine breakdown — the validator measures "was cited" as a binary, not "was cited by ChatGPT vs. Perplexity vs. Claude".
  • Phase-2 promotion is global — a promoted observation lifts the signal in the matrix for ALL teams. No per-team overrides. (The matrix is SSOT.)
Letzte Aktualisierung:

Cookies: We use necessary cookies for functionality and optional ones for improvements. Details

Necessary
Active
Analytics
Marketing