Products validating products.
VANTAGE is the COSMIC diagnostic suite and now the CLARION-powered code auditor. VANTAGE X was already strong. VANTAGE 2.0 rebuilt the auditor around reject-first verification and cleared the internal fixture suite with 100% expected recall.
Request a VANTAGE scan →The auditor beat the auditor.
The first VANTAGE line established a hard internal bar for deterministic code review. VANTAGE 2.0 did not chase a bigger confidence score. It changed the question: what claims about this codebase cannot survive contradiction, missing proof, stale evidence, unsafe execution patterns, or brittle architecture?
That CLARION posture produced a cleaner benchmark profile: every expected finding recovered, no forbidden-hit classes, no severity mismatches, and audit-hashable output that can be rerun rather than trusted.
Internal deterministic code-audit benchmark. This is a product validation receipt, not a public held-out benchmark claim.
BCa bootstrap confidence intervals (B=2,000) computed per scan. Full receipts available on request. BACCHUS and HELIX codebases will be added to VANTAGE scan rotation in the coming weeks.
Sealed corpus per domain
Each VANTAGE scan runs against a domain-specific corpus sealed before the scan begins. The corpus SHA is published in the receipt. You can verify the corpus was not modified between scans.
Full COSMIC pipeline execution
NOVA → ECLIPSE → PULSAR → AURORA → LUNA. Every stage runs deterministically. No LLM calls at runtime. The AURORA gate refuses tasks where aggregate confidence falls below threshold — refusal rates are reported, not suppressed.
Per-engine confidence attribution
VANTAGE produces a failure-class taxonomy alongside the score. Each failure class is attributed to a specific pipeline stage with a confidence estimate. The taxonomy is what informs the benchmark checkpoint arc — it's how CITADEL's E.1 and E.2 fixes were identified.
Sealed receipt with BCa CIs
The scan receipt includes: corpus SHA, per-task scores, refusal rate, BCa bootstrap confidence intervals (B=2,000), and a LUNA audit chain head. The receipt is tamper-evident. If you share it with a third party, they can verify it against the public corpus.
Engine implementations are proprietary. The VANTAGE diagnostic framework ships without engine source code. To run full diagnostics, request an evaluation API key.
Is claimed:VANTAGE correctly identified CITADEL failure classes A, B, C with stated confidence levels (95%, 88%, 80%). The E.1 and E.2 fixes were implemented from VANTAGE's diagnosis and produced the documented F1 improvements.
Is not claimed: VANTAGE scan scores are capability demonstrations, not published benchmarks. Division scan scores (ATLAS 88.5%, BACCHUS 88.9%, HELIX 81.5%) have no held-out ground truth separate from the scan corpus. The CRUCIBLE numbers (SENTINEL 94.0%, ORACLE 51.0%) are the publicly published sealed benchmark results.