← BenchmarksSENTINEL · SOC Triage

SENTINEL

Deterministic SOC alert triage with HEIMDALL confidence gate and LUNA audit chain.

Held-out accuracy
94.0%
Unit tests (post-sprint)
210/210
Regressions
0

What it is

SENTINEL is the COSMIC pipeline applied to Security Operations Center alert triage — the classification of incoming security alerts as true positive, false positive, or requiring escalation. SOC analysts spend the majority of their time triaging alerts that turn out to be noise. LLM-based triage introduces hallucination risk in exactly the domain where a false negative (missing a real threat) has catastrophic consequences.

SENTINEL uses a sealed decision ruleset. Every triage decision traces to a specific rule in the pipeline with a confidence score. The HEIMDALL gate refuses to triage alerts that fall outside the sealed decision boundary — escalating to a human analyst rather than guessing. LUNA maintains a SHA-chained audit log: every alert, every decision, every refusal is immutably recorded and tamper-evident.

The overnight sprint shipped SOAR adapter stubs for 4 major vendors (Chronicle, Microsoft Sentinel, Splunk, Elastic) plus a feedback loop API and a full infrastructure deployment (Terraform + EKS). The 94.0% held-out accuracy figure is from the HEIMDALL classifier tested against a held-out test set sealed before sprint start.


Sprint summary

WS1
Deployment Infrastructure
Terraform + EKS + NATS JetStream + Grafana dashboards
Complete
WS2
SOAR Adapters
Chronicle, MS Sentinel, Splunk, Elastic — auth gaps documented
Complete (stubs)
WS3
Ingest Connectors
Chronicle native ATT&CK IDs; MS Sentinel OAuth2 needs sandbox
Complete (stubs)
WS4
Admin API
New routes wired; existing triage routes unaffected
Complete
WS5
Feedback Loop
FP-rate heuristic >30% → reduce active_ratio 20%; thresholds TBD
Complete (v0 placeholder)

Limitations

SOAR auth unknowns. All four SOAR vendor adapters have documented gap files requiring sign-off before Month 4 sandbox work. The adapters are functional stubs; production auth flows are vendor-specific and not yet validated.

Feedback loop thresholds are v0 placeholders. The FP-rate heuristic (>30% → reduce active_ratio 20%) was implemented but should be verified before exposing to a design partner.

No live deployment yet. Infrastructure is reviewable Terraform and Kubernetes manifests. Six design questions must be resolved before first terraform apply.

GitHub →Reproducibility guide →