← BenchmarksSENTINEL · SOC Triage

SENTINEL

Deterministic SOC alert triage with HEIMDALL confidence gate and LUNA audit chain.

Held-out accuracy

94.0%

Unit tests (post-sprint)

210/210

Regressions

0

What it is

SENTINEL is the COSMIC pipeline applied to Security Operations Center alert triage — the classification of incoming security alerts as true positive, false positive, or requiring escalation. SOC analysts spend the majority of their time triaging alerts that turn out to be noise. LLM-based triage introduces hallucination risk in exactly the domain where a false negative (missing a real threat) has catastrophic consequences.

SENTINEL uses a sealed decision ruleset. Every triage decision traces to a specific rule in the pipeline with a confidence score. The HEIMDALL gate refuses to triage alerts that fall outside the sealed decision boundary — escalating to a human analyst rather than guessing. LUNA maintains a SHA-chained audit log: every alert, every decision, every refusal is immutably recorded and tamper-evident.

The overnight sprint shipped SOAR adapter stubs for 4 major vendors (Chronicle, Microsoft Sentinel, Splunk, Elastic) plus a feedback loop API and a full infrastructure deployment (Terraform + EKS). The 94.0% held-out accuracy figure is from the HEIMDALL classifier tested against a held-out test set sealed before sprint start.

Sprint summary

WS1

Deployment Infrastructure

Terraform + EKS + NATS JetStream + Grafana dashboards

Complete

WS2

SOAR Adapters

Chronicle, MS Sentinel, Splunk, Elastic — auth gaps documented

Complete (stubs)

WS3

Ingest Connectors

Chronicle native ATT&CK IDs; MS Sentinel OAuth2 needs sandbox

Complete (stubs)

WS4

Admin API

New routes wired; existing triage routes unaffected

Complete

WS5

Feedback Loop

FP-rate heuristic >30% → reduce active_ratio 20%; thresholds TBD

Complete (v0 placeholder)

Limitations

SOAR auth unknowns. All four SOAR vendor adapters have documented gap files requiring sign-off before Month 4 sandbox work. The adapters are functional stubs; production auth flows are vendor-specific and not yet validated.

Feedback loop thresholds are v0 placeholders. The FP-rate heuristic (>30% → reduce active_ratio 20%) was implemented but should be verified before exposing to a design partner.

No live deployment yet. Infrastructure is reviewable Terraform and Kubernetes manifests. Six design questions must be resolved before first terraform apply.

GitHub →Reproducibility guide →