JOURDANLABS
← BENCHMARKS / SIGNAL · PHARMACOVIGILANCE NLP
SIGNAL · PHARMACOVIGILANCE NLP

SIGNAL

Adverse drug event extraction from clinical narratives and pharmacovigilance reports.

SIGNAL pharmacovigilance artifact
0.639
F1
24.3
MONTHS MEDIAN DETECTION
0.712
PRECISION
What It Is

SIGNAL identifies adverse drug events (ADEs) in unstructured clinical text — spontaneous reports, case narratives, and pharmacovigilance databases. The 24.3-month median detection window represents how far in advance SIGNAL identifies safety signals before they appear in regulatory action.

Unlike LLM-based extraction pipelines, SIGNAL uses a sealed corpus of FAERS data — a public-domain government dataset — combined with a deterministic entity extraction and normalization pipeline. No inference is made about causality or severity unless the corpus explicitly supports it. Ambiguous mentions trigger honest refusal via AURORA.

Results
SystemF1PrecisionRecallRefusal Rate
SIGNAL v0.1 (COSMIC)0.6390.7120.580reported per-class
Keyword baseline0.5500.4810.6440.000
Dictionary lookup0.5120.7730.3840.000

Baselines are real implementations — keyword matching against MedDRA and drug-name dictionary lookup — not straw men.

Reproducibility
Corpus sourceFDA FAERS (public domain)
Corpus sealSHA-256 in CHECKPOINT_RESULTS.md
Repogithub.com/jourdanlabs/benchmarks/signal
Limitations

FAERS report quality variance. Incomplete, noisy, or duplicative reports can affect extraction quality and downstream signal timing.

Causality not inferred. SIGNAL does not infer causality or severity unless the corpus explicitly supports it.

Corpus recency. FAERS is updated monthly; detection latency depends on data availability and processing cadence.