JOURDANLABS
← BENCHMARKS / COMPASS · READING COMPLEXITY
COMPASS · READING COMPLEXITY

COMPASS

Reading-level calibration.

COMPASS navigation artifact
15/15
WITHIN-1-TIER
Research
PAPERS
Calibrated
TIER ASSIGNMENT
What It Is

COMPASS tests reading-level complexity calibration on research papers. The system must classify text complexity within one tier of the ground-truth label. Research papers are the hardest category — they combine technical vocabulary, discipline-specific knowledge, and high inferential demand.

The 15/15 within-1-tier result means every research paper in the test set was assigned a complexity tier within one level of its ground-truth classification. Surface metrics (Flesch-Kincaid, Gunning Fog) routinely mis-classify research papers. COMPASS gets all 15 within one tier.

Methodology
CorpusSealed research paper corpus
BaselinesFlesch-Kincaid, Gunning Fog, Coleman-Liau
PipelineMulti-dimensional complexity scoring
MetricWithin-1-tier accuracy
DimensionsVocabulary, domain-specificity, argument structure, inferential load
ReproducibilityFull instructions in GitHub repo
Reproducibility
Corpus15 research papers (sealed)
MetricWithin-1-tier classification
Repogithub.com/jourdanlabs/benchmarks/compass
Limitations

Within-1-tier, not exact. Metric counts within-1-tier matches, not exact matches. Exact-match accuracy is lower and documented in repo.

English-only. Corpus and pipeline are English-language only. Multilingual calibration out of scope.

Domain coverage. Tier system designed for benchmark corpus document types. Novel document types may produce degraded calibration.