← BenchmarksLENS · Semantic Search

LENS

Intent-based semantic search for dense technical corpora.

vs grep (intent queries)
25×
Primary metric
P@5
Version
v0.3 / v0.4

Note on v0.4: LENS v0.4 results were pending at time of publication. The 25× grep advantage figure comes from v0.3 intent-query evaluation and is conservatively stated — v0.4 results may revise upward.

What it is

LENS answers the question that grep cannot: "find me documents about intent X" rather than "find me documents containing string Y." For dense technical corpora — regulatory filings, clinical documentation, security playbooks, legal contracts — intent-based retrieval is the difference between finding what you need and finding everything that happens to share a keyword.

The LENS pipeline uses a multi-stage retrieval approach: query understanding, candidate expansion, semantic relevance scoring, and AURORA confidence gating on retrieval confidence. Results are ranked by a deterministic scoring function that weights document structure, semantic alignment, and source authority.

The 25× advantage over grep is measured on intent queries. On exact-string queries, LENS and grep perform comparably. The benchmark is honest about this: grep wins on exact matches; LENS wins on everything else.


Limitations

v0.4 results pending. The v0.4 evaluation was in progress at publication time. The 25× figure is from v0.3. This page will be updated with the sealed v0.4 result when it lands.

Corpus-specific performance. LENS performance varies by corpus density and query type distribution. The 25× advantage is measured on the sealed benchmark corpus; performance on novel corpora is not characterized.

No reranking ablation published. The contribution of each pipeline stage has not been published as a separate ablation. Full methodology documented in the repo.

GitHub →Reproducibility guide →