Hybrid search
Hybrid search combines two retrieval methods:
- Dense retrieval — cosine similarity on embedding vectors. Good at finding semantically similar content ("documents about climate change").
- Sparse retrieval — BM25 keyword scoring. Good at finding exact matches ("GPT-4o", "SKU-4421", "RFC 2616", proper nouns, codes, IDs).
The combination consistently outperforms either alone. The merge strategy is Reciprocal Rank Fusion (RRF) — a simple, parameter-free formula that combines ranked lists from both retrievers without needing to tune weights.
it lives in LogicGrid.Memory.Search.
Hybrid search currently only works with InMemoryVectorStore. Support
for external stores is on the roadmap.
Use it
using LogicGrid.Memory.Search;
using LogicGrid.Memory.VectorStores;
var inner = new InMemoryVectorStore();
var hybrid = new HybridVectorStore(inner);
// HybridVectorStore implements IVectorStore — drop-in replacement.
// Use it like a regular vector store; it maintains the BM25 index automatically.
await hybrid.UpsertAsync(new VectorDocument("d1", "...", vec));
// Then call HybridSearchAsync for combined retrieval.
var results = await hybrid.HybridSearchAsync(
query: "How do I configure SKU-4421?",
queryVector: await embedder.EmbedAsync("How do I configure SKU-4421?"),
topK: 5);
foreach (var r in results)
{
Console.WriteLine(
$"[RRF:{r.RrfScore:F4} dense:{r.DenseScore:F2} sparse:{r.SparseScore:F2}] " +
$"{r.Document.Text}");
}
Convenience overload — auto-embed the query
var results = await hybrid.HybridSearchAsync(
query: "How do I configure SKU-4421?",
embedder: embedder,
topK: 5);
With a RagPipeline
var pipeline = new RagPipeline(embedder, hybrid);
await pipeline.IngestAsync("./docs/architecture.md");
var results = await pipeline.HybridSearchAsync(
"How do I configure Qdrant?",
topK: 5);
If you build the pipeline with a plain IVectorStore, HybridSearchAsync
falls back to dense-only retrieval — your code keeps working but there won't be a hybrid search.
Why hybrid wins
Dense embeddings capture meaning but blur exact tokens. BM25 captures exact tokens but doesn't understand meaning. Hybrid catches both:
| Query | Dense alone | BM25 alone | Hybrid |
|---|---|---|---|
| "documents about deployment" | ✅ Good | ❌ Misses synonyms | ✅ Good |
| "RFC 2616" | ❌ May miss | ✅ Finds it | ✅ Finds it |
| "billing question" | ✅ Good | ⚠️ Only literal "billing" | ✅ Best of both |
How RRF merges results
For each document the formula is:
RRF(doc) = 1/(k + rank_dense) + 1/(k + rank_sparse)
where rank_dense and rank_sparse are 1-indexed positions in the
respective result lists (or ∞ if the document is missing from one). The
constant k defaults to 60 — the value from the original RRF paper,
robust across most retrieval scenarios.
You almost never need to tune k. If you do:
var hybrid = new HybridVectorStore(
inner,
new HybridSearchOptions { RrfK = 90 });
Each result includes individual scores
public sealed class HybridSearchResult
{
public VectorDocument Document { get; }
public double RrfScore { get; } // combined — use for ranking
public float DenseScore { get; } // 0 if not in dense results
public double SparseScore { get; } // 0 if not in sparse results
public bool InDenseResults { get; }
public bool InSparseResults { get; }
}
Useful for debugging: see whether a hit came from dense, sparse, or both.
When to use hybrid
✅ Use it when:
- Users search by literal terms ("error 500", "v2.4.1", "PROD-1234").
- You're going to production — the cost is one BM25 index in memory.
❌ Don't bother when:
- Your corpus is tiny and recall is fine with dense alone.
- You're prototyping — InMemoryVectorStore + dense is the fastest start.
Tuning
var hybrid = new HybridVectorStore(
inner,
new HybridSearchOptions
{
RrfK = 60, // RRF constant, default 60
CandidateMultiplier = 3, // fetch topK*3 candidates from each retriever
Bm25K1 = 1.5, // BM25 term saturation
Bm25B = 0.75, // BM25 length normalization
});
Defaults work well universally. Tune only when you have a measurable retrieval-quality issue.