Hybrid search

Name: LogicGrid
Author: LogicGrid

Hybrid search combines two retrieval methods:

Dense retrieval — cosine similarity on embedding vectors. Good at finding semantically similar content ("documents about climate change").
Sparse retrieval — BM25 keyword scoring. Good at finding exact matches ("GPT-4o", "SKU-4421", "RFC 2616", proper nouns, codes, IDs).

The combination consistently outperforms either alone. The merge strategy is Reciprocal Rank Fusion (RRF) — a simple, parameter-free formula that combines ranked lists from both retrievers without needing to tune weights.

it lives in LogicGrid.Memory.Search.

Current limitation

Hybrid search currently only works with InMemoryVectorStore. Support for external stores is on the roadmap.

Use it

using LogicGrid.Memory.Search;
using LogicGrid.Memory.VectorStores;

var inner  = new InMemoryVectorStore();
var hybrid = new HybridVectorStore(inner);

// HybridVectorStore implements IVectorStore — drop-in replacement.
// Use it like a regular vector store; it maintains the BM25 index automatically.

await hybrid.UpsertAsync(new VectorDocument("d1", "...", vec));

// Then call HybridSearchAsync for combined retrieval.
var results = await hybrid.HybridSearchAsync(
    query: "How do I configure SKU-4421?",
    queryVector: await embedder.EmbedAsync("How do I configure SKU-4421?"),
    topK: 5);

foreach (var r in results)
{
    Console.WriteLine(
        $"[RRF:{r.RrfScore:F4} dense:{r.DenseScore:F2} sparse:{r.SparseScore:F2}] " +
        $"{r.Document.Text}");
}

Convenience overload — auto-embed the query

var results = await hybrid.HybridSearchAsync(
    query: "How do I configure SKU-4421?",
    embedder: embedder,
    topK: 5);

With a RagPipeline

var pipeline = new RagPipeline(embedder, hybrid);
await pipeline.IngestAsync("./docs/architecture.md");

var results = await pipeline.HybridSearchAsync(
    "How do I configure Qdrant?",
    topK: 5);

If you build the pipeline with a plain IVectorStore, HybridSearchAsync falls back to dense-only retrieval — your code keeps working but there won't be a hybrid search.

Why hybrid wins

Dense embeddings capture meaning but blur exact tokens. BM25 captures exact tokens but doesn't understand meaning. Hybrid catches both:

Query	Dense alone	BM25 alone	Hybrid
"documents about deployment"	✅ Good	❌ Misses synonyms	✅ Good
"RFC 2616"	❌ May miss	✅ Finds it	✅ Finds it
"billing question"	✅ Good	⚠️ Only literal "billing"	✅ Best of both

How RRF merges results

For each document the formula is:

RRF(doc) = 1/(k + rank_dense) + 1/(k + rank_sparse)

where rank_dense and rank_sparse are 1-indexed positions in the respective result lists (or ∞ if the document is missing from one). The constant k defaults to 60 — the value from the original RRF paper, robust across most retrieval scenarios.

You almost never need to tune k. If you do:

var hybrid = new HybridVectorStore(
    inner,
    new HybridSearchOptions { RrfK = 90 });

Each result includes individual scores

public sealed class HybridSearchResult
{
    public VectorDocument Document { get; }
    public double RrfScore { get; }       // combined — use for ranking
    public float DenseScore { get; }      // 0 if not in dense results
    public double SparseScore { get; }    // 0 if not in sparse results
    public bool InDenseResults { get; }
    public bool InSparseResults { get; }
}

Useful for debugging: see whether a hit came from dense, sparse, or both.

When to use hybrid

✅ Use it when:

Users search by literal terms ("error 500", "v2.4.1", "PROD-1234").
You're going to production — the cost is one BM25 index in memory.

❌ Don't bother when:

Your corpus is tiny and recall is fine with dense alone.
You're prototyping — InMemoryVectorStore + dense is the fastest start.

Tuning

var hybrid = new HybridVectorStore(
    inner,
    new HybridSearchOptions
    {
        RrfK = 60,                      // RRF constant, default 60
        CandidateMultiplier = 3,        // fetch topK*3 candidates from each retriever
        Bm25K1 = 1.5,                   // BM25 term saturation
        Bm25B = 0.75,                   // BM25 length normalization
    });

Defaults work well universally. Tune only when you have a measurable retrieval-quality issue.

Use it​

Convenience overload — auto-embed the query​

With a RagPipeline​

Why hybrid wins​

How RRF merges results​

Each result includes individual scores​

When to use hybrid​

Tuning​