When Keyword Search Isn't Enough

Semantic search, hybrid search, and conversation search. Three embedding providers, three search modes, real benchmark results.

6 min

Keyword search answers one question: “does this string appear in the code?” That’s powerful. But it misses an entire class of query.

“How does authentication work?” has no single keyword. “What’s the caching strategy?” could match a dozen implementations. “Why did we add rate limiting here?” can’t be answered by any text search at all.

CodeSift’s semantic search answers these queries using embeddings.

Three embedding providers

Env VariableProviderModelNotes
CODESIFT_VOYAGE_API_KEYVoyage AIvoyage-code-3Best quality for code
CODESIFT_OPENAI_API_KEYOpenAItext-embedding-3-small~$0.02/1M tokens
CODESIFT_OLLAMA_URLOllama (local)nomic-embed-textFree, runs locally

Three search modes

Semantic — pure embedding similarity. Best for concept queries.

{ "type": "semantic", "query": "error handling and retry logic", "top_k": 10 }

Hybrid — semantic + BM25 merged via Reciprocal Rank Fusion (RRF, k=60). Best for most real queries.

{ "type": "hybrid", "query": "caching strategy", "top_k": 10 }

Conversation — search past AI session history by concept.

{ "type": "conversation", "query": "why we chose Redis over Postgres cache" }

Benchmark results

On a 4,127-file TypeScript codebase, 10 conceptual questions rated on a 1-10 scale:

  • CodeSift: 7.8/10 average quality
  • Native (grep-based): 6.5/10 average quality
  • Improvement: +20%

When to use semantic vs keyword

Query TypeBest Mode
”Find function named X”search_symbols (keyword)
“Find all TODO comments”search_text (keyword)
“How does authentication work?”assemble_context + semantic
”What’s our caching strategy?”codebase_retrieval hybrid
”Why did we add this middleware?”search_conversations
”Find code similar to this pattern”codebase_retrieval semantic