How it works

Three pillars. One MCP server.

Name: CodeSift
Author: CodeSift

CodeSift sits between your AI agent and your codebase. It parses code into an AST index, ranks results with BM25F, and bridges to language servers — delivering 61% fewer tokens while giving agents capabilities grep cannot provide.

Without CodeSift

Agent → rg "auth" — raw text matches

Agent → Read 5 full files (20K+ tokens)

Agent → rg "middleware" — more raw matches

Agent → Read 3 more files, guess at structure

~80,000 tokens consumed

11+ tool calls, mostly noise

With CodeSift

Agent → assemble_context("auth", level="L1")

CodeSift → AST parse → BM25F rank → return signatures + types

Agent → Gets curated context, reasons about it

~12,600 tokens consumed

1 tool call, structured output

The three pillars

AST Index

Tree-sitter parses every file into an abstract syntax tree. Symbols are extracted, scored by centrality (how often imported), and indexed with BM25F for fast, ranked retrieval.

12 languages supported

Incremental reindex (9ms per file)

Import graph for centrality scoring

File watcher for auto-updates

Powers:

search_symbols get_file_outline find_dead_code +25 more

Semantic Search

Embeddings turn code into vectors. Questions like "how does authentication work?" find relevant code by meaning, not just matching keywords. Hybrid mode merges semantic + BM25 via Reciprocal Rank Fusion.

3 providers (Voyage, OpenAI, Ollama)

Hybrid: semantic + BM25 via RRF

Token budget: fill context window optimally

Conversation search over past sessions

Powers:

codebase_retrieval assemble_context search_conversations

LSP Bridge

When a language server is available, CodeSift proxies type-aware operations: resolved definitions, hover types, cross-file rename. Lazy start, 5-minute idle kill, zero overhead when unused.

6 servers (TS, Python, Go, Rust, Ruby, PHP)

Falls back to AST/index when unavailable

Type discovery: 50-200 tokens vs file read

Type-safe rename across all files

Powers:

go_to_definition get_type_info rename_symbol

Two commands. Thirty seconds.

Install globally

$ npm install -g codesift-mcp

One binary. No cloud. No signup. MIT licensed.

Add to your MCP config

{
  "mcpServers": {
    "codesift": {
      "command": "codesift-mcp"
    }
  }
}

Works with Claude Code, Cursor, Codex, and any MCP client.

✓

Code normally

Your AI agent automatically discovers 64 MCP tools. It uses search_symbols instead of grep, assemble_context instead of reading files, trace_route instead of guessing at endpoints. You don't change how you work. The agent works better.

Measured savings

From real benchmarks across 188 agent sessions and 603 combo flow runs

What the agent does	Native (grep/read)	CodeSift	Savings
Search for a symbol definition	~57,000 tok	~5,700 tok	-90%
Understand a feature ("how does auth work?")	~93,000 tok	~12,600 tok	-86%
Trace an HTTP route end-to-end	~35,000 tok	~61 tok	-99%
Scan for hardcoded secrets	~1.6M tok	~11,500 tok	-99%
Find unused exports (dead code)	21 calls	1 call	-82%
Real-world combo flows (13 sequences)	4.58M tok	1.86M tok	-61%

Ready to try?

One binary. Zero cloud dependencies. MIT licensed.

Install CodeSift Browse 64 tools