Search Direct comparison

search_text

Full-text BM25F search with auto-grouping, relevance-gap filtering, and 30K token hard cap. The most-used CodeSift tool across 188 real sessions.

-65%
Token reduction
rg (ripgrep)
Native baseline
1 vs 2
Calls (CS vs native)
5,700 vs 16,000
Tokens (CS vs native)

What It Does Differently

search_text is not a replacement for grep. It is a ranked search engine that happens to look at code. The difference matters when an AI agent is the caller, not a human scanning terminal output.

Grep returns every match. An agent working on a 4,000-file codebase that greps for prisma.$transaction gets back every occurrence in every file, ordered by directory traversal. The agent must then read all of that output, decide which results matter, and often re-grep with narrower scope. Two calls, double the token cost, and the agent still might miss the relevant result buried on line 847 of a flat listing.

search_text returns results ranked by BM25F relevance. The most important matches come first. Results are grouped by file with match counts, and a relevance-gap filter automatically cuts off results that score far below the top hits. The output stays under a hard cap of 30K tokens regardless of how many files match.

The Safety Mechanisms

Three features prevent context window blowout:

Auto-grouping (group_by_file). Results are clustered by file path with surrounding context. Instead of seeing the same file path repeated 40 times in a flat match list, you see it once with all its matches beneath it.

Relevance-gap filtering. When the BM25F score of result N is less than 20% of the top result’s score, the remaining results are cut. This eliminates the long tail of marginally relevant matches that grep faithfully includes.

80K character safety cap. Even with grouping and gap filtering, the raw output is capped at 80,000 characters before any formatting. This prevents pathological queries from consuming the entire context window.

Benchmark Results

Tested across three production codebases with 10 identical search tasks:

Metricsearch_textrg (ripgrep)
Total tokens (10 tasks)48,93072,993
Avg calls per task1.12.9
Avg time per task7s9.6s
Tasks won50
Tasks tied55

The token reduction comes from two sources: ranked output means fewer irrelevant matches, and auto-grouping means less repetition in file paths and context lines. The call reduction comes from not needing follow-up searches to narrow scope.

Key Parameters

  • query (required) — the search string. Supports regex when prefixed with /pattern/.
  • file_pattern — glob filter (e.g., *.service.ts, src/**/*.tsx). Always pass this when you know the scope. Reduces tokens by 50% or more.
  • top_k — maximum number of results. Defaults to 20. Relevance-gap filtering may return fewer.
  • group_by_file — groups matches under their file path. Enabled by default. Disable for single-line match lists.
  • context_lines — lines of surrounding code per match. Default 2. Set to 0 for compact output.

When to Use Something Else

Use search_symbols when looking for function or type definitions specifically. search_text finds string occurrences; search_symbols finds structural definitions with AST awareness.

Use search_patterns when looking for known anti-patterns like empty catch blocks or console.log in production code. The built-in patterns include false-positive exclusions that raw text search cannot match.

Use codebase_retrieval with type: "semantic" when the question is conceptual (“how does authentication work?”) rather than a specific text pattern. BM25F matches keywords; semantic search matches meaning.

Use rg directly when you need exhaustive, guaranteed-complete results. search_text has a top_k limit and relevance-gap cutoff that intentionally skip low-value matches. If you need every single occurrence of a string across the entire codebase with zero omissions, grep is the right tool.

In Combo Workflows

search_text appears in 11 of 13 measured combo flows from real agent usage data. The most common pattern is search_text followed by codebase_retrieval (or vice versa) — text search finds the specific code, batch retrieval fills in context. The st-sp-st pattern (search, scan for anti-patterns, search again) is common during code audit workflows.

Benchmark note

This benchmark compares CodeSift against the closest practical native workflow an agent would use for the same task. For some tools, that baseline is a direct shell equivalent such as rg or find. For AST-aware, graph-aware, and LSP-backed tools, the baseline is a multi-step workflow rather than a strictly identical command. Results should be read as agent-workflow comparisons: token cost, call count, and practical context efficiency.