Combo Flows: The 13 Tool Sequences Real Agents Use

N-gram analysis of 188 real agent sessions reveals 13 common tool sequences. 772 runs across 33+ codebases. Aggregate: −61% token reduction, 70% win rate.

8 min

Benchmarking individual tools is useful. But AI agents don’t use tools individually — they use them in sequences. The real question isn’t “how efficient is search_text?” It’s “how efficient is the workflow?”

Methodology

We extracted tool call sequences from 188 real Claude Code agent sessions using n-gram analysis on usage logs. From hundreds of unique 2-step, 3-step, and 4-step sequences, we selected the 13 most frequent — each occurring at least 12 times across multiple sessions.

Each sequence was then benchmarked:

  • 772 total runs across 33 real-world TypeScript/React/NestJS codebases (50 to 4,100+ files each)
  • Both approaches used the same queries and evaluated on token consumption
  • Native baseline: the equivalent Bash workflow (grep, find, read) that an agent without CodeSift would use
  • Win = CodeSift consumed fewer tokens than native for the same task

Results summary

MetricValue
Total native tokens5,130,240
Total CodeSift tokens1,994,825
Aggregate reduction−59%
Win rate542 / 772 (70%)
Sequences tested13
Codebases used33

All 13 sequences

Tier 1: High reduction (−80% to −86%)

These sequences combine structured tools (symbol search, pattern matching) with text search. The structured tool identifies what’s relevant, text search finds how it’s used — without reading full files.

SequenceDescriptionRunsReductionWin rate
ss→stSymbol discovery → usage search65−86%63%
pat→st→pat→stExtended pattern investigation37−86%68%
pat→st→patPattern-first investigation loop39−85%77%
st→ssText orient → symbol narrow58−84%67%
st→pat→st→patText-first pattern investigation35−84%66%
st→ss→stText → symbol → text refinement27−81%67%
st→pat→stPattern sandwich — text bookended by patterns40−80%63%

Why these win big: Symbol search (search_symbols) with detail_level="compact" returns ~15 tokens per result. Pattern search (search_patterns) uses AST matching, returning only structural matches. Both avoid dumping raw file content into the context window.

Tier 2: Good reduction (−68% to −76%)

File tree + search combinations. The agent maps the codebase structure first, then searches within it.

SequenceDescriptionRunsReductionWin rate
st→tree→stSearch → check structure → search again27−76%89%
tree→stMap the codebase → search within it50−68%86%

Why these are the most reliable: st→tree→st has the highest win rate of any sequence at 89%. The file tree tool (get_file_tree with compact=true) returns a flat path list with symbol counts at ~10x less output than find -type f. The agent learns the project layout cheaply, then targets its search.

Tier 3: Moderate reduction (−26% to −39%)

Retrieval-heavy sequences. codebase_retrieval batches multiple query types (text, semantic, symbols) into one call — powerful, but the batched output is denser than individual tools.

SequenceDescriptionRunsReductionWin rate
st→crText search → batch follow-up queries91−39%81%
st→cr→stInvestigative loop: search, batch, refine41−39%83%
cr→stBatch query first → targeted follow-up81−32%79%
cr→st→cr→stExploratory investigation (both expensive)12−26%58%

Why these save less: codebase_retrieval returns structured results from multiple query types in one response. That’s already more efficient than 3-5 separate calls, but the response size is larger than a single targeted tool. The −26% outlier (cr→st→cr→st) represents truly exploratory investigation where the agent doesn’t know what it’s looking for — both approaches are expensive.

Real-world examples

Example 1: Symbol discovery workflow (ss→st)

An agent investigating a function in an i18n platform:

  1. search_symbols("getLanguageName", detail_level="compact") — finds the function definition in 3 tokens
  2. search_text("getLanguageName", file_pattern="*.service.ts") — finds all usage sites

Native approach: grep -rn "function getLanguageName" + grep -rn "getLanguageName" --include="*.service.ts" — returns full matching lines with surrounding context.

ApproachTokens
Native (grep)6,386
CodeSift3
Reduction−99.9%

Example 2: File exploration workflow (tree→st)

An agent onboarding to a content management system:

  1. get_file_tree("src/", compact=true) — gets the full directory structure in a flat list
  2. search_text("article-generat") — finds the article generation logic

Native approach: find src/ -type f + grep -rn "article-generat" src/

ApproachTokens
Native (find + grep)5,852
CodeSift2
Reduction−99.9%

Example 3: Anti-pattern hunt (pat→st→pat)

An agent running a code quality check on a survey platform:

  1. search_patterns("toBeDefined") — finds all weak assertions in tests
  2. search_text(".toBeDefined(") — gets surrounding context for each match
  3. search_patterns("toBeDefined") — refined pattern search after context

Native approach: grep -rn "toBeDefined" --include="*.test.ts" × 3 with manual filtering

ApproachTokens
Native (grep)928
CodeSift0
Reduction−100%

(Zero tokens because the pattern search returned empty — no matches in this codebase. CodeSift correctly returns nothing. Native grep still outputs headers, paths, and line numbers even with zero matches.)

Example 4: Investigative workflow (st→cr)

An agent searching for permission logic in an enterprise API:

  1. search_text("CAN_MANAGE_ORGS") — finds the permission constant usage
  2. codebase_retrieval(queries=[{type:"references", symbol_name:"CAN_MANAGE_ORGS"}, {type:"semantic", query:"organization management permissions"}]) — batches follow-up
ApproachTokens
Native (grep + multiple reads)8,911
CodeSift63
Reduction−99.3%

The pattern that emerges

The sequences with the highest reduction share a structure: structured tools bookend text searches. The agent uses symbol search or pattern matching to identify what’s relevant, then text search to find how it’s used. This avoids the biggest token waste in native workflows: reading full file contents to extract a few lines of information.

PatternAvg reductionWhy it works
Symbol + text−83%Compact symbol lookup (~15 tok/result) replaces grep’s full-line output
Pattern + text−83%AST matching returns structural hits, not string matches
Tree + text−72%Flat path list replaces find + wc -l combos
Retrieval + text−34%Batch queries save round-trips but responses are denser

The honest outlier

cr→st→cr→st only achieves −26% with a 58% win rate. This pattern represents exploratory investigation — the agent doesn’t yet know what it’s looking for. CodeSift’s advantage is largest when agents have structured intent. When truly exploring, both approaches are expensive.

In the 26% of runs where native won, the margin was small and the task was exploratory. In the 74% where CodeSift won, margins of 80–99% were common.

All benchmark data collected 2026-03-30 from 188 real agent sessions. Scripts available in the CodeSift repository.