Analysis No native equivalent

`ast_query`

Name: CodeSift
Author: CodeSift

Query the AST directly using tree-sitter syntax. Find code by shape, not text. Things impossible with regex.

What Regex Cannot Do

Regular expressions match character patterns. They cannot understand code structure. A regex for “function with more than three parameters” would need to account for default values, type annotations, destructured parameters, multiline formatting, trailing commas, and generic constraints. The resulting pattern would be fragile, unreadable, and wrong in edge cases.

ast_query searches the parsed AST (abstract syntax tree) directly using tree-sitter query syntax. It matches structural patterns in the code, not character sequences. This makes it possible to express queries that are fundamentally impossible with regex.

Example Queries

Arrow functions that return a Promise:

(arrow_function
  return_type: (type_annotation
    (type_reference name: (identifier) @type))
  (#eq? @type "Promise"))

Methods with 3 or more parameters:

(method_definition
  parameters: (formal_parameters
    (required_parameter) @p1
    (required_parameter) @p2
    (required_parameter) @p3))

useEffect without dependency array:

(call_expression
  function: (identifier) @fn
  arguments: (arguments
    (arrow_function))
  (#eq? @fn "useEffect")
  (#not-match? @fn ".*,.*"))

Catch blocks without logger call:

(catch_clause
  body: (statement_block) @body
  (#not-match? @body "logger"))

These queries express structural intent. They work regardless of formatting, naming conventions, or code style. A method with three parameters is matched whether the parameters are on one line or three, whether they have type annotations or not, whether they use default values or not.

How It Works

CodeSift maintains tree-sitter parse trees for every indexed file. ast_query runs your query pattern against these trees and returns matching nodes with their source code, file location, and surrounding context. The query language is tree-sitter’s own S-expression pattern syntax, which is well-documented and consistent across all supported languages.

What the Output Contains

Each match includes:

File path and line range — where the matched node lives
Matched source code — the actual code that matched the structural pattern
Capture groups — named captures from the query (e.g., @type, @fn) with their values

When to Use It

ast_query fills the gap between search_text (character matching) and search_patterns (pre-built pattern library). Use it when:

You need structural queries that search_patterns does not cover. The built-in pattern library covers common anti-patterns, but your codebase may have specific structural conventions you want to enforce. ast_query lets you write custom structural checks.
Regex would be fragile. Any query involving nesting depth, parameter counts, return types, or control flow structure is better expressed as an AST query.
You are building custom linting rules. Before investing in a custom ESLint plugin, prototype the detection logic with ast_query to see if the pattern is even present.
You need language-aware search. ast_query understands the difference between a function named error and a variable named error and a string containing the word “error.” Text search cannot distinguish these.

For known anti-patterns (empty catch, console.log in production, missing error handling), use search_patterns first. Its built-in patterns include false-positive exclusions that raw AST queries do not. ast_query is for custom structural queries that go beyond the built-in library.

Benchmark note

This benchmark compares CodeSift against the closest practical native workflow an agent would use for the same task. For some tools, that baseline is a direct shell equivalent such as rg or find. For AST-aware, graph-aware, and LSP-backed tools, the baseline is a multi-step workflow rather than a strictly identical command. Results should be read as agent-workflow comparisons: token cost, call count, and practical context efficiency.

ast_query