get_knowledge_map
Maps module dependency graph: imports, centrality scores, circular dependencies. Always use with focus parameter.
Architecture As Data
Understanding a codebase’s architecture means understanding its dependency structure: which modules import which, which are central, which are peripheral, and where circular dependencies create coupling.
With grep, mapping dependencies means searching for import statements across every file in every subdirectory, then mentally assembling those into a graph. For a project with 50+ files across 7 directories, that is seven or more grep calls returning raw import lines that you then have to parse and connect.
get_knowledge_map returns the dependency graph as structured data: nodes ranked by centrality, edges with import relationships, and circular dependency detection.
get_knowledge_map(repo="local/my-project", focus="src/lib")
Always Use the Focus Parameter
This is not a suggestion. Without focus, get_knowledge_map returns the complete dependency graph for the entire repository. On a medium-sized project, that is 129K+ tokens. On a large project, it can be significantly more.
The focus parameter scopes the graph to a specific directory or module. This reduces the output by 90% or more while giving you the graph you actually need. There is no use case where you want the entire repo graph in a single response.
# Good: scoped to the module you care about
get_knowledge_map(repo="local/my-project", focus="src/lib/services")
# Bad: entire repo graph, 129K+ tokens
get_knowledge_map(repo="local/my-project")
What the Output Contains
Nodes With Centrality Scores
Every module in the scoped graph receives a centrality score. High-centrality modules are imported by many others. They are the hubs of the architecture, the modules where a breaking change has the widest blast radius.
Import Edges
Each edge shows which module imports which, with the specific symbols imported. This is more precise than grep output, which shows raw import lines without resolving re-exports or barrel files.
Circular Dependencies
The tool detects circular dependency chains and reports them explicitly. Circular dependencies are one of the most common sources of initialization bugs, bundler issues, and architectural drift. Finding them with grep requires tracing import chains across files manually.
Benchmark
| Approach | Tokens | Calls | Output Quality |
|---|---|---|---|
| grep across 7 subdirectories | 43,700 | 7 | Raw import lines, unranked, no circulars detected |
get_knowledge_map (focused) | 4,400 | 1 | Ranked nodes, resolved edges, circular detection |
The native approach produces 43K tokens of raw import { X } from './Y' lines. Converting those into an understanding of architecture requires significant mental effort. The structured output from get_knowledge_map is immediately actionable: the highest-centrality modules are the most important, the circular dependencies are the most urgent problems.
Mermaid Diagrams
The output can be used to generate Mermaid dependency diagrams for documentation or architecture reviews. The ranked node list gives you the most important modules to include in a simplified diagram.
When To Use It
- Architecture review: See how modules connect, identify over-coupled areas
- Refactoring planning: Find circular dependencies to break, identify high-centrality modules that need stable interfaces
- Onboarding: Understand the dependency structure of an unfamiliar codebase
- Pre-PR review: Verify that new code follows established dependency patterns
For questions about natural module boundaries (as opposed to declared dependencies), consider detect_communities first. It uses community detection algorithms to find groupings that may not match folder structure but reflect actual coupling patterns.
Related tools
Benchmark note
This benchmark compares CodeSift against the closest practical native workflow an agent would use for the same task.
For some tools, that baseline is a direct shell equivalent such as rg or find.
For AST-aware, graph-aware, and LSP-backed tools, the baseline is a multi-step workflow rather than a strictly identical command.
Results should be read as agent-workflow comparisons: token cost, call count, and practical context efficiency.