Context-Engine
Search and retrieve code context from any codebase using hybrid vector search (semantic + lexical) with neural reranking.
Client- or provider-specific wrapper files should stay thin and defer to this document for shared MCP tool-selection and search guidance.
Quickstart
- Start with
searchfor most codebase questions. - Use
symbol_graphfirst for direct symbol relationships such as callers, definitions, importers, subclasses, and base classes. - Use
graph_queryonly if that tool is available and you need transitive impact, dependency, or cycle analysis; otherwise combinesymbol_graphwith targeted search. - Prefer MCP tools for exploration. Narrow grep/file-open use is still fine for exact literal confirmation, exact file/path confirmation, or opening a file you already identified for editing.
- Use
cross_repo_searchfor multi-repo questions. For public V1context_search, treatinclude_memories=trueas compatibility-only: it preserves response shape but keeps results code-only and may addmemory_note.
Decision Tree: Choosing the Right Tool
What do you need?
|
+-- UNSURE or GENERAL QUERY --> search (RECOMMENDED DEFAULT)
| |
| +-- Auto-detects intent and routes to the best tool
| +-- Handles: code search, Q&A, tests, config, symbols, imports
| +-- Use this when you don't know which specialized tool to pick
|
+-- Find code locations/implementations
| |
| +-- Unsure what tool to use → search (DEFAULT - routes to repo_search if needed)
| +-- Speed-critical or complex filters → repo_search (skip routing overhead)
| +-- Want LLM explanation → context_answer
|
+-- Understand how something works
| |
| +-- Want LLM explanation --> search OR context_answer
| +-- Just code snippets --> search OR repo_search with include_snippet=true
|
+-- Find similar code patterns (retry loops, error handling, etc.)
| |
| +-- Have code example --> pattern_search with code snippet (if enabled)
| +-- Describe pattern --> pattern_search with natural language (if enabled)
|
+-- Find specific file types
| |
| +-- Test files --> search OR search_tests_for
| +-- Config files --> search OR search_config_for
|
+-- Find relationships
| |
| +-- Direct callers/defs/importers/inheritance --> search OR symbol_graph
| +-- Multi-hop callers --> symbol_graph (depth=2+)
| +-- Deep impact/dependencies/cycles --> graph_query (if available) OR symbol_graph + targeted search
|
+-- Git history
| |
| +-- Find commits --> search_commits_for
| +-- Predict co-changing files --> search_commits_for with predict_related=true
|
+-- Store/recall knowledge --> memory_store, memory_find
|
+-- Preserve public search shape while accepting memory flags --> context_search with include_memories=true (compatibility-only in V1)
|
+-- Multiple independent searches at once
|
+-- batch_search (runs N repo_search calls in one invocation, ~75% token savings)
Standard Parameters Reference
All SaaS-exposed tools organize parameters into families with consistent naming and behavior.
Family 1: Code Search Tools
Applies to: search, repo_search, code_search, batch_search, info_request, context_search
Standard Parameters:
| Parameter | Type | Required? | Default | Purpose |
|-----------|------|-----------|---------|---------|
| query | string or string[] | YES | — | Single query OR array of queries for fusion |
| language | string | optional | (auto-detect) | Filter by language: "python", "typescript", "go", etc. |
| under | string | optional | (root) | Path prefix filter, e.g., "src/api/" or "tests/" |
| path_glob | string[] | optional | (all) | Include patterns: ["**/*.ts", "lib/**"] |
| not_glob | string[] | optional | (none) | Exclude patterns: ["**/test_*", "**/*_test.*"] |
| symbol | string | optional | (all) | Filter by symbol name (function, class, variable) |
| kind | string | optional | (all) | AST node type: "function", "class", "method", "variable" |
| ext | string | optional | (all) | File extension: "py", "ts", "go" (alias for language) |
| repo | string or string[] | optional | (default) | Repository filter: single repo OR list OR "*" for all |
| limit | int | optional | 10 | Max results to return (1-100) |
| include_snippet | bool | optional | true | Include code snippets in results |
| compact | bool | optional | false | Strip verbose fields from response |
| output_format | string | optional | "json" | "json" (structured) OR "toon" (token-efficient) |
| rerank_enabled | bool | optional | true | Enable neural reranking (default ON) |
| case | string | optional | (insensitive) | "sensitive" for case-sensitive matching |
| context_lines | int | optional | 2 | Lines of context around matches |
| per_path | int | optional | 2 | Max results per file |
Standard Constraints:
limitmax 100 (higher values slow queries)querymax 400 characters / 50 wordslanguagemust be valid code language or auto-detection will fail silentlypath_glob/not_globsupport glob patterns (*, **, ?)- Multiple
queryterms are fused via Reciprocal Rank Fusion (RRF) for better recall
Family 2: Symbol Graph Tools
Applies to: symbol_graph, batch_symbol_graph, graph_query, batch_graph_query
Standard Parameters:
| Parameter | Type | Required? | Default | Purpose |
|-----------|------|-----------|---------|---------|
| symbol | string | YES | — | Symbol name to analyze (e.g., "authenticate", "UserService.get_user") |
| query_type | string | optional | "callers" | "callers", "callees", "definition", "importers", "subclasses", "base_classes", "impact", "cycles", "transitive_callers", "transitive_callees", "dependencies" |
| depth | int | optional | 1 | Traversal depth: 1=direct, 2=callers of callers, etc. (symbol_graph: max 3, graph_query: max 5+) |
| language | string | optional | (auto) | Filter by language for multi-language codebases |
| under | string | optional | (all) | Path prefix filter |
| limit | int | optional | 20 | Max results to return |
| include_paths | bool | optional | false | Include full traversal paths (graph_query only) |
| output_format | string | optional | "json" | "json" or "toon" |
| repo | string | optional | (default) | Repository filter |
| collection | string | optional | (session) | Target collection (use session defaults) |
Standard Constraints:
symbolmust be exact match or use fuzzy fallbackquery_typeis case-sensitivedepth > 3may be slow on large graphs- Results are auto-hydrated with code snippets
Family 3: Specialized Search Tools
Applies to: search_tests_for, search_config_for, search_callers_for, search_importers_for, search_commits_for
Standard Parameters:
| Parameter | Type | Required? | Default | Purpose |
|-----------|------|-----------|---------|---------|
| query | string | YES | — | Natural language or symbol name |
| limit | int | optional | 10 | Max results to return |
| language | string | optional | (auto) | Filter by language |
| under | string | optional | (all) | Path prefix filter |
Additional Parameters:
| Tool | Extra Parameters |
|------|------------------|
| search_commits_for | path (optional), predict_related (bool, default false) |
| All others | (inherit code search family) |
Family 4: Memory Tools
Applies to: memory_store, memory_find
Standard Parameters:
| Parameter | Type | Required? | Default | Purpose |
|-----------|------|-----------|---------|---------|
| information | string | YES (store) | — | Knowledge to persist (clear, self-contained) |
| query | string | YES (find) | — | Search for stored knowledge by similarity |
| metadata | dict | optional (store) | {} | Structured metadata: kind, topic, priority (1-5), tags, author |
| kind | string | optional (find) | (all) | Filter by kind: "memory", "note", "decision", "convention", "gotcha", "policy" |
| topic | string | optional (find) | (all) | Filter by topic: "auth", "database", "api", "caching", etc. |
| tags | string or string[] | optional (find) | (all) | Filter by tags: ["security", "sql", ...] |
| priority_min | int | optional (find) | 1 | Minimum priority threshold (1-5) |
| limit | int | optional | 10 | Max results to return |
Family 5: Batch Tools
Applies to: batch_search, batch_symbol_graph, batch_graph_query
Standard Parameters (Shared across all queries):
| Parameter | Type | Purpose |
|-----------|------|---------|
| searches / queries | array | Array of individual search/query specs (max 10 items) |
| collection | string | Shared default collection for all queries |
| language | string | Shared default language filter |
| under | string | Shared default path prefix |
| limit | int | Shared default result limit |
| output_format | string | "json" or "toon" for all results |
Per-Search Overrides:
Each item in searches / queries can override ANY shared parameter.
Example: searches[0] has different limit than searches[1]
Family 6: Cross-Repo & Admin Tools
Applies to: cross_repo_search, qdrant_status, qdrant_list, set_session_defaults
Standard Parameters:
| Tool | Parameters |
|------|------------|
| cross_repo_search | query, collection, target_repos, discover, trace_boundary, boundary_key |
| qdrant_status / qdrant_list | (no parameters) |
| set_session_defaults | collection, language, under, output_format, limit |
Unified Search: search (RECOMMENDED DEFAULT)
Use search as your PRIMARY tool. It auto-detects query intent and routes to the best specialized tool. No need to choose between 15+ tools.
{
"query": "authentication middleware"
}
Returns:
{
"ok": true,
"intent": "search",
"confidence": 0.92,
"tool": "repo_search",
"result": {
"results": [...],
"total": 8
},
"plan": ["detect_intent", "dispatch_repo_search"],
"execution_time_ms": 245
}
What it handles automatically:
- Code search ("find auth middleware") -> routes to
repo_search - Q&A ("how does caching work?") -> routes to
context_answer - Test discovery ("tests for payment") -> routes to
search_tests_for - Config lookup ("database settings") -> routes to
search_config_for - Symbol queries ("who calls authenticate") -> routes to
symbol_graph - Import tracing ("what imports CacheManager") -> routes to
search_importers_for
Override parameters (all optional):
{
"query": "error handling patterns",
"limit": 5,
"language": "python",
"under": "src/api/",
"include_snippet": true
}
When to use search:
- You're unsure which specialized tool to use
- You want intent auto-detection (routing to repo_search, context_answer, symbol_graph, tests, etc.)
- Acceptable latency overhead: ~50-100ms for routing + tool execution
- You're doing exploratory queries where routing overhead is negligible
When NOT to use search:
- You know you need raw code results (use
repo_searchdirectly) - Time is critical (<100ms target) and routing overhead matters
- You're in a tight loop doing 10+ sequential searches (use
batch_searchinstead)
Routing Performance:
- Intent detection: ~10-20ms
- Tool dispatch: ~5-10ms
- Total routing overhead: ~20-40ms typical, up to ~100ms worst-case
- For time-critical loops: skip routing with
repo_searchdirectly
When to use specialized tools instead:
- Cross-repo search ->
cross_repo_search - Multiple independent searches ->
batch_search(N searches in one call, ~75% token savings) - Memory storage/retrieval ->
memory_store,memory_find - Admin/diagnostics ->
qdrant_status,qdrant_list - Pattern matching (structural) ->
pattern_search
When to use repo_search instead of search:
- Full control over filters: You know exactly what you're searching for and want to apply specific language, path, or symbol filters without auto-detection overhead
- Example: "In a polyglot repo, I need Python code only" → use
repo_searchwithlanguage="python"to avoid search's auto-detectedlanguage=javascript - Example: "Find only test files matching a pattern" → use
repo_searchwithpath_glob="**/test_*.py"directly
- Example: "In a polyglot repo, I need Python code only" → use
- Speed-critical queries (<100ms target): You can't afford the ~20-40ms routing overhead
- Example: Time-sensitive tool loops where each query must complete in <50ms
- Complex filter combinations: You need
language+under+not_globtogether, not guessed by auto-detection - Guaranteed exact behavior: You want reproducible results without routing confidence variations (search routing confidence varies 0.6-0.95)
- Known tool type: You already know you need code results (not Q&A, tests, configs, or symbols) so routing is wasted
Example: When search guesses wrong:
SEARCH (auto-routes, may detect wrong intent):
query: "authenticate in FastAPI"
confidence: 0.75
intent: "Q&A - what does authenticate do in FastAPI?"
→ routes to context_answer, returns explanation instead of code
REPO_SEARCH (explicit, predictable):
query: "authenticate"
language: "python"
under: "src/auth/"
→ returns code implementations in src/auth/ only, no routing overhead
Routing Overhead: When It Matters
Latency Impact of Using search vs repo_search directly:
| Scenario | search Latency | repo_search Latency | Routing Cost | Use search? | |----------|-----------------|---------------------|--------------|-------------| | One exploratory query | ~150-200ms | ~80-100ms | ~70-100ms | YES (worth it for auto-routing) | | 3 independent queries, sequential | ~450-600ms | ~240-300ms | ~210-300ms | NO (use batch_search instead) | | Time-critical query (<50ms) | Can miss deadline | ~80-100ms | ❌ Unacceptable | NO (use repo_search) | | Tight loop (20+ queries) | ~3000-4000ms | ~1600-2000ms | ~1400-2000ms | NO (use batch_search) |
Decision Criteria:
- Use
searchwhen: One-off query, exploratory, unsure which tool, latency <200ms is acceptable - Use
repo_searchwhen: Speed <100ms required, complex filter combo needed, tight loop (use batch_search if >2 queries), know you need code (not Q&A) - Use
batch_searchwhen: 2+ independent code searches to reduce routing overhead by 75-85% per batch
Real-world example - Interactive AI assistant loop:
Bad (repeated routing overhead):
for query in user_queries: # 5 queries
result = search(query) # ~70-100ms routing × 5 = 350-500ms wasted
Good (one batch call):
results = batch_search([query1, query2, query3, query4, query5]) # Routing once, ~25ms × 5 = 125ms
# Saves ~300ms+ per iteration
Primary Search: repo_search
Use repo_search (or its alias code_search) for direct code lookups when you need full control. Reranking is ON by default.
{
"query": "database connection handling",
"limit": 10,
"include_snippet": true,
"context_lines": 3
}
Returns:
{
"results": [
{"score": 3.2, "path": "src/db/pool.py", "symbol": "ConnectionPool", "start_line": 45, "end_line": 78, "snippet": "..."}
],
"total": 8,
"used_rerank": true
}
Multi-query for better recall - pass a list to fuse results:
{
"query": ["auth middleware", "authentication handler", "login validation"]
}
Apply filters to narrow results:
{
"query": "error handling",
"language": "python",
"under": "src/api/",
"not_glob": ["**/test_*", "**/*_test.*"]
}
Search across repos (same collection):
{
"query": "shared types",
"repo": ["frontend", "backend"]
}
Use repo: "*" to search all indexed repos.
Search across repos (separate collections — use cross_repo_search):
// cross_repo_search
{"query": "shared types", "target_repos": ["frontend", "backend"]}
// With boundary tracing for cross-repo flow discovery
{"query": "login submit", "trace_boundary": true}
Available Filters
language- Filter by programming languageunder- Path prefix (e.g., "src/api/")path_glob- Include patterns (e.g., ["/*.ts", "lib/"])not_glob- Exclude patterns (e.g., ["**/test_*"])symbol- Symbol name matchkind- AST node type (function, class, etc.)ext- File extensionrepo- Repository filter for multi-repo setupscase- Case-sensitive matching
Batch Search: batch_search
Run N independent repo_search calls in a single MCP tool invocation. Reduces token overhead by ~75-85% compared to sequential calls.
Token Savings & Latency Metrics:
| N Searches | Token Savings | Sequential Latency | Batch Latency | Worth Batching? | |------------|---------------|--------------------|---------------|-----------------| | 1 | 0% | ~100ms | N/A | N/A | | 2 | ~40% | ~180-200ms | ~150-160ms | ✅ YES (save 30-40ms, 40% tokens) | | 3 | ~55% | ~270-300ms | ~180-200ms | ✅ YES (save 90-100ms, 55% tokens) | | 5 | ~70% | ~450-500ms | ~220-250ms | ✅ YES (save 250ms, 70% tokens) | | 10 | ~75% | ~900-1000ms | ~300-350ms | ✅ YES (save 600ms, 75% tokens) |
Decision Rule: Always use batch_search when you have 2+ independent code searches. The latency savings alone (30-100ms faster) justify batching, plus you save ~40-75% tokens.
{
"searches": [
{"query": "authentication middleware", "limit": 5},
{"query": "rate limiting implementation", "limit": 5},
{"query": "error handling patterns"}
],
"compact": true,
"output_format": "toon"
}
Returns:
{
"ok": true,
"batch_results": [result_set_0, result_set_1, result_set_2],
"count": 3,
"elapsed_ms": 245
}
Each result_set has the same schema as repo_search output.
Shared parameters (applied to all searches unless overridden per-search):
collection,output_format,compact,limit,language,under,repo,include_snippet,rerank_enabled
Per-search overrides: Each entry in searches can include any repo_search parameter to override the shared defaults.
Limits: Maximum 10 searches per batch.
When to use batch_search vs multiple search calls:
- Use
batch_searchwhen you have 2+ independent code searches and want to minimize token usage and round-trips - Use individual
searchcalls when you need intent routing (Q&A, symbol graph, etc.) or when searches depend on each other's results
Simple Lookup: info_request
Use info_request for natural language queries with minimal parameters:
{
"info_request": "how does user authentication work"
}
Add explanations:
{
"info_request": "database connection pooling",
"include_explanation": true
}
Q&A with Citations: context_answer
Use context_answer when you need an LLM-generated explanation grounded in code:
{
"query": "How does the caching layer invalidate entries?",
"budget_tokens": 2000
}
Returns an answer with file/line citations. Use expand: true to generate query variations for better retrieval.
Pattern Search: pattern_search (Optional)
Note: This tool may not be available in all deployments. If pattern detection is disabled, calls return
{"ok": false, "error": "Pattern search module not available"}.
Find structurally similar code patterns across all languages. Accepts either code examples or natural language descriptions—auto-detects which.
Code example query - find similar control flow:
{
"query": "for i in range(3): try: ... except: time.sleep(2**i)",
"limit": 10,
"include_snippet": true
}
Natural language query - describe the pattern:
{
"query": "retry with exponential backoff",
"limit": 10,
"include_snippet": true
}
Cross-language search - Python pattern finds Go/Rust/Java equivalents:
{
"query": "if err != nil { return err }",
"language": "go",
"limit": 10
}
Explicit mode override - force code or description mode:
{
"query": "error handling",
"query_mode": "description",
"limit": 10
}
Key parameters:
query- Code snippet OR natural language descriptionquery_mode-"code","description", or"auto"(default)language- Language hint for code examples (python, go, rust, etc.)limit- Max results (default 10)min_score- Minimum similarity threshold (default 0.3)include_snippet- Include code snippets in resultscontext_lines- Lines of context around matchesaroma_rerank- Enable AROMA structural reranking (default true)aroma_alpha- Weight for AROMA vs original score (default 0.6)target_languages- Filter results to specific languages
Returns:
{
"ok": true,
"results": [...],
"total": 5,
"query_signature": "L2_2_B0_T2_M0",
"query_mode": "code",
"search_mode": "aroma"
}
The query_signature encodes control flow: L (loops), B (branches), T (try/except), M (match).
Specialized Search Tools
search_tests_for - Find test files:
{"query": "UserService", "limit": 10}
search_config_for - Find config files:
{"query": "database connection", "limit": 5}
search_callers_for - Find callers of a symbol:
{"query": "processPayment", "language": "typescript"}
search_importers_for - Find importers:
{"query": "utils/helpers", "limit": 10}
symbol_graph - Symbol graph navigation (callers / callees / definition / importers / subclasses / base classes):
Query types:
| Type | Description |
|------|-------------|
| callers | Who calls this symbol? |
| callees | What does this symbol call? |
| definition | Where is this symbol defined? |
| importers | Who imports this module/symbol? |
| subclasses | What classes inherit from this symbol? |
| base_classes | What classes does this symbol inherit from? |
Examples:
{"symbol": "ASTAnalyzer", "query_type": "definition", "limit": 10}
{"symbol": "get_embedding_model", "query_type": "callers", "under": "scripts/", "limit": 10}
{"symbol": "qdrant_client", "query_type": "importers", "limit": 10}
{"symbol": "authenticate", "query_type": "callees", "limit": 10}
{"symbol": "BaseModel", "query_type": "subclasses", "limit": 20}
{"symbol": "MyService", "query_type": "base_classes"}
- Supports
language,under,depth, andoutput_formatlike other tools. - Use
depth=2ordepth=3for multi-hop traversals (callers of callers). - If there are no graph hits, it falls back to semantic search.
- Note: Results are "hydrated" with ~500-char source snippets for immediate context.
graph_query - Advanced graph traversals and impact analysis (available to all SaaS users):
Query types:
| Type | Description |
|------|-------------|
| callers | Direct callers of this symbol |
| callees | Direct callees of this symbol |
| transitive_callers | Multi-hop callers (up to depth) |
| transitive_callees | Multi-hop callees (up to depth) |
| impact | What would break if I change this symbol? |
| dependencies | Combined calls + imports |
| definition | Where is this symbol defined? |
| cycles | Detect circular dependencies involving this symbol |
Examples:
{"symbol": "UserService", "query_type": "impact", "depth": 3}
{"symbol": "auth_module", "query_type": "cycles"}
{"symbol": "processPayment", "query_type": "transitive_callers", "depth": 2, "limit": 20}
- Supports
language,under,depth,limit,include_paths, andoutput_format. - Use
include_paths: trueto get full traversal paths in results. - Use
depthto control how many hops to traverse (default varies by query type). - Note:
symbol_graphis always available (Qdrant-backed).graph_queryprovides advanced Memgraph-backed traversals and is available to all SaaS users.
Comparison: symbol_graph vs graph_query
| Feature | symbol_graph | graph_query | |---------|--------------|------------| | Availability | Always (Qdrant-backed) | SaaS/Enterprise (Memgraph-backed) | | Performance | ~2-5ms per query | ~50-200ms per query | | Supported Relationships | callers, callees, definition, importers, subclasses, base_classes | All symbol_graph + impact, cycles, transitive_* | | Max Depth | up to 3 | up to 5+ | | Best For | Direct relationships, exploratory queries | Impact analysis, dependency chains, circular detection | | Fallback When Unavailable | Falls back to semantic search | N/A (use symbol_graph instead) | | Latency-Critical Loops | ✅ YES (fast) | ❌ NO (slower) |
Decision Guide:
- Use symbol_graph for: direct callers/callees/definitions, inheritance queries, when you need speed, always as first stop
- Use graph_query for: impact analysis ("what breaks?"), cycle detection, transitive chains, when available and you need depth >3
search_commits_for - Search git history:
{"query": "fixed authentication bug", "limit": 10}
Predict co-changing files (predict_related mode):
{"path": "src/api/auth.py", "predict_related": true, "limit": 10}
Returns ranked files that historically co-change with the given path, along with the most relevant commit message explaining why.
change_history_for_path - File change summary:
{"path": "src/api/auth.py", "include_commits": true}
Memory: Store and Recall Knowledge
Memory tools allow you to persist team knowledge, architectural decisions, and findings for later retrieval across sessions.
Memory Workflow: Store → Retrieve → Reuse
Phase 1: During Exploration (Session 1) As you discover important patterns, decisions, or findings, store them for future reference:
{
"memory_store": {
"information": "Auth service uses JWT tokens with 24h expiry. Refresh tokens last 7 days. Stored in Redis with LRU eviction.",
"metadata": {
"kind": "decision",
"topic": "auth",
"priority": 5,
"tags": ["security", "jwt", "session-management"]
}
}
}
Phase 2: In Later Sessions Retrieve and reuse stored knowledge by similarity:
{
"memory_find": {
"query": "token expiration policy",
"topic": "auth",
"limit": 5
}
}
Returns the exact note stored in Phase 1, plus any other auth-related memories.
Phase 3: Blend Code + Memory When you want BOTH code search results AND stored team knowledge:
{
"context_search": {
"query": "authentication flow",
"include_memories": true,
"per_source_limits": {"code": 6, "memory": 3}
}
}
Returns: 6 code snippets + 3 memory notes, all ranked by relevance.
Timeline and Persistence
| Property | Behavior |
|----------|----------|
| Searchability | Memories searchable immediately after memory_store (indexing is instant) |
| Persistence | Memories persist across sessions indefinitely (durable storage) |
| Scope | Org/workspace scoped: one team's memories don't leak to another |
| Latency | ~100ms per memory_find query (same as code search) |
| Storage | Embedded in same Qdrant collection as code, but logically isolated |
Real-World Example: Session Continuity
Session 1 (Day 1) - Discovery:
Context: Investigating why JWT refresh tokens sometimes expire unexpectedly
→ memory_store(
information="Found: RefreshTokenManager.py line 89 uses session.expire_in instead of constants.REFRESH_TTL. This was a bug introduced in PR #1234 where the constant was 7 days but the session value was hardcoded to 3 days. The mismatch causes premature expiration.",
metadata={"kind": "gotcha", "topic": "auth", "tags": ["bug", "jwt"], "priority": 4}
)
Session 2 (Day 5) - Troubleshooting a Similar Issue:
→ memory_find(query="refresh token expiration problem", topic="auth")
Response: Found Session 1's note about the RefreshTokenManager bug, plus similar findings about token TTL misconfigurations.
→ User goes directly to line 89 of RefreshTokenManager.py and verifies the fix status.
Result: Problem solved in 2 minutes instead of 30 minutes of debugging.
When to Store What
| Memory Kind | Use Case | Example |
|-------------|----------|---------|
| decision | Architectural choices and their rationale | "We chose JWT over sessions because stateless scaling" |
| gotcha | Subtle bugs or trap conditions | "RefreshTokenManager line 89 has TTL mismatch" |
| convention | Team patterns and standards | "All API responses use envelope pattern with status/data/errors" |
| note | General findings or context | "Auth service was moved to separate repo last month" |
| policy | Compliance or operational rules | "Session tokens must be rotated every 24h per SOC2" |
Integration with Code Search
Pattern 1: Pure Code Search
{"search": "authentication validation"}
Returns: code snippets only. Fast, no memory overhead.
Pattern 2: Code + Memory Blend
{
"context_search": {
"query": "authentication validation",
"include_memories": true,
"per_source_limits": {"code": 5, "memory": 2}
}
}
Returns: 5 code snippets + 2 relevant memory notes (team insights about auth validation patterns).
Pattern 3: Memory Only
{"memory_find": {"query": "authentication patterns", "limit": 10}}
Returns: stored team knowledge about auth, useful for onboarding or architecture review.
Common Patterns
Team Onboarding:
- New engineer joins →
memory_find(query="project architecture", topic="architecture") - Retrieves all stored architectural decisions in one place
- Much faster than reading scattered code comments
Incident Response:
- Production auth bug occurs
- →
memory_find(query="auth failures", priority_min=3) - Retrieves gotchas, prior incidents, and known traps
- Faster root-cause diagnosis
Code Review Efficiency:
- Reviewer checks PR modifying auth module
- →
context_search(query="authentication standards", include_memories=true) - Sees both current code AND team conventions/policies
- Makes more informed review decisions
Error Cases and Recovery
| Error | Cause | Recovery |
|-------|-------|----------|
| "No results from memory_find" | Query too specific or memories not yet stored | Broaden query, check metadata filters (topic, tags, kind) |
| "Memory not found in next session" | Wrong workspace/collection or stale cache | Verify workspace matches, run qdrant_list to confirm collection |
| "include_memories=true returns only code" | Memory store empty for this workspace | Start storing with memory_store - next session will have memories |
| "Duplicate memories with same info" | Same finding discovered twice | Use memory_find with topic/tags filter, consolidate via note |
Admin and Diagnostics
qdrant_status - Check index health:
{}
qdrant_list - List all collections:
{}
embedding_pipeline_stats - Get cache efficiency, bloom filter stats, pipeline performance:
{}
set_session_defaults - Set defaults for session:
{"collection": "my-project", "language": "python"}
Deployment Mode Capabilities
SaaS Mode: In SaaS deployments, indexing is handled automatically by the VS Code extension upload service. The tools below marked "Self-Hosted Only" are not available in SaaS mode. All search, symbol graph, memory, and session tools work normally.
Self-Hosted Only Tools (not available in SaaS):
| Tool | Purpose | When to Use |
|------|---------|-------------|
| qdrant_index_root | Index entire workspace | Initial indexing or after major codebase reorg |
| qdrant_index | Index subdirectory | Incremental indexing of specific folders |
| qdrant_prune | Remove stale entries | Clean up entries from deleted files |
Tool Availability Matrix
Which tools are available in which deployment modes:
| Tool Category | Tool | SaaS | Self-Hosted | Enterprise |
|---|---|---|---|---|
| Search | search | ✅ | ✅ | ✅ |
| | repo_search / code_search | ✅ | ✅ | ✅ |
| | cross_repo_search | ✅ | ✅ | ✅ |
| | batch_search | ✅ | ✅ | ✅ |
| Search (Specialized) | info_request | ✅ | ✅ | ✅ |
| | context_answer | ✅ | ✅ | ✅ |
| | search_tests_for | ✅ | ✅ | ✅ |
| | search_config_for | ✅ | ✅ | ✅ |
| | search_callers_for | ✅ | ✅ | ✅ |
| | search_importers_for | ✅ | ✅ | ✅ |
| | search_commits_for | ✅ | ✅ | ✅ |
| | change_history_for_path | ✅ | ✅ | ✅ |
| | pattern_search (if enabled) | ✅* | ✅* | ✅* |
| Symbol Graph | symbol_graph | ✅ | ✅ | ✅ |
| | batch_symbol_graph | ✅ | ✅ | ✅ |
| | graph_query | ✅ (limited)** | ✅ | ✅ |
| | batch_graph_query | ✅ (limited)** | ✅ | ✅ |
| Memory | memory_store | ✅ | ✅ | ✅ |
| | memory_find | ✅ | ✅ | ✅ |
| | context_search | ✅ | ✅ | ✅ |
| Session | set_session_defaults | ✅ | ✅ | ✅ |
| | expand_query | ✅ | ✅ | ✅ |
| Admin | qdrant_status | ✅ | ✅ | ✅ |
| | qdrant_list | ✅ | ✅ | ✅ |
| | embedding_pipeline_stats | ✅ | ✅ | ✅ |
| | qdrant_index_root | ❌ | ✅ | ✅ |
| | qdrant_index | ❌ | ✅ | ✅ |
| | qdrant_prune | ❌ | ✅ | ✅ |
Legend:
- ✅ = Available
- ❌ = Not available
- ✅* = Pattern search available only if enabled during deployment
- ✅ (limited)** = SaaS graph_query has limited depth/performance vs Enterprise with dedicated Memgraph
Choosing Your Deployment Mode
| Requirement | Best Fit | |---|---| | Automatic indexing via VS Code | SaaS | | Manual control over indexing pipeline | Self-Hosted | | Advanced graph queries (cycles, impact analysis) | Self-Hosted or Enterprise | | High-performance graph traversal | Enterprise (dedicated Memgraph) | | Cost-sensitive small team | SaaS (pay per upload) | | Large codebase with frequent indexing | Self-Hosted (unlimited reindex) |
Error Handling and Recovery
Tools return structured errors via error field or ok: false flag. Below are common errors and recovery steps by category.
Search Tools (search, repo_search, batch_search, info_request)
| Error | HTTP 400? | Cause | Recovery Steps |
|-------|-----------|-------|-----------------|
| "Collection not found" | Yes | Collection doesn't exist, workspace hasn't been indexed, or collection was deleted | 1. Run qdrant_list() to verify available collections<br>2. Check workspace name in config matches indexed name<br>3. If missing: re-upload workspace to indexing service<br>4. If collection exists but stale: wait for background refresh or trigger reindex |
| "Invalid language filter" | Yes | language parameter has invalid value | Use only valid language codes: "python", "typescript", "go", "rust", "java", etc.<br>Check qdrant_status for supported languages |
| "Timeout during rerank" | No (504) | Reranking took too long (default 5s timeout) | Set rerank_enabled: false to skip reranking<br>OR set rerank_timeout_ms: 10000 for longer timeout<br>OR reduce limit to speed up reranking |
| "Empty results" | No (200) | Query too specific, collection not fully indexed, or no matches exist | 1. Broaden query (remove filters, use more general terms)<br>2. Check language filter is correct<br>3. Run qdrant_status to see point count<br>4. If points=0: indexing is incomplete, wait and retry |
| "Query too long" | Yes | Query exceeds 400 chars or 50 words | Shorten query or split into multiple searches |
| "Syntax error in path_glob" | Yes | Invalid glob pattern in path_glob or not_glob | Check glob syntax: valid wildcards are * (any), ** (any directories), ? (any single char) |
Silent Failure Watches:
- Empty results when expecting matches → check
underpath filter (may be excluding files) - Results from wrong language → verify
languageparameter is set correctly - Reranking disabled silently → check
rerank_timeout_msif you set custom timeout - Wrong collection queried → session defaults may not match workspace (use
set_session_defaultsto "cd" into correct collection)
Symbol Graph Tools (symbol_graph, batch_symbol_graph, graph_query)
| Error | Cause | Recovery Steps |
|-------|-------|-----------------|
| "Symbol not found" | Symbol doesn't exist, wrong name, or graph not indexed | 1. Verify exact symbol name using repo_search(symbol="...")<br>2. Check spelling and case sensitivity<br>3. If graph unavailable: use repo_search instead<br>4. For imported symbols: search with full module path |
| "Graph unavailable / not ready" | Memgraph backend not initialized (graph_query only) | Fall back to symbol_graph (always available)<br>graph_query requires SaaS/Enterprise plan with Neo4j/Memgraph |
| "Depth too high" | depth parameter exceeds max for this tool | Reduce depth: symbol_graph max 3, graph_query max 5+<br>For deeper chains, use multiple queries with results as input |
| "Timeout during graph traversal" | Graph query took too long | Reduce depth, reduce limit, or use smaller under path filter |
Silent Failure Watches:
- No callers found when method is clearly called → fuzzy fallback may have triggered, use repo_search to verify method exists
- Symbol seems undefined but code uses it → cross-module imports may not be resolved in graph yet
Context Answer (LLM-powered explanation)
| Error | Cause | Recovery Steps |
|-------|-------|-----------------|
| "Insufficient context" | Retrieved code wasn't enough to answer question | 1. Rephrase question more specifically<br>2. Use expand: true to generate query variations<br>3. Increase budget_tokens for deeper retrieval<br>4. Use repo_search first to verify code exists |
| "Timeout during retrieval or generation" | LLM generation or retrieval took >60s | Set rerank_enabled: false to skip reranking<br>Reduce budget_tokens for faster shallow retrieval<br>Ask simpler question requiring less context |
| "Budget exceeded" | Generated answer would use >budget_tokens | Increase budget_tokens or ask more focused question |
Memory Tools (memory_store, memory_find)
| Error | Cause | Recovery Steps |
|-------|-------|-----------------|
| "Memory not found" | No memories match query or metadata filters | 1. Broaden query (more general terms)<br>2. Remove metadata filters (topic, kind, priority_min)<br>3. Check if memories exist: memory_find(query="*")<br>4. Verify workspace/collection (memories are org-scoped) |
| "Storage failure" | Backend couldn't persist memory | Retry memory_store - likely transient<br>Check qdrant_status for cluster health |
| "Duplicate memory detected" (warning) | Similar memory already exists with higher priority | Review existing memories first: memory_find(query="...", topic="...")<br>Consolidate if same information |
Batch Tools (batch_search, batch_symbol_graph)
| Error | Cause | Recovery Steps |
|-------|-------|-----------------|
| "Too many searches" | searches / queries array > 10 items | Split into multiple batch calls (max 10 per call)<br>If independent: use sequential calls (lower token savings but more granular)<br>If dependent: must be sequential anyway |
| "Mixed error in batch" | Some queries succeeded, others failed | Check individual batch_results array for per-query ok: false<br>Failed queries return error details in batch_results[i].error<br>Successful queries still have results in batch_results[i].results |
| "Timeout on any query" | One query in the batch timed out | Set rerank_enabled: false in that query's override<br>Reduce limit for slow queries<br>Consider running that query separately |
Cross-Repo & Discovery
| Error | Cause | Recovery Steps |
|-------|-------|-----------------|
| "No collections found" | No indexed repositories available OR discover mode="never" | 1. Run qdrant_list() manually to see available collections<br>2. Try discover: "always" in cross_repo_search<br>3. Verify workspace is indexed<br>4. If nothing indexed: use upload_service to index workspace |
| "Multiple ambiguous collections" | User query matched multiple repos but target unclear | Use target_repos: [...] to explicitly specify repos<br>OR use boundary_key to search with exact interface name<br>OR do two separate targeted searches |
| "Boundary key not found" | boundary_key doesn't exist in other repo | Verify boundary_key is exact string (routes, event names, type names)<br>May be named slightly differently in other repo (check similar names)<br>Try broader search instead of boundary tracing |
Logging and Diagnostics
When errors persist:
- Check cluster health:
qdrant_status()shows point counts, last indexed time, scanned_points - List available collections:
qdrant_list()withinclude_status=trueshows health per collection - Check embedding stats:
embedding_pipeline_stats()shows cache hit rate, dedup efficiency - Verify auth: If authentication errors, check workspace/org identity matches request
- Review recent changes: If started failing recently, check
change_history_for_path()for relevant commits
Multi-Repo Navigation (CRITICAL)
When multiple repositories are indexed, you MUST discover and explicitly target collections.
Discovery (Lazy — only when needed)
Don't discover at every session start. Trigger when: search returns no/irrelevant results, user asks a cross-repo question, or you're unsure which collection to target.
// qdrant_list — discover available collections
{}
Context Switching (Session Defaults = cd)
Treat set_session_defaults like cd — it scopes ALL subsequent searches:
// "cd" into backend repo — all searches now target this collection
// set_session_defaults
{"collection": "backend-api-abc123"}
// One-off peek at another repo (does NOT change session default)
// search (or repo_search)
{"query": "login form", "collection": "frontend-app-def456"}
For unified collections: use "repo": "*" or "repo": ["frontend", "backend"]
Cross-Repo Flow Tracing (Boundary-Driven)
NEVER search both repos with the same vague query. Find the interface boundary in Repo A, extract the hard key, then search Repo B with that specific key.
Pattern 1 — Interface Handshake (API/RPC):
// 1. Find client call in frontend
// search
{"query": "login API call", "collection": "frontend-col"}
// → Found: axios.post('/auth/v1/login', ...)
// 2. Search backend for that exact route
// search
{"query": "'/auth/v1/login'", "collection": "backend-col"}
Pattern 2 — Shared Contract (Types/Schemas):
// 1. Find type usage in consumer
// symbol_graph
{"symbol": "UserProfile", "query_type": "importers", "collection": "frontend-col"}
// 2. Find definition in source
// search
{"query": "interface UserProfile", "collection": "shared-lib-col"}
Pattern 3 — Event Relay (Pub/Sub):
// 1. Find producer → extract event name
// search
{"query": "publish event", "collection": "service-a-col"}
// → Found: bus.publish("USER_CREATED", payload)
// 2. Find consumer with exact event name
// search
{"query": "'USER_CREATED'", "collection": "service-b-col"}
Automated Cross-Repo Search (PRIMARY for Multi-Repo)
cross_repo_search is the PRIMARY tool for multi-repo scenarios. Use it BEFORE manual qdrant_list + repo_search chains.
Discovery Modes:
| Mode | Behavior | When to Use |
|------|----------|-------------|
| "auto" (default) | Discovers only if results empty or no targeting | Normal usage |
| "always" | Always runs discovery before search | First search in session, exploring new codebase |
| "never" | Skips discovery, uses explicit collection | When you know exact collection, speed-critical |
// Search across all repos at once (auto-discovers collections)
// cross_repo_search
{"query": "authentication flow", "discover": "auto"}
// Target specific repos by name
// cross_repo_search
{"query": "login handler", "target_repos": ["frontend", "backend"]}
// Boundary tracing — auto-extracts routes/events/types from results
// cross_repo_search
{"query": "login submit", "trace_boundary": true}
// → Returns boundary_keys: ["/api/auth/login"] + trace_hint for next search
// Follow boundary key to another repo
// cross_repo_search
{"boundary_key": "/api/auth/login", "collection": "backend-col"}
Use cross_repo_search when you need breadth across repos. Use search (or repo_search) with explicit collection when you need depth in one repo.
Multi-Repo Anti-Patterns
- DON'T search both repos with the same vague query (noisy, confusing)
- DON'T assume the default collection is correct — verify with
qdrant_list - DON'T forget to "cd back" after cross-referencing another repo
- DO extract exact strings (route paths, event names, type names) as search anchors
Query Expansion
expand_query - Generate query variations for better recall:
{"query": "auth flow", "max_new": 2}
Output Formats
json(default) - Structured outputtoon- Token-efficient compressed format
Set via output_format parameter.
Tool Aliases and Compatibility
Tool Aliases
These tools have alternate names that work identically:
| Primary Name | Alias(es) | When to Use | Note |
|--------------|-----------|------------|------|
| repo_search | code_search | Either name works identically | Both names are equivalent, use whichever is familiar |
| memory_store | (none) | Standard name | Part of memory server, no aliases |
| memory_find | (none) | Standard name | Part of memory server, no aliases |
| search | (none) | Standard name | Auto-routing search, no aliases |
| symbol_graph | (none) | Standard name | Direct symbol queries, no aliases |
Compatibility Wrappers
These wrappers provide backward compatibility for legacy clients by accepting alternate parameter names:
| Wrapper | Primary Tool | Alternate Parameter Names | When to Use |
|---------|--------------|--------------------------|-------------|
| repo_search_compat | repo_search | Accepts q, text (instead of query), top_k (instead of limit) | Legacy clients that don't support standard parameter names |
| context_answer_compat | context_answer | Accepts q, text (instead of query) | Legacy clients using old parameter names |
Preference: Use primary tools and standard parameter names whenever possible. Compat wrappers exist only for legacy client support and may have slower adoption of new features.
Cross-Server Tools
These tools are provided by separate MCP servers:
| Tool | Server | Purpose |
|------|--------|---------|
| memory_store | Memory server | Persist team knowledge for later retrieval |
| memory_find | Memory server | Search stored memories by similarity |
| All search/symbol tools | Context server | Primary code search and analysis |
All tools are transparently integrated into the unified search interface.
Best Practices
- Use
searchas your default tool - It auto-routes to the best specialized tool. Only use specific tools when you need precise control or featuressearchdoesn't handle (cross-repo, memory, admin). - Prefer MCP over Read/grep for exploration - Use MCP tools (
search,repo_search,symbol_graph,context_answer) for discovery and cross-file understanding. Narrow file/grep use is still fine for exact literal confirmation, exact path/line confirmation, or opening a file you already identified for editing. - Use
symbol_graphfirst for symbol relationships - It handles callers, callees, definitions, importers, subclasses, and base classes. Usegraph_queryonly when available and you need deeper impact/dependency traversal. - Start broad, then filter - Begin with
searchor a semantic query, add filters if too many results - Use multi-query - Pass 2-3 query variations for better recall on complex searches
- Include snippets - Set
include_snippet: trueto see code context in results - Store decisions - Use
memory_storeto save architectural decisions and context for later - Check index health - Run
qdrant_statusif searches return unexpected results - Use pattern_search for structural matching - When looking for code with similar control flow (retry loops, error handling), use
pattern_searchinstead ofrepo_search(if enabled) - Describe patterns in natural language -
pattern_searchunderstands "retry with backoff" just as well as actual code examples (if enabled) - Fire independent searches in parallel - Call multiple
search,repo_search,symbol_graph, etc. in the same message block for 2-3x speedup. Alternatively, usebatch_searchto run Nrepo_searchcalls in a single invocation with ~75% token savings - Use TOON format for discovery - Set
output_format: "toon"for 60-80% token reduction on exploratory queries - Bootstrap sessions with defaults - Call
set_session_defaults(output_format="toon", compact=true)early to avoid repeating params - Two-phase search - Discovery first (
limit=3, compact=true), then deep dive (limit=5-8, include_snippet=true) on targets - Use fallback chains - If
context_answertimes out, fall back tosearchorrepo_search+info_request(include_explanation=true)
Return Shapes Reference
Every tool returns a consistent envelope. Understanding the response structure helps you parse results correctly and detect errors.
Universal Response Envelope
All tools return minimum:
{
"ok": boolean,
"error": "string (only if ok=false)"
}
ok: true= success (may have zero results but no error)ok: false= error (details inerrorfield)
Search Family Return Shape
Applies to: search, repo_search, batch_search, info_request, context_search
{
"ok": true,
"results": [
{
"score": 0.85, // Relevance score (0-1+, higher=better)
"path": "src/auth.py", // File path
"symbol": "authenticate", // Symbol name (optional)
"start_line": 42, // Start line number
"end_line": 67, // End line number
"snippet": "def authenticate...",// Code snippet (if include_snippet=true)
"language": "python" // Programming language
}
],
"total": 5, // Total results found
"used_rerank": true, // Whether reranking was applied
"execution_time_ms": 245 // Query execution time
}
Symbol Graph Return Shape
Applies to: symbol_graph, batch_symbol_graph, graph_query, batch_graph_query
{
"ok": true,
"results": [
{
"path": "src/api/handlers.py", // File path
"start_line": 142, // Start line
"end_line": 145, // End line
"symbol": "handle_login", // Symbol at this location
"symbol_path": "handlers.handle_login", // Qualified symbol
"language": "python", // Programming language
"snippet": "result = authenticate(username, password)", // Code snippet
"hop": 1 // For depth>1: which hop found this
}
],
"symbol": "authenticate", // Symbol queried
"query_type": "callers", // Type of query
"count": 12, // Total results
"depth": 1, // Traversal depth used
"used_graph": true, // Whether graph backend was used
"suggestions": [...] // Fuzzy matches if exact symbol not found
}
Unified Search Return Shape
Applies to: search (auto-routing wrapper)
{
"ok": true,
"intent": "search", // Detected intent (search, qa, tests, config, symbols, etc.)
"confidence": 0.92, // Intent detection confidence (0-1)
"tool": "repo_search", // Tool used for routing
"result": { // Result from dispatched tool
"results": [...],
"total": 8,
"used_rerank": true,
"execution_time_ms": 245
},
"plan": ["detect_intent", "dispatch_repo_search"], // Steps taken
"execution_time_ms": 245 // Total time
}
Context Answer Return Shape
Applies to: context_answer
{
"ok": true,
"answer": "The authentication system validates tokens by first checking the JWT signature using the secret from config [1], then verifying expiration time [2]...", // LLM-generated answer with citations [1], [2]...
"citations": [
{
"id": 1, // Citation number
"path": "src/auth/jwt.py", // File path
"start_line": 45, // Start line
"end_line": 52, // End line
"snippet": "def verify_token(token):..." // Optional code snippet
}
// ... more citations
],
"query": ["How does authentication validate tokens"], // Original query
"used": {
"spans": 5, // Code spans retrieved
"tokens": 1842 // Tokens used for answer
}
}
Memory Tools Return Shape
Applies to: memory_store, memory_find
{
"ok": true,
"id": "abc123...", // Unique ID (memory_store only)
"message": "Successfully stored information", // Status message
"collection": "codebase", // Collection name
"vector": "bge-base-en-v1-5" // Embedding model used
}
memory_find results:
{
"ok": true,
"results": [
{
"id": "abc123...", // Memory ID
"information": "JWT tokens expire after 24h...", // Stored knowledge
"metadata": { // Structured metadata
"kind": "decision",
"topic": "auth",
"created_at": "2024-01-15T10:30:00Z",
"tags": ["security", "architecture"]
},
"score": 0.85, // Similarity score
"highlights": [...] // Query term matches in context
}
],
"total": 3,
"count": 3,
"query": "authentication decisions"
}
Error Response Shape
All tools on error:
{
"ok": false,
"error": "Collection not found", // Error message
"error_code": "COLLECTION_NOT_FOUND" // Optional error code
}
Or HTTP-level errors (504, 400, etc.) with structured response.
Batch Tool Return Shape
Applies to: batch_search, batch_symbol_graph, batch_graph_query
{
"ok": true,
"batch_results": [
{ /* result from search/query 0 */ },
{ /* result from search/query 1 */ },
{ /* result from search/query 2 */ }
],
"count": 3, // Number of results
"elapsed_ms": 123.4 // Total execution time
}
Each item in batch_results has the same schema as the individual tool (repo_search, symbol_graph, etc.).
Admin Tools Return Shape
Applies to: qdrant_status, qdrant_list, set_session_defaults
{
"ok": true,
"collections": [
{
"name": "frontend-abc123",
"count": 1234, // Point count
"last_ingested_at": {
"unix": 1704067800,
"iso": "2024-01-01T10:30:00Z"
}
}
]
}