Agent Brain Expert Skill
Expert-level skill for Agent Brain document search with five modes: BM25 (keyword), Vector (semantic), Hybrid (fusion), Graph (knowledge graph), and Multi (comprehensive fusion).
Contents
- Search Modes
- Mode Selection Guide
- GraphRAG (Knowledge Graph)
- Indexing & Folder Management
- Content Injection
- Job Queue Management
- Server Management
- Cache Management
- When Not to Use
- Best Practices
- Reference Documentation
Search Modes
| Mode | Speed | Best For | Example Query |
|------|-------|----------|---------------|
| bm25 | Fast (10-50ms) | Technical terms, function names, error codes | "AuthenticationError" |
| vector | Slower (800-1500ms) | Concepts, explanations, natural language | "how authentication works" |
| hybrid | Slower (1000-1800ms) | Comprehensive results combining both | "OAuth implementation guide" |
| graph | Medium (500-1200ms) | Relationships, dependencies, call chains | "what calls AuthService" |
| multi | Slowest (1500-2500ms) | Most comprehensive with entity context | "complete auth flow with dependencies" |
Mode Parameters
| Parameter | Default | Description |
|-----------|---------|-------------|
| --mode | hybrid | Search mode: bm25, vector, hybrid, graph, multi |
| --threshold | 0.3 | Minimum similarity (0.0-1.0) |
| --top-k | 5 | Number of results |
| --alpha | 0.5 | Hybrid balance (0=BM25, 1=Vector) |
Mode Selection Guide
Use BM25 When
Searching for exact technical terms:
agent-brain query "recursiveCharacterTextSplitter" --mode bm25
agent-brain query "ValueError: invalid token" --mode bm25
agent-brain query "def process_payment" --mode bm25
Counter-example - Wrong mode choice:
# BM25 is wrong for conceptual queries
agent-brain query "how does error handling work" --mode bm25 # Wrong
agent-brain query "how does error handling work" --mode vector # Correct
Use Vector When
Searching for concepts or natural language:
agent-brain query "best practices for error handling" --mode vector
agent-brain query "how to implement caching" --mode vector
Counter-example - Wrong mode choice:
# Vector is wrong for exact function names
agent-brain query "getUserById" --mode vector # Wrong - may miss exact match
agent-brain query "getUserById" --mode bm25 # Correct - finds exact match
Use Hybrid When
Need comprehensive results (default mode):
agent-brain query "OAuth implementation" --mode hybrid --alpha 0.6
agent-brain query "database connection pooling" --mode hybrid
Alpha tuning:
--alpha 0.3- More keyword weight (technical docs)--alpha 0.7- More semantic weight (conceptual docs)
Use Graph When
Exploring relationships and dependencies:
agent-brain query "what functions call process_payment" --mode graph
agent-brain query "classes that inherit from BaseService" --mode graph --traversal-depth 3
agent-brain query "modules that import authentication" --mode graph
Prerequisite: Requires ENABLE_GRAPH_INDEX=true during server startup.
Use Multi When
Need the most comprehensive results:
agent-brain query "complete payment flow implementation" --mode multi --include-relationships
GraphRAG (Knowledge Graph)
GraphRAG enables relationship-aware retrieval by building a knowledge graph from indexed documents.
Enabling GraphRAG
export ENABLE_GRAPH_INDEX=true
agent-brain start
Graph Query Types
| Query Pattern | Example |
|---------------|---------|
| Function callers | "what calls process_payment" |
| Class inheritance | "classes extending BaseController" |
| Import dependencies | "modules importing auth" |
| Data flow | "where does user_id come from" |
See Graph Search Guide for detailed usage.
Indexing & Folder Management
Indexing with File Type Presets
# Index only Python files
agent-brain index ./src --include-type python
# Index Python and documentation
agent-brain index ./project --include-type python,docs
# Index all code files
agent-brain index ./repo --include-type code
# Force full re-index (bypass incremental)
agent-brain index ./docs --force
Use agent-brain types list to see all 14 available presets.
Folder Management
agent-brain folders list # List indexed folders with chunk counts
agent-brain folders add ./docs # Add folder (triggers indexing)
agent-brain folders add ./src --include-type python # Add with preset filter
agent-brain folders remove ./old-docs --yes # Remove folder and evict chunks
Incremental Indexing
Re-indexing a folder automatically detects changes:
- Unchanged files are skipped (mtime + SHA-256 checksum)
- Changed files have old chunks evicted and new ones created
- Deleted files have their chunks automatically removed
- Use
--forceto bypass manifest and fully re-index
Content Injection
Enrich chunk metadata during indexing with custom Python scripts or static JSON metadata.
When to Use
- Tag chunks with project/team/category metadata
- Classify chunks by content type
- Add custom fields for filtered search
- Merge folder-level metadata into all chunks
Basic Usage
# Inject via Python script
agent-brain inject ./docs --script enrich.py
# Inject via static JSON metadata
agent-brain inject ./src --folder-metadata project-meta.json
# Validate script before indexing
agent-brain inject ./docs --script enrich.py --dry-run
Injector Script Protocol
Scripts export a process_chunk(chunk: dict) -> dict function:
def process_chunk(chunk: dict) -> dict:
chunk["project"] = "my-project"
chunk["team"] = "backend"
return chunk
- Values must be scalars (str, int, float, bool)
- Per-chunk exceptions are logged as warnings, not fatal
- See
docs/INJECTOR_PROTOCOL.mdfor the full specification
Job Queue Management
Indexing runs asynchronously via a job queue. Monitor and manage jobs:
agent-brain jobs # List all jobs
agent-brain jobs --watch # Live polling every 3s
agent-brain jobs <job_id> # Job details + eviction summary
agent-brain jobs <job_id> --cancel # Cancel a job
Eviction Summary
When re-indexing, job details show what changed:
Eviction Summary:
Files added: 3
Files changed: 2
Files deleted: 1
Files unchanged: 42
Chunks evicted: 15
Chunks created: 25
This confirms incremental indexing is working efficiently.
Server Management
Quick Start
agent-brain init # Initialize project (first time)
agent-brain start # Start server
agent-brain index ./docs # Index documents
agent-brain query "search" # Search
agent-brain stop # Stop when done
Progress Checklist:
- [ ]
/agent-brain:agent-brain-initsucceeded - [ ]
/agent-brain:agent-brain-statusshows healthy - [ ] Document count > 0
- [ ] Query returns results (or "no matches" - not error)
Lifecycle Commands
| Command | Description |
|---------|-------------|
| /agent-brain:agent-brain-init | Initialize project config |
| /agent-brain:agent-brain-start | Start with auto-port |
| /agent-brain:agent-brain-status | Show port, mode, document count |
| /agent-brain:agent-brain-list | List all running instances |
| /agent-brain:agent-brain-stop | Graceful shutdown |
Pre-Query Validation
Before querying, verify setup:
agent-brain status
Expected:
- Status: healthy
- Documents: > 0
- Provider: configured
Counter-example - Querying without validation:
# Wrong - querying without checking status
agent-brain query "search term" # May fail if server not running
# Correct - validate first
agent-brain status && agent-brain query "search term"
See Server Discovery Guide for multi-instance details.
Cache Management
The embedding cache automatically stores computed embeddings to avoid redundant API calls during reindexing. No setup is required — the cache is active by default.
When to Check Cache Status
- After indexing — verify cache is working and hit rate is growing
- When queries seem slow — a low or zero hit rate means embeddings are being recomputed on every reindex
- To monitor cache growth — track disk usage over time for large indexes
agent-brain cache status
A healthy cache shows:
- Hit rate > 80% after the first full reindex cycle
- Growing disk entries over time as more content is indexed
- Low misses relative to hits
When to Clear the Cache
- After changing embedding provider or model — prevents dimension mismatches and stale cached vectors
- Suspected cache corruption — if embeddings seem incorrect or search quality degrades unexpectedly
- To force fresh embeddings — when you need to ensure all vectors reflect the current provider/model
# Clear with confirmation prompt
agent-brain cache clear
# Clear without prompt (use in scripts)
agent-brain cache clear --yes
Cache is Automatic
No configuration is required. Embeddings are cached on first compute and reused on subsequent reindexes of unchanged content (identified by SHA-256 hash). The cache complements the ManifestTracker — files that haven't changed on disk won't need to recompute embeddings.
See the API Reference for GET /index/cache and DELETE /index/cache
endpoint details, including response schemas.
When Not to Use
This skill focuses on searching and querying. Do NOT use for:
- Installation - Use
configuring-agent-brainskill - API key configuration - Use
configuring-agent-brainskill - Server setup issues - Use
configuring-agent-brainskill - Provider configuration - Use
configuring-agent-brainskill
Scope boundary: This skill assumes Agent Brain is already installed, configured, and the server is running with indexed documents.
Best Practices
- Mode Selection: BM25 for exact terms, Vector for concepts, Hybrid for comprehensive, Graph for relationships
- Threshold Tuning: Start at 0.7, lower to 0.3-0.5 for more results
- Server Discovery: Use
runtime.jsonrather than assuming port 8000 - Resource Cleanup: Run
agent-brain stopwhen done - Source Citation: Always reference source filenames in responses
- Graph Queries: Use graph mode for "what calls X", "what imports Y" patterns
- Traversal Depth: Start with depth 2, increase to 3-4 for deeper chains
- File Type Presets: Use
--include-type python,docsinstead of manual glob patterns - Incremental Indexing: Re-index without
--forcefor efficient updates - Injection Validation: Always
--dry-runinjector scripts before full indexing - Job Monitoring: Use
agent-brain jobs --watchfor long-running index jobs
Reference Documentation
| Guide | Description | |-------|-------------| | BM25 Search | Keyword matching for technical queries | | Vector Search | Semantic similarity for concepts | | Hybrid Search | Combined keyword and semantic search | | Graph Search | Knowledge graph and relationship queries | | Server Discovery | Auto-discovery, multi-agent sharing | | Provider Configuration | Environment variables and API keys | | Integration Guide | Scripts, Python API, CI/CD patterns | | API Reference | REST endpoint documentation | | Troubleshooting | Common issues and solutions |
Limitations
- Vector/hybrid/graph/multi modes require embedding provider configured
- Graph mode requires additional memory (~500MB extra)
- Supported formats: Markdown, PDF, plain text, code files (Python, JS, TS, Java, Go, Rust, C, C++)
- Not supported: Word docs (.docx), images
- Server requires ~500MB RAM for typical collections (~1GB with graph)
- Ollama requires local installation and model download