Code Semantic Search Skill

Code Semantic Search

Overview

Semantic code search using Phase 1 vector embeddings and Phase 2 hybrid search (semantic + structural). Find code by meaning, not just keywords.

Core principle: Search code by what it does, not what it's called.

Phase 2: Hybrid Search

This skill now supports three search modes:

Hybrid (Default):
- Combines semantic + structural search
- Best accuracy (95%+)
- Slightly slower but still <150ms
- Recommended for all searches
Semantic-Only:
- Uses only Phase 1 semantic vectors
- Fastest (<50ms)
- Good for conceptual searches
- Use when structure doesn't matter
Structural-Only:
- Uses only ast-grep patterns
- Precise for exact matches
- Best for finding function/class definitions
- Use when you need exact structural patterns

Performance Comparison

| Mode | Speed | Accuracy | Best For | | --------------- | ------ | -------- | ----------------- | | Hybrid | <150ms | 95% | General search | | Semantic-only | <50ms | 85% | Concepts | | Structural-only | <50ms | 100% | Exact patterns | | Phase 1 only | <50ms | 80% | Legacy (fallback) |

When to Use

Always:

Finding authentication logic without knowing function names
Searching for error handling patterns
Locating database queries
Finding similar code to a concept
Discovering implementation patterns

Don't Use:

Exact text matching (use Grep instead)
File name searches (use Glob instead)
Simple keyword searches (use ripgrep instead)

Usage Examples

Hybrid Search (Recommended)

// Basic hybrid search
Skill({ skill: 'code-semantic-search', args: 'find authentication logic' });

// With options
Skill({
  skill: 'code-semantic-search',
  args: 'database queries',
  options: {
    mode: 'hybrid',
    language: 'javascript',
    limit: 10,
  },
});

Semantic-Only Search

// Fast conceptual search
Skill({
  skill: 'code-semantic-search',
  args: 'find authentication',
  options: { mode: 'semantic-only' },
});

Structural-Only Search

// Exact pattern matching
Skill({
  skill: 'code-semantic-search',
  args: 'find function authenticate',
  options: { mode: 'structural-only' },
});

Implementation Reference

Hybrid Search: .claude/lib/code-indexing/hybrid-search.cjs

Query Analysis: .claude/lib/code-indexing/query-analyzer.cjs

Result Ranking: .claude/lib/code-indexing/result-ranker.cjs

Integration Points

developer: Code exploration, implementation discovery
architect: System understanding, pattern analysis
code-reviewer: Finding similar patterns, consistency checks
reverse-engineer: Understanding unfamiliar codebases
researcher: Research existing implementations

Iron Laws

ALWAYS use hybrid mode (semantic + structural) for general searches — semantic-only misses exact matches; structural-only misses conceptual variants; hybrid provides 95% accuracy vs 85% for single mode.
ALWAYS use ripgrep/keyword search first for fast keyword discovery — semantic search is for meaning-based queries; exact strings, function names, and filenames are found faster with ripgrep.
NEVER use semantic search without a meaningful natural-language query — single-word or code-syntax queries produce poor semantic results; describe what the code does, not what it's called.
ALWAYS combine with code-structural-search for precision refinement — start broad with semantic discovery, then use ast-grep patterns to find exact structural matches from the semantic results.
NEVER ignore low-similarity results without checking them — similarity scores are approximations; a 0.7 score result may be more relevant than a 0.9 score result for uncommon patterns.

Anti-Patterns

| Anti-Pattern | Why It Fails | Correct Approach | | ----------------------------------------------------- | -------------------------------------------------- | --------------------------------------------------------------------------- | | Semantic search for exact string matching | Slower and less accurate than text search | Use ripgrep/Grep for exact keyword matching | | Single-word queries ("auth") | Too vague for semantic matching; returns noise | Use natural-language descriptions ("authentication token validation logic") | | Using semantic-only mode for general searches | Misses structural variants; 85% vs 95% accuracy | Use hybrid mode (default) for general queries | | Ignoring search results that don't match expectations | Semantic results find surprising-but-relevant code | Read all results; unexpected matches are often the most valuable | | Not combining with structural search | Finds concepts but not exact patterns | Use semantic for discovery → structural for precision |

Memory Protocol (MANDATORY)

Before starting: Read .claude/context/memory/learnings.md

After completing:

New pattern -> .claude/context/memory/learnings.md
Issue found -> .claude/context/memory/issues.md
Decision made -> .claude/context/memory/decisions.md

ASSUME INTERRUPTION: If it's not in memory, it didn't happen.

Agent Skills: Code Semantic Search

Install this agent skill to your local

Skill Files