Cursor Codebase Indexing Skill

Cursor Codebase Indexing

Set up and optimize Cursor's codebase indexing system. Indexing creates embeddings of your code, enabling @Codebase semantic search and improving AI context awareness across Chat, Composer, and Agent mode.

How Indexing Works

Your Code Files
      │
      ▼
  Syntax Chunking ─── splits files into meaningful code blocks
      │
      ▼
  Embedding Generation ─── converts chunks to vector representations
      │
      ▼
  Vector Storage (Turbopuffer) ─── cloud-hosted nearest-neighbor search
      │
      ▼
  @Codebase Query ─── your question → embedding → similarity search → relevant chunks

Key Architecture Details

Merkle tree for change detection: only modified files are re-indexed (every 10 minutes)
No plaintext storage: code is not stored server-side; only embeddings and obfuscated metadata
Privacy Mode compatible: with Privacy Mode on, embeddings are computed without retaining source code
Indexing runs in the background; small projects complete in seconds, large projects (50K+ files) may take hours initially

Initial Setup

Open your project in Cursor
Indexing starts automatically on first open
Check status: look at the bottom status bar for "Indexing..." indicator
View indexed files: Cursor Settings > Features > Codebase Indexing > View included files

Verify Indexing Status

The status bar shows:

"Indexing..." with progress indicator -- initial indexing in progress
"Indexed" -- indexing complete, @Codebase queries are available
No indicator -- indexing may be disabled or not started

Configuration

.cursorignore

Exclude files from indexing and AI features. Place in project root. Uses .gitignore syntax:

# .cursorignore

# Build artifacts (large, not useful for AI context)
dist/
build/
out/
.next/
target/

# Dependencies
node_modules/
vendor/
venv/
.venv/

# Generated files
*.min.js
*.min.css
*.bundle.js
*.map
*.lock

# Large data files
*.csv
*.sql
*.sqlite
*.parquet
fixtures/
seed-data/

# Secrets (defense in depth -- also use .gitignore)
.env*
**/secrets/
**/credentials/

.cursorindexingignore

Exclude files from indexing only but keep them accessible to AI features when explicitly referenced:

# .cursorindexingignore

# Large test fixtures -- don't index, but allow @Files reference
tests/fixtures/
e2e/recordings/

# Documentation build output
docs/.vitepress/dist/

Difference: .cursorignore hides files from both indexing and AI features. .cursorindexingignore only excludes from the index; files can still be referenced via @Files.

Default Exclusions

Cursor automatically excludes everything in .gitignore. You only need .cursorignore for files tracked by git that you want to exclude from AI.

Using the Index

@Codebase Queries

Ask semantic questions about your entire codebase:

@Codebase where is user authentication handled?

@Codebase show me all API endpoints that accept file uploads

@Codebase how does the payment processing flow work?

@Codebase find all places where we connect to Redis

@Codebase performs a nearest-neighbor search using your question's embedding. It returns the most semantically similar code chunks, even if they do not contain the exact keywords you used.

@Codebase vs @Files vs Text Search

| Method | When to Use | Context Cost | |--------|------------|--------------| | @Codebase | Discovery -- you don't know which files | High (many chunks) | | @Files | You know exactly which file | Low (one file) | | @Folders | You know the directory | Medium-High | | Ctrl+Shift+F | Exact text/regex match | N/A (editor search) |

Use @Codebase for discovery, then switch to @Files once you know where the code lives.

Optimization for Large Projects

Monorepo Strategy

For monorepos with many packages, open the specific package directory instead of the root:

# Instead of opening the entire monorepo:
cursor /path/to/monorepo           # Indexes everything -- slow

# Open the specific package:
cursor /path/to/monorepo/packages/api   # Indexes only this package -- fast

Or use .cursorignore at the root to exclude packages you are not actively working on:

# .cursorignore -- monorepo, focus on api and shared
packages/web/
packages/mobile/
packages/admin/
# packages/api/    ← not listed, so it IS indexed
# packages/shared/ ← not listed, so it IS indexed

Re-Indexing

If search results are stale or indexing appears stuck:

Cmd+Shift+P > Cursor: Resync Index
Wait for status bar to show indexing progress
If that fails, delete the local cache:
- macOS: ~/Library/Application Support/Cursor/Cache/
- Linux: ~/.config/Cursor/Cache/
- Windows: %APPDATA%\Cursor\Cache\
Restart Cursor and allow full re-index

File Watcher Limits (Linux)

On Linux, large projects may hit the file watcher limit:

# Check current limit
cat /proc/sys/fs/inotify/max_user_watches

# Increase (temporary)
sudo sysctl fs.inotify.max_user_watches=524288

# Increase (permanent)
echo "fs.inotify.max_user_watches=524288" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p

Enterprise Considerations

Data residency: Embeddings are stored in Turbopuffer (cloud). Obfuscated filenames and no plaintext code, but metadata exists
Privacy Mode: With Privacy Mode on, embeddings are computed with zero data retention at the provider
Air-gapped environments: Indexing requires network access to Cursor's embedding API. Not available offline
Indexing scope: Only files in the currently open workspace are indexed. Closing a project removes its index from active queries

Troubleshooting

| Symptom | Cause | Fix | |---------|-------|-----| | @Codebase returns no results | Index not built | Wait for "Indexed" in status bar | | Search misses known files | File in .gitignore or .cursorignore | Check ignore files | | Indexing stuck at N% | Large project or network issue | Resync index via Command Palette | | Stale results after refactor | Index not yet updated | Wait 10 min or manual resync | | High CPU during indexing | Initial embedding computation | Normal for first run; subsides |

Agent Skills: Cursor Codebase Indexing

Install this agent skill to your local

Skill Files