/utils/gemini - Large Context Processing
Skill Awareness: See
skills/_registry.mdfor all available skills.
- Called by:
/dev-scoutfor large codebases- Purpose: Use Gemini Flash (1M context) for tasks Claude's context can't fit
- Note: Utility skill, typically called by other skills
Use Gemini Flash for large context tasks like scanning entire codebases.
Why Gemini?
| Model | Context | Best For | Cost | |-------|---------|----------|------| | Claude | 200K | Reasoning, writing, precision | Higher | | Gemini Flash | 1M+ | Scanning, summarization, bulk reading | Lower |
Use case: Codebase has 500+ files. Claude can't fit it all. Gemini scans and summarizes, Claude uses the summary.
Setup
Prerequisites
-
Google AI API Key
- Go to: https://aistudio.google.com/app/apikey
- Create an API key
- Copy the key
-
Set Environment Variable
# Add to ~/.bashrc or ~/.zshrc export GEMINI_API_KEY="your-api-key-here" # Or create .env in project root echo "GEMINI_API_KEY=your-api-key-here" >> .env -
Install Python Dependencies
pip install google-generativeai
Quick Setup Script
Run from this skill's folder:
# Interactive setup
./scripts/setup.sh
This will:
- Check for existing API key
- Guide you to get one if missing
- Test the connection
- Verify everything works
Usage
Direct Script Usage
# Scan entire codebase
python scripts/gemini-scan.py --path /path/to/project --output summary.md
# Scan specific directory
python scripts/gemini-scan.py --path /path/to/project/src --output src-summary.md
# Custom prompt
python scripts/gemini-scan.py --path /path/to/project --prompt "List all API endpoints" --output apis.md
From Claude Code
When /dev-scout detects a large codebase:
1. Scout detects 500+ files
→ "Large codebase. Using Gemini for initial scan."
2. Run Gemini scan
→ Bash: python skills/utils/gemini/scripts/gemini-scan.py \
--path . \
--output plans/scout/gemini-summary.md
3. Read Gemini output
→ Read: plans/scout/gemini-summary.md
4. Claude refines and creates final scout.md
→ Uses Gemini summary as input
→ Adds analysis, recommendations
→ Creates structured scout output
Script: gemini-scan.py
Located at: scripts/gemini-scan.py
Features
- Scans directory recursively
- Respects .gitignore
- Outputs structured markdown
- Configurable file types
- Progress indicator
Arguments
| Arg | Description | Default |
|-----|-------------|---------|
| --path | Directory to scan | Current dir |
| --output | Output file path | stdout |
| --prompt | Custom prompt | Default scan prompt |
| --extensions | File types to include | Common code files |
| --max-files | Max files to process | 1000 |
| --ignore | Additional ignore patterns | None |
Default Scan Prompt
Analyze this codebase and provide:
1. **Project Overview**
- What does this project do?
- Main technologies used
- Project structure
2. **File Organization**
- Key directories and their purposes
- Entry points
- Configuration files
3. **Patterns Detected**
- Architecture patterns (MVC, component-based, etc.)
- Coding conventions
- Common utilities
4. **Key Files**
- Most important files (entry points, configs)
- Core business logic locations
- API/route definitions
5. **Dependencies**
- External packages
- Internal module dependencies
Be concise but comprehensive. Use markdown formatting.
Output Format
Gemini outputs structured markdown:
# Codebase Summary
## Project Overview
{description}
## Technologies
- Framework: Next.js 14
- Database: PostgreSQL with Prisma
- Auth: NextAuth.js
- Styling: Tailwind CSS
## Structure
src/ ├── app/ # Next.js App Router pages ├── components/ # React components ├── lib/ # Utilities and helpers └── prisma/ # Database schema
## Key Files
| File | Purpose |
|------|---------|
| src/app/layout.tsx | Root layout |
| src/lib/auth.ts | Authentication logic |
| prisma/schema.prisma | Database schema |
## Patterns
- Component-based architecture
- Server Components with Client islands
- API routes for backend logic
## Recommendations for Scout
- Focus on src/app/ for routes
- Check src/components/ui/ for base components
- Review prisma/schema.prisma for data model
Integration with /dev-scout
In dev-scout/SKILL.md, add:
### Step 0.5: Large Codebase Check
After counting files:
1. If 500+ files → Use Gemini
```bash
python skills/utils/gemini/scripts/gemini-scan.py \
--path . \
--output plans/scout/gemini-summary.md
-
Read Gemini summary → Use as foundation for scout
-
Deep dive key areas only → Gemini identified important files → Claude focuses on those
## Troubleshooting
### API Key Not Found
Error: GEMINI_API_KEY not set
**Fix:**
```bash
export GEMINI_API_KEY="your-key"
# Or add to .env file
Rate Limit
Error: Rate limit exceeded
Fix:
- Wait a few seconds and retry
- Free tier has limits, consider paid tier for heavy use
Context Too Large
Error: Input too large for model
Fix:
- Use
--max-filesto limit files - Use
--ignoreto skip directories - Split scan into multiple runs
Cost Considerations
Gemini Flash pricing (as of 2024):
- Input: ~$0.075 per 1M tokens
- Output: ~$0.30 per 1M tokens
Typical codebase scan (500 files):
- ~100K tokens input
- ~5K tokens output
- Cost: ~$0.01-0.02 per scan
Very affordable for occasional use.
Security Notes
- API key should be in environment, not code
- Don't commit .env files
- Add to .gitignore:
.env,*.env.local - Gemini processes code - consider sensitivity
Future: MCP Server
If needed, this can be upgraded to an MCP server for tighter integration:
skills/utils/gemini/
├── SKILL.md
├── scripts/
│ ├── setup.sh
│ └── gemini-scan.py
└── mcp-server/ # Future
├── index.ts
└── package.json
For now, the Python script approach is simpler and works well.