/utils/gemini - Large Context Processing Skill

/utils/gemini - Large Context Processing

Skill Awareness: See skills/_registry.md for all available skills.

Called by: /dev-scout for large codebases

Purpose: Use Gemini Flash (1M context) for tasks Claude's context can't fit

Note: Utility skill, typically called by other skills

Use Gemini Flash for large context tasks like scanning entire codebases.

Why Gemini?

| Model | Context | Best For | Cost | |-------|---------|----------|------| | Claude | 200K | Reasoning, writing, precision | Higher | | Gemini Flash | 1M+ | Scanning, summarization, bulk reading | Lower |

Use case: Codebase has 500+ files. Claude can't fit it all. Gemini scans and summarizes, Claude uses the summary.

Setup

Prerequisites

Google AI API Key
- Go to: https://aistudio.google.com/app/apikey
- Create an API key
- Copy the key

Set Environment Variable

# Add to ~/.bashrc or ~/.zshrc
export GEMINI_API_KEY="your-api-key-here"

# Or create .env in project root
echo "GEMINI_API_KEY=your-api-key-here" >> .env

Install Python Dependencies
```
pip install google-generativeai
```

Quick Setup Script

Run from this skill's folder:

# Interactive setup
./scripts/setup.sh

This will:

Check for existing API key
Guide you to get one if missing
Test the connection
Verify everything works

Usage

Direct Script Usage

# Scan entire codebase
python scripts/gemini-scan.py --path /path/to/project --output summary.md

# Scan specific directory
python scripts/gemini-scan.py --path /path/to/project/src --output src-summary.md

# Custom prompt
python scripts/gemini-scan.py --path /path/to/project --prompt "List all API endpoints" --output apis.md

From Claude Code

When /dev-scout detects a large codebase:

1. Scout detects 500+ files
   → "Large codebase. Using Gemini for initial scan."

2. Run Gemini scan
   → Bash: python skills/utils/gemini/scripts/gemini-scan.py \
       --path . \
       --output plans/scout/gemini-summary.md

3. Read Gemini output
   → Read: plans/scout/gemini-summary.md

4. Claude refines and creates final scout.md
   → Uses Gemini summary as input
   → Adds analysis, recommendations
   → Creates structured scout output

Script: gemini-scan.py

Located at: scripts/gemini-scan.py

Features

Scans directory recursively
Respects .gitignore
Outputs structured markdown
Configurable file types
Progress indicator

Arguments

| Arg | Description | Default | |-----|-------------|---------| | --path | Directory to scan | Current dir | | --output | Output file path | stdout | | --prompt | Custom prompt | Default scan prompt | | --extensions | File types to include | Common code files | | --max-files | Max files to process | 1000 | | --ignore | Additional ignore patterns | None |

Default Scan Prompt

Analyze this codebase and provide:

1. **Project Overview**
   - What does this project do?
   - Main technologies used
   - Project structure

2. **File Organization**
   - Key directories and their purposes
   - Entry points
   - Configuration files

3. **Patterns Detected**
   - Architecture patterns (MVC, component-based, etc.)
   - Coding conventions
   - Common utilities

4. **Key Files**
   - Most important files (entry points, configs)
   - Core business logic locations
   - API/route definitions

5. **Dependencies**
   - External packages
   - Internal module dependencies

Be concise but comprehensive. Use markdown formatting.

Output Format

Gemini outputs structured markdown:

# Codebase Summary

## Project Overview
{description}

## Technologies
- Framework: Next.js 14
- Database: PostgreSQL with Prisma
- Auth: NextAuth.js
- Styling: Tailwind CSS

## Structure

src/ ├── app/ # Next.js App Router pages ├── components/ # React components ├── lib/ # Utilities and helpers └── prisma/ # Database schema


## Key Files
| File | Purpose |
|------|---------|
| src/app/layout.tsx | Root layout |
| src/lib/auth.ts | Authentication logic |
| prisma/schema.prisma | Database schema |

## Patterns
- Component-based architecture
- Server Components with Client islands
- API routes for backend logic

## Recommendations for Scout
- Focus on src/app/ for routes
- Check src/components/ui/ for base components
- Review prisma/schema.prisma for data model

Integration with /dev-scout

In dev-scout/SKILL.md, add:

### Step 0.5: Large Codebase Check

After counting files:

1. If 500+ files → Use Gemini
   ```bash
   python skills/utils/gemini/scripts/gemini-scan.py \
     --path . \
     --output plans/scout/gemini-summary.md

Read Gemini summary → Use as foundation for scout
Deep dive key areas only → Gemini identified important files → Claude focuses on those


## Troubleshooting

### API Key Not Found

Error: GEMINI_API_KEY not set


**Fix:**
```bash
export GEMINI_API_KEY="your-key"
# Or add to .env file

Rate Limit

Error: Rate limit exceeded

Fix:

Wait a few seconds and retry
Free tier has limits, consider paid tier for heavy use

Context Too Large

Error: Input too large for model

Fix:

Use --max-files to limit files
Use --ignore to skip directories
Split scan into multiple runs

Cost Considerations

Gemini Flash pricing (as of 2024):

Input: ~$0.075 per 1M tokens
Output: ~$0.30 per 1M tokens

Typical codebase scan (500 files):

~100K tokens input
~5K tokens output
Cost: ~$0.01-0.02 per scan

Very affordable for occasional use.

Security Notes

API key should be in environment, not code
Don't commit .env files
Add to .gitignore: .env, *.env.local
Gemini processes code - consider sensitivity

Future: MCP Server

If needed, this can be upgraded to an MCP server for tighter integration:

skills/utils/gemini/
├── SKILL.md
├── scripts/
│   ├── setup.sh
│   └── gemini-scan.py
└── mcp-server/          # Future
    ├── index.ts
    └── package.json

For now, the Python script approach is simpler and works well.

Agent Skills: /utils/gemini - Large Context Processing

Install this agent skill to your local

Skill Files