Code Review Skill
Comprehensive code review using PAL MCP multi-model consensus. Reviews recent commits, staged changes, or branch comparisons with focus on security, quality, performance, and architecture.
Quick Start
# Review last 5 commits (default)
/sc:code-review
# Review staged changes
/sc:code-review --staged
# Compare feature branch against main
/sc:code-review --branch main
# Security-focused review
/sc:code-review commits=3 --focus security
# Post review as PR comment
/sc:code-review --branch main --post-pr
Philosophy: Ask Early, Ask Often
This skill liberally uses AskUserQuestion at every decision point. Code review is subjective — what matters depends on context, and assumptions about reviewer priorities are frequently wrong. The cost of asking is low; the cost of reviewing the wrong scope or missing what the user cares about is high.
Interactive checkpoints:
- Before reviewing — confirm the review scope and what the user cares about
- When the diff is large — let the user choose what to focus on first
- When categorization is ambiguous — ask about severity levels
- When models disagree — let the user break the tie on issue severity
- After findings — ask what to do next (fix, re-review, post to PR)
- When tradeoffs exist — present them explicitly instead of picking silently
- For critical/high findings — validate severity with user before finalizing
The user should feel like they're directing the review, not receiving a fire-and-forget report.
Behavioral Flow
- Parse - Extract review mode, focus area, and commit range from arguments
- Validate - Verify git state, branch existence, staged changes
- Confirm Scope - Interactive scope confirmation with user
- Discover - List available models via
mcp__pal__listmodels - Gather - Collect git diff, changed files, commit history
- Categorize - Map changed files to review focus areas
- Handle Large Diffs - Interactive chunking strategy for large changes
- Review - Run PAL MCP consensus with multi-model evaluation
- Resolve Disagreements - User adjudicates model conflicts
- Validate Findings - User confirms critical/high severity items
- Format - Present findings in GitHub PR comment style
- Follow-up - Ask user what to do next (fix, deep-dive, post to PR)
- Post - Optionally post to PR via
ghCLI
Flags
| Flag | Type | Default | Description |
|------|------|---------|-------------|
| commits | int | 5 | Number of recent commits to review |
| --staged | bool | false | Review only staged changes |
| --branch | string | - | Compare current branch against target branch |
| --focus | string | full | Focus: security, performance, quality, architecture, full |
| --post-pr | bool | false | Post review as PR comment |
| --models | int | 2 | Number of models for consensus (2-5) |
Review Modes
Modes are mutually exclusive:
| Mode | Trigger | Git Command |
|------|---------|-------------|
| Commits | Default or commits=N | git diff -U10 HEAD~N..HEAD |
| Staged | --staged | git diff -U10 --cached |
| Branch | --branch <target> | git diff -U10 origin/<target>...HEAD |
Phase 1: Validate Git State
# Verify git repository
git rev-parse --is-inside-work-tree
# Get current branch
git branch --show-current
Mode-specific validation:
- Commits: Check
git rev-list --count HEAD>= N - Staged: Check
git diff --cached --name-onlyis non-empty - Branch: Check
git rev-parse --verify "origin/$TARGET"exists
Phase 1b: Confirm Review Scope
After gathering git context, present a summary and let the user confirm or adjust:
AskUserQuestion:
question: "Here's what I'll be reviewing. Does this scope look right?"
header: "Scope"
multiSelect: false
options:
- label: "Looks good — proceed"
description: "<N files changed, +X/-Y lines, commits/staged/branch summary>"
- label: "Too much — narrow scope"
description: "I only want to review a subset of these changes"
- label: "Too little — expand scope"
description: "Include more commits or compare against a different branch"
- label: "Different focus"
description: "I want to focus on a specific area (security, performance, etc.)"
If "Too much — narrow scope": Ask which files or directories to focus on:
AskUserQuestion:
question: "Which areas should I focus the review on?"
header: "Focus"
multiSelect: true
options:
- label: "Backend (app/)"
description: "<N files, +X/-Y lines in app/>"
- label: "Frontend (frontend/)"
description: "<N files, +X/-Y lines in frontend/>"
- label: "Tests"
description: "<N files, +X/-Y lines in tests/>"
- label: "Config/Infra"
description: "<N files in Docker, CI, config>"
If "Too little — expand scope": Ask how to expand (more commits, different branch).
If "Different focus": Ask which review focus area to use, overriding the --focus flag.
Phase 2: Discover Models
Call mcp__pal__listmodels to get available models. Select top N by score, preferring different providers for diversity.
Selection criteria:
- Sort by score descending
- Pick from distinct providers when possible
- Minimum 2 models for meaningful consensus
Phase 3: Gather Git Context
Collect for the selected review mode:
| Data | Purpose |
|------|---------|
| git diff -U10 ... | Full diff with extended context |
| git diff --stat ... | Change statistics |
| git log --oneline ... | Commit summaries |
| Changed file list | File categorization |
Phase 4: Categorize Changed Files
| Pattern | Category | Review Focus |
|---------|----------|--------------|
| *.py, *.ts, *.go, *.rs | Source Code | Security, SOLID, performance |
| tests/**, *_test.* | Test Code | Coverage, edge cases, assertions |
| *.yml, *.yaml, *.toml | Config | Security, correctness |
| *requirements*, *package* | Dependencies | Vulnerabilities, versions |
| Dockerfile*, docker-compose* | Infrastructure | Security, best practices |
| *.md, *.rst | Documentation | Accuracy, completeness |
Skip binary files (note them but exclude from diff review).
Phase 5: Handle Large Diffs
If diff exceeds ~1000 lines, ask user how to handle:
AskUserQuestion:
question: "The diff is large (~<N> lines across <M> files). How should I handle it?"
header: "Large diff"
multiSelect: false
options:
- label: "Review in chunks by category (Recommended)"
description: "Group files by type (backend, frontend, tests, config) and review each separately"
- label: "Focus on highest-risk files only"
description: "Skip low-risk changes (docs, tests, config) and focus on production code"
- label: "Review everything in one pass"
description: "Send the full diff to models — may lose detail on individual files"
- label: "Let me pick specific files"
description: "I'll tell you exactly which files to review"
If "Let me pick specific files":
- List all changed files with their stats
- Ask user to select which files to review
- Scope the diff down accordingly
If "Review in chunks by category":
- Group files by category (source, tests, config, docs)
- Review each group separately with PAL consensus
- Synthesize findings into single report
- Deduplicate overlapping issues
If "Focus on highest-risk files only":
- Filter out documentation, configuration, and test files
- Focus consensus on production source code only
Phase 6: Run PAL MCP Consensus
Use mcp__pal__consensus with discovered models.
Consensus workflow:
step 1: Your initial analysis with the diffstep 2+: Each model responds with findingstotal_steps= number of models + 1
Review checklist by focus area:
| Focus | Checks | |-------|--------| | security | Secrets, injection, auth, input validation, dependencies | | performance | N+1 queries, complexity, caching, memory leaks | | quality | SOLID, naming, error handling, types, dead code | | architecture | Patterns, coupling, separation of concerns, testability | | full | All of the above (default) |
Fallback: If consensus fails, use mcp__pal__codereview for single-model review.
Phase 6a: Present Model Disagreements
If models disagree on the severity or existence of an issue, do NOT resolve the disagreement yourself. Present each significant disagreement:
AskUserQuestion:
question: "Models disagreed on <specific issue>. How should this be classified?"
header: "Severity"
multiSelect: false
options:
- label: "<Model A's severity> (e.g., Critical)"
description: "<Model A> flagged this as <severity> because: <reasoning>"
- label: "<Model B's severity> (e.g., Low)"
description: "<Model B> considers this <severity> because: <reasoning>"
- label: "Dismiss this finding"
description: "This isn't actually an issue — it's intentional or acceptable"
Repeat for each significant disagreement (up to 4 per AskUserQuestion call). Skip for minor differences that don't change the severity tier.
Phase 6b: Validate Critical/High Findings
If any findings are classified as Critical or High, present them to the user for confirmation before including in the final report. False positives at these severity levels erode trust:
AskUserQuestion:
question: "I found <N> critical/high issues. Do these look like real problems, or should I reclassify any?"
header: "Validate"
multiSelect: true
options:
- label: "<Issue 1 summary>"
description: "[FILE:LINE] — <brief description>. Classified as <severity>"
- label: "<Issue 2 summary>"
description: "[FILE:LINE] — <brief description>. Classified as <severity>"
- label: "<Issue 3 summary>"
description: "[FILE:LINE] — <brief description>. Classified as <severity>"
- label: "All look correct"
description: "Keep all critical/high classifications as-is"
Interpretation:
- Selected items are confirmed as real issues — keep them at current severity
- Unselected critical/high items should be downgraded to Medium with a note
- If "All look correct" is selected, keep everything as-is
Skip this question if there are no Critical or High findings.
Use mcp__pal__challenge for validation:
- For each critical/high finding, run
mcp__pal__challengeto validate the severity - Present both the original finding and the challenge result
- Let the user make the final call on severity
Example challenge prompt:
"The code review identified this as a Critical security issue: <issue description>.
Challenge this finding: Is it actually critical? Could it be a false positive?
What's the real-world exploitability? What's the actual severity?"
This adds an adversarial validation layer to prevent false positives.
Phase 7: Format Output
## Code Review: Multi-Model Consensus
**Review Mode**: [Commits (last N) | Staged Changes | Branch vs TARGET]
**Files Reviewed**: N files
**Lines Changed**: +X / -Y
---
### Executive Summary
[2-3 sentence overview]
---
### Critical Issues
> Must be addressed before merge
- [ ] **[FILE:LINE]** - [Issue description]
- **Category**: Security/Performance/Bug Risk
- **Severity**: Critical
- **Recommendation**: [Specific fix]
### High Priority
> Should be addressed
- [ ] **[FILE:LINE]** - [Issue description]
- **Category**: [Category]
- **Recommendation**: [Specific fix]
### Medium Priority
> Recommended improvements
- [ ] **[FILE:LINE]** - [Issue description]
- **Recommendation**: [Suggestion]
### Low Priority / Suggestions
> Nice to have
- [ ] [Suggestion]
---
### Positive Observations
- [Good pattern observed]
---
### Review Summary
| Category | Rating | Notes |
|----------|--------|-------|
| Security | X/5 | [Brief note] |
| Code Quality | X/5 | [Brief note] |
| Performance | X/5 | [Brief note] |
| Architecture | X/5 | [Brief note] |
| Test Coverage | X/5 | [Brief note] |
**Overall Assessment**: [APPROVE / REQUEST CHANGES / NEEDS DISCUSSION]
---
### Model Consensus
| Model | Key Findings |
|-------|--------------|
| [Model 1] | [Summary] |
| [Model 2] | [Summary] |
**Agreement Areas**: [Where models agreed]
**Divergent Views**: [Where models differed]
Phase 8: Post-Review Follow-Up
Always ask the user what they want to do next after presenting the review:
AskUserQuestion:
question: "Review complete (<overall assessment>). What would you like to do next?"
header: "Next"
multiSelect: false
options:
- label: "Auto-fix issues"
description: "Attempt to fix all critical and high issues automatically"
- label: "Review a specific file in detail"
description: "Deep-dive into one file that needs closer inspection"
- label: "Post to PR"
description: "Post this review as a comment on the current PR"
- label: "Re-review with different models"
description: "Run the review again with different model consensus"
- label: "Done"
description: "Review is complete — no further action needed"
If "Auto-fix issues": Ask which severity levels to fix:
AskUserQuestion:
question: "Which issues should I attempt to fix?"
header: "Fix scope"
multiSelect: true
options:
- label: "Critical issues (<N>)"
description: "Must-fix items that block merge"
- label: "High priority (<N>)"
description: "Should-fix items"
- label: "Medium priority (<N>)"
description: "Recommended improvements"
- label: "All issues"
description: "Fix everything that can be automated"
Then attempt fixes using Edit and report results. After each fix, ask if the user wants to continue or review the changes.
If "Review a specific file in detail":
- Ask which file from the changed files list
- Re-run consensus on just that file's diff with maximum context (-U20)
- Present detailed findings for that file only
If "Re-review with different models":
- Run
mcp__pal__listmodelsagain - Let user select different models
- Re-run the entire review with the new model set
Phase 9: Post to PR (Optional)
If --post-pr flag is set AND in PR context:
PR_NUMBER=$(gh pr view --json number -q .number 2>/dev/null)
gh pr comment $PR_NUMBER --body "<review output>"
MCP Integration
PAL MCP (Primary)
| Tool | When | Purpose |
|------|------|---------|
| mcp__pal__listmodels | Always | Discover available models |
| mcp__pal__consensus | Default | Multi-model code review consensus |
| mcp__pal__codereview | Fallback | Single-model code review |
| mcp__pal__challenge | Critical findings | Validate severity of critical issues |
Rube MCP (Optional)
| Tool | When | Purpose |
|------|------|---------|
| mcp__rube__RUBE_SEARCH_TOOLS | --notify | Find notification tools |
| mcp__rube__RUBE_MULTI_EXECUTE_TOOL | --notify | Send review to Slack/Jira |
Error Handling
| Scenario | Action |
|----------|--------|
| Not a git repo | Exit with error |
| No commits in range | Adjust N to available commits |
| No staged changes | Error with hint to git add |
| Branch not found | Error with available branches |
| PAL MCP unavailable | Fall back to mcp__pal__codereview |
| No models available | Error — PAL MCP not configured |
Tool Coordination
- Bash - Git operations, PR posting
- Read - Source file inspection for context
- Grep - Pattern search in changed files
- Edit - Auto-fix issues based on findings
- Glob - File discovery for large diffs
- PAL MCP - Multi-model consensus review
- Rube MCP - External notifications
- AskUserQuestion - Interactive decision points
Notes
- Extended diff context (-U10) provides better code understanding for models
- Binary files are noted but excluded from diff review
- Python files get additional SOLID principles review per CLAUDE.md
- Review depth scales with diff size (larger diffs get chunked)
- Interactive validation prevents false positives on critical findings
- User directs the review flow — this is collaborative, not automated