Tree of Thoughts Reasoning Methodology Skill

Tree of Thoughts Reasoning Methodology

Purpose: Systematic parallel exploration of solution spaces through recursive branching, self-reflection, and rigorous evaluation. Use this methodology when facing complex problems with multiple viable solution paths.

When to Use Tree of Thoughts

✅ Use ToT when:

Problem has multiple viable solution approaches (3+ fundamentally different paths)
Need to find optimal solution, not just any solution
Can define clear evaluation criteria
Complexity justifies systematic exploration
Trade-offs exist between competing approaches
Strategic or architectural decisions with long-term impact

❌ Don't use ToT when:

Problem has obvious single solution path
Time-critical decisions with simple trade-offs
Problem is well-defined with standard solution
Exploratory work where breadth matters more than depth

Examples:

"Should we use REST, GraphQL, or gRPC?" (3 paths, clear trade-offs) ✅
"Design distributed caching system balancing latency, consistency, cost" (multi-dimensional) ✅
"Fix this syntax error" (single path) ❌
"Research all available databases" (breadth-of-thought better) ❌

Core Methodology: 5-Step Process

Step 1: Problem Decomposition (5+ Branches)

Objective: Identify 5+ fundamentally different approaches to explore

Actions:

Analyze problem to identify key dimensions (technical, organizational, risk, cost, timeline)
Brainstorm 5-10 distinct approaches (not variations of same approach)
Define evaluation criteria from problem constraints
Validate diversity: Each branch explores different solution philosophy

Example (Distributed Caching):

Branch A: Write-through consistency (strong consistency, higher latency)
Branch B: Eventual consistency (performance, weaker guarantees)
Branch C: Hybrid tiered (hot data write-through, cold data eventual)
Branch D: Edge-centric (CDN-style, geography-aware)
Branch E: Cost-optimized minimal (single region, no replication)

Deliverable: 5+ distinct approach definitions

Step 2: Parallel Branch Exploration

Objective: Explore each branch systematically with self-reflection

For each branch:

Analyze approach against problem requirements
Consider strengths, weaknesses, trade-offs
Identify assumptions and constraints
End with self-reflection (see template below)

Self-Reflection Template (REQUIRED for each branch):

## Branch [X]: [Approach Name]

[Analysis of this approach: 2-4 paragraphs covering requirements, strengths, weaknesses, trade-offs]

### Self-Reflection
- **Confidence**: [0-100]/100
- **Strengths**: [What makes this approach compelling]
- **Weaknesses**: [Gaps, assumptions, limitations]
- **Trade-offs**: [What you gain vs what you lose]
- **Recommendation**: [Continue deeper exploration? Prune? Why?]

Execution Options:

With Task tool: Spawn 5+ parallel tasks for independent exploration
Without Task tool: Explore branches sequentially using TodoWrite to track progress
Hybrid: Use Task for complex branches, sequential for simple ones

Deliverable: 5+ explored branches with self-reflections

Step 3: Branch Evaluation (Scoring)

Objective: Systematically evaluate all branches against criteria

Evaluation Criteria (100 points total, 5 categories × 20 points):

Novelty (0-20): Does it explore new solution space vs obvious approaches?
- 18-20: Innovative approach, fresh perspective
- 12-17: Good approach with some novel elements
- 6-11: Standard approach, minor tweaks
- 0-5: Obvious/conventional approach
Feasibility (0-20): Practically implementable with reasonable resources?
- 18-20: Proven technology, clear implementation path
- 12-17: Feasible with moderate effort/risk
- 6-11: Significant technical challenges
- 0-5: Impractical or resource-intensive
Completeness (0-20): Addresses all stated requirements?
- 18-20: Covers all requirements comprehensively
- 12-17: Covers most requirements, minor gaps
- 6-11: Missing key requirements
- 0-5: Incomplete solution
Confidence (0-20): Branch's self-reflection confidence score?
- 18-20: High confidence (80-100%) with justification
- 12-17: Medium confidence (60-79%)
- 6-11: Low confidence (40-59%)
- 0-5: Very low confidence (<40%)
Alignment (0-20): Matches problem constraints and context?
- 18-20: Perfect fit for constraints
- 12-17: Good fit, minor misalignment
- 6-11: Notable misalignment
- 0-5: Poor fit for context

Scoring Process:

Review each branch's analysis and self-reflection
Score each branch on all 5 criteria (0-20 per criterion)
Calculate total score (0-100) for each branch
Rank branches by total score
Select highest-scoring branch for deeper exploration

Deliverable: Scored ranking of all branches, winner selected

Step 4: Recursive Depth Exploration (Level 1+)

Objective: Recursively expand the best branch

Actions:

Take highest-scoring branch from Step 3
Decompose that branch into 5+ sub-approaches or refinements
Repeat Steps 2-3 for the new level (explore → evaluate → select)
Continue recursion until stopping criteria met

Minimum Depth: 4 levels (Level 0 → 1 → 2 → 3)

Level Transition Example:

Level 0: "Distributed caching system" (5 approaches)
  → Winner: Branch B (Eventual consistency)

Level 1: "Eventual consistency variants" (5 refinements)
  - B.1: Last-write-wins
  - B.2: Version vectors
  - B.3: CRDTs
  - B.4: Causal consistency
  - B.5: Session consistency
  → Winner: Branch B.3 (CRDTs)

Level 2: "CRDT implementations" (5 options)
  - B.3.1: G-Counter
  - B.3.2: PN-Counter
  - B.3.3: LWW-Element-Set
  - B.3.4: OR-Set
  - B.3.5: RGA (Replicated Growable Array)
  → Winner: Branch B.3.4 (OR-Set)

Level 3: "OR-Set optimizations" (5 variants)
  [Explore specific implementation strategies]
  → Winner: Branch B.3.4.2 (Tombstone compaction)

Deliverable: Recursive tree with minimum 4 levels explored

Step 5: Final Synthesis

Objective: Synthesize insights into final recommendation

Actions:

Trace winning path: Document Level 0 → Level 1 → Level 2 → Level 3+
Extract key insights: What was learned at each level?
Document pruned branches: Why were alternatives discarded?
Calculate confidence: Final confidence score (see Bayesian formula below)
State assumptions: What assumptions underpin the recommendation?
Provide recommendation: Clear, actionable guidance

Synthesis Template:

## Tree of Thoughts Analysis Complete

### Winning Path
- **Level 0**: [Chosen approach] (Score: X/100)
- **Level 1**: [Refinement] (Score: X/100)
- **Level 2**: [Sub-refinement] (Score: X/100)
- **Level 3**: [Implementation] (Score: X/100)

### Key Insights
1. [Insight from Level 0]
2. [Insight from Level 1]
3. [Insight from Level 2]
4. [Insight from Level 3]

### Alternatives Considered
- [Branch A]: Pruned because [reason]
- [Branch C]: Pruned because [reason]
- [Branch D]: Pruned because [reason]

### Final Confidence: [X]%

**Justification**: [Why this confidence level based on exploration depth, evidence, and remaining uncertainties]

### Recommendation
[Clear, actionable recommendation with next steps]

### Remaining Uncertainties
- [Assumption 1]
- [Assumption 2]

Deliverable: Comprehensive synthesis with traced path and confidence score

Stopping Criteria

Stop exploration when ANY of:

✅ Reached 4+ levels AND best branch confidence >80%
✅ Reached 6 levels (maximum recommended depth)
✅ All branches converge to same solution across multiple levels
✅ Diminishing returns (Level N scores similar to Level N-1)

Warning signs (don't stop yet):

❌ Only 2-3 levels explored
❌ Confidence <80% without clear reason
❌ Winner not clearly superior to alternatives

Bayesian Confidence Scoring

Purpose: Quantify confidence based on accumulated evidence

Formula:

Prior Odds = P(correct) / (1 - P(correct))
Likelihood Ratio = Evidence strength (from scores)
Posterior Odds = Prior Odds × Likelihood Ratio
Final Confidence = Posterior Odds / (1 + Posterior Odds)

Practical Calculation:

Start with prior confidence: 50% (neutral)
For each evaluation criterion score (0-20):
- Convert to likelihood ratio: LR = 0.25 + (score/20) * 3.75
- Update odds: Odds = Odds × LR
Convert back to probability: Conf = Odds / (1 + Odds)
Cap at 95% (Bayesian humility for unknown unknowns)

Example:

Branch scores: Novelty 18/20, Feasibility 19/20, Completeness 17/20, Confidence 18/20, Alignment 19/20
Likelihood ratios: 3.62, 3.81, 3.44, 3.62, 3.81
Final odds: 1.0 × 3.62 × 3.81 × 3.44 × 3.62 × 3.81 = 1,782
Confidence: 1782 / 1783 = 99.9% → Capped at 95%

Confidence Interpretation:

90-95%: Exceptional evidence, suitable for critical decisions
80-89%: High confidence, suitable for important decisions
70-79%: Medium confidence, consider additional validation
60-69%: Low confidence, recommend further investigation
<60%: Very low confidence, gather more information

Self-Critique Checklist

After applying ToT methodology, verify:

[ ] Branch Diversity: Are all 5+ branches fundamentally different (not variations)?
[ ] Self-Reflection Quality: Does each branch have genuine self-reflection (not boilerplate)?
[ ] Evaluation Rigor: Did I systematically score all 5 criteria for each branch?
[ ] Depth Achievement: Did I reach minimum 4 levels of exploration?
[ ] Confidence Validity: Is final confidence score justified by exploration depth?
[ ] Pruning Rationale: Can I explain why each non-selected branch was discarded?
[ ] Path Traceability: Can I clearly trace the winning path from root to leaf?
[ ] Synthesis Clarity: Does final output provide actionable recommendation?
[ ] Stopping Appropriateness: Did I stop for valid reasons per criteria?

Common Mistakes to Avoid

Too Few Branches: Using <5 branches reduces exploration quality
Variation vs Diversity: Creating 5 variations of same approach instead of 5 different approaches
Shallow Depth: Stopping at 1-2 levels instead of minimum 4
Biased Evaluation: Favoring familiar approaches without systematic scoring
Missing Self-Reflection: Skipping confidence assessment in branches
Premature Convergence: Selecting winner before thorough evaluation
Over-Recursion: Going beyond 6 levels without clear benefit
Poor Synthesis: Not clearly documenting winning path and rationale

Reference Documentation

Detailed Templates: ~/.claude/skills/tree-of-thoughts/references/tree-of-thoughts-patterns.md

Includes:

Branch exploration template (detailed prompts)
Self-reflection rubric (confidence scoring guide)
Evaluation matrix (scoring examples per criterion)
Level transition logic (when/how to deepen)
Edge case handling (convergence, insufficient diversity)

Quick Start Examples

Example 1: Simple Decision (3 levels)

Problem: Choose between Redis, Memcached, or Hazelcast for caching

Level 0 (3 branches):

Branch A: Redis (rich data structures)
Branch B: Memcached (pure speed)
Branch C: Hazelcast (distributed computing) → Winner: Branch A (Redis) - 85/100

Level 1 (Redis deployment options):

A.1: Single instance
A.2: Sentinel (high availability)
A.3: Cluster (horizontal scaling)
A.4: Redis Enterprise
A.5: Managed service (AWS ElastiCache) → Winner: Branch A.3 (Cluster) - 88/100

Level 2 (Cluster configuration):

A.3.1: 3 masters, no replicas
A.3.2: 3 masters, 3 replicas
A.3.3: 6 masters, 6 replicas
A.3.4: Auto-scaling cluster
A.3.5: Hybrid (critical data replicated) → Winner: Branch A.3.2 (3+3) - 91/100

Confidence: 88% (3 levels, clear winner at each level)

Example 2: Complex Architecture (5 levels)

Problem: Design microservices communication strategy

Level 0: REST, gRPC, Message Queue, Event Sourcing, GraphQL (5 approaches) Level 1: [Winner] expanded into 5 sub-approaches Level 2: [Winner] expanded into 5 implementation variants Level 3: [Winner] expanded into 5 technology choices Level 4: [Winner] expanded into 5 deployment patterns

Confidence: 93% (5 levels, 25+ branches explored total)

Summary

Tree of Thoughts is a systematic methodology for exploring complex problem spaces through:

Parallel branching (5+ approaches per level)
Self-reflection (confidence scoring for each branch)
Rigorous evaluation (5 criteria, 0-100 scoring)
Recursive depth (minimum 4 levels)
Bayesian confidence (evidence-based scoring)

Use it for strategic decisions, architectural choices, and optimization problems where systematic exploration yields better outcomes than intuition alone.

Remember: Quality over speed. ToT trades time for rigor. The goal is high-confidence optimal solutions, not quick answers.

Agent Skills: Tree of Thoughts Reasoning Methodology

Install this agent skill to your local

Skill Files