Agentic KPI Tracking Skill Skill

Agentic KPI Tracking Skill

Guide measurement and tracking of agentic coding KPIs to assess ZTE readiness.

When to Use

Measuring agentic workflow effectiveness
Tracking progress toward ZTE
Analyzing success patterns
Identifying improvement areas

Core KPIs

Summary Metrics

| Metric | Calculation | Target | | --- | --- | --- | | Current Streak | Consecutive successes (Attempts <= 2) | Higher is better | | Longest Streak | Best consecutive success run | Track improvement | | Average Presence | Mean attempts across all runs | Target: 1 | | Total Plan Size | Sum of all plan sizes | Track scaling | | Total Diff Size | Sum of all changes (added + removed) | Track throughput |

Per-Run Metrics

| Metric | Source | Meaning | | --- | --- | --- | | Attempts | Count of plan/patch runs | 1 = perfect, higher = retries | | Plan Size | Lines in plan file | Task complexity | | Diff Size | Lines added + removed | Change magnitude | | Files Changed | Number of files modified | Change scope |

Calculation Methods

Attempts Count

Only count workflow restarts:

attempts_incrementing = ["adw_plan_iso", "adw_patch_iso"]
attempts = count(workflow in all_adws if workflow in attempts_incrementing)

Build/test/review don't increment - only full replans.

Streak Calculation

current_streak = 0
for run in reversed(runs):
    if run.attempts <= 2:
        current_streak += 1
    else:
        break

Diff Statistics

git diff origin/main --shortstat
# Output: X files changed, Y insertions(+), Z deletions(-)

KPI File Format

Store in app_docs/agentic_kpis.md or equivalent:

# Agentic KPIs

## Summary

| Metric | Value |
| --- | --- |
| Current Streak | 5 |
| Longest Streak | 12 |
| Average Presence | 1.3 |
| Total Plan Size | 450 lines |
| Total Diff Size | 2,340 lines |

## Detail

| Date | ADW ID | Issue | Class | Attempts | Plan Size | Diff +/- | Files |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 2024-01-15 | abc123 | #45 | /bug | 1 | 35 | +45/-12 | 3 |
| 2024-01-14 | def456 | #44 | /feature | 2 | 85 | +120/-30 | 8 |

Tracking Workflow

Step 1: Gather Current Run Data

From state or git:

ADW ID
Issue number
Issue classification
Plan file path
All workflows run (for attempts)

Step 2: Calculate Metrics

attempts = count_attempts(all_adws)
plan_size = wc_lines(plan_file)
diff_stats = parse_git_diff()

Step 3: Update Detail Table

Add new row with current run data.

Step 4: Recalculate Summary

Update all summary metrics based on full detail table.

Step 5: Analyze Trends

Is streak increasing?
Is average presence decreasing?
Are plan sizes growing (handling bigger tasks)?

ZTE Readiness Indicators

Based on KPIs, assess ZTE readiness:

| Indicator | Threshold | Status | | --- | --- | --- | | Current Streak | >= 5 | Ready to try ZTE | | Average Presence | <= 1.5 | Good efficiency | | Recent Failures | 0 in last 10 | High confidence | | Plan Size Trend | Increasing | Scaling up |

Key Memory References

@agentic-kpis.md - KPI definitions from Lesson 002
@zte-progression.md - How KPIs relate to ZTE levels
@zte-confidence-building.md - Using KPIs for confidence

Output Format

Provide KPI update:

## KPI Update

**Run:** {adw_id}
**Issue:** #{issue_number} ({issue_class})

### This Run
- Attempts: 1
- Plan Size: 45 lines
- Diff: +67/-23 (4 files)

### Updated Summary
- Current Streak: 6 (was 5)
- Longest Streak: 12 (unchanged)
- Average Presence: 1.28 (improved)

### Analysis
[Trend observations and recommendations]

Anti-Patterns

Gaming metrics (easy tasks only)
Ignoring failures (not counting retries)
Not tracking consistently
Celebrating streaks over actual delivery

Version History

v1.0.0 (2025-12-26): Initial release

Last Updated

Date: 2025-12-26 Model: claude-opus-4-5-20251101

Agent Skills: Agentic KPI Tracking Skill

Install this agent skill to your local

Skill Files