product-metrics Skill | Agent Skills

Metrics Architecture

Structure product measurement into three layers, each serving a distinct purpose.

North Star Metric

A single indicator that captures the fundamental value the product delivers. Selection criteria:

Value-reflective: Increases when users extract more benefit from the product
Forward-looking: Reliably predicts sustained business outcomes like revenue and retention
Influenceable: The product team's work can demonstrably move it
Broadly understood: Anyone in the organization can grasp its meaning and significance

Illustrative North Star choices by product category:

Team collaboration platform: Weekly active teams where three or more members contribute
Two-sided marketplace: Weekly completed transactions
Enterprise SaaS: Weekly active users who execute the core workflow
Media or content product: Weekly minutes of engaged consumption
Developer tooling: Weekly production deployments facilitated by the tool

L1 Indicators (Product Health)

Five to seven metrics that collectively represent the full user lifecycle. Organized by lifecycle phase:

Acquisition -- Are new users discovering the product?

Volume of new registrations or trial initiations and their trajectory
Visitor-to-registration conversion rate
Distribution across acquisition channels
Per-channel acquisition cost (for paid efforts)

Activation -- Are newcomers reaching the value threshold?

Activation rate: fraction of new users who perform the action most predictive of retention
Time-to-activation: elapsed duration from registration to activation
Onboarding completion rate: fraction who finish the guided setup sequence
First value moment: the point at which users first experience the product's core promise

Engagement -- Are active users deriving ongoing value?

Active user counts at daily, weekly, and monthly granularity (DAU, WAU, MAU)
Stickiness ratio (DAU divided by MAU): how habitual the product is
Core action frequency: how often users perform the most meaningful operation
Depth per session: volume of activity within a single visit
Feature penetration: share of users who adopt specific capabilities

Retention -- Are users returning over time?

Cohort retention at standard intervals: day 1, day 7, day 30, day 90
Retention curves by signup cohort showing decay and stabilization
Churn rate: fraction of users or revenue lost per period
Reactivation rate: fraction of previously lapsed users who return

Monetization -- Is user value converting to revenue?

Free-to-paid conversion rate (for freemium models)
Monthly and annual recurring revenue (MRR / ARR)
Average revenue per user or account (ARPU / ARPA)
Expansion revenue: growth generated by existing customers
Net revenue retention: combined effect of expansion, contraction, and churn

Satisfaction -- How do users perceive the experience?

Net Promoter Score (NPS)
Customer Satisfaction Score (CSAT)
Support ticket volume and mean resolution time
App store ratings and review sentiment analysis

L2 Indicators (Diagnostic Detail)

Granular metrics used to investigate why L1 indicators move:

Step-by-step funnel conversion rates
Per-feature usage and adoption measurements
Segment-level breakdowns: by plan tier, company size, geography, user role
Technical performance: page load latency, error rates, API response times
Content or feature-level engagement analysis: which surfaces drive the most activity

Key Metric Deep Dives

Active Users (DAU / WAU / MAU)

Definition: Unique users who perform a qualifying action within a day, week, or month.

Critical design choices:

Define "active" precisely. Logging in, loading a page, and executing a core action tell fundamentally different stories.
Match the timeframe to natural usage cadence. DAU for daily-use products (chat, email). WAU for weekly-use products (project tracking). MAU for episodic products (tax filing, travel booking).

Interpretation guidance:

Stickiness (DAU/MAU) above 0.5 signals daily-habit status. Below 0.2 suggests sporadic engagement.
Trajectory matters more than absolute level. Watch for growth, plateau, or decline.
Segment by user archetype. Power users and occasional visitors exhibit vastly different patterns.

Retention

Definition: Of users who arrived in cohort X, what percentage remain active in period Y?

Standard measurement windows:

Day 1: Was the initial experience compelling enough to prompt a return?
Day 7: Has the user begun forming a usage habit?
Day 30: Is the user retained at a meaningful horizon?
Day 90: Has the user become durably embedded?

Analytical approaches:

Chart retention curves by cohort. Steep initial falloff signals an activation gap. Steady ongoing decline points to an engagement deficit. A flattening curve indicates a healthy stable base.
Compare cohorts chronologically. Improving retention in newer cohorts confirms product improvements are landing.
Segment by onboarding completion or feature adoption to isolate what behaviors predict lasting retention.

Funnel Conversion

Definition: The percentage of users who advance from one lifecycle stage to the next.

Typical funnels to instrument:

Visitor to registration
Registration to activation (first value moment)
Free user to paying customer
Trial to subscription
Monthly plan to annual plan

Analytical approaches:

Map the entire funnel and measure conversion at every transition
Locate the steepest drop-offs -- these represent the highest-leverage optimization targets
Segment conversion by traffic source, plan type, and user profile; different populations convert at very different rates
Monitor conversion trends over time to gauge whether iterative improvements are working

Activation Rate

Definition: The fraction of new users who reach the experience where they first realize the product's core value.

Identifying the activation event:

Compare behavioral data for retained users versus churned users. What actions distinguish the two groups?
The activation event should strongly predict long-term retention
It should be reachable within the first session or first few days
Examples: created a first project, invited a collaborator, completed the primary workflow, connected an external integration

Operational use:

Track activation rate for every registration cohort
Measure time-to-activation; shorter intervals almost always correlate with better outcomes
Design onboarding sequences that steer users toward the activation moment
When testing onboarding changes, evaluate impact on downstream retention, not just activation rate in isolation

Goal-Setting Methodology

OKR Framework (Objectives and Key Results)

Objectives: Qualitative, motivating statements of what the team aims to accomplish.

Memorable and directionally inspiring
Bounded to a time period (quarter or half)
Focused on outcomes, not feature lists

Key Results: Quantitative evidence that the objective has been met.

Specific, measurable, and time-bound
Framed as outcomes rather than outputs
Two to four Key Results per Objective

Worked example:

Objective: Become an essential part of our users' daily routine

Key Results:
- Raise DAU/MAU stickiness from 0.35 to 0.50
- Improve 30-day retention for new cohorts from 40% to 55%
- Achieve >80% task completion rate across three primary workflows

OKR Operating Principles

Aim for ambitious-but-plausible targets. Achieving roughly 70% of a stretch OKR signals proper calibration.
Key Results measure user and business outcomes, not team output like features shipped or story points completed.
Constrain scope: two to three Objectives with two to four Key Results each prevents dilution.
If the team is confident of hitting every Key Result, ambition is too low.
Conduct a mid-period checkpoint. Reallocate effort toward off-track Key Results if warranted.
Score honestly at period's end: 0.0-0.3 = missed, 0.4-0.6 = partial progress, 0.7-1.0 = delivered.

Calibrating Metric Targets

Baseline: Establish the current value with reliable measurement before committing to a target.
External benchmarks: Reference what comparable products or industry reports indicate is achievable.
Existing trajectory: If the metric already trends upward at 5% monthly, targeting 6% is not ambitious.
Planned investment: Larger bets justify bolder targets.
Confidence bands: Set a "commit" level (high confidence) and a "stretch" level (aspirational).

Review Cadences

Weekly Health Check

Objective: Detect anomalies early, monitor active experiments, maintain situational awareness. Duration: 15-30 minutes. Participants: Product manager, optionally the engineering lead.

Agenda:

North Star metric: current value and week-over-week delta
L1 indicators: flag any notable movements
Active experiments: interim results and statistical power
Anomaly scan: unexpected spikes, drops, or pattern breaks
Triggered alerts: anything that crossed a monitoring threshold

Outcome: If something is off, open an investigation. Otherwise, log observations and proceed.

Monthly Deep Dive

Objective: Assess trends in context, measure progress toward quarterly targets, identify strategic implications. Duration: 30-60 minutes. Participants: Product team and key stakeholders.

Agenda:

Full L1 scorecard with month-over-month trends
OKR progress: are Key Results on trajectory?
Cohort health: are more recent cohorts outperforming earlier ones?
Launch performance: how are recently shipped features tracking?
Segment divergence: are any user segments behaving differently than expected?

Outcome: Identify one to three areas warranting deeper investigation or adjusted investment. Update priorities if metrics surface new insights.

Quarterly Strategic Review

Objective: Evaluate the quarter holistically, set direction for the next period. Duration: 60-90 minutes. Participants: Product, engineering, design, and leadership.

Agenda:

OKR final scoring for the quarter
L1 trend analysis spanning the full quarter
Year-over-year comparisons for context
Competitive and market backdrop: relevant shifts and competitor moves
Retrospective: what delivered expected results and what did not

Outcome: Set OKRs for the upcoming quarter. Recalibrate product strategy based on accumulated evidence.

Dashboard Design

Guiding Principles

A well-constructed dashboard answers "how is the product performing?" at a glance.

Design from the decision backward. Identify which decisions the dashboard informs before selecting metrics.
Enforce visual hierarchy. The highest-stakes metric gets the most prominent placement. North Star at the top, L1 indicators below, L2 detail accessible through drill-down.
Always provide context. A raw number in isolation conveys nothing. Pair every metric with: prior-period comparison, target value, and trend direction.
Favor signal density over metric count. Five to ten carefully chosen indicators outperform fifty superficial ones. Relegate the rest to a supplementary report.
Standardize time windows. Display all metrics over the same period. Mixing daily and monthly granularity on one screen breeds confusion.
Use color for instant status:
- Green: on track or trending favorably
- Yellow: warrants attention or trending flat
- Red: off track or declining
Every metric must be actionable. If the team cannot influence a measurement, it does not earn a place on the product dashboard.

Recommended Layout

Row 1: North Star metric with trend line and target overlay.

Row 2: L1 health scorecard -- current value, period change, target, and status indicator for each metric.

Row 3: Key funnels -- visual conversion funnel with drop-off rates at each stage.

Row 4: Experiment and launch tracker -- active tests with preliminary results, recent releases with early performance data.

Drill-down layer: L2 diagnostic metrics, segment breakdowns, and extended time-series charts for investigation.

Dashboard Pitfalls

Vanity metrics: Cumulative totals that only climb (all-time signups, lifetime page views) without indicating health
Metric overload: Dashboards that require scrolling. If it does not fit on a single screen, trim the metric set.
Missing baselines: Numbers shown without prior-period comparison or target reference
Abandoned dashboards: Metrics that have not been reviewed or refreshed in months
Activity metrics masquerading as outcomes: Measuring internal throughput (tickets closed, pull requests merged) instead of user and business results
One-size-fits-all views: Executives, product managers, and engineers need different dashboards. A single view serves none of them well.

Alerting Strategy

Configure automated alerts for metrics that demand prompt response:

Threshold alerts: A metric breaches a predefined boundary (error rate exceeds 1%, conversion falls below 5%)
Trend alerts: A metric shows sustained decline across multiple consecutive periods
Anomaly alerts: A metric deviates significantly from its expected range

Alert hygiene practices:

Every alert must have a corresponding action plan. If nothing can be done, remove the alert.
Review and recalibrate alerts periodically. Excessive false positives train teams to ignore all signals.
Assign a designated responder for each alert category.
Differentiate severity tiers. Not every alert warrants an emergency response.

Agent Skills: product-metrics

Install this agent skill to your local

Skill Files

Metrics Architecture

North Star Metric

L1 Indicators (Product Health)

L2 Indicators (Diagnostic Detail)

Key Metric Deep Dives

Active Users (DAU / WAU / MAU)

Retention

Funnel Conversion

Activation Rate

Goal-Setting Methodology

OKR Framework (Objectives and Key Results)

OKR Operating Principles

Calibrating Metric Targets

Review Cadences

Weekly Health Check

Monthly Deep Dive

Quarterly Strategic Review

Dashboard Design

Guiding Principles

Recommended Layout

Dashboard Pitfalls

Alerting Strategy