Utility Skill Skill | Agent Skills

Utility Skill

Overview

A decision framework for agent orchestration based on Liu et al., "Utility-Guided Agent Orchestration for Efficient LLM Tool Use" (arXiv:2603.19896). Each candidate action is scored by subtracting weighted costs from expected gain, producing a single utility value that guides action selection. The framework prevents over-calling tools and premature stopping by making both errors costly. Utility range is [-2.3, 1.0].

When To Use

Deciding whether to dispatch another agent or tool call
Gating expensive tool calls (search, code execution, delegation)
Selecting the right model tier for a sub-task
Continuation decisions after receiving partial results
Verification gating before writing or committing output

When NOT to Use

Single-step operations with one obvious action
Trivial tasks where cost of scoring exceeds benefit
Already-committed actions that cannot be undone

Action Space

A = {respond, retrieve, tool_call, verify, delegate, stop}

| Action | Description | |-----------|------------------------------------------------------| | respond | Emit a final answer from current context | | retrieve | Fetch additional information (search, read, lookup) | | tool_call | Execute a tool (code runner, API, file write) | | verify | Check a prior result for correctness or completeness | | delegate | Spawn a sub-agent or hand off to a specialist | | stop | Terminate the loop and return current state |

Utility Function

U(a | s_t) = Gain(a | s_t)
           - λ₁ · StepCost(a | s_t)
           - λ₂ · Uncertainty(a | s_t)
           - λ₃ · Redundancy(a | s_t)

| Parameter | Default | Rationale | |-----------|---------|---------------------------------------------------| | λ₁ | 1.0 | Cost baseline; all other weights relative to this | | λ₂ | 0.5 | Weak empirical correlation with outcome (r=0.0131) | | λ₃ | 0.8 | Redundancy pruning yields ~10% token savings |

Utility range: [-2.3, 1.0]. Positive values indicate the action is worth taking. Values below the floor (-0.5 default) indicate the action should be skipped.

Termination Conditions

Stop the loop when any of the following is true:

(a) Selected action is stop
(b) Step budget exhausted (default: 10 steps)
(c) All non-stop actions score below the floor (default: -0.5)

High-gain override: If Gain >= 0.7 for any action, condition (c) may be overridden. Document the override and the gain value in your reasoning trace.

Quick Start

Minimal 4-step advisory pattern:

Construct state: gather task context per modules/state-builder.md
Score candidates: evaluate each action in A per modules/action-selector.md
Prefer highest utility: select the action with the maximum U(a | s_t), subject to termination conditions
Log score and decision: record the winning action, its utility value, and step count before executing

Detailed Resources

State Builder: modules/state-builder.md, how to populate s_t from task context
Gain: modules/gain.md, estimating expected information or progress gain
Step Cost: modules/step-cost.md, token, latency, and monetary cost tables
Uncertainty: modules/uncertainty.md, confidence estimation and calibration
Redundancy: modules/redundancy.md, detecting duplicate or low-delta actions
Action Selector: modules/action-selector.md, scoring loop and tie-breaking rules
Integration: modules/integration.md, wiring utility scoring into existing orchestration loops

Exit Criteria

[ ] State constructed with task goal and prior steps
[ ] All six actions scored before selecting one
[ ] Termination condition checked after each step
[ ] Score and decision logged for each step taken
[ ] High-gain overrides documented with gain value

Agent Skills: Utility Skill

Install this agent skill to your local

Skill Files