examples-auto-run
What it does
- Runs
pnpm build && pnpm -r build-check first
- Runs
pnpm examples:start-all in auto-input mode (interactive prompts are auto-answered, HITL/MCP/apply-patch are auto-approved).
- Executes starts in parallel (default concurrency 4) and pipes each start’s stdout/stderr into its own log file under
.tmp/examples-start-logs/.
- Provides start/stop/status/logs/tail helpers via
run.sh.
- If the Codex session ends (no disown/nohup), the child processes receive SIGHUP and exit;
stop is also available to clean up manually.
Usage
# Start (auto mode, concurrency=4 by default)
.agents/skills/examples-auto-run/scripts/run.sh start [extra args to examples:start-all]
# If you invoke the skill name alone ($examples-auto-run):
# - when `.tmp/examples-rerun.txt` exists and is non-empty, it will run `rerun` automatically
# - otherwise it runs the default `start` command.
# Examples:
.agents/skills/examples-auto-run/scripts/run.sh start --filter basic
.agents/skills/examples-auto-run/scripts/run.sh start --include-server --include-audio
# Check status
.agents/skills/examples-auto-run/scripts/run.sh status
# Stop running job (kills pid from .tmp/examples-auto-run.pid)
.agents/skills/examples-auto-run/scripts/run.sh stop
# List logs (per start script)
.agents/skills/examples-auto-run/scripts/run.sh logs
# Tail latest log
.agents/skills/examples-auto-run/scripts/run.sh tail
.agents/skills/examples-auto-run/scripts/run.sh tail basic__start_hello-world.log
# After a run, build a rerun list from the latest main log (auto-skip list is imported from `scripts/run-example-starts.mjs` and server/audio/external skips are honored)
.agents/skills/examples-auto-run/scripts/run.sh collect
# Rerun only the entries in .tmp/examples-rerun.txt
.agents/skills/examples-auto-run/scripts/run.sh rerun
# Show the current auto-skip list (env or defaults)
.agents/skills/examples-auto-run/scripts/run.sh start --print-auto-skip --dry-run
Defaults (overridable via env)
EXAMPLES_INTERACTIVE_MODE=auto
AUTO_APPROVE_MCP=1, APPLY_PATCH_AUTO_APPROVE=1, AUTO_APPROVE_HITL=1 (set in runner)
EXAMPLES_CONCURRENCY=4
EXAMPLES_EXECA_TIMEOUT_MS=300000 (5m)
financial-research-agent and computer-use use 10m inside the script.
- Includes interactive; excludes server/audio/external by default:
EXAMPLES_INCLUDE_INTERACTIVE=1
EXAMPLES_INCLUDE_SERVER=0
EXAMPLES_INCLUDE_AUDIO=0
EXAMPLES_INCLUDE_EXTERNAL=0
- This means
realtime-* / nextjs (tagged as server/audio) are skipped unless you opt in with --include-server / --include-audio or the corresponding env flags.
- Auto-skip list:
EXAMPLES_AUTO_SKIP (comma/space separated) overrides the built-in defaults used by both run.sh and run-example-starts.mjs. Defaults include agent-patterns:start:llm-as-a-judge, agent-patterns:start:routing, customer-service:start, connectors:start, mcp:start:hosted-mcp-on-approval, mcp:start:hosted-mcp-human-in-the-loop.
Cancellation / cleanup
- Jobs are backgrounded but not disowned; if Codex suspends/ends the shell, the process group gets SIGHUP and stops.
- Manual cleanup:
run.sh stop (removes stale pid if already exited).
Log locations
.tmp/examples-start-logs/<package>__<script>.log (per start)
- Main runner log path is printed when
start is invoked.
- Rerun list (generated by
collect): .tmp/examples-rerun.txt (one package:script per line).
Notes
- Auto-skip is centralized (same defaults as above) and can be overridden via
EXAMPLES_AUTO_SKIP. Auto-skip entries are excluded from rerun collection and will be removed from rerun execution automatically.
- Auto-input map covers common interactive prompts; HITL/MCP/apply-patch auto-approve via env is enabled by the runner.
- Shell tool approvals are auto-approved in auto mode (
SHELL_AUTO_APPROVE=1).
rerun runs entries sequentially, continues after failures, and rewrites .tmp/examples-rerun.txt with only the remaining failures. Auto-skip entries are not re-added.
- Behavioral validation is not done in the runner, so Codex must immediately perform it after every
start or rerun invocation without waiting for the user to ask. Required steps:
- Read the example source to infer intended flow from code/comments (tools invoked, expected outputs, guards, approvals).
- Read the matching log under
.tmp/examples-start-logs/.
- Compare intent vs. log: confirm key actions/results happened; flag omissions or divergences.
- Do this for all exit-0 entries, not just samples.
- Summarize findings right after the run completes; when “OK”, note what was checked (e.g., “tools called + final message emitted”).
- When reporting, do not omit or ellipsize outputs that justify the validation; include the full relevant lines (keep it concise but untruncated).
- The runner prints a full table after the summary: one row per start script with
status, package:script, info (reason/exit/skipped), and the log path. If the run stops before the table appears, point the analyzer at the latest main_*.log to reconstruct a table and validations.