examples-auto-run Skill

examples-auto-run

What it does

Runs pnpm build && pnpm -r build-check first
Runs pnpm examples:start-all in auto-input mode (interactive prompts are auto-answered, HITL/MCP/apply-patch are auto-approved).
Executes starts in parallel (default concurrency 4) and pipes each start’s stdout/stderr into its own log file under .tmp/examples-start-logs/.
Provides start/stop/status/logs/tail helpers via run.sh.
If the Codex session ends (no disown/nohup), the child processes receive SIGHUP and exit; stop is also available to clean up manually.

Usage

# Start (auto mode, concurrency=4 by default)
.agents/skills/examples-auto-run/scripts/run.sh start [extra args to examples:start-all]
# If you invoke the skill name alone ($examples-auto-run):
#   - when `.tmp/examples-rerun.txt` exists and is non-empty, it will run `rerun` automatically
#   - otherwise it runs the default `start` command.

# Examples:
.agents/skills/examples-auto-run/scripts/run.sh start --filter basic
.agents/skills/examples-auto-run/scripts/run.sh start --include-server --include-audio

# Check status
.agents/skills/examples-auto-run/scripts/run.sh status

# Stop running job (kills pid from .tmp/examples-auto-run.pid)
.agents/skills/examples-auto-run/scripts/run.sh stop

# List logs (per start script)
.agents/skills/examples-auto-run/scripts/run.sh logs

# Tail latest log
.agents/skills/examples-auto-run/scripts/run.sh tail
.agents/skills/examples-auto-run/scripts/run.sh tail basic__start_hello-world.log

# After a run, build a rerun list from the latest main log (auto-skip list is imported from `scripts/run-example-starts.mjs` and server/audio/external skips are honored)
.agents/skills/examples-auto-run/scripts/run.sh collect
# Rerun only the entries in .tmp/examples-rerun.txt
.agents/skills/examples-auto-run/scripts/run.sh rerun
# Show the current auto-skip list (env or defaults)
.agents/skills/examples-auto-run/scripts/run.sh start --print-auto-skip --dry-run

Defaults (overridable via env)

EXAMPLES_INTERACTIVE_MODE=auto
AUTO_APPROVE_MCP=1, APPLY_PATCH_AUTO_APPROVE=1, AUTO_APPROVE_HITL=1 (set in runner)
EXAMPLES_CONCURRENCY=4
- sandbox:start:memory-generation is still serialized by the runner because its pre-stop memory generation can contend with shared local resources. Other hosted, Unix-local, and Docker sandbox examples remain parallelized.
EXAMPLES_EXECA_TIMEOUT_MS=300000 (5m)
financial-research-agent and computer-use use 10m inside the script.
Includes interactive; excludes server/audio/external by default:
- EXAMPLES_INCLUDE_INTERACTIVE=1
- EXAMPLES_INCLUDE_SERVER=0
- EXAMPLES_INCLUDE_AUDIO=0
EXAMPLES_INCLUDE_EXTERNAL=0
- This means realtime-* / nextjs (tagged as server/audio) are skipped unless you opt in with --include-server / --include-audio or the corresponding env flags.
Auto-skip list: EXAMPLES_AUTO_SKIP (comma/space separated) overrides the built-in defaults used by both run.sh and run-example-starts.mjs. Defaults include agent-patterns:start:llm-as-a-judge, connectors:start, mcp:start:hosted-mcp-on-approval, mcp:start:hosted-mcp-human-in-the-loop, tools:start:codex, tools:start:codex-same-thread.

Codex execution environment

When Codex invokes start or rerun, run the command outside the Codex sandbox by default (sandbox_permissions=require_escalated). Several examples start nested sandboxes, browsers, npm helpers, or local service processes; running from inside the Codex sandbox can produce environment-only failures such as Playwright browser launch permission errors, npm cache permission errors, or nested sandbox setup errors.
Use sandboxed execution only when the user explicitly asks for it or when running a narrow dry-run / log inspection command that does not execute examples.

Cancellation / cleanup

Jobs are backgrounded but not disowned; if Codex suspends/ends the shell, the process group gets SIGHUP and stops.
Manual cleanup: run.sh stop (removes stale pid if already exited).

Log locations

.tmp/examples-start-logs/<package>__<script>.log (per start)
Main runner log path is printed when start is invoked.
Rerun list (generated by collect): .tmp/examples-rerun.txt (one package:script per line).

Notes

Auto-skip is centralized (same defaults as above) and can be overridden via EXAMPLES_AUTO_SKIP. Auto-skip entries are excluded from rerun collection and will be removed from rerun execution automatically.
Auto-input map covers common interactive prompts; HITL/MCP/apply-patch auto-approve via env is enabled by the runner.
Shell tool approvals are auto-approved in auto mode (SHELL_AUTO_APPROVE=1).
rerun runs entries sequentially, continues after failures, and rewrites .tmp/examples-rerun.txt with only the remaining failures. Auto-skip entries are not re-added.
Behavioral validation is not done in the runner, so Codex must immediately perform it after every start or rerun invocation without waiting for the user to ask. Required steps:
1. Read the example source to infer intended flow from code/comments (tools invoked, expected outputs, guards, approvals).
2. Read the matching log under .tmp/examples-start-logs/.
3. Compare intent vs. log: confirm key actions/results happened; flag omissions or divergences.
4. Do this for all exit-0 entries, not just samples.
5. Summarize findings right after the run completes; when “OK”, note what was checked (e.g., “tools called + final message emitted”).
6. When reporting, do not omit or ellipsize outputs that justify the validation; include the full relevant lines (keep it concise but untruncated).
The runner prints a full table after the summary: one row per start script with status, package:script, info (reason/exit/skipped), and the log path. If the run stops before the table appears, point the analyzer at the latest main_*.log to reconstruct a table and validations.

Agent Skills: examples-auto-run

Install this agent skill to your local

Skill Files

examples-auto-run

What it does

Usage

Defaults (overridable via env)

Codex execution environment

Cancellation / cleanup

Log locations

Notes