Agent Skills: OpenAI transcriptions API

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

UncategorizedID: openclaw/openclaw/openai-whisper-api

Repository

openclawLicense: MIT
336,90865,980

Install this agent skill to your local

pnpm dlx add-skill https://github.com/openclaw/openclaw/tree/HEAD/skills/openai-whisper-api

Skill Files

Browse the full folder contents for openai-whisper-api.

Download Skill

Loading file tree…

skills/openai-whisper-api/SKILL.md

Skill Metadata

Name
openai-whisper-api
Description
"OpenAI Audio Transcriptions API via curl; gpt-4o-transcribe, mini, diarize, or whisper-1."

OpenAI transcriptions API

Transcribe audio through /v1/audio/transcriptions. Set OPENAI_BASE_URL for an OpenAI-compatible proxy or local gateway.

Quick start

{baseDir}/scripts/transcribe.sh /path/to/audio.m4a

Defaults:

  • Model: gpt-4o-transcribe
  • Output: <input>.txt

Useful flags

{baseDir}/scripts/transcribe.sh /path/to/audio.ogg --model gpt-4o-transcribe --out /tmp/transcript.txt
{baseDir}/scripts/transcribe.sh /path/to/audio.ogg --model gpt-4o-mini-transcribe
{baseDir}/scripts/transcribe.sh /path/to/audio.ogg --model gpt-4o-transcribe-diarize --json
{baseDir}/scripts/transcribe.sh /path/to/audio.ogg --model whisper-1
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --language en
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --prompt "Speaker names: Peter, Daniel"
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --json --out /tmp/transcript.json

Notes:

  • Supported upload formats include mp3, mp4, mpeg, mpga, m4a, wav, webm.
  • 25 MB upload limit on the hosted API.
  • Use diarize for speaker labels; script sends chunking_strategy=auto and rejects --prompt.

API key

Set OPENAI_API_KEY, or configure it in the active OpenClaw config file ($OPENCLAW_CONFIG_PATH, default ~/.openclaw/openclaw.json). Optionally set OPENAI_BASE_URL:

{
  skills: {
    "openai-whisper-api": {
      apiKey: "OPENAI_KEY_HERE",
    },
  },
}