Agent Skills: Synthesize Speech

Synthesize text to speech with Kokoro TTS. TRIGGERS - speak this, kokoro tts, text to speech, synthesize voice, say this.

UncategorizedID: terrylica/cc-skills/synthesize

Install this agent skill to your local

pnpm dlx add-skill https://github.com/terrylica/cc-skills/tree/HEAD/plugins/kokoro-tts/skills/synthesize

Skill Files

Browse the full folder contents for synthesize.

Download Skill

Loading file tree…

plugins/kokoro-tts/skills/synthesize/SKILL.md

Skill Metadata

Name
synthesize
Description
"Synthesize text to speech with Kokoro TTS. TRIGGERS - speak this, kokoro tts, text to speech, synthesize voice, say this."

Synthesize Speech

Generate speech from text using the Kokoro TTS CLI tool. Supports single WAV output or chunked streaming for long text.

Quick Usage

# Single WAV
~/.local/share/kokoro/.venv/bin/python ~/.local/share/kokoro/tts_generate.py \
  --text "Hello from Kokoro TTS" --voice af_heart --lang en-us --speed 1.0 \
  --output /tmp/kokoro-tts-$$.wav

# Play it
afplay /tmp/kokoro-tts-$$.wav

Parameters

| Parameter | Default | Description | | ---------- | ---------- | ------------------------------------ | | --text | (required) | Text to synthesize | | --voice | af_heart | Voice name (see voice catalog) | | --lang | en-us | Language code (en-us, zh, ja, etc.) | | --speed | 1.0 | Speech speed multiplier | | --output | (required) | Output WAV path | | --chunk | off | Chunked streaming mode for long text |

Voice Catalog

See Voice Catalog for all available voices with quality grades.

Top voices:

| Voice ID | Name | Grade | Gender | | --------- | ------ | ----- | ------ | | af_heart | Heart | A | Female | | af_bella | Bella | A- | Female | | af_nicole | Nicole | B- | Female |

Chunked Streaming

For long text, use --chunk to get progressive playback:

~/.local/share/kokoro/.venv/bin/python ~/.local/share/kokoro/tts_generate.py \
  --text "Long text here..." --voice af_heart --lang en-us --speed 1.0 \
  --output /tmp/kokoro-tts-$$.wav --chunk

Each chunk WAV path is printed to stdout as it becomes ready. The final line is DONE <ms>.

Troubleshooting

| Issue | Cause | Solution | | ---------------- | ---------------- | ------------------------------- | | No audio output | Model not loaded | Run /kokoro-tts:install first | | Empty text error | Input was blank | Provide non-empty --text | | Slow generation | First-run warmup | Normal — subsequent runs faster |