Back to categories
Category

Agent Skills in category: audio-processing

8 skills match this category. Browse curated collections and explore related Agent Skills.

whisper

OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.

speech-to-textmultilingualautomatic-speech-recognitiontranscription
ovachiever
ovachiever
81

sound-effects-generator

Generate audio tones, noise, DTMF signals, and simple sound effects programmatically. Export to WAV or MP3 format.

audio-generationsound-effectsDTMFWAV
dkyazzentwatwa
dkyazzentwatwa
3

audio-converter

Convert audio files between formats (MP3, WAV, FLAC, OGG, M4A) with bitrate and sample rate control. Batch processing supported.

audio-conversionbatch-processingbitrate-controlsample-rate
dkyazzentwatwa
dkyazzentwatwa
3

audio-trimmer

Cut, trim, and edit audio segments with fade effects, speed control, concatenation, and basic audio manipulations.

audio-editingfade-effectsspeed-controlconcatenation
dkyazzentwatwa
dkyazzentwatwa
3

podcast-splitter

Split audio files by detecting silence gaps. Auto-segment podcasts into chapters, remove long silences, and export individual clips.

podcastaudio-editingsilence-detectionaudio-segmentation
dkyazzentwatwa
dkyazzentwatwa
3

audio-normalizer

Use when asked to normalize audio volume, match loudness, or apply peak/RMS normalization to audio files.

audio-normalizationloudness-matchingpeak-normalizationRMS-normalization
dkyazzentwatwa
dkyazzentwatwa
3

ASR

Implement speech-to-text (ASR/automatic speech recognition) capabilities using the z-ai-web-dev-sdk. Use this skill when the user needs to transcribe audio files, convert speech to text, build voice input features, or process audio recordings. Supports base64 encoded audio files and returns accurate text transcriptions.

speech-to-textasraudio-transcriptionaudio-input
UholySmokes
UholySmokes
1

audio-analysis

Audio analysis with Tone.js and Web Audio API including FFT, frequency data extraction, amplitude measurement, and waveform analysis. Use when extracting audio data for visualizations, beat detection, or any audio-reactive features.

tone.jsweb-audio-apifrequency-analysisbeat-detection
Bbeierle12
Bbeierle12
3