audio-processing | Agent Skills

whisper

OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.

speech-to-textmultilingualautomatic-speech-recognitiontranscription

ovachiever

sound-effects-generator

Generate audio tones, noise, DTMF signals, and simple sound effects programmatically. Export to WAV or MP3 format.

audio-generationsound-effectsDTMFWAV

dkyazzentwatwa

audio-converter

Convert audio files between formats (MP3, WAV, FLAC, OGG, M4A) with bitrate and sample rate control. Batch processing supported.

audio-conversionbatch-processingbitrate-controlsample-rate

dkyazzentwatwa

audio-trimmer

Cut, trim, and edit audio segments with fade effects, speed control, concatenation, and basic audio manipulations.

audio-editingfade-effectsspeed-controlconcatenation

dkyazzentwatwa

podcast-splitter

Split audio files by detecting silence gaps. Auto-segment podcasts into chapters, remove long silences, and export individual clips.

podcastaudio-editingsilence-detectionaudio-segmentation

dkyazzentwatwa

audio-normalizer

Use when asked to normalize audio volume, match loudness, or apply peak/RMS normalization to audio files.

audio-normalizationloudness-matchingpeak-normalizationRMS-normalization

dkyazzentwatwa

ASR

Implement speech-to-text (ASR/automatic speech recognition) capabilities using the z-ai-web-dev-sdk. Use this skill when the user needs to transcribe audio files, convert speech to text, build voice input features, or process audio recordings. Supports base64 encoded audio files and returns accurate text transcriptions.

speech-to-textasraudio-transcriptionaudio-input

UholySmokes

audio-analysis

Audio analysis with Tone.js and Web Audio API including FFT, frequency data extraction, amplitude measurement, and waveform analysis. Use when extracting audio data for visualizations, beat detection, or any audio-reactive features.

tone.jsweb-audio-apifrequency-analysisbeat-detection

Bbeierle12

Agent Skills in category: audio-processing

whisper

sound-effects-generator

audio-converter

audio-trimmer

podcast-splitter

audio-normalizer

ASR

audio-analysis