speech-to-text | Agent Skills

Azure AI Services

This skill should be used when the user asks about "Azure AI Search", "Cognitive Search", "AI Foundry", "Azure OpenAI", "speech to text", "text to speech", "Azure AI", "vector search", "semantic search", or mentions Azure AI and cognitive services. Provides best practices and MCP tool guidance for Azure AI services.

Azurecognitive-servicesAIsearch

charris-msft

superwhisper-custom-mode

Guide for creating effective Custom Mode prompts and examples for Superwhisper, an AI dictation app. Use when users want to create, improve, or understand Superwhisper custom mode instructions for processing dictated speech with context-awareness, and when users want to generate examples.

prompt-engineeringexamplespeech-to-textcustom-mode

miguelarios

transcribe-and-analyze

Transcribe audio and video from URLs (YouTube, direct media links) using WhisperKit locally. Optionally analyze transcripts with AI when explicitly requested. Use when users provide URLs to media content and request transcription or speech-to-text conversion.

speech-to-textaudio-processingvideo-processingwhisper

buddyh

ASR

Implement speech-to-text (ASR/automatic speech recognition) capabilities using the z-ai-web-dev-sdk. Use this skill when the user needs to transcribe audio files, convert speech to text, build voice input features, or process audio recordings. Supports base64 encoded audio files and returns accurate text transcriptions.

speech-to-textasraudio-transcriptionaudio-input

UholySmokes

youtube-transcript

Download YouTube video transcripts when user provides a YouTube URL or asks to download/get/fetch a transcript from YouTube. Also use when user wants to transcribe or get captions/subtitles from a YouTube video.

youtuberest-apispeech-to-textsubtitles