gpu-media-processor
Process audio, video, and media on cloud GPUs. Transcribe with Whisper, clone voices, generate videos, upscale images, and run batch media processing. All results sync back to your Mac.
whisper
OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.
blackbelt-meeting-summary
Processes BlackBelt coaching call transcripts and generates Basecamp-ready summaries. Can post directly to Basecamp. Use when "process meeting transcripts", "summarize BlackBelt meetings", "check for new transcripts", "post to basecamp".
transcript-saver
Save and export the current Claude Code session as a shareable HTML transcript. Use this skill when the user asks to save, export, archive, or publish their current Claude Code conversation or session. Triggers on phrases like "save this transcript", "export this session", "create a transcript", "archive this conversation", "publish to gist", or "share this session". Wraps Simon Willison's claude-code-transcripts tool for in-session use.
youtube-to-knowledge-doc
Use when user provides YouTube URL and says "document this", "create notes", or "save this video". Automatically extracts transcript, determines folder placement, and generates Knowledge Framework documentation with MECE structure, Mermaid diagrams, and clickable timestamp citations.
meeting-recorder
Join Google Meet calls, transcribe audio in real-time, and participate via chat. Use when asked to join a meeting, transcribe a call, attend a video conference, or take meeting notes.
podcast-to-content-suite
Transform podcast transcripts into comprehensive content marketing suites including blog posts, social media content, newsletters, SEO-optimized articles, and timestamps. Use when user provides podcast transcripts or wants to repurpose podcast content.
supadata
Supadata API via curl. Use this skill to extract transcripts from YouTube/TikTok/Instagram videos and scrape web content to markdown.
assemblyai-streaming
This skill should be used when working with AssemblyAI’s Speech-to-Text and LLM Gateway APIs, especially for streaming/live transcription, meeting notetakers, and voice agents that need low-latency transcripts and audio analysis.
transcribe
Speech-to-text transcription using Groq Whisper API. Supports m4a, mp3, wav, ogg, flac, webm.
youtube-transcript
Fetch transcripts from YouTube videos for summarization and analysis.
youtube-transcript
Download YouTube video transcripts when user provides a YouTube URL or asks to download/get/fetch a transcript from YouTube. Also use when user wants to transcribe or get captions/subtitles from a YouTube video.
video-transcript-downloader
Download videos, audio, subtitles, and clean paragraph-style transcripts from YouTube and any other yt-dlp supported site. Use when asked to “download this video”, “save this clip”, “rip audio”, “get subtitles”, “get transcript”, or to troubleshoot yt-dlp/ffmpeg and formats/playlists.
Video Processor
Download and process videos from YouTube and other platforms. Supports video download, audio extraction, format conversion (mp4, webm), and Whisper transcription. Use when user mentions YouTube download, video conversion, audio extraction, transcription, mp4, webm, ffmpeg, yt-dlp, or whisper transcription.
openai-whisper-api
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
markitdown
Convert files and office documents to Markdown. Supports PDF, DOCX, PPTX, XLSX, images (with OCR), audio (with transcription), HTML, CSV, JSON, XML, ZIP, YouTube URLs, EPubs and more.