Back to tags
Tag

Agent Skills with tag: video-processing

15 skills match this tag. Use tags to discover related Agent Skills and explore similar workflows.

youtube-transcript

Download YouTube video transcripts when user provides a YouTube URL or asks to download/get/fetch a transcript from YouTube. Also use when user wants to transcribe or get captions/subtitles from a YouTube video.

youtubetranscriptsubtitlecaption
prof-ramos
prof-ramos
0

youtube-processor

Process YouTube videos into summarized Obsidian notes. Use when given a YouTube URL to summarize, extract insights, or turn videos into notes. Triggers on "summarize this video", "process this YouTube", "what's this video about", or any YouTube URL shared for processing.

youtubesummarizationobsidiannote-taking
Eddale
Eddale
0

media-processing

Process multimedia files with FFmpeg (video/audio encoding, conversion, streaming, filtering, hardware acceleration) and ImageMagick (image manipulation, format conversion, batch processing, effects, composition). Use when converting media formats, encoding videos with specific codecs (H.264, H.265, VP9), resizing/cropping images, extracting audio from video, applying filters and effects, optimizing file sizes, creating streaming manifests (HLS/DASH), generating thumbnails, batch processing images, creating composite images, or implementing media processing pipelines. Supports 100+ formats, hardware acceleration (NVENC, QSV), and complex filtergraphs. | Sử dụng khi: xử lý hình ảnh, video, audio, FFmpeg, ImageMagick, chuyển đổi media.

ffmpegimagemagickmedia-processingvideo-processing
wollfoo
wollfoo
1

ai-multimodal

Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens. | Sử dụng khi: AI, LLM, vision, embedding, phân tích hình ảnh, Gemini API.

google-geminimultimodalimage-processingaudio-processing
wollfoo
wollfoo
1

video-thumbnail-extractor

Extract frames from videos at specific timestamps or intervals, find best frames, and generate thumbnail grids for previews.

video-processingthumbnail-generationframe-extractionvideo-previews
dkyazzentwatwa
dkyazzentwatwa
3

video-to-gif

Convert video clips to optimized GIFs with speed control, cropping, text overlays, and file size optimization. Create perfect GIFs for social media, documentation, and presentations.

gifvideo-processingimage-optimizationsocial-media
dkyazzentwatwa
dkyazzentwatwa
3

video-captioner

Use when asked to add text overlays, subtitles, or captions to videos with customizable positioning, styling, and timing.

video-processingsubtitlescaptionstext-overlays
dkyazzentwatwa
dkyazzentwatwa
3

video-metadata-inspector

Use when asked to inspect video file metadata, get video duration, resolution, codec information, frame rate, or bitrate.

video-processingmetadatavideo-durationvideo-resolution
dkyazzentwatwa
dkyazzentwatwa
3

timelapse-creator

Create timelapse videos from image sequences with frame rate control, transitions, and quality optimization.

video-processingtimelapseframe-ratetransitions
dkyazzentwatwa
dkyazzentwatwa
3

transcribe-and-analyze

Transcribe audio and video from URLs (YouTube, direct media links) using WhisperKit locally. Optionally analyze transcripts with AI when explicitly requested. Use when users provide URLs to media content and request transcription or speech-to-text conversion.

speech-to-textaudio-processingvideo-processingwhisper
buddyh
buddyh
1

ai-multimodal

Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.

google-geminimultimodal-aiimage-analysisaudio-processing
zircote
zircote
42

youtube-to-markdown

Use when user asks YouTube video extraction, get, fetch, transcripts, subtitles, or captions. Writes video details and transcription into structured markdown file.

youtubemarkdownsrtstructured-output
vre
vre
6

cloudinary

Upload images and videos to Cloudinary with CDN delivery and transformations. Use this skill for media hosting, optimization, resizing, format conversion, and video concatenation.

cloudinarymedia-hostingimage-optimizationvideo-processing
vm0-ai
vm0-ai
12

video-comparer

This skill should be used when comparing two videos to analyze compression results or quality differences. Generates interactive HTML reports with quality metrics (PSNR, SSIM) and frame-by-frame visual comparisons. Triggers when users mention "compare videos", "video quality", "compression analysis", "before/after compression", or request quality assessment of compressed videos.

video-processingvideo-qualitycompression-analysisquality-metrics
daymade
daymade
15713

video-frames

Extract frames or short clips from videos using ffmpeg.

ffmpegmedia-conversionvideo-processingterminal
steipete
steipete
2,731407