Back to tags
Tag

Agent Skills with tag: image-processing

26 skills match this tag. Use tags to discover related Agent Skills and explore similar workflows.

image

Generate and edit images using AI models via OpenRouter. Supports Nano Banana Pro (Gemini 3 Pro Image), FLUX, and other image generation models.

text-to-imageimage-processinggenerative-artai-models
flight505
flight505
0

senior-computer-vision

World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production deployment. Use when building vision AI systems, implementing object detection, training custom vision models, or optimizing inference pipelines.

PyTorchobject-detectionimage-processingreal-time-processing
ovachiever
ovachiever
81

histolab

Digital pathology image processing toolkit for whole slide images (WSI). Use this skill when working with histopathology slides, processing H&E or IHC stained tissue images, extracting tiles from gigapixel pathology images, detecting tissue regions, segmenting tissue masks, or preparing datasets for computational pathology deep learning pipelines. Applies to WSI formats (SVS, TIFF, NDPI), tile-based analysis, and histological image preprocessing workflows.

digital-pathologywhole-slide-imaginghistopathologyimage-processing
ovachiever
ovachiever
81

cloudflare-vectorize

|

cloudflarevectorizationimage-processingautomation
ovachiever
ovachiever
81

multimodal-looker

|

multimodalcomputer-visionimage-processingdeep-learning
bahayonghang
bahayonghang
0

gemini-cli

"Use Gemini CLI when processing images, PDFs, large files, needing 1M+ token context, or requiring Gemini's strong reasoning and fine-grained domain knowledge.

geminiimage-processingpdflarge-context
metrovoc
metrovoc
1

ai-multimodal

Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens. | Sử dụng khi: AI, LLM, vision, embedding, phân tích hình ảnh, Gemini API.

google-geminimultimodalimage-processingaudio-processing
wollfoo
wollfoo
1

sprite-sheet-generator

Combine multiple images into sprite sheets with customizable grid layouts and generate CSS sprite maps for web development.

sprite-sheetcssweb-developmentimage-processing
dkyazzentwatwa
dkyazzentwatwa
3

qr-barcode-reader

Use when asked to scan, decode, read, or extract data from QR codes or barcodes in images.

qr-codebarcodeimage-processingdata-extraction
dkyazzentwatwa
dkyazzentwatwa
3

business-card-scanner

Extract contact information from business card images using OCR - name, company, email, phone, address.

OCRimage-processingcontact-extractionbusiness-cards
dkyazzentwatwa
dkyazzentwatwa
3

color-palette-extractor

Extract dominant colors from images, generate color palettes, and export as CSS, JSON, or ASE with K-means clustering.

color-paletteimage-processingk-meanscss-export
dkyazzentwatwa
dkyazzentwatwa
3

photo-collage-maker

Create photo collages with grid layouts, custom arrangements, borders, and backgrounds. Combine multiple images into single compositions.

image-processingphoto-collagelayout-designcustomization
dkyazzentwatwa
dkyazzentwatwa
3

image-processing

Process, transform, and analyze images using common operations

image-processingimage-analysiscomputer-vision
tatat
tatat
1

gemini-imagegen

Generate and edit images using the Gemini API (Nano Banana). Use this skill when creating images from text prompts, editing existing images, applying style transfers, generating logos with text, creating stickers, product mockups, or any image generation/manipulation task. Supports text-to-image, image editing, multi-turn refinement, and composition from multiple reference images.

text-to-imageimage-processinggenerative-artapi
phrazzld
phrazzld
21

computer-vision

Image processing, object detection, segmentation, and vision models. Use for image classification, object detection, or visual analysis tasks.

computer-visionimage-processingmachine-learningobject-detection
pluginagentmarketplace
pluginagentmarketplace
21

Pixel Art Exporter

Export sprites to PNG, GIF, or spritesheet formats with JSON metadata for game engines. Use when the user wants to "export", "save", "output", "render", "generate", "create file", mentions file formats like "PNG", "GIF", "animated GIF", "spritesheet", "sprite sheet", "texture atlas", "tile sheet", or game engine integration with "Unity", "Godot", "Phaser", "Unreal", "GameMaker". Trigger on layout terms ("horizontal", "vertical", "grid", "packed", "strip"), scaling ("2x", "4x", "upscale", "pixel-perfect"), file operations ("save as", "export to", "output to"), metadata formats ("JSON", "XML", "metadata", "atlas data"), and delivery terms ("for web", "for game", "for Twitter", "for itch.io", "optimized").

pixel-artsprite-sheettexture-atlasjson
willibrandon
willibrandon
14

Convex Agents Files

Handles file uploads, image attachments, and media processing in agent conversations. Use this when agents analyze images, process documents, or generate files.

file-uploadsagent-tool-interfaceimage-processingdocument-processing
Sstobo
Sstobo
81

gemini-imagegen

Generate and edit images using the Gemini API (Nano Banana). Use this skill when creating images from text prompts, editing existing images, applying style transfers, generating logos with text, creating stickers, product mockups, or any image generation/manipulation task. Supports text-to-image, image editing, multi-turn refinement, and composition from multiple reference images.

text-to-imageimage-to-imageimage-processinggenerative-art
gupsammy
gupsammy
102

Page 1 of 2 · 26 results