Back to tags
Tag

Agent Skills with tag: image-analysis

13 skills match this tag. Use tags to discover related Agent Skills and explore similar workflows.

look-at

This skill should be used when the user asks to 'look at', 'analyze', 'describe', 'extract from', or 'what's in' media files like PDFs, images, diagrams, screenshots, or charts. Triggers include: 'what does this image show', 'extract the table from this PDF', 'describe this diagram', 'what's in this screenshot', 'analyze this chart', 'read this image', 'get text from this PDF', 'summarize this document', or requests for specific data extraction from visual or document files. Use when analyzed/interpreted content is needed rather than literal file reading (which uses Read tool).

pdfimage-analysisdata-extractiondocument-summarization
edwinhu
edwinhu
0

pathml

Computational pathology toolkit for analyzing whole-slide images (WSI) and multiparametric imaging data. Use this skill when working with histopathology slides, H&E stained images, multiplex immunofluorescence (CODEX, Vectra), spatial proteomics, nucleus detection/segmentation, tissue graph construction, or training ML models on pathology data. Supports 160+ slide formats including Aperio SVS, NDPI, DICOM, OME-TIFF for digital pathology workflows.

digital-pathologywhole-slide-imaginghistopathologymachine-learning
ovachiever
ovachiever
81

omero-integration

Microscopy data management platform. Access images via Python, retrieve datasets, analyze pixels, manage ROIs/annotations, batch processing, for high-content screening and microscopy workflows.

microscopyimage-analysispythonhigh-content-screening
ovachiever
ovachiever
81

image-comparison-tool

Compare images with SSIM similarity scoring, pixel difference highlighting, and side-by-side visualization.

image-comparisonSSIMpixel-differenceside-by-side-visualization
dkyazzentwatwa
dkyazzentwatwa
3

image-processing

Process, transform, and analyze images using common operations

image-processingimage-analysiscomputer-vision
tatat
tatat
1

VLM

Implement vision-based AI chat capabilities using the z-ai-web-dev-sdk. Use this skill when the user needs to analyze images, describe visual content, or create applications that combine image understanding with conversational AI. Supports image URLs and base64 encoded images for multimodal interactions.

computer-visionmultimodalimage-analysisconversational-ai
UholySmokes
UholySmokes
1

ai-multimodal

Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.

google-geminimultimodal-aiimage-analysisaudio-processing
zircote
zircote
42

Fluxwing Screenshot Importer

Import UI screenshots and generate uxscii components automatically using vision analysis. Use when user wants to import, convert, or generate .uxm components from screenshots or images.

computer-visionimage-analysisui-componentsscreenshot-capture
trabian
trabian
101

color-palette-extractor

Extract color palettes from images, websites, or designs. Identifies dominant colors, generates complementary schemes, and exports in multiple formats (HEX, RGB, HSL, Tailwind, CSS variables). Use when users need color schemes from images, brand colors, or design system palettes.

image-analysisdesign-tokensthemingcolor-palette
OneWave-AI
OneWave-AI
237

screenshot-to-code

Convert UI screenshots into working HTML/CSS/React/Vue code. Detects design patterns, components, and generates responsive layouts. Use this when users provide screenshots of websites, apps, or UI designs and want code implementation.

design-to-codeimage-analysishtml-cssreact
OneWave-AI
OneWave-AI
237

Computer Vision

Implement computer vision tasks including image classification, object detection, segmentation, and pose estimation using PyTorch and TensorFlow

computer-visiondeep-learningpytorchtensorflow
aj-geddes
aj-geddes
301

ui-designer

Extract design systems from reference UI images and generate implementation-ready UI design prompts. Use when users provide UI screenshots/mockups and want to create consistent designs, generate design systems, or build MVP UIs matching reference aesthetics.

design-systemui-designimage-analysisprompt-generation
daymade
daymade
15713

omero-integration

Microscopy data management platform. Access images via Python, retrieve datasets, analyze pixels, manage ROIs/annotations, batch processing, for high-content screening and microscopy workflows.

pythonAPImicroscopyimage-analysis
K-Dense-AI
K-Dense-AI
3,233360