Scripts
1. adb-screen-capture.py
Capture Android device screen and save locally.
# Basic usage
uv run .claude/skills/adb-screen-detection/scripts/adb-screen-capture.py
# Specify device
uv run .claude/skills/adb-screen-detection/scripts/adb-screen-capture.py --device 127.0.0.1:5555
# Custom output path
uv run .claude/skills/adb-screen-detection/scripts/adb-screen-capture.py --output /tmp/screen.png
# JSON output
uv run .claude/skills/adb-screen-detection/scripts/adb-screen-capture.py --json
Output:
{
"device": "127.0.0.1:5555",
"timestamp": "2025-12-01T10:30:45Z",
"local_path": "/tmp/screenshot.png",
"size": [1080, 2400],
"success": true
}
2. adb-ocr-extract.py
Extract all visible text from device screen using Tesseract OCR.
# Basic usage (uses most recent screenshot)
uv run .claude/skills/adb-screen-detection/scripts/adb-ocr-extract.py
# Specify screenshot path
uv run .claude/skills/adb-screen-detection/scripts/adb-ocr-extract.py --image /tmp/screen.png
# Search for specific text
uv run .claude/skills/adb-screen-detection/scripts/adb-ocr-extract.py --search "Login"
# JSON output with coordinates
uv run .claude/skills/adb-screen-detection/scripts/adb-ocr-extract.py --json
Output:
{
"text": ["Login", "Username", "Password", "Submit"],
"detected": true,
"search_found": true,
"search_term": "Login",
"coordinates": {
"Login": [[100, 200, 150, 230]]
}
}
3. adb-find-element.py
Find UI element by template matching or OCR text search.
# Find by OCR text
uv run .claude/skills/adb-screen-detection/scripts/adb-find-element.py \
--method ocr \
--target "Login Button" \
--threshold 0.8
# Find by template image
uv run .claude/skills/adb-screen-detection/scripts/adb-find-element.py \
--method template \
--template /path/to/template.png \
--threshold 0.8
# JSON output
uv run .claude/skills/adb-screen-detection/scripts/adb-find-element.py \
--method ocr \
--target "Login" \
--json
Output:
{
"found": true,
"method": "ocr",
"target": "Login",
"coordinates": {
"x": 100,
"y": 200,
"width": 150,
"height": 30
},
"confidence": 0.95,
"message": "Element found at (100, 200)"
}
4. adb-tap-coordinate.py
Tap device screen at specific coordinates.
# Tap at coordinates
uv run .claude/skills/adb-screen-detection/scripts/adb-tap-coordinate.py \
--x 100 \
--y 200 \
--device 127.0.0.1:5555
# Tap with verification (check screen after tap)
uv run .claude/skills/adb-screen-detection/scripts/adb-tap-coordinate.py \
--x 100 \
--y 200 \
--verify-text "Next Screen" \
--timeout 5
# JSON output
uv run .claude/skills/adb-screen-detection/scripts/adb-tap-coordinate.py \
--x 100 \
--y 200 \
--json
Output:
{
"device": "127.0.0.1:5555",
"tap": {
"x": 100,
"y": 200
},
"success": true,
"verified": true,
"verify_text": "Next Screen",
"verification_match": true
}
Usage Patterns
Pattern 1: Verify Screen State Before Action
# 1. Capture current screen
adb-screen-capture.py
# 2. Check for expected element
adb-find-element.py --method ocr --target "Login Button"
# 3. If found, tap it
adb-tap-coordinate.py --x 100 --y 200 --verify-text "Welcome"
Pattern 2: OCR-Based Automation
# 1. Capture screen
adb-screen-capture.py
# 2. Extract all text
adb-ocr-extract.py --search "Settings"
# 3. Get coordinates and tap
adb-find-element.py --method ocr --target "Settings"
adb-tap-coordinate.py --x 150 --y 300
Pattern 3: Template-Based Element Detection
# 1. Have known UI template images in ./templates/
# 2. Capture screen
adb-screen-capture.py
# 3. Match against templates
adb-find-element.py --method template --template ./templates/button.png
# 4. Tap matched location
adb-tap-coordinate.py --x $(jq -r '.coordinates.x') --y $(jq -r '.coordinates.y')
Architecture
Design Principles:
- Independent: Each script can run standalone
- Chainable: Scripts output JSON for piping
- Stateless: No dependencies between executions
- Verifiable: Always verify screen state before proceeding
- Timeout Protected: All network operations have timeouts
Dependency Relationship:
adb-screen-capture.py (foundation)
↓
adb-ocr-extract.py (uses capture)
adb-find-element.py (uses capture or templates)
↓
adb-tap-coordinate.py (uses find-element for verification)
Integration Points
Used By:
adb-navigation-base- Wait for elements between actionsadb-magisk- Verify Magisk UI stateadb-karrot- Verify app state during automationadb-workflow-orchestrator- Screen verification in workflows
Dependencies:
- System:
adbcommand-line tool - Python: pytesseract, opencv-python, pillow, numpy
Troubleshooting
OCR Not Working
- Install Tesseract:
brew install tesseract(macOS) orapt-get install tesseract-ocr(Linux) - Set TESSDATA_PREFIX:
export TESSDATA_PREFIX=/usr/local/share/tessdata
Template Matching Too Strict/Loose
- Adjust
--thresholdparameter (0.0-1.0) - Higher threshold = stricter matching
- Recommended: 0.8-0.9 for reliable detection
Device Offline
- Check ADB connection:
adb devices - Reconnect:
adb connect <device> - Restart ADB:
adb kill-server && adb start-server
Workflows
This skill includes TOON-based workflow definitions for automation.
What is TOON?
TOON (Task-Oriented Orchestration Notation) is a structured workflow definition language that pairs with Markdown documentation. Each workflow consists of:
- [name].toon - Orchestration logic and execution steps
- [name].md - Complete documentation and usage guide
This TOON+MD pairing approach is inspired by the BMAD METHOD pattern, adapted to use TOON instead of YAML for better orchestration support.
Available Workflows
Workflow files are located in workflow/ directory:
Example Workflows (adb-screen-detection):
workflow/screen-verification.toon- Capture and verify screen stateworkflow/element-detection.toon- Find elements via OCR or template matchingworkflow/screen-monitoring.toon- Continuous screen monitoring and analysis
Running a Workflow
Execute any workflow using the ADB workflow orchestrator:
uv run .claude/skills/adb-workflow-orchestrator/scripts/adb-run-workflow.py \
--workflow .claude/skills/adb-screen-detection/workflow/screen-verification.toon \
--param device="127.0.0.1:5555"
Workflow Documentation
Each workflow includes comprehensive documentation in the corresponding .md file:
- Purpose and use case
- Prerequisites and requirements
- Available parameters
- Execution phases and steps
- Success criteria
- Error handling and recovery
- Example commands
See the workflow/ directory for complete TOON file definitions and documentation.
Creating New Workflows
To create custom workflows for this skill:
- Create a new
.toonfile in theworkflow/directory - Define phases, steps, and parameters using TOON v4.0 syntax
- Create corresponding
.mdfile with comprehensive documentation - Test with the workflow orchestrator
For more information, refer to the TOON specification and the workflow orchestrator documentation.
Version: 1.0.0 Status: ✅ Foundation Tier Scripts: 4 (all MCP-ready) Last Updated: 2025-12-01 Tier: 2 (Foundation)