Seer
Overview
Capture a precise screenshot of a visible app window, annotate it for quick UI mockups, then compare against baselines to keep visual state in the agent loop.
Quick start
- Ensure the target app is running and Screen Recording + Accessibility are enabled for your terminal. (Automation -> System Events required for typing.)
- Run the script:
bash scripts/capture_app_window.sh(defaults to frontmost app, output.seer/capture/app-window-<app>-YYYYMMDD-HHMMSS-<pid>-<rand>.png)bash scripts/capture_app_window.sh /path/to/out.png "Promptlight"(custom output + process name)
- (Optional) Record video + extract frames:
bash scripts/record_app_window.sh --duration 3 --frames --fps 20bash scripts/record_simulator.sh --duration 3 --summary --summary-sheet --summary-gifbash scripts/record_screen.sh --duration 3bash scripts/extract_frames.sh /path/to/video.mov --fps 20bash scripts/record_app_window.sh --duration 3 --summary --summary-sheet --summary-gifbash scripts/summarize_video.sh /path/to/video.mov --mode scene --sheet --gif
- (Optional) Create a mockup with annotations:
bash scripts/mockup_ui.sh --spec spec.jsonbash scripts/mockup_ui.sh --spec spec.json --json
- (Optional) Store + compare in the visual loop:
bash scripts/loop_compare.sh /path/to/out.png web-home- First run creates a baseline under
$SEER_LOOP_DIR(default.seer/loop)
- Attach the current image (and diff image, if generated) with
view_image.
Usage
bash scripts/capture_app_window.sh --helpbash scripts/capture_app_window.sh [out_path] [process_name]out_pathdefault.seer/capture/app-window-<app>-YYYYMMDD-HHMMSS-<pid>-<rand>.pngprocess_namedefault frontmost app- set
SEER_OUT_DIRto change default output root (falls back toSEER_TMP_DIRfor legacy behavior)
bash scripts/type_into_app.sh --helpbash scripts/type_into_app.sh --app "Promptlight" --text "hello" --enterbash scripts/type_into_app.sh --app "Promptlight" --click-rel 120,180 --text "hello"bash scripts/type_into_app.sh --text "hello" --no-activatebash scripts/record_app_window.sh --helpbash scripts/record_app_window.sh --duration 3 --frames --fps 20bash scripts/record_app_window.sh --duration 3 --summary --summary-mode scene --summary-max 24 --summary-sheet --summary-gifbash scripts/record_app_window.sh --simulator --duration 3 --summary --summary-sheetbash scripts/record_simulator.sh --duration 3 --summary --summary-sheetbash scripts/record_screen.sh --helpbash scripts/record_screen.sh --duration 3 --display 1bash scripts/record_screen.sh --duration 3 --region 100,100,800,600bash scripts/extract_frames.sh --helpbash scripts/extract_frames.sh /path/to/video.mov --fps 20bash scripts/summarize_video.sh --helpbash scripts/summarize_video.sh /path/to/video.mov --mode scene --sheet --gifbash scripts/mockup_ui.sh --helpbash scripts/mockup_ui.sh --spec spec.jsonbash scripts/mockup_ui.sh --spec spec.json --jsonpython3 scripts/excalidraw_from_text.py --helppython3 scripts/excalidraw_from_text.py --text "header: Settings; list: Account, Notifications, Privacy; button: Log out"python3 scripts/excalidraw_from_text.py --text $'screen: Home\nheader: Home\nbutton: Get started\n\nscreen: Settings\nheader: Settings\nlist: Account, Notifications\nbutton: Log out' --theme classic --fidelity mediumcat prompt.txt | python3 scripts/excalidraw_from_text.py --name settingspython3 scripts/excalidraw_from_text.py --text "lib: Search field | Search" --json(explicit Excalidraw library item)python3 scripts/annotate_image.py input.png output.png --spec spec.jsonpython3 scripts/annotate_image.py --spec-help(prints JSON spec schema)annotate_image.pysupports top-leveldefaults(e.g.,auto_scale,outline,text_bg),spotlightannotations to dim the background,fit(enabled by default) to auto-adjust rect/spotlight bounds, andanchor/from/tofor auto-anchoring labels and arrows.bash scripts/loop_compare.sh --helpbash scripts/loop_compare.sh [--loop-dir <path>] [--resize] [--update-baseline] <current_path> <baseline_name>- set
SEER_LOOP_DIRto change default loop directory (default.seer/loop) - consider adding
.seer/to.gitignore
- set
Workflow
- Capture
scripts/capture_app_window.sh- If it fails, rerun with explicit process name or verify permissions.
- Record (optional)
scripts/record_app_window.sh --duration 3 --frames --fps 20scripts/record_simulator.sh --duration 3 --summary --summary-sheetscripts/record_screen.sh --duration 3- Use frames for granular UI change analysis.
record_app_window.shauto-activates when--process/--simulatoris used and warns if frontmost app differs.
- Compare (optional)
scripts/loop_compare.sh <current_path> <baseline_name>- Stores
baselines/,latest/,history/,diffs/,reports/under.seer/loop(or$SEER_LOOP_DIR)
- Inspect
- Use
view_imageto load the current image and diff image.
- Use
- Iterate
- Repeat after UI changes or window repositioning.
Video summary flags (record_app_window.sh --summary)
--summary-mode <scene|fps|keyframes>: selection strategy (default:scene)--summary-scene <threshold>: scene-change sensitivity (default:0.30)--summary-fps <n>: sampling rate forfpsmode (default:2)--summary-max <n>: cap frame count (default:24,0disables cap)--summary-out <dir>: output folder--summary-sheet: createsheet.pngcontact sheet--summary-sheet-cols <n>: contact sheet columns (default: auto)--summary-gif: createpreview.gif--summary-gif-width <px>: GIF max width (default:640)
Resources
scripts/
capture_app_window.sh: grabs window bounds via System Events and runsscreencapture -x -R.record_app_window.sh: records a window region to.movviascreencapture -v(optionally extracts frames).record_simulator.sh: convenience wrapper for iOS Simulator recording.record_screen.sh: records full screen or a region to.movviascreencapture -v.extract_frames.sh: extracts frames from a video viaffmpeg.summarize_video.sh: extracts representative frames (scene/fps/keyframes) + optional contact sheet or GIF (falls back to fps when scene yields too few frames).type_into_app.sh: focuses app and types text via System Events keystrokes.excalidraw_from_text.py: converts a natural-language-ish prompt into a.excalidrawscene file under.seer/excalidraw/(supportsscreen: Namefor multi-screen; uses the bundled Excalidraw library when present).annotate_image.py: draws arrows, rectangles, and text on an image (requirespython3 -m pip install pillow).mockup_ui.sh: capture window (optional) then annotate using a JSON spec.compare_images.py: compares baseline vs current and emits diff metrics + optional diff image (requirespython3 -m pip install pillow).loop_compare.sh: manages baselines, history, and diff outputs for visual regression loops.
assets/
assets/excalidraw/wireframe-ui-kit.excalidrawlib: default Excalidraw UI library used byexcalidraw_from_text.pywhen present (override with--libraryor disable with--no-library).assets/excalidraw/basic-ux-wireframing-elements.excalidrawlib: fallback library (smaller) if the UI kit is missing.
Output layout (default)
Under .seer/:
capture/window screenshotsrecord/window recordings + extracted frame foldersrecord/video summaries + contact sheets/GIFsmockup/annotated mockups + their capture/spec/meta (also writeslatest-*convenience copies)excalidraw/generated.excalidrawscenes (also writeslatest-*.excalidraw)loop/visual regression loop storage (baselines/latest/history/diffs/reports)