Create Media With Vofy CLI
Deliver generated media with a deterministic, non-interactive workflow.
Workflow
- Run
vofy status; if auth fails, tell the user to runvofy login. - Identify output type, source assets, aspect ratio, duration, resolution, and whether local files are required.
- Load
imagine-promptwhen the user gives a rough idea, asks for prompt improvement, or model-specific wording matters. - Choose the simplest matching mode from the tables below.
- Pick a default model unless the user named one; load
imagine-modelsonly for strict limits, price, or special flags. - Build one non-interactive command with
--yesand, when local output is useful,--download-to ./output. - Return file paths or resource URLs; for async jobs, return the task id and next check command.
If vofy is missing, stop and ask the user to install vofy-cli@0.1.7 and authenticate once.
Default Model Shortcuts
| Need | Default |
| --- | --- |
| General image | seedream-4.5 |
| Image editing or transparent assets | gpt-image-1.5 |
| Premium text-to-video | veo-3.1 |
| Fast video draft | veo-3.1-fast |
| Animate an image | kling-3.0 |
| Transform or extend video | seedance-2.0 |
Check vofy models <model> before adding optional flags such as --audio, --background, --web-search, --multi-shot, or motion-control settings.
Image Modes
| Intent | Mode | Required flags |
| --- | --- | --- |
| Text prompt | text_to_image | --prompt |
| Transform image | image_to_image | --prompt --image <path> |
| Edit masked area | inpainting | --prompt --image <path> --mask <path> |
Base command:
vofy image create --model <model> --prompt "<prompt>" --aspect-ratio <ratio> --resolution <resolution> --yes --download-to ./output
Video Modes
| Intent | Mode | Required flags |
| --- | --- | --- |
| Text prompt | text_to_video | --prompt |
| Animate image | image_to_video | --prompt --first-frame <path> |
| Morph images | interpolation | --first-frame <path> --last-frame <path> |
| Image references | reference_images | --prompt --reference-image <path> |
| Mixed references | multimodal_reference | --mode multimodal_reference --reference-image <path> plus optional --reference-video / --reference-audio |
| Transform video | video_to_video | --mode video_to_video --prompt --video <path> |
| Extend video | video_extension | --mode video_extension --prompt --video <path> |
| Control motion | motion_control | Model-specific trajectory flags |
Base command:
vofy video create --model <model> --prompt "<prompt>" --duration <seconds> --aspect-ratio <ratio> --yes --download-to ./output
Result Handling
- Sync create commands wait for completion and print output by default.
--download-to ./outputsaves files locally and creates the directory if needed.--result-urlprints generated resource URLs explicitly after completion.--asyncreturns early; usevofy tasks --plain --type videoandvofy task <id_or_prefix> --download-to ./outputlater.- If the command fails because a value is unsupported, run
vofy models <model>and retry with one of the listed ratios, resolutions, durations, or modes.
Common Validation Traps
--videois ambiguous; always add--mode video_to_videoor--mode video_extension.- Mixed
--reference-imagewith--reference-videoor--reference-audiorequires--mode multimodal_reference. kling-2.6needsresolution=1080pfor--audioand for last-frame interpolation.kling-3.0 --multi-shotrequires--shot-type;customizeuses--multi-prompt, whileintelligenceuses--prompt.- Source-driven modes may ignore
--aspect-ratioor--resolution; trust derived values from input media.
See examples.md for broader scenarios and commands-reference.md for full CLI help.