YouTube Publish (Scripted Flow)
Use scripts in order. Stop for validation after copy + thumbnail generation. If the user did not already specify them, ask up front for:
- the exact publish day/time
- whether they want English dubbing for YouTube
- whether they also want the English X variant
Rule: the English X variant depends on the English YouTube dubbing pack. It is valid to do English dubbing for YouTube without doing the English X variant, but not the other way around.
Behavior rules for the agent
- Tone & Authority: Titles and copy must focus on engineering, architecture, and solving developer friction. Avoid reaction-style hype and mass-content phrasing.
- Title Blacklist (strict): Forbidden in titles:
RIP,Increíble,Brutal,Locura,Definitivo,¿El fin de...?, and crown/fire emojis. - Technical Anchor (strict): Every title must include at least one engineering keyword:
Orquestación,Despliegue,Infraestructura,Clean Architecture,Refactorización,Pipeline,Capa de Abstracción. - Title Derivation: Do not ask for a title hint; derive it from the video stem and the technical density of the SRT.
- Scheduling: If the user provides a publish time, resolve to exact
YYYY-MM-DD HH:MMusing system time and pass--publish-at+--timezone. Always determine and pass--timezone. - Missing decisions: If publish date/time, English YouTube dubbing, or English X dubbing were not explicitly provided, ask for them before starting the workflow.
- Content Generation Engine: Titles, thumbnail ideas, description, chapters, and LinkedIn copy must be written by the same model executing this skill.
- English Variant: When the user wants a multilingual YouTube version, always generate an English pack for the same video: translated transcript, English title, English description, and dubbed English audio.
- English Scope: The English pack is for YouTube multi-language audio/localization only. Do not create English social posts unless the user explicitly asks for them.
- English X Dependency: If the user wants the English X variant, that implies the English YouTube dubbing pack must also be produced first.
- Thumbnail Generation: Generate 3 thumbnails using the presenter photo set. Default presenter is
antonio(assets/antonio-1.png,antonio-2.png,antonio-3.png). If the user indicates the video is from Nino, switch presenter tonino(assets/nino-1.png,nino-2.png,nino-3.png). Keep only two non-negotiables: (1) massive bold white text (max 3-4 words), (2) cinematic dark look with cyan/magenta accents. Everything else should adapt to the video's narrative with maximum creative freedom. - Reference Photos (strict): For each generated thumbnail, always pass the 3 presenter images together as references. They are identity anchors (not fixed poses); the model is free to choose the best posture/composition.
- Thumbnail Engine: Reuse
nano-banana-pro/scripts/generate_image.pyas the single image generation engine. Keep default model behavior (Flash). Only override model explicitly when requested. - Thumbnail Creativity Rule: Deliver 1 safer option + 2 exploratory options. Avoid producing near-duplicates. If an unconventional concept communicates better for that specific video, prioritize it.
- English Thumbnail Rule: If the English YouTube dubbing pack is requested, once the user chooses the final thumbnail you must create an English-edited version of that same thumbnail by editing the selected image so its main headline text is in English instead of Spanish. Keep the composition, styling, and identity intact; only adapt the main headline text.
- Workflow: Upload a private draft before generating copy so the video URL can be used in social text.
- Newsletter: Disabled in this flow. Do not generate or schedule newsletter here.
- X Strategy: Do not schedule/publish to X via PostFlow in this flow. X is handled as native video upload outside this step.
- X Variants: The default X asset is the Spanish native video with the selected Spanish thumbnail embedded as the first 500ms. If the user requested the English X variant too, also build an English native X video using the English-dubbed video plus the English-edited thumbnail.
- Links: In social posts, the comment must not be just the link; it must include a brief descriptive text inviting to watch (e.g., "Watch the full technical analysis here: https://...").
- Comment Sequence: For final publish/update, always use this order: set video to
unlisted, insert promo comment (Domina la IA...), then set final status (privatewithpublishAtif scheduled, otherwiseprivate). - Schedule Decision Required: Never publish without an explicit decision in
Programación (final): either a dateYYYY-MM-DD HH:MMorprivate. - Timing: Schedule social posts 15 minutes after the YouTube publish time.
- Audio Consistency:
prepare_video.pyruns audio normalization inautomode by default. It analyzes LUFS/true peak/LRA and only re-encodes audio when out of target.
Content Styles
LinkedIn Post Style
- Length/Format: 600–900 characters, 3–6 short paragraphs, 1–2 emojis.
- Strategy (Signal vs Noise): Start with a principle or real engineering problem, not with "new video" framing.
- Identity: Keep the tone of technical authority. Fewer creator-marketing phrases, more architecture conclusions and tradeoffs.
- Scope: 1 central idea focused on technical authority. No digressions.
- Closing: Final line “Link en el primer comentario.” followed by a short question or CTA.
- Restrictions: No hashtags.
Scripted flow (order)
-
Prepare video
- Command:
python scripts/prepare_video.py --videos /path/v1.mp4 [/path/v2.mp4 ...] - Audio behavior (default):
--audio-normalization autotargets-14 LUFS,-1 dBTP, and maxLRA 9; it skips normalization when already in range. - Optional:
--audio-normalization alwaysto force normalization.--audio-normalization offto skip analysis and normalization.
- Output JSON with
workdir,video,slug.
- Command:
-
Upload draft (private)
- Command:
python scripts/upload_draft.py --video <video> --output-video-id <workdir>/video_id.txt --client-secret <path> - Write
video_id.txtand createvideo_url.txt.
- Command:
-
Transcribe + clean
- Command:
python scripts/transcribe_parakeet.py --video <video> --out-dir <workdir> - Outputs:
transcript.es.cleaned.srttranscript.es.dub.srt(same transcript resegmented into more natural dubbing units)
- After transcription, copy the SRT to the vault transcripts folder:
cp <workdir>/transcript.es.cleaned.srt ~/Documents/aipal/transcripts/<YYYY-MM-DD>-<slug>.srt - Use today's date and the video slug from step 1 as the filename.
- Command:
-
Prepare English dubbing assets (when multilingual output is requested)
- Read
<workdir>/transcript.es.dub.srtwhen present (fallback:transcript.es.cleaned.srt) and create:<workdir>/transcript.en.srttranslated to natural English while preserving timestamps.<workdir>/title.en.txtwith 1 English YouTube title for the dubbed track.<workdir>/description.en.txtwith 1 English YouTube description for the dubbed track.
- Generate dubbed English audio using the
youtube-dubberproject. - Default dubbing path:
scripts/dub_voxtral.py- model
voxtral-mini-tts-latest - English reference clip from the presenter's own voice when available
- Fallback path:
- Chatterbox / Qwen only if Voxtral is unavailable or clearly worse for a specific run
- Save at least:
<workdir>/dubbed_audio.en.wav<workdir>/dubbed_video.en.mp4if the dubbing pipeline also muxes the video<workdir>/title.en.txt<workdir>/description.en.txt
- Keep the English title/description technically faithful to the Spanish source, not marketing-localized beyond what is needed for natural English.
- The goal is to run Voxtral through the same timed dubbing pipeline as the other models, not a manual narration-only shortcut.
- Read
-
Generate copy with the calling model
- Read
<workdir>/transcript.es.cleaned.srtdirectly and generate:- 3 Technical Authority Titles.
- 3 Thumbnail ideas (Artifact-based).
- Description (remove any self-link to current video).
- Chapters (MM:SS).
- LinkedIn post (per rules).
- Save the result into
<workdir>/content.md. - Also save thumbnail concepts into
<workdir>/ideas.jsonwith this shape:{ "titles": ["...", "...", "..."], "thumbnails": [ {"text": "...", "artifact": "...", "concept": "..."}, {"text": "...", "artifact": "...", "concept": "..."}, {"text": "...", "artifact": "...", "concept": "..."} ] } - Make sure
content.mdcontains at least these sections so downstream validation stays compatible:# Pack YouTube — <slug> ## Enlace del vídeo <video_url> ## Títulos - ... - ... - ... ## Ideas de thumbnails 1. Texto: ... Artifact: ... Concept: ... ## Descripción ... ## Capítulos 00:00 ... ## LinkedIn ... ## Título (final) ## Descripción (final) ## Capítulos (final) ## Post LinkedIn (final) ## Thumbnail (final) ## Programación (final) (YYYY-MM-DD HH:MM o "private") ## Title (EN) ## Description (EN) - Title quality gate: reject title candidates that break blacklist or technical-anchor rules.
- Read
-
Generate 3 thumbnails
- Use presenter photos according to context: by default
antonio, andninoonly when explicitly requested for a Nino video. Create 3 images into<workdir>/thumb-1.png,thumb-2.png,thumb-3.png. - The same model executing the skill must derive the 3 image prompts from the transcript and
ideas.json, then save each prompt into<workdir>/thumb-1.prompt.txt,thumb-2.prompt.txt,thumb-3.prompt.txt. - Keep the two anchors fixed (massive white text + cinematic cyan/magenta look), but allow concept/composition/artifact/background to vary freely by story.
- Target mix: 1 safe option + 2 exploratory options.
- Example with multiple inputs:
uv run /path/to/nano-banana-pro/scripts/generate_image.py --prompt "Antonio working..." --filename "thumb-1.png" --input-image assets/antonio-1.png assets/antonio-2.png assets/antonio-3.png - If using helper scripts:
- Batch render from an existing
ideas.json:python scripts/generate_missing_thumbs.py --presenter antonio --out-dir <videos-root> - Nino video:
python scripts/generate_missing_thumbs.py --presenter nino --out-dir <videos-root> - Optional override:
--image-model <model>only if explicitly needed; otherwise keep the default model.
- Batch render from an existing
- Use presenter photos according to context: by default
-
Stop to ask for validation of:
- Title (choose one of the 3 generated).
- Thumbnail (choose one of the 3 generated).
- Description (edit if needed).
- Chapters (edit if needed).
- LinkedIn post (edit if needed).
- English title (edit if needed).
- English description (edit if needed).
- After the user confirms the final thumbnail and the English YouTube dubbing pack is enabled, create the English-edited thumbnail before any final X-video build.
-
Update YouTube
- Command:
python scripts/update_youtube.py --video-id <id> --title "..." --description-file <desc.txt> --thumbnail <thumb.png> --publish-at "YYYY-MM-DD HH:MM" --timezone <IANA> --client-secret <path>
- Command:
-
Build native X video variant (after thumbnail choice)
- Command:
python scripts/build_x_native_video.py --video <video.mp4> --thumbnail <thumb.png> --output <workdir>/video-x.mp4 --intro-ms 500 - Result: a version ready for X where the first 500ms shows the selected thumbnail as a static cover frame.
- Always build the Spanish X variant from the original Spanish video + final Spanish thumbnail.
- If the user requested the English X variant too, also build:
python scripts/build_x_native_video.py --video <workdir>/dubbed_video.en.mp4 --thumbnail <english-thumb.png> --output <workdir>/video-x.en.mp4 --intro-ms 500 - The English X variant must use the English-edited thumbnail, not the Spanish one.
- Command:
-
Schedule socials (PostFlow, excluding X)
- Command:
python scripts/schedule_socials.py --text-file <linkedin.txt> --scheduled-date <ISO8601+offset> --comment-url <video_url> --image <thumb.png> - This script publishes to configured socials except X.
- Note:
schedule_socials.pypercent-encodes underscores in the--comment-url(e.g._->%5F) to avoid LinkedIn URL formatting issues.
- Final Reminder
- Explicitly remind the user to go to YouTube Studio to:
- Enable monetization (not supported via API).
- Add End Screens (not supported via API).
- If multilingual output was requested, upload the English dubbed audio track and apply the English title/description in the YouTube Studio multi-language UI.
- If the English X variant was requested, remember that there should now be two native X assets ready: the Spanish
video-x.mp4and the Englishvideo-x.en.mp4.