Agent Skills: YouTube to Markdown

Use when user asks YouTube video extraction, get, fetch, transcripts, subtitles, or captions, or channel browsing. Writes video details and transcription into structured markdown file.

UncategorizedID: vre/flow-state/youtube-to-markdown

Install this agent skill to your local

pnpm dlx add-skill https://github.com/vre/flow-state/tree/HEAD/youtube-to-markdown

Skill Files

Browse the full folder contents for youtube-to-markdown.

Download Skill

Loading file tree…

youtube-to-markdown/SKILL.md

Skill Metadata

Name
youtube-to-markdown
Description
Use when user asks YouTube video extraction, get, fetch, transcripts, subtitles, or captions, or channel browsing. Writes video details and transcription into structured markdown file.

YouTube to Markdown

Multiple videos: Process one video at a time, sequentially. Do not run parallel extractions. Do not create your own scripts. Channel subfolder naming: Channel Name (channel_id) — always include the channel ID.

Step -1: Detect input type

If input is a channel URL (contains /@, /channel/, /c/, /user/) or a bare channel ID (starts with UC, 24 chars), but NOT watch?v=: Read and follow ./subskills/channel_browse.md.

Otherwise: Continue to Step 0.

Step 0: Check if extracted before

python3 ./scripts/run.py check "<YOUTUBE_URL>" "<output_directory>"

Output JSON contains video_id. Set BASE_NAME = youtube_{video_id} for all subsequent steps.

Tell the user: "Valitse 'Yes, allow all edits during this session'. Kirjoitan warmup-tiedoston Write-oikeuksien varmistamiseksi, jotta agenttien kirjoitusoikeudet toimivat ilman lisäkyselyjä."

Then Write "# Processing {video_id}\n" to <output_directory>/${BASE_NAME}_warmup.tmp. If Write fails because file exists and hasn't been read, Read it first then Write.

If exists: false: Continue to Step 1.

If exists: true: Read and follow ./subskills/update_flow.md.

Step 1: Choose output

AskUserQuestion:

  • question: "What do you want to extract from the video?"
  • header: "Output"
  • multiSelect: false
  • options: A. "Summary + Comments (Recommended)" - Summary cross-analyzed with comments. Timestamped transcript stored for reference. B. "Summary + Comments + Formatted Transcript with Watch Guide" - Option A + cleaned and formatted full transcript with synthesized watch guide → double tokens C. "Summary Only" - Summary of video content. Timestamped transcript stored for reference. D. "Formatted Transcript Only" - Cleaned and formatted full transcript

Step 2: Execute modules

Based on user's choice, read and follow each subskill instruction in ./subskills/{file}.

Concurrency: "|" = run concurrently as parallel foreground agents in one message. Never use run_in_background.

Delegation: Subagents cannot launch sub-subagents. Modules that need parallel Agent calls (transcript_polish.md) must be read and executed by the orchestrator directly — do NOT delegate them to an Agent. Only leaf modules (no internal Agent calls) can be delegated.

  • A: transcript_extract.md → (transcript_summarize.md | comment_extract.md) → comment_summarize.md
  • B: transcript_extract.md → (transcript_summarize.md | transcript_polish.md | comment_extract.md) → comment_summarize.md
  • C: transcript_extract.md → transcript_summarize.md
  • D: transcript_extract.md → transcript_polish.md

Bold = orchestrator reads and executes directly (not delegated to Agent).

For option B only, create marker file before running modules:

python3 ./scripts/run.py flag "<output_directory>/${BASE_NAME}_watch_guide_requested.flag"

Step 3: Finalize

python3 ./scripts/run.py assemble [flag] "${BASE_NAME}" "<output_directory>"

Flags: A=--summary-comments, B=(none), C=--summary-only, D=--transcript-only

Use --debug to keep intermediate files.