PowerPoint Automation
AI-powered PPTX generation using Orchestrator-Workers pattern.
When to Use
- PowerPoint, PPTX, create presentation, slides
- Convert web articles/blog posts to presentations
- Translate English PPTX to Japanese
- Create presentations using custom templates
Quick Start
From Web Article:
Create a 15-slide presentation from: https://zenn.dev/example/article
From Existing PPTX:
Translate this presentation to Japanese: input/presentation.pptx
Workflow
TRIAGE → PLAN → PREPARE_TEMPLATE → EXTRACT → TRANSLATE → BUILD → REVIEW → DONE
| Phase | Script/Agent | Description |
| ------- | ------------------------- | ---------------------- |
| EXTRACT | extract_images.py | Content → content.json |
| BUILD | create_from_template.py | Generate PPTX |
| REVIEW | PPTX Reviewer | Quality check |
Key Scripts
→ references/SCRIPTS.md for complete reference
| Script | Purpose |
| ------------------------- | -------------------------------------- |
| create_from_template.py | Generate PPTX from content.json (main) |
| reconstruct_analyzer.py | Convert PPTX → content.json |
| extract_images.py | Extract images from PPTX/web |
| validate_content.py | Validate content.json schema |
| validate_pptx.py | Detect text overflow |
content.json (IR)
All agents communicate via this intermediate format:
{
"slides": [
{ "type": "title", "title": "Title", "subtitle": "Sub" },
{ "type": "content", "title": "Topic", "items": ["Point 1"] }
]
}
→ references/schemas/content.schema.json
Templates
| Template | Purpose | Layouts |
| ---------------------- | --------------------------- | --------- |
| assets/template.pptx | デフォルト (Japanese, 16:9) | 4 layouts |
template レイアウト詳細
| Index | Name | Category | 用途 | | ----- | ----------------------- | -------- | ---------------------- | | 0 | タイトル スライド | title | プレゼン冒頭 | | 1 | タイトルとコンテンツ | content | 標準コンテンツ | | 2 | 1_タイトルとコンテンツ | content | 標準コンテンツ(別版) | | 3 | セクション見出し | section | セクション区切り |
使用例:
python scripts/create_from_template.py assets/template.pptx content.json output.pptx --config assets/template_layouts.json
テンプレート管理のベストプラクティス
複数デザイン(スライドマスター)の整理
テンプレートPPTXに複数のスライドマスターが含まれている場合、出力が不安定になることがあります。
確認方法:
python scripts/create_from_template.py assets/template.pptx --list-layouts
対処法:
- PowerPointでテンプレートを開く
- [表示] → [スライドマスター] を選択
- 不要なスライドマスターを削除
- 保存後、
template_layouts.jsonを再生成
python scripts/analyze_template.py assets/template.pptx
content.json の階層構造
箇条書きに階層構造(インデント)を持たせる場合は items ではなく bullets 形式を使用(items はフラット表示になる):
{"type": "content", "bullets": [
{"text": "項目1", "level": 0},
{"text": "詳細1", "level": 1},
{"text": "項目2", "level": 0}
]}
Agents
→ references/agents/ for definitions
| Agent | Purpose | | ------------- | --------------------- | | Orchestrator | Pipeline coordination | | Localizer | Translation (EN ↔ JA) | | PPTX Reviewer | Final quality check |
Design Principles
- SSOT: content.json is canonical
- SRP: Each agent/script has one purpose
- Fail Fast: Max 3 retries per phase
- Human in Loop: User confirms at PLAN phase
URL Format in Slides
Reference URLs must use "Title - URL" format for APPENDIX slides:
VPN Gateway の新機能 - https://learn.microsoft.com/ja-jp/azure/vpn-gateway/whats-new
→ references/content-guidelines.md for details
References
| File | Content | | --------------------------------------------------------- | -------------------- | | SCRIPTS.md | Script documentation | | USE_CASES.md | Workflow examples | | content-guidelines.md | URL format, bullets | | agents/ | Agent definitions | | schemas/ | JSON schemas |
Technical Content Addition (Azure/MS Topics)
When adding Azure/Microsoft technical content to slides, follow the same verification workflow as QA:
Workflow
[Content Request] → [Researcher] → [Reviewer] → [PPTX Update]
↓ ↓
Docs MCP 検索 内容検証
Required Steps
- Research Phase: Use
microsoft_docs_search/microsoft_docs_fetchto gather official information - Review Phase: Verify the accuracy of content before adding to slides
- Build Phase: Update content.json and regenerate PPTX
Forbidden
- ❌ Adding technical content without MCP verification
- ❌ Skipping review for "simple additions"
- ❌ Generating PPTX while PowerPoint has the file open
File Lock Prevention
Before generating PPTX, check if the file is locked:
# Check if file is locked
$path = "path/to/file.pptx"
try { [IO.File]::OpenWrite($path).Close(); "File is writable" }
catch { "File is LOCKED - close PowerPoint first" }
Shape-based Architecture Diagrams
When creating network/architecture diagrams, use PowerPoint shapes instead of ASCII art text boxes. ASCII art is unreadable in presentation mode.
Design Pattern
from pptx.enum.shapes import MSO_SHAPE
from pptx.dml.color import RGBColor
from pptx.util import Cm, Pt
# Color scheme
AZURE_BLUE = RGBColor(0, 120, 212)
LIGHT_BLUE = RGBColor(232, 243, 255)
ONPREM_GREEN = RGBColor(16, 124, 65)
LIGHT_GREEN = RGBColor(232, 248, 237)
# Outer frame (Azure VNet)
box = slide.shapes.add_shape(MSO_SHAPE.ROUNDED_RECTANGLE, left, top, w, h)
box.fill.solid()
box.fill.fore_color.rgb = LIGHT_BLUE
box.line.color.rgb = AZURE_BLUE
# Dashed connector (tunnel)
conn = slide.shapes.add_connector(1, x1, y1, x2, y2) # 1 = straight
conn.line.color.rgb = AZURE_BLUE
conn.line.dash_style = 2 # dash
Layout Tips
- Use
Cm()for positioning (notInches()) — easier to reason about on metric-based slides - Leave at least 1.5cm vertical gap between Azure and on-premises zones for tunnel lines
- Place labels inside boxes (not overlapping edges) to avoid visual clutter
- Use color coding to distinguish zones: blue = Azure, green = on-premises, orange = cross-connect
- For dual diagrams (side-by-side), split slide into left/right halves with 12cm left margin for the right diagram
Anti-patterns
❌ ASCII art in textboxes (unreadable in presentation mode)
❌ Overlapping shapes due to insufficient spacing
❌ Placing labels outside their parent containers
❌ Using absolute EMU values without helper functions
Hyperlink Batch Processing
Batch-add hyperlinks and page titles to all URLs in a presentation:
Workflow
import re
url_pattern = re.compile(r'(https?://[^\s\))]+)')
# 1. Build URL→Title map (use MCP docs_search or fetch_webpage)
URL_TITLES = {
'https://learn.microsoft.com/.../whats-new': 'Azure VPN Gateway の新機能',
...
}
# 2. Iterate all runs and add hyperlinks
for slide in prs.slides:
for shape in slide.shapes:
if not shape.has_text_frame:
continue
for para in shape.text_frame.paragraphs:
for run in para.runs:
urls = url_pattern.findall(run.text)
for url in urls:
if not (run.hyperlink and run.hyperlink.address):
run.hyperlink.address = url.rstrip('/')
# Prepend title if missing
title = URL_TITLES.get(url.rstrip('/'))
if title and title not in run.text:
run.text = f'{title}\n{url}'
Verification
hlink_count = sum(
1 for slide in prs.slides
for shape in slide.shapes if shape.has_text_frame
for para in shape.text_frame.paragraphs
for run in para.runs
if run.hyperlink and run.hyperlink.address
)
print(f'Hyperlinks: {hlink_count}')
Font Theme Token Resolution (ZIP-level)
python-pptx sometimes leaves theme tokens (+mn-ea, +mj-lt) unresolved, causing font fallback. Fix via ZIP-level string replacement:
import zipfile, re, shutil
FONT_JA = 'BIZ UDPゴシック'
FONT_LATIN = 'BIZ UDPGothic'
tmp = out + '.tmp'
shutil.copy2(out, tmp)
with zipfile.ZipFile(tmp, 'r') as zin:
with zipfile.ZipFile(out, 'w', zipfile.ZIP_DEFLATED) as zout:
for item in zin.infolist():
data = zin.read(item.filename)
if item.filename.endswith('.xml'):
content = data.decode('utf-8')
content = content.replace('+mn-ea', FONT_JA)
content = content.replace('+mj-ea', FONT_JA)
content = content.replace('+mn-lt', FONT_LATIN)
content = content.replace('+mj-lt', FONT_LATIN)
content = re.sub(
r'(<a:ea typeface=")[^"]*(")',
f'\\g<1>{FONT_JA}\\2', content
)
data = content.encode('utf-8')
zout.writestr(item, data)
os.remove(tmp)
⚠️ Always do this after
prs.save(), not before.
Section Management via XML
PowerPoint sections are stored as an extension in ppt/presentation.xml. python-pptx has no native section API.
Adding/Updating Sections
import re, uuid, zipfile
SECTION_URI = '{521415D9-36F7-43E2-AB2F-B90AF26B5E84}'
P14_NS = 'http://schemas.microsoft.com/office/powerpoint/2010/main'
# Read presentation.xml from ZIP
with zipfile.ZipFile(pptx_path) as z:
pres_xml = z.read('ppt/presentation.xml').decode('utf-8')
# Ensure p14 namespace is declared
if f'xmlns:p14="{P14_NS}"' not in pres_xml:
pres_xml = pres_xml.replace('<p:presentation',
f'<p:presentation xmlns:p14="{P14_NS}"', 1)
# Extract slide IDs
slide_ids = re.findall(r'<p:sldId id="(\d+)"', pres_xml)
# Define sections: (name, start_slide_0based)
sections = [("表紙", 0), ("本編", 2), ("Appendix", 15)]
# Build section XML
section_parts = []
for idx, (name, start) in enumerate(sections):
end = sections[idx+1][1] if idx+1 < len(sections) else len(slide_ids)
refs = ''.join(f'<p14:sldId id="{slide_ids[i]}"/>'
for i in range(start, min(end, len(slide_ids))))
sec_id = '{' + str(uuid.uuid4()).upper() + '}'
section_parts.append(
f'<p14:section name="{name}" id="{sec_id}">'
f'<p14:sldIdLst>{refs}</p14:sldIdLst></p14:section>'
)
# Insert into extLst
new_ext = (f'<p:ext uri="{SECTION_URI}">'
f'<p14:sectionLst xmlns:p14="{P14_NS}">'
+ ''.join(section_parts)
+ '</p14:sectionLst></p:ext>')
# Write back to ZIP
Important Notes
- The URI
{521415D9-36F7-43E2-AB2F-B90AF26B5E84}is specific to the presenter's PowerPoint version; some versions use different URIs - Always remove existing section XML before inserting new ones (avoid duplicates)
- Section changes only show in PowerPoint's slide sorter view after re-opening the file
Slide Layout Change (Safe Pattern)
python-pptx does NOT safely support direct layout swapping. Use the add-move-hide-cleanup pattern:
add_slide(target_layout)— new slide at the end- Set title text on the new slide's placeholder (
placeholder_format.idx == 0) - Move new slide to old slide's position via
sldIdLstXML manipulation (reverse order) - Hide & clear old slide (
show='0', remove shapes) - Save, re-open, delete hidden slides in a separate pass
# Step 3: Move new slide (last) before old slide
sldIdLst = prs.part._element.find(qn('p:sldIdLst'))
slides_list = list(sldIdLst)
new_el = slides_list[-1]
old_el = list(sldIdLst)[target_idx]
sldIdLst.remove(new_el)
sldIdLst.insert(list(sldIdLst).index(old_el), new_el)
# Step 4: Hide old slide (now at target_idx + 1)
old_slide._element.set('show', '0')
for shape in list(old_slide.shapes):
shape._element.getparent().remove(shape._element)
Forbidden Patterns (★ Critical)
| Pattern | Problem | Result |
|---------|---------|--------|
| rel._target = new_layout.part | Layout reference corrupted | PowerPoint repair dialog |
| prs.part.drop_rel(rId) for slide deletion | Orphan XML in ZIP | Duplicate name warning → corruption |
| show='0' while indices shift | Wrong slides hidden | Content silently disappears |
Safe Hidden Slide Cleanup
Delete hidden slides in a separate script/pass after saving, in reverse index order:
# Cleanup pass (separate from insertion)
prs = Presentation(saved_file)
sldIdLst = prs.part._element.find(qn('p:sldIdLst'))
for i, slide in enumerate(prs.slides):
if slide._element.get('show') == '0':
# Verify truly empty before deleting
has_content = any(
para.text.strip()
for shape in slide.shapes if shape.has_text_frame
for para in shape.text_frame.paragraphs
)
if has_content:
del slide._element.attrib['show'] # Restore, not delete
# Delete empty hidden slides (reverse order)
for idx in reversed(empty_hidden_indices):
el = list(sldIdLst)[idx]
rId = el.get(qn('r:id'))
sldIdLst.remove(el)
prs.part.drop_rel(rId)
prs.save(output_new_name) # Always save to NEW filename
Post-Processing (URL Linkification)
⚠️
create_from_template.pydoes not processfooter_url. Post-processing required.
Items Requiring Post-Processing
| Item | Processing |
| --------------- | ---------------------------------- |
| footer_url | Add linked textbox at slide bottom |
| URLs in bullets | Convert to hyperlinks |
| Reference URLs | Linkify URLs in Appendix |
Save with Different Name (File Lock Workaround)
PowerPoint locks open files.同名保存は PermissionError になるため、必ず別名で保存:
prs.save('file_withURL.pptx')
| Processing | Suffix |
| ------------- | ---------- |
| URL added | _withURL |
| Final version | _final |
| Fixed version | _fixed |
16:9 Slide Centering (Known Issue)
L9:
Presentation()のデフォルトプレースホルダは 4:3 (25.4cm) 基準。slide_width = Cm(33.867)で 16:9 に変更しても プレースホルダ位置は 4:3 のまま → 全スライドが左寄りに表示される。
推奨パターン: Blank + 手動配置
prs = Presentation()
prs.slide_width = Cm(33.867) # 16:9
prs.slide_height = Cm(19.05)
SW = prs.slide_width
# Blank layout (プレースホルダなし) を使う
slide = prs.slides.add_slide(prs.slide_layouts[6])
# SW 基準で中央配置
margin = Cm(3)
tb = slide.shapes.add_textbox(margin, Cm(5), SW - margin * 2, Cm(3))
p = tb.text_frame.paragraphs[0]
p.text = "Centered Title"
p.alignment = PP_ALIGN.CENTER
Anti-patterns
❌ Layout 0-5 を 16:9 スライドで使う(プレースホルダが 25.4cm 基準で左寄り)
❌ slide_width 変更後にプレースホルダ位置を未調整のまま使う
✅ Blank レイアウト + add_textbox() で SW 基準の対称マージン配置
✅ テンプレート PPTX 自体が 16:9 で作成されていれば Layout 0-5 も OK
Template Corruption Recovery
L10:
.gitattributesの*.pptx binaryが git add 後 に追加された場合、 CRLF/エンコーディング変換でバイナリが破壊される(UTF-8 replacement charEF BF BDが混入)。
診断方法
with open('template.pptx', 'rb') as f:
data = f.read()
count = data.count(b'\xef\xbf\xbd')
print(f'UTF-8 replacement chars: {count}') # 0 以外なら破損
復旧方法
# python-pptx で空テンプレートを再生成
from pptx import Presentation
prs = Presentation()
prs.slide_width = Cm(33.867) # 16:9
prs.slide_height = Cm(19.05)
prs.save('template_new.pptx')
# → 11 layouts が自動生成される(4:3 プレースホルダ注意)
予防策
.gitattributesは 最初のコミット前 に設定する- skill-ninja 等の自動インストーラ経由の場合、
.gitignoreによる除外とバイナリ管理の整合性を確認
Video Embedding (ZIP Direct Manipulation)
L11: python-pptx は公式に MP4 埋め込み非対応。 しかし PPTX は ZIP なので
lxml+zipfileで直接操作すれば埋め込み可能。
必要な操作
- slide XML:
p:picにa:videoFile+p14:mediaを注入 - slide rels: video/image リレーションシップを追加 (rId)
- [Content_Types].xml:
<Default Extension="mp4" ContentType="video/mp4"/>を追加 - ZIP:
ppt/media/に MP4 ファイルとポスター画像を格納
XML パターン
<p:pic>
<p:nvPicPr>
<p:cNvPr id="100" name="Video 1">
<a:hlinkClick r:id="" action="ppaction://media"/>
</p:cNvPr>
<p:cNvPicPr><a:picLocks noChangeAspect="1"/></p:cNvPicPr>
<p:nvPr>
<a:videoFile r:link="rId10"/>
<p:extLst>
<p:ext uri="{DAA4B4D4-6D71-4841-9C94-3DE7FCFB9230}">
<p14:media r:embed="rId11"/>
</p:ext>
</p:extLst>
</p:nvPr>
</p:nvPicPr>
<p:blipFill>
<a:blip r:embed="rId12"/> <!-- poster image -->
<a:stretch><a:fillRect/></a:stretch>
</p:blipFill>
<p:spPr>
<a:xfrm>
<a:off x="720000" y="1260000"/>
<a:ext cx="10752120" cy="5058000"/>
</a:xfrm>
<a:prstGeom prst="rect"><a:avLst/></a:prstGeom>
</p:spPr>
</p:pic>
Slide Rels (3 リレーションシップ必要)
<Relationship Id="rId10" Type=".../relationships/video"
Target="../media/video.mp4" TargetMode="Internal"/>
<Relationship Id="rId11" Type=".../2007/relationships/media"
Target="../media/video.mp4"/>
<Relationship Id="rId12" Type=".../relationships/image"
Target="../media/poster.png"/>
注意事項
- PowerPoint が「修復しますか?」と聞く場合がある(軽微な XML 不整合) → 「はい」で自動修復される
- ポスター画像は必須(表示用サムネイル)
- ファイルサイズ注意: 動画を ZIP 圧縮すると PPTX が肥大化。Git 管理には LFS 推奨
Done Criteria
- [ ]
content.jsongenerated and validated - [ ] PPTX file created successfully
- [ ] No text overflow detected
- [ ] User confirmed output quality