Terraform Documentation Generator
Overview
Transform Terraform code into comprehensive, user-friendly documentation. This skill analyzes Terraform configurations and generates formatted documentation including module READMEs, architecture overviews, operational runbooks, and quick reference guides.
Quick Start
Provide Terraform files or point to a Terraform module/configuration, then specify the documentation type needed:
"Document this Terraform module"
"Create architecture docs for these Terraform files"
"Generate a runbook for this infrastructure"
"Create a quick reference for this Terraform configuration"
The skill will analyze the code and produce formatted documentation appropriate for the target audience.
Detailed Instructions
Step 1: Analyze Terraform Code
Parse and understand the Terraform configuration:
Identify Key Components:
- Resources - What infrastructure is being created (aws_instance, azurerm_storage_account, etc.)
- Variables - Input parameters, their types, defaults, and descriptions
- Outputs - Values exported by the module
- Providers - Cloud providers and versions required
- Data sources - External data being referenced
- Locals - Internal computed values
- Modules - Child modules being called
Extract Metadata:
- Required provider versions
- Terraform version constraints
- Backend configuration
- Dependencies between resources
- Security configurations (IAM, security groups, policies)
Understand Purpose:
- What problem does this infrastructure solve?
- What services/resources are the core components?
- How do components interact?
- What are the security and networking patterns?
Step 2: Choose Documentation Format
Select the appropriate documentation type based on user request:
Module README
For: Terraform modules that will be reused Audience: Developers using the module Focus: Inputs, outputs, usage examples
Architecture Documentation
For: Complete infrastructure setups Audience: Technical stakeholders, architects Focus: High-level design, component relationships, diagrams
Runbook
For: Operational infrastructure Audience: DevOps, SRE, operations teams Focus: Deployment, troubleshooting, maintenance
Quick Reference
For: Complex configurations Audience: Team members working with the code Focus: Key resources, variables, common tasks
Step 3: Generate Documentation
Module README Format
# [Module Name]
## Overview
[Brief description of what this module creates and why]
## Architecture
[Simple diagram or description of resources created]
## Usage
### Basic Example
```hcl
module "example" {
source = "./modules/[module-name]"
# Required variables
[var1] = "[value1]"
[var2] = "[value2]"
}
Complete Example
[Full working example with all options]
Requirements
| Name | Version | |------|---------| | terraform | >= 1.0 | | aws | >= 4.0 |
Providers
| Name | Version | |------|---------| | aws | >= 4.0 |
Inputs
| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| [var1] | [description] | string | "default" | no |
| [var2] | [description] | number | n/a | yes |
Outputs
| Name | Description | |------|-------------| | [output1] | [description] | | [output2] | [description] |
Resources Created
- [resource_type]: [description of what it does]
- [resource_type]: [description of what it does]
Security Considerations
- [Security feature 1]
- [Security feature 2]
Cost Considerations
[Notes about what will incur costs]
Examples
See the examples directory for more usage examples.
License
[License information]
#### Architecture Documentation Format
```markdown
# [Infrastructure Name] - Architecture Documentation
## Overview
[High-level description of the infrastructure]
### Purpose
[What problem this infrastructure solves]
### Key Components
- **[Component 1]**: [Role and purpose]
- **[Component 2]**: [Role and purpose]
- **[Component 3]**: [Role and purpose]
## Architecture Diagram
[ASCII diagram or mermaid diagram showing component relationships]
Example: ┌─────────────────┐ │ CloudFront │ └────────┬────────┘ │ ┌────▼─────┐ │ ALB │ └────┬─────┘ │ ┌──────┴──────┐ │ ECS │ │ Cluster │ └──────┬──────┘ │ ┌────▼─────┐ │ RDS │ └──────────┘
## Components
### [Component 1 Name]
**Type**: [e.g., Load Balancer, Database, Compute]
**Purpose**: [What this component does]
**Configuration Highlights**:
- [Key setting 1]
- [Key setting 2]
**Resources**:
- `[resource_name]` - [description]
**Dependencies**:
- Depends on: [other components]
- Used by: [other components]
### [Component 2 Name]
[Repeat structure]
## Networking
### VPC Architecture
- **CIDR Block**: [CIDR]
- **Subnets**: [Public/Private configuration]
- **Availability Zones**: [AZ strategy]
### Security Groups
- **[SG Name]**: [Purpose and rules]
### Network Flow
[Description of how traffic flows through the architecture]
## Security
### IAM Roles and Policies
- **[Role Name]**: [Purpose and permissions]
### Encryption
- **At Rest**: [What's encrypted and how]
- **In Transit**: [TLS/SSL configuration]
### Access Control
[How access is controlled]
## High Availability & Disaster Recovery
### HA Strategy
[How high availability is achieved]
### Backup Strategy
[What's backed up and how]
### Disaster Recovery
- **RTO**: [Recovery Time Objective]
- **RPO**: [Recovery Point Objective]
## Monitoring & Logging
### CloudWatch Metrics
- [Key metrics being tracked]
### Logging
- [What's being logged and where]
### Alarms
- [Critical alarms configured]
## Cost Optimization
### Cost Breakdown
- [Major cost drivers]
### Optimization Opportunities
- [Potential cost savings]
## Deployment
See [RUNBOOK.md](RUNBOOK.md) for deployment instructions.
## Terraform Configuration
- **Terraform Version**: [version]
- **Providers**: [list]
- **Backend**: [backend type and configuration]
## File Structure
terraform/ ├── main.tf # Main resource definitions ├── variables.tf # Input variables ├── outputs.tf # Output values ├── providers.tf # Provider configurations └── modules/ # Child modules
## Maintenance
### Updates
[How to update the infrastructure]
### Scaling
[How to scale resources]
## Support
For issues or questions, contact [team/email].
Runbook Format
# [Infrastructure Name] - Operations Runbook
## Quick Reference
| Item | Value |
|------|-------|
| AWS Account | [account-id] |
| Region | [region] |
| Environment | [prod/staging/dev] |
| Terraform Backend | [S3 bucket/location] |
## Prerequisites
- Terraform >= [version]
- AWS CLI configured with appropriate credentials
- Required permissions: [list]
## Initial Deployment
### Step 1: Configure Backend
```bash
# Initialize Terraform backend
terraform init \
-backend-config="bucket=[bucket-name]" \
-backend-config="key=[state-key]" \
-backend-config="region=[region]"
Step 2: Review Configuration
# Review variables
cat terraform.tfvars
# Validate configuration
terraform validate
# Plan deployment
terraform plan -out=tfplan
Step 3: Deploy
# Apply the plan
terraform apply tfplan
# Verify deployment
terraform output
Daily Operations
Viewing Current State
# Show current state
terraform show
# List all resources
terraform state list
# Get specific resource details
terraform state show [resource_address]
Making Changes
# 1. Update .tf files or variables
# 2. Plan changes
terraform plan -out=tfplan
# 3. Review the plan carefully
# 4. Apply changes
terraform apply tfplan
Checking Outputs
# View all outputs
terraform output
# Get specific output
terraform output [output_name]
Common Tasks
Adding a New Environment
- Copy
terraform.tfvars.exampleto[env].tfvars - Update variables for the new environment
- Deploy:
terraform apply -var-file=[env].tfvars
Scaling Resources
Scale Up/Down [Resource Type]:
# Update the variable
terraform apply -var="[instance_count]=5"
# Or update terraform.tfvars and apply
terraform apply
Updating Security Groups
# 1. Update security group rules in main.tf
# 2. Plan to see impact
terraform plan
# 3. Apply changes
terraform apply
Rotating Credentials
Database Credentials:
# 1. Update secrets in AWS Secrets Manager
# 2. Force resource update
terraform apply -replace="[resource_address]"
Backup and Recovery
Create Manual Backup:
# For RDS
aws rds create-db-snapshot \
--db-instance-identifier [db-name] \
--db-snapshot-identifier [snapshot-name]
Restore from Backup:
[Step-by-step restoration procedure]
Troubleshooting
Issue: Terraform State Lock
Symptoms: "Error acquiring the state lock"
Solution:
# Check who has the lock (in DynamoDB)
aws dynamodb get-item \
--table-name [lock-table] \
--key '{"LockID":{"S":"[state-path]"}}'
# Force unlock (use with caution!)
terraform force-unlock [lock-id]
Issue: Resource Creation Fails
Symptoms: Apply fails with error creating resource
Diagnosis:
- Check AWS service limits
- Verify IAM permissions
- Review CloudWatch logs
Solution:
# Get detailed error
terraform apply -debug 2>&1 | tee terraform-debug.log
# Check specific resource
terraform state show [resource_address]
# Taint and recreate if needed
terraform taint [resource_address]
terraform apply
Issue: Drift Detection
Symptoms: Manual changes made outside Terraform
Detection:
# Run plan to detect drift
terraform plan -refresh-only
# Refresh state to match reality
terraform apply -refresh-only
Resolution:
- Option 1: Update Terraform to match reality
- Option 2: Revert manual changes to match Terraform
Issue: Dependency Errors
Symptoms: "depends on resource that doesn't exist"
Solution:
# Refresh state
terraform refresh
# Re-import resource if needed
terraform import [resource_address] [resource_id]
Disaster Recovery
Complete Infrastructure Loss
Recovery Steps:
-
Retrieve State File:
# State is in S3 backend aws s3 cp s3://[bucket]/[key] terraform.tfstate -
Verify State:
terraform state list -
Rebuild:
terraform apply
State File Corruption
Recovery:
# Retrieve previous version from S3
aws s3api list-object-versions \
--bucket [bucket] \
--prefix [key]
# Download specific version
aws s3api get-object \
--bucket [bucket] \
--key [key] \
--version-id [version] \
terraform.tfstate.backup
Maintenance Windows
Applying Updates
Best Practices:
- Always run during maintenance window
- Create backup before changes
- Test in non-prod environment first
- Have rollback plan ready
Procedure:
# 1. Create state backup
terraform state pull > backup-$(date +%Y%m%d-%H%M%S).tfstate
# 2. Plan changes
terraform plan -out=tfplan
# 3. Apply with auto-approve (in scripts only)
terraform apply tfplan
# 4. Verify
[verification commands]
Rolling Back Changes
# Revert to previous state
terraform state push backup-[timestamp].tfstate
# Or revert code and re-apply
git revert [commit]
terraform apply
Monitoring
Key Metrics to Watch
- [Metric 1]: [What it means, threshold]
- [Metric 2]: [What it means, threshold]
Accessing Logs
# CloudWatch Logs
aws logs tail [log-group-name] --follow
# Application logs
[specific log access commands]
Alerts
| Alert | Severity | Action | |-------|----------|--------| | [Alert name] | Critical | [What to do] | | [Alert name] | Warning | [What to do] |
Decommissioning
Destroying Infrastructure
WARNING: This is irreversible!
# 1. Backup critical data first!
[backup commands]
# 2. Create final state backup
terraform state pull > final-backup-$(date +%Y%m%d).tfstate
# 3. Plan destroy
terraform plan -destroy
# 4. Destroy (with confirmation)
terraform destroy
# 5. Clean up backend
[cleanup commands]
Contacts
| Role | Contact | Escalation | |------|---------|------------| | Primary | [name/email] | [phone] | | Secondary | [name/email] | [phone] | | Manager | [name/email] | [phone] |
References
- Architecture Docs: ARCHITECTURE.md
- Module README: README.md
- Terraform Docs: https://terraform.io/docs
#### Quick Reference Format
```markdown
# [Infrastructure Name] - Quick Reference
## Terraform Basics
### Initialize
```bash
terraform init
Plan
terraform plan
terraform plan -out=tfplan
terraform plan -var-file=prod.tfvars
Apply
terraform apply
terraform apply tfplan
terraform apply -auto-approve # Use with caution!
Destroy
terraform destroy
terraform destroy -target=[resource_address]
Common Commands
State Management
# List resources
terraform state list
# Show resource details
terraform state show [resource_address]
# Move resource
terraform state mv [source] [destination]
# Remove resource from state
terraform state rm [resource_address]
# Import existing resource
terraform import [resource_address] [resource_id]
Outputs
# Show all outputs
terraform output
# Show specific output
terraform output [output_name]
# JSON format
terraform output -json
Workspace Management
# List workspaces
terraform workspace list
# Create workspace
terraform workspace new [name]
# Switch workspace
terraform workspace select [name]
# Delete workspace
terraform workspace delete [name]
Key Resources
[Resource Category 1]
| Resource | Address | Purpose |
|----------|---------|---------|
| [Name] | [resource_address] | [What it does] |
[Resource Category 2]
| Resource | Address | Purpose |
|----------|---------|---------|
| [Name] | [resource_address] | [What it does] |
Variables
Required Variables
| Variable | Type | Description | Example |
|----------|------|-------------|---------|
| [var1] | string | [description] | "value" |
| [var2] | number | [description] | 10 |
Optional Variables
| Variable | Type | Default | Description |
|----------|------|---------|-------------|
| [var1] | string | "default" | [description] |
| [var2] | bool | true | [description] |
Outputs
| Output | Description | Usage |
|--------|-------------|-------|
| [output1] | [description] | [how to use it] |
| [output2] | [description] | [how to use it] |
File Structure
.
├── main.tf # Main configuration
├── variables.tf # Input variables
├── outputs.tf # Output values
├── providers.tf # Provider configs
├── terraform.tfvars # Variable values (gitignored)
├── versions.tf # Version constraints
└── modules/
└── [module-name]/
├── main.tf
├── variables.tf
└── outputs.tf
Useful Snippets
Get Resource ARN
terraform state show [resource_address] | grep arn
Find Resources by Type
terraform state list | grep [resource_type]
Format All Files
terraform fmt -recursive
Validate Configuration
terraform validate
Check for Drift
terraform plan -refresh-only
Target Specific Resource
terraform apply -target=[resource_address]
Troubleshooting Quick Fixes
State Lock Issue
terraform force-unlock [lock-id]
Refresh State
terraform refresh
Taint Resource (force recreate)
terraform taint [resource_address]
terraform apply
Debug Mode
TF_LOG=DEBUG terraform apply 2>&1 | tee debug.log
AWS CLI Helpers
Verify Resources Exist
# EC2 instances
aws ec2 describe-instances --filters "Name=tag:Name,Values=[name]"
# S3 buckets
aws s3 ls
# RDS databases
aws rds describe-db-instances
# Load balancers
aws elbv2 describe-load-balancers
Get Resource IDs
# VPC ID
aws ec2 describe-vpcs --filters "Name=tag:Name,Values=[name]" --query 'Vpcs[0].VpcId'
# Subnet IDs
aws ec2 describe-subnets --filters "Name=vpc-id,Values=[vpc-id]" --query 'Subnets[*].SubnetId'
Emergency Contacts
| Issue | Contact | |-------|---------| | Infrastructure down | [contact] | | Security incident | [contact] | | Cost spike | [contact] |
Links
### Step 4: Enhance with Context
**Add Explanatory Content:**
Transform technical resource names into human-readable descriptions:
- `aws_instance.web` → "Web Server EC2 Instance"
- `aws_db_instance.main` → "Primary PostgreSQL Database"
- `aws_lb.main` → "Application Load Balancer"
**Explain Relationships:**
- "The ALB distributes traffic to the ECS tasks"
- "Security group allows inbound HTTPS from CloudFront only"
- "RDS database is accessible only from private subnets"
**Include Diagrams:**
Use ASCII art or Mermaid syntax for visual representation:
graph TD A[CloudFront] --> B[ALB] B --> C[ECS Service] C --> D[RDS Database] C --> E[ElastiCache]
**Add Context:**
- Why certain resources are configured the way they are
- Security implications of configurations
- Cost considerations
- Best practices being followed
## Examples
### Example 1: Simple Module to README
**Input (Terraform Module):**
```hcl
# variables.tf
variable "environment" {
description = "Environment name"
type = string
}
variable "instance_count" {
description = "Number of instances"
type = number
default = 2
}
# main.tf
resource "aws_instance" "web" {
count = var.instance_count
ami = data.aws_ami.ubuntu.id
instance_type = "t3.micro"
tags = {
Name = "web-${var.environment}-${count.index}"
Environment = var.environment
}
}
# outputs.tf
output "instance_ids" {
description = "IDs of created instances"
value = aws_instance.web[*].id
}
Output (Module README):
# Web Server Module
## Overview
Creates multiple EC2 instances for web server hosting. Designed for flexibility across environments with configurable instance counts.
## Usage
### Basic Example
```hcl
module "web_servers" {
source = "./modules/web-server"
environment = "production"
}
Custom Instance Count
module "web_servers" {
source = "./modules/web-server"
environment = "staging"
instance_count = 3
}
Requirements
| Name | Version | |------|---------| | terraform | >= 1.0 | | aws | >= 4.0 |
Inputs
| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| environment | Environment name (used for tagging) | string | n/a | yes |
| instance_count | Number of web server instances to create | number | 2 | no |
Outputs
| Name | Description | |------|-------------| | instance_ids | List of EC2 instance IDs created by this module |
Resources Created
- EC2 Instances: t3.micro Ubuntu instances, count based on
instance_countvariable - Tags: Automatic environment and name tagging for resource management
Cost Considerations
- Each t3.micro instance costs approximately $0.0104/hour (~$7.50/month)
- Default configuration (2 instances) = ~$15/month
### Example 2: Infrastructure to Architecture Docs
**Input (Terraform Configuration):**
```hcl
# VPC and networking
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
}
resource "aws_subnet" "public" {
count = 2
vpc_id = aws_vpc.main.id
cidr_block = "10.0.${count.index}.0/24"
availability_zone = data.aws_availability_zones.available.names[count.index]
}
# Application load balancer
resource "aws_lb" "main" {
name = "app-lb"
load_balancer_type = "application"
subnets = aws_subnet.public[*].id
}
# ECS cluster
resource "aws_ecs_cluster" "main" {
name = "app-cluster"
}
resource "aws_ecs_service" "api" {
name = "api-service"
cluster = aws_ecs_cluster.main.id
task_definition = aws_ecs_task_definition.api.arn
desired_count = 2
load_balancer {
target_group_arn = aws_lb_target_group.api.arn
container_name = "api"
container_port = 8080
}
}
# RDS database
resource "aws_db_instance" "main" {
engine = "postgres"
instance_class = "db.t3.small"
allocated_storage = 20
}
Output (Architecture Documentation):
# Application Infrastructure - Architecture
## Overview
Containerized web application running on AWS ECS with PostgreSQL database backend. Designed for high availability across multiple availability zones with automatic load balancing.
## Architecture Diagram
Internet │ ▼ ┌─────────────────┐ │ Application │ │ Load Balancer │ │ (Public Subnets)│ └────────┬────────┘ │ ┌────┴────┐ │ │ ▼ ▼ ┌────────┐ ┌────────┐ │ ECS │ │ ECS │ │ Task 1 │ │ Task 2 │ │ (API) │ │ (API) │ └───┬────┘ └───┬────┘ │ │ └────┬─────┘ │ ▼ ┌──────────┐ │PostgreSQL│ │ RDS │ └──────────┘
## Components
### Networking Layer
**VPC Configuration**
- **CIDR Block**: 10.0.0.0/16
- **Public Subnets**: 2 subnets across different AZs (10.0.0.0/24, 10.0.1.0/24)
- **High Availability**: Multi-AZ deployment for redundancy
### Load Balancing
**Application Load Balancer**
- **Type**: Application Load Balancer (Layer 7)
- **Placement**: Public subnets across 2 availability zones
- **Purpose**: Distributes incoming traffic across ECS tasks
- **Health Checks**: Monitors task health and routes only to healthy instances
### Compute Layer
**ECS Cluster**
- **Container Orchestration**: AWS ECS manages container lifecycle
- **API Service**: 2 task instances running continuously
- **Container Port**: 8080
- **Scaling**: Can be configured for auto-scaling based on demand
### Database Layer
**PostgreSQL RDS**
- **Engine**: PostgreSQL
- **Instance Type**: db.t3.small (2 vCPU, 2 GB RAM)
- **Storage**: 20 GB SSD
- **Accessibility**: Private access from ECS tasks only
## Traffic Flow
1. User request arrives at Application Load Balancer
2. ALB performs health check on available ECS tasks
3. Request is routed to a healthy ECS task
4. API container processes request
5. If data is needed, task queries PostgreSQL RDS
6. Response returns through ALB to user
## High Availability
- **Multi-AZ Deployment**: Resources spread across 2 availability zones
- **Load Balancing**: ALB automatically routes around failed tasks
- **Task Redundancy**: 2 API tasks running simultaneously
- **Database**: RDS with automated backups (can enable Multi-AZ)
## Security
- **Network Isolation**: Database not directly accessible from internet
- **Security Groups**: (To be documented based on security group configs)
- **IAM Roles**: ECS tasks use IAM roles for AWS service access
## Cost Breakdown
| Resource | Estimated Monthly Cost |
|----------|----------------------|
| ALB | ~$16 |
| ECS Tasks (2 × t3.small) | ~$30 |
| RDS db.t3.small | ~$25 |
| **Total** | **~$71/month** |
## Scalability
- **Horizontal Scaling**: Increase `desired_count` in ECS service
- **Database Scaling**: Upgrade instance class or enable read replicas
- **Load Balancer**: Automatically scales to handle traffic
Example 3: Complex Setup to Runbook
Input (Production Infrastructure):
terraform {
backend "s3" {
bucket = "company-terraform-state"
key = "prod/infrastructure.tfstate"
region = "us-east-1"
}
}
# Multi-tier web application with autoscaling
Output (Operational Runbook):
See the full Runbook format above for the complete structure. Key sections would include:
- Backend configuration details
- Step-by-step deployment procedures
- Common operational tasks
- Troubleshooting guides specific to the resources
- Emergency contacts and escalation procedures
Best Practices
Documentation Clarity
- Use Plain Language: Avoid unnecessarily complex terminology
- Be Specific: Include actual values, not just placeholders
- Provide Context: Explain the "why", not just the "what"
- Include Examples: Real, working code examples
- Keep Updated: Mark documentation date/version
Technical Accuracy
- Extract from Code: Don't guess at variable types or defaults
- Verify Commands: Test all command examples
- Check Versions: Include accurate version requirements
- Validate Links: Ensure all references point to correct locations
Audience Awareness
For Developers:
- Focus on module usage and API
- Include code examples
- Explain inputs/outputs clearly
For Operations:
- Focus on deployment and troubleshooting
- Include runbooks and commands
- Provide escalation procedures
For Architects:
- Focus on design decisions
- Include diagrams
- Explain trade-offs and alternatives
Formatting Standards
- Consistent Structure: Use templates for similar docs
- Code Blocks: Always specify language for syntax highlighting
- Tables: Use for structured data (variables, outputs)
- Diagrams: Include visual representations
- Links: Cross-reference related documentation
Additional Resources
See the templates directory for:
See the examples directory for:
See the scripts directory for:
- terraform-docs integration - Automated documentation generation
- Validation script - Check documentation completeness