Terraform Documentation Generator Skill

Terraform Documentation Generator

Overview

Transform Terraform code into comprehensive, user-friendly documentation. This skill analyzes Terraform configurations and generates formatted documentation including module READMEs, architecture overviews, operational runbooks, and quick reference guides.

Quick Start

Provide Terraform files or point to a Terraform module/configuration, then specify the documentation type needed:

"Document this Terraform module"
"Create architecture docs for these Terraform files"
"Generate a runbook for this infrastructure"
"Create a quick reference for this Terraform configuration"

The skill will analyze the code and produce formatted documentation appropriate for the target audience.

Detailed Instructions

Step 1: Analyze Terraform Code

Parse and understand the Terraform configuration:

Identify Key Components:

Resources - What infrastructure is being created (aws_instance, azurerm_storage_account, etc.)
Variables - Input parameters, their types, defaults, and descriptions
Outputs - Values exported by the module
Providers - Cloud providers and versions required
Data sources - External data being referenced
Locals - Internal computed values
Modules - Child modules being called

Extract Metadata:

Required provider versions
Terraform version constraints
Backend configuration
Dependencies between resources
Security configurations (IAM, security groups, policies)

Understand Purpose:

What problem does this infrastructure solve?
What services/resources are the core components?
How do components interact?
What are the security and networking patterns?

Step 2: Choose Documentation Format

Select the appropriate documentation type based on user request:

Module README

For: Terraform modules that will be reused Audience: Developers using the module Focus: Inputs, outputs, usage examples

Architecture Documentation

For: Complete infrastructure setups Audience: Technical stakeholders, architects Focus: High-level design, component relationships, diagrams

Runbook

For: Operational infrastructure Audience: DevOps, SRE, operations teams Focus: Deployment, troubleshooting, maintenance

Quick Reference

For: Complex configurations Audience: Team members working with the code Focus: Key resources, variables, common tasks

Step 3: Generate Documentation

Module README Format

# [Module Name]

## Overview
[Brief description of what this module creates and why]

## Architecture
[Simple diagram or description of resources created]

## Usage

### Basic Example
```hcl
module "example" {
  source = "./modules/[module-name]"

  # Required variables
  [var1] = "[value1]"
  [var2] = "[value2]"
}

Complete Example

[Full working example with all options]

Requirements

| Name | Version | |------|---------| | terraform | >= 1.0 | | aws | >= 4.0 |

Providers

| Name | Version | |------|---------| | aws | >= 4.0 |

Inputs

| Name | Description | Type | Default | Required | |------|-------------|------|---------|:--------:| | [var1] | [description] | string | "default" | no | | [var2] | [description] | number | n/a | yes |

Outputs

| Name | Description | |------|-------------| | [output1] | [description] | | [output2] | [description] |

Resources Created

[resource_type]: [description of what it does]
[resource_type]: [description of what it does]

Security Considerations

[Security feature 1]
[Security feature 2]

Cost Considerations

[Notes about what will incur costs]

Examples

See the examples directory for more usage examples.

License

[License information]


#### Architecture Documentation Format

```markdown
# [Infrastructure Name] - Architecture Documentation

## Overview

[High-level description of the infrastructure]

### Purpose
[What problem this infrastructure solves]

### Key Components
- **[Component 1]**: [Role and purpose]
- **[Component 2]**: [Role and purpose]
- **[Component 3]**: [Role and purpose]

## Architecture Diagram

[ASCII diagram or mermaid diagram showing component relationships]

Example: ┌─────────────────┐ │ CloudFront │ └────────┬────────┘ │ ┌────▼─────┐ │ ALB │ └────┬─────┘ │ ┌──────┴──────┐ │ ECS │ │ Cluster │ └──────┬──────┘ │ ┌────▼─────┐ │ RDS │ └──────────┘


## Components

### [Component 1 Name]

**Type**: [e.g., Load Balancer, Database, Compute]

**Purpose**: [What this component does]

**Configuration Highlights**:
- [Key setting 1]
- [Key setting 2]

**Resources**:
- `[resource_name]` - [description]

**Dependencies**:
- Depends on: [other components]
- Used by: [other components]

### [Component 2 Name]
[Repeat structure]

## Networking

### VPC Architecture
- **CIDR Block**: [CIDR]
- **Subnets**: [Public/Private configuration]
- **Availability Zones**: [AZ strategy]

### Security Groups
- **[SG Name]**: [Purpose and rules]

### Network Flow
[Description of how traffic flows through the architecture]

## Security

### IAM Roles and Policies
- **[Role Name]**: [Purpose and permissions]

### Encryption
- **At Rest**: [What's encrypted and how]
- **In Transit**: [TLS/SSL configuration]

### Access Control
[How access is controlled]

## High Availability & Disaster Recovery

### HA Strategy
[How high availability is achieved]

### Backup Strategy
[What's backed up and how]

### Disaster Recovery
- **RTO**: [Recovery Time Objective]
- **RPO**: [Recovery Point Objective]

## Monitoring & Logging

### CloudWatch Metrics
- [Key metrics being tracked]

### Logging
- [What's being logged and where]

### Alarms
- [Critical alarms configured]

## Cost Optimization

### Cost Breakdown
- [Major cost drivers]

### Optimization Opportunities
- [Potential cost savings]

## Deployment

See [RUNBOOK.md](RUNBOOK.md) for deployment instructions.

## Terraform Configuration

- **Terraform Version**: [version]
- **Providers**: [list]
- **Backend**: [backend type and configuration]

## File Structure

terraform/ ├── main.tf # Main resource definitions ├── variables.tf # Input variables ├── outputs.tf # Output values ├── providers.tf # Provider configurations └── modules/ # Child modules


## Maintenance

### Updates
[How to update the infrastructure]

### Scaling
[How to scale resources]

## Support

For issues or questions, contact [team/email].

Runbook Format

# [Infrastructure Name] - Operations Runbook

## Quick Reference

| Item | Value |
|------|-------|
| AWS Account | [account-id] |
| Region | [region] |
| Environment | [prod/staging/dev] |
| Terraform Backend | [S3 bucket/location] |

## Prerequisites

- Terraform >= [version]
- AWS CLI configured with appropriate credentials
- Required permissions: [list]

## Initial Deployment

### Step 1: Configure Backend

```bash
# Initialize Terraform backend
terraform init \
  -backend-config="bucket=[bucket-name]" \
  -backend-config="key=[state-key]" \
  -backend-config="region=[region]"

Step 2: Review Configuration

# Review variables
cat terraform.tfvars

# Validate configuration
terraform validate

# Plan deployment
terraform plan -out=tfplan

Step 3: Deploy

# Apply the plan
terraform apply tfplan

# Verify deployment
terraform output

Daily Operations

Viewing Current State

# Show current state
terraform show

# List all resources
terraform state list

# Get specific resource details
terraform state show [resource_address]

Making Changes

# 1. Update .tf files or variables
# 2. Plan changes
terraform plan -out=tfplan

# 3. Review the plan carefully
# 4. Apply changes
terraform apply tfplan

Checking Outputs

# View all outputs
terraform output

# Get specific output
terraform output [output_name]

Common Tasks

Adding a New Environment

Copy terraform.tfvars.example to [env].tfvars
Update variables for the new environment
Deploy: terraform apply -var-file=[env].tfvars

Scaling Resources

Scale Up/Down [Resource Type]:

# Update the variable
terraform apply -var="[instance_count]=5"

# Or update terraform.tfvars and apply
terraform apply

Updating Security Groups

# 1. Update security group rules in main.tf
# 2. Plan to see impact
terraform plan

# 3. Apply changes
terraform apply

Rotating Credentials

Database Credentials:

# 1. Update secrets in AWS Secrets Manager
# 2. Force resource update
terraform apply -replace="[resource_address]"

Backup and Recovery

Create Manual Backup:

# For RDS
aws rds create-db-snapshot \
  --db-instance-identifier [db-name] \
  --db-snapshot-identifier [snapshot-name]

Restore from Backup:

[Step-by-step restoration procedure]

Troubleshooting

Issue: Terraform State Lock

Symptoms: "Error acquiring the state lock"

Solution:

# Check who has the lock (in DynamoDB)
aws dynamodb get-item \
  --table-name [lock-table] \
  --key '{"LockID":{"S":"[state-path]"}}'

# Force unlock (use with caution!)
terraform force-unlock [lock-id]

Issue: Resource Creation Fails

Symptoms: Apply fails with error creating resource

Diagnosis:

Check AWS service limits
Verify IAM permissions
Review CloudWatch logs

Solution:

# Get detailed error
terraform apply -debug 2>&1 | tee terraform-debug.log

# Check specific resource
terraform state show [resource_address]

# Taint and recreate if needed
terraform taint [resource_address]
terraform apply

Issue: Drift Detection

Symptoms: Manual changes made outside Terraform

Detection:

# Run plan to detect drift
terraform plan -refresh-only

# Refresh state to match reality
terraform apply -refresh-only

Resolution:

Option 1: Update Terraform to match reality
Option 2: Revert manual changes to match Terraform

Issue: Dependency Errors

Symptoms: "depends on resource that doesn't exist"

Solution:

# Refresh state
terraform refresh

# Re-import resource if needed
terraform import [resource_address] [resource_id]

Disaster Recovery

Complete Infrastructure Loss

Recovery Steps:

Retrieve State File:

# State is in S3 backend
aws s3 cp s3://[bucket]/[key] terraform.tfstate

Verify State:
```
terraform state list
```
Rebuild:
```
terraform apply
```

State File Corruption

Recovery:

# Retrieve previous version from S3
aws s3api list-object-versions \
  --bucket [bucket] \
  --prefix [key]

# Download specific version
aws s3api get-object \
  --bucket [bucket] \
  --key [key] \
  --version-id [version] \
  terraform.tfstate.backup

Maintenance Windows

Applying Updates

Best Practices:

Always run during maintenance window
Create backup before changes
Test in non-prod environment first
Have rollback plan ready

Procedure:

# 1. Create state backup
terraform state pull > backup-$(date +%Y%m%d-%H%M%S).tfstate

# 2. Plan changes
terraform plan -out=tfplan

# 3. Apply with auto-approve (in scripts only)
terraform apply tfplan

# 4. Verify
[verification commands]

Rolling Back Changes

# Revert to previous state
terraform state push backup-[timestamp].tfstate

# Or revert code and re-apply
git revert [commit]
terraform apply

Monitoring

Key Metrics to Watch

[Metric 1]: [What it means, threshold]
[Metric 2]: [What it means, threshold]

Accessing Logs

# CloudWatch Logs
aws logs tail [log-group-name] --follow

# Application logs
[specific log access commands]

Alerts

| Alert | Severity | Action | |-------|----------|--------| | [Alert name] | Critical | [What to do] | | [Alert name] | Warning | [What to do] |

Decommissioning

Destroying Infrastructure

WARNING: This is irreversible!

# 1. Backup critical data first!
[backup commands]

# 2. Create final state backup
terraform state pull > final-backup-$(date +%Y%m%d).tfstate

# 3. Plan destroy
terraform plan -destroy

# 4. Destroy (with confirmation)
terraform destroy

# 5. Clean up backend
[cleanup commands]

Contacts

| Role | Contact | Escalation | |------|---------|------------| | Primary | [name/email] | [phone] | | Secondary | [name/email] | [phone] | | Manager | [name/email] | [phone] |

References

Architecture Docs: ARCHITECTURE.md
Module README: README.md
Terraform Docs: https://terraform.io/docs


#### Quick Reference Format

```markdown
# [Infrastructure Name] - Quick Reference

## Terraform Basics

### Initialize
```bash
terraform init

Plan

terraform plan
terraform plan -out=tfplan
terraform plan -var-file=prod.tfvars

Apply

terraform apply
terraform apply tfplan
terraform apply -auto-approve  # Use with caution!

Destroy

terraform destroy
terraform destroy -target=[resource_address]

Common Commands

State Management

# List resources
terraform state list

# Show resource details
terraform state show [resource_address]

# Move resource
terraform state mv [source] [destination]

# Remove resource from state
terraform state rm [resource_address]

# Import existing resource
terraform import [resource_address] [resource_id]

Outputs

# Show all outputs
terraform output

# Show specific output
terraform output [output_name]

# JSON format
terraform output -json

Workspace Management

# List workspaces
terraform workspace list

# Create workspace
terraform workspace new [name]

# Switch workspace
terraform workspace select [name]

# Delete workspace
terraform workspace delete [name]

Key Resources

[Resource Category 1]

| Resource | Address | Purpose | |----------|---------|---------| | [Name] | [resource_address] | [What it does] |

[Resource Category 2]

| Resource | Address | Purpose | |----------|---------|---------| | [Name] | [resource_address] | [What it does] |

Variables

Required Variables

| Variable | Type | Description | Example | |----------|------|-------------|---------| | [var1] | string | [description] | "value" | | [var2] | number | [description] | 10 |

Optional Variables

| Variable | Type | Default | Description | |----------|------|---------|-------------| | [var1] | string | "default" | [description] | | [var2] | bool | true | [description] |

Outputs

| Output | Description | Usage | |--------|-------------|-------| | [output1] | [description] | [how to use it] | | [output2] | [description] | [how to use it] |

File Structure

.
├── main.tf              # Main configuration
├── variables.tf         # Input variables
├── outputs.tf           # Output values
├── providers.tf         # Provider configs
├── terraform.tfvars     # Variable values (gitignored)
├── versions.tf          # Version constraints
└── modules/
    └── [module-name]/
        ├── main.tf
        ├── variables.tf
        └── outputs.tf

Useful Snippets

Get Resource ARN

terraform state show [resource_address] | grep arn

Find Resources by Type

terraform state list | grep [resource_type]

Format All Files

terraform fmt -recursive

Validate Configuration

terraform validate

Check for Drift

terraform plan -refresh-only

Target Specific Resource

terraform apply -target=[resource_address]

Troubleshooting Quick Fixes

State Lock Issue

terraform force-unlock [lock-id]

Refresh State

terraform refresh

Taint Resource (force recreate)

terraform taint [resource_address]
terraform apply

Debug Mode

TF_LOG=DEBUG terraform apply 2>&1 | tee debug.log

AWS CLI Helpers

Verify Resources Exist

# EC2 instances
aws ec2 describe-instances --filters "Name=tag:Name,Values=[name]"

# S3 buckets
aws s3 ls

# RDS databases
aws rds describe-db-instances

# Load balancers
aws elbv2 describe-load-balancers

Get Resource IDs

# VPC ID
aws ec2 describe-vpcs --filters "Name=tag:Name,Values=[name]" --query 'Vpcs[0].VpcId'

# Subnet IDs
aws ec2 describe-subnets --filters "Name=vpc-id,Values=[vpc-id]" --query 'Subnets[*].SubnetId'

Emergency Contacts

| Issue | Contact | |-------|---------| | Infrastructure down | [contact] | | Security incident | [contact] | | Cost spike | [contact] |

Links


### Step 4: Enhance with Context

**Add Explanatory Content:**

Transform technical resource names into human-readable descriptions:
- `aws_instance.web` → "Web Server EC2 Instance"
- `aws_db_instance.main` → "Primary PostgreSQL Database"
- `aws_lb.main` → "Application Load Balancer"

**Explain Relationships:**
- "The ALB distributes traffic to the ECS tasks"
- "Security group allows inbound HTTPS from CloudFront only"
- "RDS database is accessible only from private subnets"

**Include Diagrams:**

Use ASCII art or Mermaid syntax for visual representation:

graph TD A[CloudFront] --> B[ALB] B --> C[ECS Service] C --> D[RDS Database] C --> E[ElastiCache]


**Add Context:**
- Why certain resources are configured the way they are
- Security implications of configurations
- Cost considerations
- Best practices being followed

## Examples

### Example 1: Simple Module to README

**Input (Terraform Module):**

```hcl
# variables.tf
variable "environment" {
  description = "Environment name"
  type        = string
}

variable "instance_count" {
  description = "Number of instances"
  type        = number
  default     = 2
}

# main.tf
resource "aws_instance" "web" {
  count         = var.instance_count
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.micro"

  tags = {
    Name        = "web-${var.environment}-${count.index}"
    Environment = var.environment
  }
}

# outputs.tf
output "instance_ids" {
  description = "IDs of created instances"
  value       = aws_instance.web[*].id
}

Output (Module README):

# Web Server Module

## Overview

Creates multiple EC2 instances for web server hosting. Designed for flexibility across environments with configurable instance counts.

## Usage

### Basic Example

```hcl
module "web_servers" {
  source = "./modules/web-server"

  environment = "production"
}

Custom Instance Count

module "web_servers" {
  source = "./modules/web-server"

  environment    = "staging"
  instance_count = 3
}

Requirements

| Name | Version | |------|---------| | terraform | >= 1.0 | | aws | >= 4.0 |

Inputs

| Name | Description | Type | Default | Required | |------|-------------|------|---------|:--------:| | environment | Environment name (used for tagging) | string | n/a | yes | | instance_count | Number of web server instances to create | number | 2 | no |

Outputs

| Name | Description | |------|-------------| | instance_ids | List of EC2 instance IDs created by this module |

Resources Created

EC2 Instances: t3.micro Ubuntu instances, count based on instance_count variable
Tags: Automatic environment and name tagging for resource management

Cost Considerations

Each t3.micro instance costs approximately $0.0104/hour (~$7.50/month)
Default configuration (2 instances) = ~$15/month


### Example 2: Infrastructure to Architecture Docs

**Input (Terraform Configuration):**

```hcl
# VPC and networking
resource "aws_vpc" "main" {
  cidr_block = "10.0.0.0/16"
}

resource "aws_subnet" "public" {
  count             = 2
  vpc_id            = aws_vpc.main.id
  cidr_block        = "10.0.${count.index}.0/24"
  availability_zone = data.aws_availability_zones.available.names[count.index]
}

# Application load balancer
resource "aws_lb" "main" {
  name               = "app-lb"
  load_balancer_type = "application"
  subnets            = aws_subnet.public[*].id
}

# ECS cluster
resource "aws_ecs_cluster" "main" {
  name = "app-cluster"
}

resource "aws_ecs_service" "api" {
  name            = "api-service"
  cluster         = aws_ecs_cluster.main.id
  task_definition = aws_ecs_task_definition.api.arn
  desired_count   = 2

  load_balancer {
    target_group_arn = aws_lb_target_group.api.arn
    container_name   = "api"
    container_port   = 8080
  }
}

# RDS database
resource "aws_db_instance" "main" {
  engine         = "postgres"
  instance_class = "db.t3.small"
  allocated_storage = 20
}

Output (Architecture Documentation):

# Application Infrastructure - Architecture

## Overview

Containerized web application running on AWS ECS with PostgreSQL database backend. Designed for high availability across multiple availability zones with automatic load balancing.

## Architecture Diagram

Internet │ ▼ ┌─────────────────┐ │ Application │ │ Load Balancer │ │ (Public Subnets)│ └────────┬────────┘ │ ┌────┴────┐ │ │ ▼ ▼ ┌────────┐ ┌────────┐ │ ECS │ │ ECS │ │ Task 1 │ │ Task 2 │ │ (API) │ │ (API) │ └───┬────┘ └───┬────┘ │ │ └────┬─────┘ │ ▼ ┌──────────┐ │PostgreSQL│ │ RDS │ └──────────┘


## Components

### Networking Layer

**VPC Configuration**
- **CIDR Block**: 10.0.0.0/16
- **Public Subnets**: 2 subnets across different AZs (10.0.0.0/24, 10.0.1.0/24)
- **High Availability**: Multi-AZ deployment for redundancy

### Load Balancing

**Application Load Balancer**
- **Type**: Application Load Balancer (Layer 7)
- **Placement**: Public subnets across 2 availability zones
- **Purpose**: Distributes incoming traffic across ECS tasks
- **Health Checks**: Monitors task health and routes only to healthy instances

### Compute Layer

**ECS Cluster**
- **Container Orchestration**: AWS ECS manages container lifecycle
- **API Service**: 2 task instances running continuously
- **Container Port**: 8080
- **Scaling**: Can be configured for auto-scaling based on demand

### Database Layer

**PostgreSQL RDS**
- **Engine**: PostgreSQL
- **Instance Type**: db.t3.small (2 vCPU, 2 GB RAM)
- **Storage**: 20 GB SSD
- **Accessibility**: Private access from ECS tasks only

## Traffic Flow

1. User request arrives at Application Load Balancer
2. ALB performs health check on available ECS tasks
3. Request is routed to a healthy ECS task
4. API container processes request
5. If data is needed, task queries PostgreSQL RDS
6. Response returns through ALB to user

## High Availability

- **Multi-AZ Deployment**: Resources spread across 2 availability zones
- **Load Balancing**: ALB automatically routes around failed tasks
- **Task Redundancy**: 2 API tasks running simultaneously
- **Database**: RDS with automated backups (can enable Multi-AZ)

## Security

- **Network Isolation**: Database not directly accessible from internet
- **Security Groups**: (To be documented based on security group configs)
- **IAM Roles**: ECS tasks use IAM roles for AWS service access

## Cost Breakdown

| Resource | Estimated Monthly Cost |
|----------|----------------------|
| ALB | ~$16 |
| ECS Tasks (2 × t3.small) | ~$30 |
| RDS db.t3.small | ~$25 |
| **Total** | **~$71/month** |

## Scalability

- **Horizontal Scaling**: Increase `desired_count` in ECS service
- **Database Scaling**: Upgrade instance class or enable read replicas
- **Load Balancer**: Automatically scales to handle traffic

Example 3: Complex Setup to Runbook

Input (Production Infrastructure):

terraform {
  backend "s3" {
    bucket = "company-terraform-state"
    key    = "prod/infrastructure.tfstate"
    region = "us-east-1"
  }
}

# Multi-tier web application with autoscaling

Output (Operational Runbook):

See the full Runbook format above for the complete structure. Key sections would include:

Backend configuration details
Step-by-step deployment procedures
Common operational tasks
Troubleshooting guides specific to the resources
Emergency contacts and escalation procedures

Best Practices

Documentation Clarity

Use Plain Language: Avoid unnecessarily complex terminology
Be Specific: Include actual values, not just placeholders
Provide Context: Explain the "why", not just the "what"
Include Examples: Real, working code examples
Keep Updated: Mark documentation date/version

Technical Accuracy

Extract from Code: Don't guess at variable types or defaults
Verify Commands: Test all command examples
Check Versions: Include accurate version requirements
Validate Links: Ensure all references point to correct locations

Audience Awareness

For Developers:

Focus on module usage and API
Include code examples
Explain inputs/outputs clearly

For Operations:

Focus on deployment and troubleshooting
Include runbooks and commands
Provide escalation procedures

For Architects:

Focus on design decisions
Include diagrams
Explain trade-offs and alternatives

Formatting Standards

Consistent Structure: Use templates for similar docs
Code Blocks: Always specify language for syntax highlighting
Tables: Use for structured data (variables, outputs)
Diagrams: Include visual representations
Links: Cross-reference related documentation

Additional Resources

See the templates directory for:

See the examples directory for:

See the scripts directory for:

terraform-docs integration - Automated documentation generation
Validation script - Check documentation completeness

Agent Skills: Terraform Documentation Generator

Install this agent skill to your local

Skill Files