Agent Skills: Autoscaling Configuration

Configure autoscaling for Kubernetes, VMs, and serverless workloads based on metrics, schedules, and custom indicators.

UncategorizedID: aj-geddes/useful-ai-prompts/autoscaling-configuration

Install this agent skill to your local

pnpm dlx add-skill https://github.com/aj-geddes/useful-ai-prompts/tree/HEAD/skills/autoscaling-configuration

Skill Files

Browse the full folder contents for autoscaling-configuration.

Download Skill

Loading file tree…

skills/autoscaling-configuration/SKILL.md

Skill Metadata

Name
autoscaling-configuration
Description
>

Autoscaling Configuration

Table of Contents

Overview

Implement autoscaling strategies to automatically adjust resource capacity based on demand, ensuring cost efficiency while maintaining performance and availability.

When to Use

  • Traffic-driven workload scaling
  • Time-based scheduled scaling
  • Resource utilization optimization
  • Cost reduction
  • High-traffic event handling
  • Batch processing optimization
  • Database connection pooling

Quick Start

Minimal working example:

# hpa-configuration.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
// ... (see reference guides for full implementation)

Reference Guides

Detailed implementations in the references/ directory:

| Guide | Contents | |---|---| | Kubernetes Horizontal Pod Autoscaler | Kubernetes Horizontal Pod Autoscaler | | AWS Auto Scaling | AWS Auto Scaling | | Custom Metrics Autoscaling | Custom Metrics Autoscaling | | Autoscaling Script | Autoscaling Script | | Monitoring Autoscaling | Monitoring Autoscaling |

Best Practices

✅ DO

  • Set appropriate min/max replicas
  • Monitor metric aggregation window
  • Implement cooldown periods
  • Use multiple metrics
  • Test scaling behavior
  • Monitor scaling events
  • Plan for peak loads
  • Implement fallback strategies

❌ DON'T

  • Set min replicas to 1
  • Scale too aggressively
  • Ignore cooldown periods
  • Use single metric only
  • Forget to test scaling
  • Scale below resource needs
  • Neglect monitoring
  • Deploy without capacity tests