Agent Skills: CAST AI Upgrade & Migration

|

UncategorizedID: jeremylongshore/claude-code-plugins-plus-skills/castai-upgrade-migration

Install this agent skill to your local

pnpm dlx add-skill https://github.com/jeremylongshore/claude-code-plugins-plus-skills/tree/HEAD/plugins/saas-packs/castai-pack/skills/castai-upgrade-migration

Skill Files

Browse the full folder contents for castai-upgrade-migration.

Download Skill

Loading file tree…

plugins/saas-packs/castai-pack/skills/castai-upgrade-migration/SKILL.md

Skill Metadata

Name
castai-upgrade-migration
Description
|

CAST AI Upgrade & Migration

Overview

Upgrade CAST AI components: Helm charts for the agent/controller/evictor, Terraform provider version, and workload autoscaler. Includes rollback procedures for each component.

Prerequisites

  • Current CAST AI components installed
  • Staging cluster for testing upgrades first
  • Helm and kubectl access
  • Change management approval for production

Instructions

Step 1: Check Current Versions

# Helm chart versions
helm list -n castai-agent -o json | jq '.[] | {name: .name, chart: .chart, appVersion: .app_version}'

# Available versions
helm repo update
helm search repo castai-helm --versions | head -20

# Terraform provider version
grep "castai/castai" .terraform.lock.hcl
terraform providers

Step 2: Review Changelog

# Check CAST AI changelog for breaking changes
# https://docs.cast.ai/changelog/

# Check Terraform provider releases
# https://github.com/castai/terraform-provider-castai/releases

Step 3: Upgrade on Staging First

# Upgrade agent
helm upgrade castai-agent castai-helm/castai-agent \
  -n castai-agent --reuse-values

# Upgrade cluster controller
helm upgrade cluster-controller castai-helm/castai-cluster-controller \
  -n castai-agent --reuse-values

# Upgrade evictor
helm upgrade castai-evictor castai-helm/castai-evictor \
  -n castai-agent --reuse-values

# Upgrade workload autoscaler
helm upgrade castai-workload-autoscaler castai-helm/castai-workload-autoscaler \
  -n castai-agent --reuse-values

# Verify all pods restarted cleanly
kubectl get pods -n castai-agent -w

Step 4: Upgrade Terraform Provider

# Update version constraint in versions.tf
terraform {
  required_providers {
    castai = {
      source  = "castai/castai"
      version = "~> 7.5"  # Update to target version
    }
  }
}
# Upgrade provider
terraform init -upgrade
terraform plan -var-file=environments/staging.tfvars

# Review plan carefully for resource recreation
# Apply if plan looks safe
terraform apply -var-file=environments/staging.tfvars

Step 5: Validate After Upgrade

# Verify agent is online
curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
  "https://api.cast.ai/v1/kubernetes/external-clusters/${CASTAI_CLUSTER_ID}" \
  | jq '{agentStatus, name}'

# Verify autoscaler policies still applied
curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
  "https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/policies" \
  | jq '.enabled'

# Check for errors in new agent version
kubectl logs -n castai-agent deployment/castai-agent --tail=50 | grep -i error

Rollback Procedure

# Helm rollback to previous release
helm rollback castai-agent -n castai-agent
helm rollback cluster-controller -n castai-agent

# Terraform rollback
terraform plan -var-file=environments/staging.tfvars  # Review
# If needed, pin previous provider version:
# version = "= 7.4.2"
terraform init -upgrade
terraform apply -var-file=environments/staging.tfvars

Error Handling

| Issue | Cause | Solution | |-------|-------|----------| | Agent CrashLoop after upgrade | Breaking chart change | helm rollback to previous | | Terraform plan shows destroy | Major provider version jump | Pin intermediate version | | Policies reset after upgrade | Chart default values changed | Pass --reuse-values | | Spot handler incompatible | Node format changed | Upgrade all components together |

Resources

Next Steps

For CI pipeline integration, see castai-ci-integration.