Agent-Skills.md

Agent Skills: k8s

Kubernetes ops skill for deploying, operating, and troubleshooting services on Kubernetes. Use for tasks like writing manifests/Helm, configuring deployments/services/ingress, autoscaling, observability, RBAC, secrets/configmaps, rollout/rollback, incident debugging, and production readiness checks.

UncategorizedID: muzhicaomingwang/ai-ideas/k8s

Author

muzhicaomingwang

muzhicaomingwang

https://github.com/muzhicaomingwang View all skills

Repository

muzhicaomingwang/ai-ideas

muzhicaomingwang

32

Install this agent skill to your local

pnpm dlx add-skill https://github.com/muzhicaomingwang/ai-ideas/tree/HEAD/.project/ai/ops/skills/k8s

Skill Files

Browse the full folder contents for k8s.

Loading file tree…

.project/ai/ops/skills/k8s/SKILL.md

Skill Metadata

Name: k8s
Description: Kubernetes ops skill for deploying, operating, and troubleshooting services on Kubernetes. Use for tasks like writing manifests/Helm, configuring deployments/services/ingress, autoscaling, observability, RBAC, secrets/configmaps, rollout/rollback, incident debugging, and production readiness checks.

k8s

Use this skill for Kubernetes 运维与发布相关工作。

Defaults / assumptions to confirm

Cluster type: managed (EKS/GKE/ACK) vs self-hosted
Packaging: raw YAML vs Helm vs Kustomize
Ingress: NGINX/ALB/APISIX/Istio
Observability stack: Prometheus/Grafana, Loki/ELK, tracing

Workflow

Understand service requirements

Ports, protocols, health checks, resources (CPU/mem), storage needs.
SLOs: latency, availability, RPO/RTO.
Dependencies: DB, cache, MQ, external APIs.

Deployment design

Use Deployment for stateless; StatefulSet for stable identities/storage.
Define readinessProbe and livenessProbe (and startupProbe if needed).
Set resources.requests/limits and choose appropriate QoS.
Use PodDisruptionBudget for availability during maintenance.

Config & secrets

Config: ConfigMap (non-sensitive), mounted or env.
Secrets: Secret (sensitive) + external secret manager if available.
Never commit plaintext secrets; prefer sealed/external secrets.

Networking

Service types and DNS.
Ingress/Gateway routing, TLS termination, timeouts.
NetworkPolicy if cluster enforces it.

Scaling & resilience

HPA based on CPU/memory/custom metrics.
Graceful shutdown (preStop, terminationGracePeriodSeconds).
Retry/backoff at client; avoid retry storms.

Observability

Standard logs with correlation IDs.
Metrics: RPS, p95 latency, error rate, saturation.
Alerts and dashboards; runbook links.

Release operations

Rolling updates, canary/blue-green if needed.
kubectl rollout status + rollback plan.
Post-deploy verification checks and smoke tests.

Troubleshooting checklist

kubectl get/describe pods, events, and logs.
Check probes, image pull, env/config, DNS, network, and resource throttling.
For performance: node pressure, HPA behavior, GC/heap, connection pool limits.

Output expectations when making changes

Provide manifests (or Helm values/templates) + brief deployment notes.
Include resource sizing rationale and probe settings.
Include rollback instructions and verification steps.