kubernetes-patterns
Kubernetes deployment patterns including Deployments, Services, Ingress, ConfigMaps, Secrets, resource management, health checks, and horizontal pod autoscaling. Covers kubectl commands, YAML manifests, and GitOps workflows. Use when deploying to Kubernetes, managing containers, setting up services, or troubleshooting pod issues.
docker-optimization
Optimize Docker images for Python applications including multi-stage builds (70%+ size reduction), security scanning with Trivy, layer caching, and distroless base images. Use when creating Dockerfiles, reducing image size, improving build performance, or scanning for vulnerabilities.
aztec-deploy
Generate TypeScript deployment scripts for Aztec contracts with fee payment configuration. Use when deploying contracts, setting up deployment pipelines, or configuring fee payment methods.
gpu-inference-server
Set up AI inference servers on cloud GPUs. Create private LLM APIs (vLLM, TGI), image generation endpoints, embedding services, and more. All with OpenAI-compatible interfaces that work with existing tools.
aws-agentic-ai
AWS Bedrock AgentCore comprehensive expert for deploying and managing all AgentCore services. Use when working with Gateway, Runtime, Memory, Identity, or any AgentCore component. Covers MCP target deployment, credential management, schema optimization, runtime configuration, memory management, and identity services.
hf-spaces-expert
This skill should be used when creating or configuring Hugging Face Spaces, including ZeroGPU hardware, secrets/env variables, persistent storage, repo-based deploys, and build/memory troubleshooting.
serving-llms-vllm
Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching. Use when deploying production LLM APIs, optimizing inference latency/throughput, or serving models with limited GPU memory. Supports OpenAI-compatible endpoints, quantization (GPTQ/AWQ/FP8), and tensor parallelism.
tensorrt-llm
Optimizes LLM inference with NVIDIA TensorRT for maximum throughput and lowest latency. Use for production deployment on NVIDIA GPUs (A100/H100), when you need 10-100x faster inference than PyTorch, or for serving models with quantization (FP8/INT4), in-flight batching, and multi-GPU scaling.
llama-cpp
Runs LLM inference on CPU, Apple Silicon, and consumer GPUs without NVIDIA hardware. Use for edge deployment, M1/M2/M3 Macs, AMD/Intel GPUs, or when CUDA is unavailable. Supports GGUF quantization (1.5-8 bit) for reduced memory and 4-10× speedup vs PyTorch on CPU.
modal
Run Python code in the cloud with serverless containers, GPUs, and autoscaling. Use when deploying ML models, running batch processing jobs, scheduling compute-intensive tasks, or serving APIs that require GPU acceleration or dynamic scaling.
argocd-image-updater
Automate container image updates for Kubernetes workloads managed by Argo CD. USE WHEN configuring ArgoCD Image Updater, setting up automatic image updates, configuring update strategies (semver, digest, newest-build, alphabetical), implementing git write-back, troubleshooting image update issues, or working with ImageUpdater CRDs. Covers installation, configuration, authentication, and best practices.
ArgoRollouts
Argo Rollouts progressive delivery controller for Kubernetes. USE WHEN user mentions rollouts, canary deployments, blue-green deployments, progressive delivery, traffic shifting, analysis templates, or Argo Rollouts. Provides deployment strategies, CLI commands, metrics analysis, and YAML examples.
GithubPages
Complete GitHub Pages deployment and management system. Static site hosting with Jekyll, custom domains, and GitHub Actions. USE WHEN user mentions 'github pages', 'deploy static site', 'host website on github', 'jekyll site', 'custom domain for github', OR wants to publish a website from a repository.
knative
Knative serverless platform for Kubernetes. Use when deploying serverless workloads, configuring autoscaling (scale-to-zero), event-driven architectures, traffic management (blue-green, canary), CloudEvents routing, Brokers/Triggers/Sources, or working with Knative Serving/Eventing/Functions. Covers installation, networking (Kourier/Istio/Contour), and troubleshooting.
deployment-checklist
Production deployment readiness checklist covering environment, security, monitoring, and operational concerns. Use before deploying to production or when setting up new environments.
sf-deploy
>
k8s-helm
Kubernetes and Helm patterns - use for deployment configs, service definitions, ConfigMaps, Secrets, and Helm chart management
kubernetes-deployment-patterns
Kubernetes deployment strategies and workload patterns for production-grade applications. Use when deploying to Kubernetes, implementing rollout strategies, or designing cloud-native application architectures.
Page 1 of 9 · 152 results