Back to tags
Tag

Agent Skills with tag: distributed-computing

30 skills match this tag. Use tags to discover related Agent Skills and explore similar workflows.

ray-train

Distributed training orchestration across clusters. Scales PyTorch/TensorFlow/HuggingFace from laptop to 1000s of nodes. Built-in hyperparameter tuning with Ray Tune, fault tolerance, elastic scaling. Use when training massive models across multiple machines or running distributed hyperparameter sweeps.

training-orchestrationdistributed-computinghyperparameter-tuningscalability
ovachiever
ovachiever
81

pytorch-fsdp

Expert guidance for Fully Sharded Data Parallel training with PyTorch FSDP - parameter sharding, mixed precision, CPU offloading, FSDP2

pytorchdistributed-computingmixed-precisioncpu-offloading
ovachiever
ovachiever
81

deepspeed

Expert guidance for distributed training with DeepSpeed - ZeRO optimization stages, pipeline parallelism, FP16/BF16/FP8, 1-bit Adam, sparse attention

distributed-computingDeepSpeedmodel-optimizationparallelism
ovachiever
ovachiever
81

huggingface-accelerate

Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. Automatic device placement, mixed precision (FP16/BF16/FP8). Interactive config, single launch command. HuggingFace ecosystem standard.

pytorchdistributed-computingdeep-learninghuggingface
ovachiever
ovachiever
81

dask

Parallel/distributed computing. Scale pandas/NumPy beyond memory, parallel DataFrames/Arrays, multi-file processing, task graphs, for larger-than-RAM datasets and parallel workflows.

distributed-computingparallel-processingpandasnumpy
ovachiever
ovachiever
81

big-data

Apache Spark, Hadoop, distributed computing, and large-scale data processing for petabyte-scale workloads

apache-sparkhadoopdistributed-computingbig-data
pluginagentmarketplace
pluginagentmarketplace
11

logging

Centralized logging with ELK Stack, Loki, Fluentd, and log analysis for distributed systems

distributed-computingELK-stackLokiFluentd
pluginagentmarketplace
pluginagentmarketplace
2

distributed-claude-sender

Send prompts to a remote Claude instance on a VPS for distributed AI collaboration, different model backends, or independent context.

anthropic-sdkllm-integrationdistributed-computingmulti-backend
ebowwa
ebowwa
32

pytorch

Building and training neural networks with PyTorch. Use when implementing deep learning models, training loops, data pipelines, model optimization with torch.compile, distributed training, or deploying PyTorch models.

pytorchdeep-learningneural-network-architecturesgpu-acceleration
itsmostafa
itsmostafa
10

spark-engineer

Use when building Apache Spark applications, distributed data processing pipelines, or optimizing big data workloads. Invoke for DataFrame API, Spark SQL, RDD operations, performance tuning, streaming analytics.

apache-sparkbig-datadistributed-computingbatch-processing
Jeffallan
Jeffallan
245

microservices-architect

Use when designing distributed systems, decomposing monoliths, or implementing microservices patterns. Invoke for service boundaries, DDD, saga patterns, event sourcing, service mesh, distributed tracing. Keywords: microservices, service mesh, distributed systems, Kubernetes, event-driven.

microservicesdistributed-computingevent-driven-architectureservice-mesh
Jeffallan
Jeffallan
245

distributed-tracing

Implement distributed tracing with Jaeger and Zipkin for tracking requests across microservices. Use when debugging distributed systems, tracking request flows, or analyzing service performance.

microservicesdistributed-computingmonitoringperformance-tuning
aj-geddes
aj-geddes
301

microservices-architecture

Design and implement microservices architecture including service boundaries, communication patterns, API gateways, service mesh, service discovery, and distributed system patterns. Use when building microservices, distributed systems, or service-oriented architectures.

microservicesdistributed-computingapi-gatewayservice-mesh
aj-geddes
aj-geddes
301

batch-processing-jobs

Implement robust batch processing systems with job queues, schedulers, background tasks, and distributed workers. Use when processing large datasets, scheduled tasks, async operations, or resource-intensive computations.

batch-processingqueueing-modeldistributed-computingcron-jobs
aj-geddes
aj-geddes
301

network-tracing

Instrument API requests with spans and distributed tracing. Use when tracking request latency, correlating client-backend traces, or debugging API issues.

apinetwork-troubleshootingperformance-optimizationdistributed-computing
nexus-labs-automation
nexus-labs-automation
917

lambda

AWS Lambda serverless functions for event-driven compute. Use when creating functions, configuring triggers, debugging invocations, optimizing cold starts, setting up event source mappings, or managing layers.

aws-lambdaserverlessevent-drivendistributed-computing
itsmostafa
itsmostafa
933415

parallel-execution

Patterns for parallel subagent execution using Task tool with run_in_background. Use when coordinating multiple independent tasks, spawning dynamic subagents, or implementing features that can be parallelized.

distributed-computingautonomous-agentconcurrencyscalable-algorithms
CloudAI-X
CloudAI-X
1,074163

designing-architecture

Designs software architecture and selects appropriate patterns for projects. Use when designing systems, choosing architecture patterns, structuring projects, making technical decisions, or when asked about microservices, monoliths, or architectural approaches.

software-architecturearchitecture-patternsmicroservicesmonolith
CloudAI-X
CloudAI-X
1,074163

Page 1 of 2 · 30 results