scalable-algorithms | Agent Skills

ray-data

Scalable data processing for ML workloads. Streaming execution across CPU/GPU, supports Parquet/CSV/JSON/images. Integrates with Ray Train, PyTorch, TensorFlow. Scales from single machine to 100s of nodes. Use for batch inference, data preprocessing, multi-modal data loading, or distributed ETL pipelines.

scalable-algorithmsbatch-processingstreaming-datamachine-learning

ovachiever

arboreto

Infer gene regulatory networks (GRNs) from gene expression data using scalable algorithms (GRNBoost2, GENIE3). Use when analyzing transcriptomics data (bulk RNA-seq, single-cell RNA-seq) to identify transcription factor-target gene relationships and regulatory interactions. Supports distributed computation for large-scale datasets.

gene-regulatory-networkstranscriptomicssingle-cell-rna-seqrna-seq

ovachiever

Generative Framework

Conversation-driven specification and execution of healthcare data generation at scale

ehrsynthetic-data-generationprompt-engineeringagent-orchestration

mark64oswald

scalability-advisor

Guidance for scaling systems from startup to enterprise scale. Use when planning for growth, diagnosing bottlenecks, or designing systems that need to handle 10x-1000x current load.

performance-optimizationscalable-algorithmsroot-cause-analysismonitoring

alirezarezvani

4110

parallel-execution

Patterns for parallel subagent execution using Task tool with run_in_background. Use when coordinating multiple independent tasks, spawning dynamic subagents, or implementing features that can be parallelized.

distributed-computingautonomous-agentconcurrencyscalable-algorithms

CloudAI-X

1,074163

beads

genomicssequence-analysisscalable-algorithmsbioinformatics

steveyegge

9,257563

zarr-python

Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.

pythondistributed-computingscalable-algorithmsdatabase-integration

K-Dense-AI

3,233360

vaex

Use this skill for processing and analyzing large tabular datasets (billions of rows) that exceed available RAM. Vaex excels at out-of-core DataFrame operations, lazy evaluation, fast aggregations, efficient visualization of big data, and machine learning on large datasets. Apply when users need to work with large CSV/HDF5/Arrow/Parquet files, perform fast statistics on massive datasets, create visualizations of big data, or build ML pipelines that don't fit in memory.

pythondata-analysisbig-datascalable-algorithms

K-Dense-AI

3,233360

pytorch-lightning

Deep learning framework (PyTorch Lightning). Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging (W&B, TensorBoard), distributed training (DDP, FSDP, DeepSpeed), for scalable neural network training.

pythondeep-learningdistributed-computingscalable-algorithms

K-Dense-AI

3,233360

arboreto

gene-regulatory-networkstranscriptomicsrna-seqscalable-algorithms

K-Dense-AI

3,233360

polars

Fast DataFrame library (Apache Arrow). Select, filter, group_by, joins, lazy evaluation, CSV/Parquet I/O, expression API, for high-performance data analysis workflows.

pythondata-analysisscalable-algorithmsdataframe

K-Dense-AI

3,233360

dask

Parallel/distributed computing. Scale pandas/NumPy beyond memory, parallel DataFrames/Arrays, multi-file processing, task graphs, for larger-than-RAM datasets and parallel workflows.

pythondistributed-computingdata-analysisscalable-algorithms

K-Dense-AI

3,233360

Agent Skills with tag: scalable-algorithms

ray-data

arboreto

Generative Framework

scalability-advisor

parallel-execution

beads

zarr-python

vaex

pytorch-lightning

arboreto

polars

dask