Back to tags
Tag

Agent Skills with tag: data-preprocessing

24 skills match this tag. Use tags to discover related Agent Skills and explore similar workflows.

ray-data

Scalable data processing for ML workloads. Streaming execution across CPU/GPU, supports Parquet/CSV/JSON/images. Integrates with Ray Train, PyTorch, TensorFlow. Scales from single machine to 100s of nodes. Use for batch inference, data preprocessing, multi-modal data loading, or distributed ETL pipelines.

scalable-algorithmsbatch-processingstreaming-datamachine-learning
ovachiever
ovachiever
81

flowio

Parse FCS (Flow Cytometry Standard) files v2.0-3.1. Extract events as NumPy arrays, read metadata/channels, convert to CSV/DataFrame, for flow cytometry data preprocessing.

flow-cytometryfcsdata-preprocessingnumpy
ovachiever
ovachiever
81

process-mining-assistant

Perform an end-to-end process mining analysis via a command-line workflow that progressively ingests, profiles, cleans, mines and reports on event logs using PM4Py. The workflow generates stage-based artefacts (including versioned notebooks) and pauses at decision checkpoints so the user can validate findings and choose how to proceed.

process-miningworkflow-automationdata-preprocessingpm4py
Wattysaid
Wattysaid
0

data-analyzer

データ分析と可視化を行うスキル

data-analysisdata-visualizationdata-preprocessingdata-synthesis
Shin0205go
Shin0205go
0

field-extraction-parsing

Extract structured fields from unstructured log data using OPAL parsing functions. Covers extract_regex() for pattern matching with type casting, split() for delimited data, parse_json() for JSON logs, and JSONPath for navigating parsed structures. Use when you need to convert raw log text into queryable fields for analysis, filtering, or aggregation.

logsparsingdata-preprocessingjson
rustomax
rustomax
11

data-analysis

Analyze data files (CSV, JSON) and generate insights, summaries, and statistical analysis

data-preprocessingstatistical-analysiscsvjson
tatat
tatat
1

ml-fundamentals

Master machine learning foundations - algorithms, preprocessing, feature engineering, and evaluation

algorithmsdata-preprocessingfeature-engineeringmodel-evaluation
pluginagentmarketplace
pluginagentmarketplace
11

python-analytics

Python data analysis with pandas, numpy, and analytics libraries

pythonpandasnumpydata-preprocessing
pluginagentmarketplace
pluginagentmarketplace
1

data-analytics-foundations

Core data analytics concepts, Excel/Google Sheets fundamentals, and data collection techniques

data-collectionexcelgoogle-sheetsdata-preprocessing
pluginagentmarketplace
pluginagentmarketplace
1

data-cleaning

Data cleaning, preprocessing, and quality assurance techniques

data-preprocessingdata-qualitydata-cleaningdata-validation
pluginagentmarketplace
pluginagentmarketplace
1

token-efficient

Use when processing 50+ items, analyzing CSV/log files, executing code in sandbox, or searching for tools. Load for data processing tasks. Achieves 98%+ token savings via in-sandbox execution, progressive disclosure, and pagination. Supports heredocs for multi-line bash.

token-cost-optimizationdata-preprocessingbashsandbox-execution
ingpoc
ingpoc
5

data-processing

Process JSON with jq and YAML/TOML with yq. Filter, transform, query structured data efficiently. Triggers on: parse JSON, extract from YAML, query config, Docker Compose, K8s manifests, GitHub Actions workflows, package.json, filter data.

jqyqshell-scriptingdata-preprocessing
0xDarkMatter
0xDarkMatter
3

funsloth-check

Validate datasets for Unsloth fine-tuning. Use when the user wants to check a dataset, analyze tokens, calculate Chinchilla optimality, or prepare data for training.

data-preprocessingtoken-optimizationllmdataset-validation
chrisvoncsefalvay
chrisvoncsefalvay
4

symmetry-discovery-questionnaire

Use when ML engineers need to identify symmetries in their data but don't know where to start. Invoke when user mentions data symmetry, invariance discovery, what transformations matter, or needs help recognizing patterns their model should respect. Works collaboratively through domain analysis, transformation testing, and physical constraint identification.

machine-learningml-developmentdata-preprocessingsymmetry-discovery
lyndonkl
lyndonkl
82

pandas-pro

Use when working with pandas DataFrames, data cleaning, aggregation, merging, or time series analysis. Invoke for data manipulation, missing value handling, groupby operations, or performance optimization.

pandasdataframedata-preprocessingtime-series-analysis
Jeffallan
Jeffallan
245

fine-tuning-expert

Use when fine-tuning LLMs, training custom models, or optimizing model performance for specific tasks. Invoke for parameter-efficient methods, dataset preparation, or model adaptation.

llmdata-preprocessinghyperparameter-tuningperformance-tuning
Jeffallan
Jeffallan
245

json-transformer

Transform, manipulate, and analyze JSON data structures with advanced operations.

jsondata-preprocessingdata-analysis
CuriousLearner
CuriousLearner
163

csv-processor

Parse, transform, and analyze CSV files with advanced data manipulation capabilities.

data-preprocessingdata-analysisfile-conversioncsv-processing
CuriousLearner
CuriousLearner
163

Page 1 of 2 · 24 results