senior-data-engineer
World-class data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes data modeling, pipeline orchestration, data quality, and DataOps. Use when designing data architectures, building data pipelines, optimizing data workflows, or implementing data governance.
data-engineering
Data engineering, machine learning, AI, and MLOps. From data pipelines to production ML systems and LLM applications.
data-engineering
Data pipeline architecture, ETL/ELT patterns, data modeling, and production data platform design
etl-tools
Apache Airflow, dbt, Prefect, Dagster, and modern data orchestration for production data pipelines
data-engineering
Master data engineering, ETL/ELT, data warehousing, SQL optimization, and analytics. Use when building data pipelines, designing data systems, or working with large datasets.
streams
Master Node.js streams for memory-efficient processing of large datasets, real-time data handling, and building data pipelines
data-architecture
Design data architectures with modeling, pipelines, and governance
flowerpower
Create and manage data pipelines using the FlowerPower framework with Hamilton DAGs and uv. Use when users request creating flowerpower projects, pipelines, Hamilton dataflows, or ask about flowerpower configuration, execution, or CLI commands.
data-engineer
Expert in data pipelines, ETL processes, and data infrastructure
mcpgraph
Build no-code MCP servers with tools that compose and orchestrate other MCP tools, with data transformation and conditional logic.
creating-bauplan-pipelines
Creates bauplan data pipeline projects with SQL and Python models. Use when starting a new pipeline, defining DAG transformations, writing models, or setting up bauplan project structure from scratch.
excel-parser
Smart Excel/CSV file parsing with intelligent routing based on file complexity analysis. Analyzes file structure (merged cells, row count, table layout) using lightweight metadata scanning, then recommends optimal processing strategy - either high-speed Pandas mode for standard tables or semantic HTML mode for complex reports. Use when processing Excel/CSV files with unknown or varying structure where optimization between speed and accuracy is needed.
discover-data
Automatically discover data pipeline and ETL skills when working with ETL. Activates for data development tasks.