AzureML Project Scaffolding Skill

AzureML Project Scaffolding

A battle-tested structure for AI projects that require reproducible experimentation, leveraging AzureML for cloud execution. It ensures reproducibility from day one without sacrificing the path to production — and without breaking the ability to keep experimenting once you're there. Code, environments, specs, and dependencies are wired so that what runs locally runs on AzureML, with no surprises.

Principles

These principles are foundational. Every decision about project structure, tooling, or workflow must be evaluated against them.

Three layers — Each layer depends only on inner layers:
1. Code — the what. Pure Python, no platform deps.
2. Specification — the how. job YAML. Declares how code executes on a target platform. Lives next to the code it describes.
3. Orchestration — the when. Makefile, CI. Triggers execution. Knows about specs, knows nothing about code internals.
Litmus test — If Python code imports or shells out to anything platform-specific (az, mlflow.register_model, endpoint APIs), it has escaped the Code layer. If a job YAML knows about scheduling, version registration, or what happens after the job finishes, it has escaped the Specification layer. Push the concern up to the next layer. Every generated or modified file must respect this layering — never merge concerns across layers even when it seems expedient.
One mental model — Everything is a package: a uv workspace member with its own pyproject.toml, [build-system], src layout, source, tests, and dependencies. Same structure, same commands, everywhere.
```
src/my_package/
├── pyproject.toml       # deps, metadata, [build-system]
├── aml-job.yaml         # aml spec (if executable, optional)
├── src/my_package/       # package source (src layout)
│   ├── __init__.py
│   └── __main__.py       # entry point (if executable, optional)
└── tests/
```
Same structure for every package — no special cases, nothing to restructure later. A package is for computation — read from paths, do work, write to paths. If a task doesn't compute (registering assets, deploying models, downloading data via platform tools), it isn't a package — it's orchestration.
Explicit deps — Each package declares its own dependencies — including other workspace packages via [tool.uv.sources] — in its pyproject.toml. Runs are isolated per package, so undeclared imports fail by design. This keeps cloud jobs lean and makes deploying a subset of packages straightforward.
Colocation — Everything needed to understand and run a piece of work lives together in one folder. Easy to find, easy to reason about.
Run anywhere the same — Same command, same lockfile, same result — whether on your laptop, a colleague's machine, a VM, or AzureML. One Dockerfile serves as both devcontainer and cloud runner. Python deps installed at runtime by uv, not baked in.
Complexity must be earned — Start with the simplest correct thing. Add structure only when a specific need demands it. But respect what exists: if the project has grown beyond the basics, that complexity was earned and should not be regressed without understanding why it was introduced.

A lean Makefile orchestrates everything: self-documenting (make help), the single entry point for running, testing, and managing the project. All packages are uv workspace members resolved by one lockfile at the root. Keep one target per concept (run/test/aml); avoid package-specific aliases unless explicitly requested.

Initializing a Project

A complete minimal project lives in assets/ — use it as the reference for every file's exact content and structure. Contents match project tree outlined below and package trees outlined above.

Steps

Scaffold the root. Copy the core assets to get the root pyproject.toml (workspace declaration, dev deps only), Makefile, AGENTS.md, .devcontainer/ (Dockerfile + devcontainer.json), and .env.
Create your first package. Add a folder under src/ with its own pyproject.toml, src/<name>/ (with __init__.py and __main__.py), and tests/. Adapt from the mypkg package in assets/ and rename — grep for mypkg and replace with your package name everywhere (pyproject.toml, imports, etc.). Treat __main__.py as starter sample logic.
Create user .env.local. Copy from .env and substitute placeholders.
Reopen in devcontainer. Must be done by human.
Lock deps. Run make sync — this creates uv.lock. Commit it.
Verify local. Run the verification loop below. Do not continue until every command passes.

<project>/
├── .devcontainer/          # Dockerfile + devcontainer.json
├── .env                    # Azure config (safe defaults, committed)
├── AGENTS.md               # project context for AI agents
├── Makefile                # single entry point
├── pyproject.toml          # workspace root, dev deps only
├── uv.lock                 # committed — reproducibility anchor
└── src/
    └── <package>/           # one package to start

Verify

All three must exit 0 before proceeding. Fix and re-run from make sync until they do.

make sync                    # uv.lock exists at root
make run pkg=<package_name>  # produces expected stdout/files
make test                    # all tests pass

Key rules

Always a uv workspace, even with one package. The root pyproject.toml declares members = ["src/*"] and has no runtime deps — only dev tools in [dependency-groups].
uv.lock is committed. Created/updated automatically by uv run and make sync (which runs uv sync --all-packages under the hood). Always use make sync instead of bare uv sync — the flag ensures every workspace member is installed, so make test and imports work.
One Dockerfile, two roles — devcontainer and cloud runner. The devcontainer is optional — you can develop without it. But uv only isolates Python deps; OS-level dependencies (system libraries, CLI tools, native builds) can still conflict across projects. The devcontainer solves that, and because the same Dockerfile backs both local development and cloud execution, skipping it means losing the guarantee that your local environment matches AzureML exactly. Python deps are not baked in and follow this split:
1. Python deps (uv-managed) → pyproject.toml / uv.lock.
2. System deps (OS libs/tools) → Dockerfile.
3. Dev-only tooling deps (for example Azure CLI + ml) → .devcontainer/devcontainer.json features.
.env is committed with empty/safe defaults. Per-developer overrides go in .env.local (gitignored).
Tool/runtime version alignment — Keep tool targets (for example Ruff target-version and type-checker Python version) aligned with requires-python in root and package pyproject.toml files.

Existing projects

Map each independently runnable piece to a package under src/, extract its deps into a pyproject.toml, and follow the same steps above. Get one package working end-to-end first, then migrate the rest. If clashes exist (e.g., existing AGENTS.md), make sure to merge gracefully.

Cloud execution (after local works)

Keep cloud as a separate step: first make run, then make aml to submit to AzureML.

Steps

Add aml-job.yaml to the package folder if it doesn't exist yet. Copy from ./assets/src/mypkg/aml-job.yaml and rename mypkg references. For the full schema, see the $schema link inside the file.
Align the YAML with __main__.py. The command, inputs in aml-job.yaml must match the current entry point and any arguments it expects. If __main__.py changed since the YAML was created, update the YAML to reflect the current state.
Ensure .env / .env.local are populated. Cloud submission requires valid Azure configuration (subscription, resource group, workspace). Ask the human to verify .env.local has all values filled in before proceeding.
Fill YAML placeholders. Ask the human to provide values for any remaining placeholders in the YAML — compute target (<azure-ml-cluster-name>), dataset references, etc.
Submit. Run make aml pkg=<package_name> from the project root.

Verify

Ask the human to confirm in Azure ML Studio: job completed, tags/metrics visible, outputs/ contains expected artifacts.

Beyond the job

This skill covers what runs inside a job and how to submit it. What happens after — registering outputs as versioned data or model assets, deploying models to endpoints, scheduling recurring runs — is orchestration that lives outside the job, typically in CI pipelines or operational scripts. The same layer rule applies: those concerns never leak into Python code or job YAML. How they're implemented varies by project; where they live does not — always the outermost layer.

Why CLI v2 over Python SDK

YAML is a clear, declarative run contract.
Python code stays platform-agnostic.
Matches the layers: code (what), YAML spec (how), Makefile (when).

Example files to inspect

./assets/src/mypkg/aml-job.yaml: command, inputs, code path, environment build context, compute.
./assets/src/mypkg/src/mypkg/__main__.py how to persist in AzureML:
- tags = run metadata labels,
- metrics = tracked numeric values,
- stdout/stderr = captured AzureML logs,
- ./outputs = persisted job artifacts.

Extensibility patterns (optional)

Keep the core scaffold minimal. Add these only when the project needs them. Each reference file includes an AGENTS.md section — merge it to the project's AGENTS.md when applying the extension so new agent sessions discover the added capabilities.

Linting & hooks — team-level quality automation with Ruff, Ty, and mdformat, optionally wired through pre-commit. details.
Experimentation & traceability — outputs-by-run in runs/ for local runs, cloud-job output download, and git-linked experiment commits for diff-from-main traceability: details.
Pipelines — multi-step execution with composable packages/components: details.
Datasets — download registered Data Assets by name or raw blob data to the developer's machine for local usage: details.

Agent Skills: AzureML Project Scaffolding

Install this agent skill to your local

Skill Files