Agent Skills: Azure Machine Learning SDK v2 for Python

|

UncategorizedID: microsoft/agent-skills/azure-ai-ml-py

Install this agent skill to your local

pnpm dlx add-skill https://github.com/microsoft/skills/tree/HEAD/.github/plugins/azure-sdk-python/skills/azure-ai-ml-py

Skill Files

Browse the full folder contents for azure-ai-ml-py.

Download Skill

Loading file tree…

.github/plugins/azure-sdk-python/skills/azure-ai-ml-py/SKILL.md

Skill Metadata

Name
azure-ai-ml-py
Description
|

Azure Machine Learning SDK v2 for Python

Client library for managing Azure ML resources: workspaces, jobs, models, data, and compute.

Installation

pip install azure-ai-ml

Environment Variables

AZURE_SUBSCRIPTION_ID=<your-subscription-id>  # Required for all auth methods
AZURE_RESOURCE_GROUP=<your-resource-group>  # Required for all auth methods
AZURE_ML_WORKSPACE_NAME=<your-workspace-name>  # Required for all auth methods
AZURE_TOKEN_CREDENTIALS=prod # Required only if DefaultAzureCredential is used in production

Authentication & Lifecycle

πŸ”‘ Two rules apply to every code sample below:

  1. Prefer DefaultAzureCredential. It works locally (Azure CLI / VS Code / Developer CLI) and in Azure (managed identity, workload identity) with no code change. Avoid connection strings, account/API keys β€” they bypass Entra audit and rotation.
    • Local dev: DefaultAzureCredential works as-is.
    • Production: set AZURE_TOKEN_CREDENTIALS=prod (or AZURE_TOKEN_CREDENTIALS=<specific_credential>) to constrain the credential chain to production-safe credentials.
  2. Wrap every client in a context manager so HTTP transports, sockets, and token caches are released deterministically:
    • Sync: with <Client>(...) as client:
    • Async: async with <Client>(...) as client: and async with DefaultAzureCredential() as credential: (from azure.identity.aio)

Snippets may abbreviate this setup, but production code should always follow both rules.

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential, ManagedIdentityCredential
import os

# Local dev: DefaultAzureCredential. Production: set AZURE_TOKEN_CREDENTIALS=prod or AZURE_TOKEN_CREDENTIALS=<specific_credential>
credential = DefaultAzureCredential(require_envvar=True)
# Or use a specific credential directly in production:
# See https://learn.microsoft.com/python/api/overview/azure/identity-readme?view=azure-python#credential-classes
# credential = ManagedIdentityCredential()
with MLClient(
    credential=credential,
    subscription_id=os.environ["AZURE_SUBSCRIPTION_ID"],
    resource_group_name=os.environ["AZURE_RESOURCE_GROUP"],
    workspace_name=os.environ["AZURE_ML_WORKSPACE_NAME"]
) as ml_client:
    for ws in ml_client.workspaces.list():
        print(ws.name)

From Config File

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

# Uses config.json in current directory or parent
with MLClient.from_config(
    credential=DefaultAzureCredential()
) as ml_client:
    for ws in ml_client.workspaces.list():
        print(ws.name)

Long-lived ml_client: Subsequent examples in this skill assume ml_client was created via the pattern above and is alive for the lifetime of your script. In production, wrap your top-level workflow in a single with MLClient(...) as ml_client: block so the underlying HTTP transport closes cleanly on exit.

Workspace Management

Create Workspace

from azure.ai.ml.entities import Workspace

ws = Workspace(
    name="my-workspace",
    location="eastus",
    display_name="My Workspace",
    description="ML workspace for experiments",
    tags={"purpose": "demo"}
)

ml_client.workspaces.begin_create(ws).result()

List Workspaces

for ws in ml_client.workspaces.list():
    print(f"{ws.name}: {ws.location}")

Data Assets

Register Data

from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes

# Register a file
my_data = Data(
    name="my-dataset",
    version="1",
    path="azureml://datastores/workspaceblobstore/paths/data/train.csv",
    type=AssetTypes.URI_FILE,
    description="Training data"
)

ml_client.data.create_or_update(my_data)

Register Folder

my_data = Data(
    name="my-folder-dataset",
    version="1",
    path="azureml://datastores/workspaceblobstore/paths/data/",
    type=AssetTypes.URI_FOLDER
)

ml_client.data.create_or_update(my_data)

Model Registry

Register Model

from azure.ai.ml.entities import Model
from azure.ai.ml.constants import AssetTypes

model = Model(
    name="my-model",
    version="1",
    path="./model/",
    type=AssetTypes.CUSTOM_MODEL,
    description="My trained model"
)

ml_client.models.create_or_update(model)

List Models

for model in ml_client.models.list(name="my-model"):
    print(f"{model.name} v{model.version}")

Compute

Create Compute Cluster

from azure.ai.ml.entities import AmlCompute

cluster = AmlCompute(
    name="cpu-cluster",
    type="amlcompute",
    size="Standard_DS3_v2",
    min_instances=0,
    max_instances=4,
    idle_time_before_scale_down=120
)

ml_client.compute.begin_create_or_update(cluster).result()

List Compute

for compute in ml_client.compute.list():
    print(f"{compute.name}: {compute.type}")

Jobs

Command Job

from azure.ai.ml import command, Input

job = command(
    code="./src",
    command="python train.py --data ${{inputs.data}} --lr ${{inputs.learning_rate}}",
    inputs={
        "data": Input(type="uri_folder", path="azureml:my-dataset:1"),
        "learning_rate": 0.01
    },
    environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest",
    compute="cpu-cluster",
    display_name="training-job"
)

returned_job = ml_client.jobs.create_or_update(job)
print(f"Job URL: {returned_job.studio_url}")

Monitor Job

ml_client.jobs.stream(returned_job.name)

Pipelines

from azure.ai.ml import dsl, Input, Output
from azure.ai.ml.entities import Pipeline

@dsl.pipeline(
    compute="cpu-cluster",
    description="Training pipeline"
)
def training_pipeline(data_input):
    prep_step = prep_component(data=data_input)
    train_step = train_component(
        data=prep_step.outputs.output_data,
        learning_rate=0.01
    )
    return {"model": train_step.outputs.model}

pipeline = training_pipeline(
    data_input=Input(type="uri_folder", path="azureml:my-dataset:1")
)

pipeline_job = ml_client.jobs.create_or_update(pipeline)

Environments

Create Custom Environment

from azure.ai.ml.entities import Environment

env = Environment(
    name="my-env",
    version="1",
    image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04",
    conda_file="./environment.yml"
)

ml_client.environments.create_or_update(env)

Datastores

List Datastores

for ds in ml_client.datastores.list():
    print(f"{ds.name}: {ds.type}")

Get Default Datastore

default_ds = ml_client.datastores.get_default()
print(f"Default: {default_ds.name}")

MLClient Operations

| Property | Operations | |----------|------------| | workspaces | create, get, list, delete | | jobs | create_or_update, get, list, stream, cancel | | models | create_or_update, get, list, archive | | data | create_or_update, get, list | | compute | begin_create_or_update, get, list, delete | | environments | create_or_update, get, list | | datastores | create_or_update, get, list, get_default | | components | create_or_update, get, list |

Best Practices

  1. Pick sync OR async and stay consistent. Do not mix azure.ai.ml sync clients with azure.ai.ml async clients in the same call path. Choose one mode per module.
  2. Always use context managers for clients and async credentials. Wrap every client in with MLClient(...) as client: (sync) or async with MLClient(...) as client: (async). For async DefaultAzureCredential from azure.identity.aio, also use async with credential: so tokens and transports are cleaned up.
  3. Use versioning for data, models, and environments
  4. Configure idle scale-down to reduce compute costs
  5. Use environments for reproducible training
  6. Stream job logs to monitor progress
  7. Register models after successful training jobs
  8. Use pipelines for multi-step workflows
  9. Tag resources for organization and cost tracking