Study notes
Why Azure CLI
To train a model with Azure Machine Learning workspace, you can use:
- Designer in the Azure Machine Learning Studio
- Python SDK
- Azure CLI. To automate the training and retraining of models more effectively, the CLI is the preferred approach.
az --version
Result:azure-cli 2.45.0
....
Extensions:
ml 2.14.0
You must have version 2.x for both.
if not run:
az upgrade
az extension remove -n azure-cli-ml
az extension remove -n ml
az extension add -n ml -y
Assume you are logged in (if not az login)
Check/ set active subscription.
az account show
# get the current default subscription using show
az account show --output table
# get the current default subscription using list
az account list --query "[?isDefault]"
# get the current default subscription using show
az account show --output table
# get the current default subscription using list
az account list --query "[?isDefault]"
# change the active subscription using the subscription ID
az account set --s "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
Create resource group
# It will be created in active subscription
az group create --name "GROUP_NAME"--location "eastus"
# Set it default
az configure --defaults group="GROUP_NAME"
Create workspace
# It will be created in active scubscription and default resource grop
az ml workspace create --name "WS_NAME"
#Set the default workspace
az configure --defaults workspace="WS_NAME"
#Set the default workspace
az configure --defaults workspace="WS_NAME"
Create compute instance
# It will be created in active subscription and default resourcegroup and default workspace
#--resource-group: Name of resource group. If you configured a default group with az configure --defaults group=<name>, you don't need to use this parameter.
#--workspace-name: Name of the Azure Machine Learning workspace. If you configured a default workspace with az configure --defaults workspace=<name>, you don't need to use this parameter.
#--name: Name of compute target. The name should be fewer than 24 characters and unique within an Azure region.
#--size: VM size to use for the compute instance. Learn more about supported VM series and sizes.
#--type: Type of compute target. To create a compute instance, use ComputeInstance
#--workspace-name: Name of the Azure Machine Learning workspace. If you configured a default workspace with az configure --defaults workspace=<name>, you don't need to use this parameter.
#--name: Name of compute target. The name should be fewer than 24 characters and unique within an Azure region.
#--size: VM size to use for the compute instance. Learn more about supported VM series and sizes.
#--type: Type of compute target. To create a compute instance, use ComputeInstance
az ml compute create --name "INSTANCE_NAME" --size STANDARD_DS11_V2 --type ComputeInstance
Create compute cluster
# It will be created in active subscription and default resourcegroup and default workspace
#--type: To create a compute cluster, use AmlCompute.
#--min-instances: The minimum number of nodes used on the cluster. The default is 0 nodes.
#--max-instances: The maximum number of nodes. The default is 4.
#--min-instances: The minimum number of nodes used on the cluster. The default is 0 nodes.
#--max-instances: The maximum number of nodes. The default is 4.
az ml compute create --name "CLUSTER_NAME" --size STANDARD_DS11_V2 --max-instances 2 --type AmlCompute
Create dataset
Necessary two files:
data_local_path.yaml
$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: lab-data
version: 1
path: data
description: Dataset pointing to diabetes data stored as CSV on local computer. Data is uploaded to default datastore.
lab.data.csvRun:
az ml data create -- file ./PATH_TO_YAML_FILE/data_local_path.yaml
Once the dataset is created, a summary is shown in the prompt. You can also view the environment in the Azure ML Studio in the Environments tab.
List datastores
az ml datastore list
Find it in Azure UI
Storage (Under resource where is the Azure ML workspace)
- Storage browser
- Blob containers
- azureml-blobstore.....
- LocalUpload
Create environment.
You expect to use a compute cluster in the future to retrain the model whenever needed. To train the model on either a compute instance or compute cluster, all necessary packages need to be installed on the compute to run the code. Instead of manually installing these packages every time you use a new compute, you can list them in an environment.
Every Azure Machine Learning workspace will by default have a list of curated environments when you create the workspace. Curated environments include common machine learning packages to train a model.
Necessary two files (in the same folder for this example):
basic-env-ml.yml
name: basic-env-ml
channels:
- conda-forge
dependencies:
- python=3.8
- pip
- pip:
- numpy
- pandas
- scikit-learn
- matplotlib
- azureml-mlflow
basic-env.yml
$schema: https://azuremlschemas.azureedge.net/latest/environment.schema.json
name: basic-env-scikit
version: 1
image: mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04
conda_file: file:conda-envs/basic-env-ml.yml
Run
az ml environment create --file ./PATH_TO_YAML_FILE/basic-env.yml
Stop instance:
az ml compute stop --name "INSTANCE_NAME"
List resources groups
az group list --output table
Delete resource group
az group delete --name GROUP_NAME
Delete workspace
az ml workspace delete
References:
How to manage Azure resource groups – Azure CLI | Microsoft Learn
Manage workspace assets with CLI (v2) - Training | Microsoft Learn