Exploring developer tools for workspace interaction

Set up Azure ML resources with the CLI

To train a machine learning model in Azure Machine Learning, a data scientist first needs to prepare the required infrastructure. Using the Azure CLI together with the Azure Machine Learning extension, you can create a workspace along with supporting resources such as a compute instance.

Open Cloud Shell [>_] in Azure cloud, and then execute the following commands. You can give your favorite name resource groups, azure machine learning workspace etc as long as Azure accepts it.

Add the Azure ML CLI extension
```
 az extension add -n ml -y
```

Create a resource group

 az group create --name "rgDataScience" --location "eastus"

Create an Azure ML workspace

 az ml workspace create --name "mlw-nicolas-ws" -g "rgDataScience"

Provision a compute instance (dev environment)

Another key piece of the setup for training machine learning models is compute power. While models can technically be trained on a local machine, it’s usually more efficient and cost-effective to use cloud resources.

Within an Azure Machine Learning workspace, data scientists often need a virtual machine to run Jupyter notebooks and experiment with code. For this development work, a compute instance provides the best option.
```
 az ml compute create --name "phantom" --size STANDARD_DS11_V2 --type ComputeInstance -w mlw-nicolas-ws -g rgDataScience
```
Provision a compute cluster (for training jobs)

While a compute instance works well for development tasks, a compute cluster is more appropriate for training machine learning models. The cluster automatically scales up from zero nodes when a job is submitted, runs the job, and then scales back down to zero once it’s finished, helping to reduce costs.
```
 az ml compute create --name "aml-cluster" --size STANDARD_DS11_V2 --max-instances 2 --type AmlCompute -w mlw-nicolas-ws -g rgDataScience
```

These are resources created for me.

We will go to mlw-nicolas-ws (your machine learning workspace in your case) > Overview page > Launch studio

Within the Azure Machine Learning studio, navigate to the Compute page and verify that the compute instance and cluster you created in the previous section exist. The compute instance should be running, the cluster should be in Succeeded state and have 0 nodes running.

Use the Python SDK to train a model

After confirming that the required compute resources are available, the next step is to use the Python SDK to run a training script. You’ll set up and run the SDK on the compute instance, while the actual model training will be executed on the compute cluster.

In your compute instance, there are a number of options in the Applications field. Select the Terminal application to launch the terminal (you may need to click the ellipsis to expand the selection).

In the terminal, install the Python SDK on the compute instance by running the following commands in the terminal:

 pip uninstall azure-ai-ml
 pip install azure-ai-ml

Run the following command to clone a Git repository containing notebooks, data, and other files to your workspace:

 git clone https://github.com/MicrosoftLearning/mslearn-azure-ml.git azure-ml-labs

When the command has completed, in the Files pane, select ↻ to refresh the view and verify that a new Users/your-user-name/azure-ml-labs folder has been created.

Open the Labs/02/Run training script.ipynb notebook. Select Authenticate and follow the necessary steps if a notification appears asking you to authenticate.Verify that the notebook uses the Python 3.10 - AzureML kernel on the upper right corner of the notebook environment. Each kernel has its own image with its own set of packages pre-installed.

Having successfully authenticated, it will show you with green color.

Run all cells in the notebook.

A new job will be created in the Azure Machine Learning workspace. The job tracks the inputs defined in the job configuration, the code used, and the outputs like metrics to evaluate the model.

from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential
from azure.ai.ml import MLClient

try:
    credential = DefaultAzureCredential()
    # Check if given credential can get token successfully.
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
    credential = InteractiveBrowserCredential()

This code is setting up authentication. It first tries automatic login using DefaultAzureCredential, which can pick up credentials from sources like the Azure CLI or environment variables. If that doesn’t work, it falls back to InteractiveBrowserCredential, which asks you to sign in manually through a browser.

# Get a handle to workspace
ml_client = MLClient.from_config(credential=credential)

This code is connecting to your Azure Machine Learning workspace. It uses the authenticated credential and the settings in your local config.json file. The MLClient.from_config method reads that file and creates a client object (ml_client) you can use to manage resources like datasets, compute, jobs, and models inside the workspace.

To train a model, you'll first create the diabetes_training.py script in the src folder. The script uses the diabetes.csv file in the same folder as the training data.

%%writefile src/diabetes-training.py
# import libraries
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score
from sklearn.metrics import roc_curve

# load the diabetes dataset
print("Loading Data...")
diabetes = pd.read_csv('diabetes.csv')

# separate features and labels
X, y = diabetes[['Pregnancies','PlasmaGlucose','DiastolicBloodPressure','TricepsThickness','SerumInsulin','BMI','DiabetesPedigree','Age']].values, diabetes['Diabetic'].values

# split data into training set and test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=0)

# set regularization hyperparameter
reg = 0.01

# train a logistic regression model
print('Training a logistic regression model with regularization rate of', reg)
model = LogisticRegression(C=1/reg, solver="liblinear").fit(X_train, y_train)

# calculate accuracy
y_hat = model.predict(X_test)
acc = np.average(y_hat == y_test)
print('Accuracy:', acc)

# calculate AUC
y_scores = model.predict_proba(X_test)
auc = roc_auc_score(y_test,y_scores[:,1])
print('AUC: ' + str(auc))

This code defines and runs the training script. It first loads the diabetes dataset and separates the input features (like pregnancies, glucose, BMI, age) from the target label (whether the patient is diabetic). The data is then split into a training set and a test set. A logistic regression model is trained using a regularization rate of 0.01. After training, the script evaluates the model in two ways: by checking its accuracy (how often it predicts correctly) and by calculating the AUC score (how well it distinguishes between diabetic and non-diabetic patients

The dataset is divided into input features (X) such as pregnancies, glucose, BMI, and age, and the output label (y), which indicates if a patient has diabetes. Using train_test_split, 70% of the data is used to train the model and 30% is kept aside for testing, ensuring the model is evaluated on data it hasn’t seen before. Logistic regression is chosen because it’s a simple and effective method for binary classification problems like predicting diabetes.

Run the cell below to submit the job that trains a classification model to predict diabetes.

from azure.ai.ml import command

# configure job
job = command(
    code="./src",
    command="python diabetes-training.py",
    environment="AzureML-sklearn-0.24-ubuntu18.04-py37-cpu@latest",
    compute="aml-cluster",
    display_name="diabetes-pythonv2-train",
    experiment_name="diabetes-training"
)

# submit job
returned_job = ml_client.create_or_update(job)
aml_url = returned_job.studio_url
print("Monitor your job at", aml_url)

This code defines and submits a training job to Azure Machine Learning. The command function specifies the script to run (diabetes-training.py in the src folder), the environment to use (a prebuilt scikit-learn environment), and the compute target (aml-cluster). It also sets a display name and experiment name for tracking. The job is then submitted with ml_client.create_or_update(job), and a link is printed so you can monitor progress in Azure ML Studio.

Review your job history in the Azure Machine Learning studio

Either select the job URL provided as output in the notebook, or navigate to the Jobs page in the Azure Machine Learning studio.
A new experiment is listed named diabetes-training. Select the latest job diabetes-pythonv2-train.
Review the job’s Properties. Note the job Status:
- Queued: The job is waiting for compute to become available.
- Preparing: The compute cluster is resizing or the environment is being installed on the compute target.
- Running: The training script is being executed.
- Finalizing: The training script ran and the job is being updated with all final information.
- Completed: The job successfully completed and is terminated.
- Failed: The job failed and is terminated.
Under Outputs + logs, you’ll find the output of the script in user_logs/std_log.txt. Outputs from print statements in the script will show here. If there’s an error because of a problem with your script, you’ll find the error message here too.

The training job ran successfully on the compute cluster. Using the provided dataset, a logistic regression model was trained with a regularization rate of 0.01. The results showed an accuracy of 0.774 and an AUC of 0.8483. In plain terms, this means the model is fairly accurate and about 85% effective at telling the difference between two groups in the data—for example, deciding whether something should be labeled “yes” or “no.” These outcomes confirm that the entire workflow—from setting up compute resources, to running the training script, to reviewing the results in Azure Machine Learning Studio—worked as intended.
Under Code, you’ll find the folder you specified in the job configuration. This folder includes the training script and dataset.

The dataset used here contains health records of patients, where each row represents one person. For each patient, details such as number of pregnancies, blood sugar level, blood pressure, body mass index (BMI), insulin level, and age are recorded. There is also a family history score that reflects how likely diabetes runs in the family. The last column shows whether the patient has diabetes (1) or does not have diabetes (0). This makes the data useful for building a model that can predict the likelihood of diabetes based on a person’s health information.

Delete Azure resources

When you finish exploring Azure Machine Learning, you should delete the resources you’ve created to avoid unnecessary Azure costs. Delete the resource group.

Conclusion

In this article, we explored how to use developer tools in Azure Machine Learning to set up the necessary infrastructure, authenticate with the Python SDK, and run a training script on the cloud. We created both a compute instance for development and a compute cluster for scalable training, then used them to train a logistic regression model on the diabetes dataset. The results showed good predictive performance, demonstrating how Azure ML simplifies the end-to-end workflow—from managing resources and running experiments to monitoring outputs in the studio. This process provides a strong foundation for experimenting with more complex models and larger datasets in future projects.

Reference

https://microsoftlearning.github.io/mslearn-azure-ml/Instructions/02-Explore-developer-tools.html

Exploring developer tools for workspace interaction

Set up Azure ML resources with the CLI

Use the Python SDK to train a model

Review your job history in the Azure Machine Learning studio

Delete Azure resources

Conclusion

Reference

Comments

More from this blog

Making data available in Azure Machine Learning

Explore the Azure Machine Learning workspace

Retrieve configuration settings from Azure App Configuration

Create and retrieve secrets from Azure Key Vault

Command Palette

Set up Azure ML resources with the CLI

Use the Python SDK to train a model

Review your job history in the Azure Machine Learning studio

Delete Azure resources

Conclusion

Reference

Comments

More from this blog