Making data available in Azure Machine Learning
Have dual bachelor degrees in BSc in Control Engineering and Computer Engineering and MSc in Artificial Intelligence. Have worked as .net fullstack/ backend developer almost 5 years now. Passionate about both Software and AI Engineering. Specializing in Azure Cloud currently.
In enterprise settings, instead of keeping data only on local machines, it is often more efficient to store it in a centralized location that allows shared access for multiple data scientists and machine learning engineers.
Setting up Azure Resources
From the Cloud Shell, execute the following commands.
Enter the following commands in the terminal to clone this repo:
rm -r azure-ml-labs -f
git clone https://github.com/MicrosoftLearning/mslearn-azure-ml.git azure-ml-labs
Enter the following commands after the repo has been cloned, to change to the folder for this lab and run the setup.sh script it contains:
cd azure-ml-labs/Labs/03
./setup.sh
Setting up resources will take a couple of minutes.
Explore the default datastores
When you create an Azure Machine Learning workspace, a Storage Account is automatically created and connected to your workspace. You’ll explore how the Storage Account is connected.
In the Azure portal, navigate to the new resource group named rg-dp100-….> Azure Blob Storage > Data Storage (left menu) > Containers

+Add container > name training-data
Copy the access key
To create a datastore in the Azure Machine Learning workspace, you need to provide some credentials. An easy way to provide the workspace with access to a Blob storage is to use the account key.
Go to Security + networking > Access keys > Select Show for the Key field under key1. > Copy this key1
Copy the name of your storage account from the top of the page. The name should start with mlwdp100storage… You’ll need to paste this value into the notebook later too.

Clone the lab materials
To create a datastore and data assets with the Python SDK, you’ll need to clone the lab materials into the workspace.
Navigate to mlw-dp100-labs > Overview > Launch studio > navigate to the Compute page and verify that the compute instance and cluster you created in the previous section exist.
In the Compute instances tab, find your compute instance, and select the Terminal application.
In the terminal, install the Python SDK on the compute instance by running the following commands in the terminal:
pip uninstall azure-ai-ml
pip install azure-ai-ml
pip install mltable

Run the following command to clone a Git repository containing notebooks, data, and other files to your workspace:
git clone https://github.com/MicrosoftLearning/mslearn-azure-ml.git azure-ml-labs
When the command has completed, in the Files pane, click ↻ to refresh the view and verify that a new Users/your-user-name/azure-ml-labs folder has been created.

Create a datastore and data assets
The code to create a datastore and data assets with the Python SDK is provided in a notebook.
Open the Labs/03/Work with data.ipynb notebook. Select Authenticate and follow the necessary steps if a notification appears asking you to authenticate.
Verify that the notebook uses the Python 3.10 - AzureML kernel.
Run all cells in the notebook.


Delete Azure resources
When you finish exploring Azure Machine Learning, you should delete the resources you’ve created to avoid unnecessary Azure costs. Delete the resource group.
Reference
https://microsoftlearning.github.io/mslearn-azure-ml/Instructions/03-Make-data-available.html

