ml-model-development - Softsolution Sahand

Machine Learning Model development

In my previous post: Azure Machine Learning workspace we have set up a Workspace, Compute and Notebooks.

In this post we will learn how to develop a training script with a notebook on an Azure Machine Learning cloud workstation. This tutorial covers the basics you need to get started:

Set up and configuring the cloud workstation. Your cloud workstation is powered by an Azure Machine Learning compute instance, which is pre-configured with environments to support your various model development needs.
Use cloud-based development environments.
Use MLflow to track your model metrics, all from within a notebook.

Prerequisites

To use Azure Machine Learning, you’ll first need a workspace. If you don’t have one, complete Create resources you need to get started to create a workspace and learn more

Start with Notebooks

The Notebooks section in your workspace is a good place to start learning about Azure Machine Learning and its capabilities. Here you can connect to compute resources, work with a terminal, and edit and run Jupyter Notebooks and scripts.

Sign in to Azure Machine Learning studio.
Select your workspace if it isn’t already open.
On the left navigation, select Notebooks.
If you don’t have a compute instance, you’ll see Create compute in the middle of the screen. Select Create compute and fill out the form. You can use all the defaults. (If you already have a compute instance, you’ll instead see Terminal in that spot. You’ll use Terminal later in this tutorial.) it is looks like as following figure:

Set up a new environment for prototyping (optional)

In order for your script to run, you need to be working in an environment configured with the dependencies and libraries the code expects. This section helps you create an environment tailored to your code. To create the new Jupyter kernel your notebook connects to, you’ll use a YAML file that defines the dependencies.

Upload a file.Files you upload are stored in an Azure file share, and these files are mounted to each compute instance and shared within the workspace.
1. Download this conda environment file, workstation_env.yml to your computer by using the Download raw file button at the top right.
2. Select Add files, then select Upload files to upload it to your workspace.

Select Browse and select file(s).
Select workstation_env.yml file you downloaded.
Select Upload.

You’ll see the workstation_env.yml file under your username folder in the Files tab. Select this file to preview it, and see what dependencies it specifies. You’ll see contents like this:

name: workstation_env
dependencies:
  - python=3.8
  - pip=21.2.4
  - scikit-learn=0.24.2
  - scipy=1.7.1
  - pandas>=1.1,<1.2
  - pip:
    - mlflow==2.4.1 
    - azureml-mlflow==1.51.0
    - psutil>=5.8,<5.9
    - ipykernel~=6.0
    - matplotlib

Create a kernel.

Now use the Azure Machine Learning terminal to create a new Jupyter kernel, based on the workstation_env.yml file.

Select Terminal to open a terminal window. You can also open the terminal from the left command bar:

If the compute instance is stopped, select Start compute and wait until it’s running.

Once the compute is running, you see a welcome message in the terminal, and you can start typing commands.
View your current conda environments. The active environment is marked with a *.

conda env list

then you see the following:

(azureml_py38) azureuser@mlcompute1:~/cloudfiles/code/Users/mehrdad.zandi$ conda env list
# conda environments:
#
base                     /anaconda
azureml_py310_sdkv2      /anaconda/envs/azureml_py310_sdkv2
azureml_py38          *  /anaconda/envs/azureml_py38
azureml_py38_PT_TF       /anaconda/envs/azureml_py38_PT_TF
jupyter_env              /anaconda/envs/jupyter_env

(azureml_py38) azureuser@mlcompute1:~/cloudfiles/code/Users/mehrdad.zandi$

3. Create the environment based on the conda file provided. It takes a few minutes to build this environment.

conda env create -f workstation_env.yml

It takes a while and prompt to you:

done
#
# To activate this environment, use
#
#     $ conda activate workstation_env
#
# To deactivate an active environment, use
#
#     $ conda deactivate

4. Activate the new environment.

conda activate workstation_env

5. Validate the correct environment is active, again looking for the environment marked with a *.

conda env list

6. Create a new Jupyter kernel based on your active environment.

python -m ipykernel install --user --name workstation_env --display-name "mehzan Workstation Env"

7. Close the terminal window by typing exit.

We now have a new kernel. Next you’ll open a notebook and use this kernel.

Create a notebook

Select Add files, and choose Create new file:

2.Name your new notebook develop-mehzan.ipynb (or enter your preferred name).

3. If the compute instance is stopped, select Start compute and wait until it’s running.

4. You’ll see the notebook is connected to the default kernel in the top right Switch to use the mehzan Workstation Env kernel if you created the kernel.

Develop a training script

We develop a Python training script that predicts credit card default payments, using the prepared test and training datasets from the UCI dataset.

This code uses sklearn for training and MLflow for logging the metrics.

Now press to the new Notebook: develop-mehzan.ipynp and copy the following code to the terminal ( on the above figure which is showing with line: 1)

Start with code that imports the packages and libraries you’ll use in the training script. (copy the following codes to

import os
import argparse
import pandas as pd
import mlflow
import mlflow.sklearn
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import classification_report
from sklearn.model_selection import train_test_split

2. Next, load and process the data for this experiment. In this tutorial, you read the data from a file on the internet.

# load the data
credit_df = pd.read_csv(
    "https://azuremlexamples.blob.core.windows.net/datasets/credit_card/default_of_credit_card_clients.csv",
    header=1,
    index_col=0,
)

train_df, test_df = train_test_split(
    credit_df,
    test_size=0.25,
)

3. Get the data ready for training:

# Extracting the label column
y_train = train_df.pop("default payment next month")

# convert the dataframe values to array
X_train = train_df.values

# Extracting the label column
y_test = test_df.pop("default payment next month")

# convert the dataframe values to array
X_test = test_df.values

4. Add code to start autologging with MLflow, so that you can track the metrics and results. With the iterative nature of model development, MLflow helps you log model parameters and results. Refer back to those runs to compare and understand how your model performs. The logs also provide context for when you’re ready to move from the development phase to the training phase of your workflows within Azure Machine Learning.

# set name for logging
mlflow.set_experiment("Develop on cloud tutorial")
# enable autologging with MLflow
mlflow.sklearn.autolog()

5. Train a model.

# Train Gradient Boosting Classifier
print(f"Training with data of shape {X_train.shape}")

mlflow.start_run()
clf = GradientBoostingClassifier(n_estimators=100, learning_rate=0.1)
clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)

print(classification_report(y_test, y_pred))
# Stop logging for this model
mlflow.end_run()

Note: You can ignore the mlflow warnings. You’ll still get all the results you need tracked.

Press to the arrow on upper left, shown in the following figure:

When you run the whole code you can see the following output:

Training with data of shape (22500, 23)
              precision    recall  f1-score   support

           0       0.85      0.95      0.90      5864
           1       0.68      0.39      0.50      1636

    accuracy                           0.83      7500
   macro avg       0.77      0.67      0.70      7500
weighted avg       0.81      0.83      0.81      7500

2023/11/16 14:10:28 WARNING mlflow.utils.autologging_utils: MLflow autologging encountered a warning: "/anaconda/envs/azureml_py38/lib/python3.8/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils."

Iterate

Now that you have model results, you may want to change something and try again. For example, try a different classifier technique:

# Train  AdaBoost Classifier
from sklearn.ensemble import AdaBoostClassifier

print(f"Training with data of shape {X_train.shape}")

mlflow.start_run()
ada = AdaBoostClassifier()

ada.fit(X_train, y_train)

y_pred = ada.predict(X_test)

print(classification_report(y_test, y_pred))
# Stop logging for this model
mlflow.end_run()

When you run the code you can see the following output:

Training with data of shape (22500, 23)
              precision    recall  f1-score   support

           0       0.83      0.96      0.89      5809
           1       0.69      0.34      0.45      1691

    accuracy                           0.82      7500
   macro avg       0.76      0.65      0.67      7500
weighted avg       0.80      0.82      0.79      7500

Examine results

Now that you’ve tried two different models, use the results tracked by MLFfow to decide which model is better. You can reference metrics like accuracy, or other indicators that matter most for your scenarios. You can dive into these results in more detail by looking at the jobs created by MLflow.

On the left navigation, select Jobs.

2. Select the link for Develop on cloud tutorial.

3. There are two different jobs shown, one for each of the models you tried. These names are autogenerated. As you hover over a name, use the pencil tool next to the name if you want to rename it.

4. Select the link for the first job. The name appears at the top. You can also rename it here with the pencil tool.

5. The page shows details of the job, such as properties, outputs, tags, and parameters. Under Tags, you’ll see the estimator_name, which describes the type of model.

6. Select the Metrics tab to view the metrics that were logged by MLflow. (Expect your results to differ, as you have a different training set.)

7. Select the Images tab to view the images generated by MLflow.

Stop compute instance

If you’re not going to use it now, stop the compute instance:

In the studio, on the left, select Compute.
In the top tabs, select Compute instances
Select the compute instance in the list.
On the top toolbar, select Stop.

Delete all resources

Important : The resources that you created can be used as prerequisites to other Azure Machine Learning tutorials and how-to articles.

If you don’t plan to use any of the resources that you created, delete them so you don’t incur any charges:

In the Azure portal, select Resource groups on the far left.
From the list, select the resource group that you created.
Select Delete resource group.
Enter the resource group name. Then select Delete.

Conclusion

In this post I have described Machine Learning Model development, setting up a new environment, creating notebook, develop a training script and Examining results of iterations.

Next Post is Machine Learning and Python language

This post is part of “Machine-learning-Step by step”.

Back to home page

useful links to train:

Train a model in Azure Machine Learning

Deploy a model as an online endpoint