Mar 8, 2021

Guide to choosing an Compute option in Azure Machine Learning service

Azure MachineLearning Service provides four main compute options each with a specific purpose attached to it. In this post we will go through each of those and see where we can occupy them in your ML experiments.

Initial concepts ofMachine Learning and AI came out back in the 1950s, but people could not explore the full power since most Machine Learning algorithms are computationally expensive. The computing power of the early systems was not enough to process complex algorithmic computations with large amounts of data.Since cloud computing and GPUs opened the arena to perform complex computations, machine learning and deep learning got a boost and are now used widely in real-world applications.

Being a leading cloud-based machine learning platform, Azure Machine Learning solves three mainburdens in the Machine Learning model development and deployment process.

     1. Setting up a development environment with the burden of solving platform and software library dependencies.

   2. Setting up the computing environments(parallel processing libraries (CUDA) etc.)

   3. Setting and managing deployment environments

Since these are key areas in MLOps life-cycle,  selecting the appropriate computing resource for performing these tasks is vital in managing and deploying a successful machine learning experiment. Azure machine learning studio has centralised all the resources to an easily accessible portal allowing the developer to select the most suitable resource for their needs.

Azure Machine Learning Studio contains four main compute categories for specific purposes. We will go through each of those and see where we can occupy them in our Machine Learning experiments.

Compute Instances

 

In a scenario where you don't want to spend time on development environment setup or on an occasion where you have very few local compute resources to perform your Machine Learning experiments, the best option is to use cloud-based pre-configured compute resources for your task.  

No need of messing around with configuring CUDA environments and all python packages. You can go through a few steps in a wizard to create a pre-configured computing instance on Azure.

Creating a compute instance on Azure Machine Learning Studio

You can create a fully managed compute instance by clicking 'new' on the compute instances tab. The procedure is more similar to creating a virtual machine on Azure. The virtual machine we are creating here is having most of the machine learning and deep learning related libraries pre-installed.  If you need to use GPU based computing, you may have to select an N-series VM on Azure. (Make sure the region you selected has the required VM families)

In order to do the experiments on the compute instance, you can either use JupyterLab, Jupyter notebook or Rstudio. Recently Azure enabled accessing the terminal of the compute instance, which comes in handy in setting up environments.

Tips –

- GPU based compute instances are costly. Create such cases if you really need to do Deep Learning experiments.

- Think of a complex deep learning scenario where data preprocessing needs a large amount of CPU processing time while model training should be done using GPUs… You can use two compute instances where preprocessing happens in a CPU based instance, while the GPU based expensive compute instance is used for model training. (Connecting these two processes can be done using Azure machine learning pipelines)

- Make sure to deallocate the resources when you are not using them. (Else you should have a fat wallet in your pocket)

Compute clusters

Compute clusters in Azure ML are the survivor when we have complex computations to perform. You can perform high computationally intensive tasks such as AutomatedML experiments or hyperparameter tuning on these pre-configured clusters. The main advantage of using compute clusters for such experiments is the ability to perform parallel computing.

Creating a compute cluster on Azure Machine Learning Studio

The main difference in the process of creating a compute cluster is selecting the maximum number of nodes of it. Make sure to select it according to the computation intensity of the experiments you intend to run on the cluster.

Selecting low priority virtual machines are cost-effective when running the experiments on clusters, but the jobs may get preempted, which may cause some delay in computations.  

The underlying technology behind the compute clusters is docker containers. Simply containerize the experiment and push it into the cluster for computations/training.

Inference clusters

Setting up anInference cluster on Azure Machine Learning Studio.

The end result of the machine learning experiments you perform sits on inference clusters. Azure MachineLearning uses Azure Kubernetes Services AKS) for managing large-scale endpoint deployments. You can adjust the number of nodes and the configurations of the cluster nodes according to the requirements of the production environment.(normally it is recommended to use Azure Container Instances for dev-test andAKS for production web service endpoints)

Attached compute

This is an interesting feature in Azure machine learning where you can push your machine learning workloads into external computing environments. As of now, Azure machine learning is supporting

   • Azure Databricks

   • Data Lake Analytics

   • HDInsight

   • Virtual machine

If you are going to attach an external VM for Azure, it should be running the Ubuntu operating system in order to connect as an attached compute.

Choosing the appropriate compute resources for your experiment is essential in managing the MLOps process. Make sure to plan ahead before selecting and spinning up the compute resources.

Interested in hearing more?
Lets connect.