Advent of 2022, Day 22 – Batch endpoints for batch scoring

In the series of Azure Machine Learning posts:

Batch endpoints are a great and simple way to run inference over large volumes of data. They simplify the process of hosting your models for batch scoring.

We will import the needed Python libraries

from azure.ai.ml import MLClient, Input
from azure.ai.ml.entities import (
    BatchEndpoint,
    BatchDeployment,
    Model,
    Environment,
    BatchRetrySettings,
    CodeConfiguration,
)
from azure.identity import DefaultAzureCredential
from azure.ai.ml.constants import AssetTypes, BatchDeploymentOutputAction
import random
import string

Once we have the packages covered, we have configured the workspace (for more details, follow notebook on GitHub) we need to

#  Creating a unique endpoint name by including a random suffix
allowed_chars = string.ascii_lowercase + string.digits
endpoint_suffix = "".join(random.choice(allowed_chars) for x in range(5))
endpoint_name = "mnist-batch-" + endpoint_suffix

# endpoint configuration
endpoint = BatchEndpoint(
    name=endpoint_name,
    description="A batch endpoint for scoring images from the MNIST dataset.",
    tags={"type": "deep-learning"},
)

And following this, we will configure and create an endpoint.

# configuration
import random
import string

# Creating a unique endpoint name by including a random suffix
allowed_chars = string.ascii_lowercase + string.digits
endpoint_suffix = "".join(random.choice(allowed_chars) for x in range(5))
endpoint_name = "mnist-batch-" + endpoint_suffix

# endpoint configuration
endpoint = BatchEndpoint(
    name=endpoint_name,
    description="A batch endpoint for scoring images from the MNIST dataset.",
    tags={"type": "deep-learning"},
)

# creation
ml_client.begin_create_or_update(endpoint).result()

Followed by the model registration. Make sure to check the data/Day22 folder to get the model files.

model_name = "mnist-classification-torch"
model_local_path = "./Day22/model/"

if not any(filter(lambda m: m.name == model_name, ml_client.models.list())):
    print(f"Model {model_name} is not registered. Creating...")
    model = ml_client.models.create_or_update(
        Model(name=model_name, path=model_local_path, type=AssetTypes.CUSTOM_MODEL)
    )

#Let's get a reference to the model:
model = ml_client.models.get(name=model_name, label="latest")
Fig 1: Result of model registration

We also need to create a deployment and compute. The compute script is also known:

from azure.ai.ml.entities import AmlCompute

compute_name = "cpu-cluster"

if not any(filter(lambda m: m.name == compute_name, ml_client.compute.list())):
    print(f"Compute {compute_name} is not created. Creating...")
    compute_cluster = AmlCompute(
        name=compute_name,
        description="CPU cluster compute",
        min_instances=0,
        max_instances=1,
    )
    ml_client.compute.begin_create_or_update(compute_cluster).result()

And we create compute and environment.

from azure.ai.ml.entities import AmlCompute

compute_name = "cpu-cluster"

if not any(filter(lambda m: m.name == compute_name, ml_client.compute.list())):
    print(f"Compute {compute_name} is not created. Creating...")
    compute_cluster = AmlCompute(
        name=compute_name,
        description="CPU cluster compute",
        min_instances=0,
        max_instances=1,
    )
    ml_client.compute.begin_create_or_update(compute_cluster).result()

env = Environment(
    conda_file="./Day22/environment/conda.yaml",
    image="mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04:latest",
)

And finally, configure the deployment

deployment = BatchDeployment(
    name="mnist-torch-dpl",
    description="A deployment using Torch to solve the MNIST classification dataset.",
    endpoint_name=endpoint_name,
    model=model,
    code_configuration=CodeConfiguration(
        code="./Day22/code/", scoring_script="batch_driver.py"
    ),
    environment=env,
    compute=compute_name,
    instance_count=2,
    max_concurrency_per_instance=2,
    mini_batch_size=10,
    output_action=BatchDeploymentOutputAction.APPEND_ROW,
    output_file_name="predictions.csv",
    retry_settings=BatchRetrySettings(max_retries=3, timeout=30),
    logging_level="info",
)

And create the deployment

Fig 2. Creation of deployment

For the last part, you need to invoke (start) the endpoint and the job.

input = Input(
    type="uri_folder",
    path="https://pipelinedata.blob.core.windows.net/sampledata/mnist",
)

job = ml_client.batch_endpoints.invoke(
    endpoint_name=endpoint_name,
    input=input,
)

Once you infer on batch data and get the predictions on batch data, you can later do any type of analysis.

Compete set of code, documents, notebooks, and all of the materials will be available at the Github repository: https://github.com/tomaztk/Azure-Machine-Learning

Happy Advent of 2022!

Tagged with: , , , , , ,
Posted in Azure Machine Learning, Uncategorized
3 comments on “Advent of 2022, Day 22 – Batch endpoints for batch scoring
  1. […] Tomaz Kastrun is winding down an advent of Azure ML. Day 22 covers batch scoring: […]

    Like

  2. […] Dec 22: Batch endpoints for batch scoring […]

    Like

Leave a comment

Follow TomazTsql on WordPress.com
Programs I Use: SQL Search
Programs I Use: R Studio
Programs I Use: Plan Explorer
Rdeči Noski – Charity

Rdeči noski

100% of donations made here go to charity, no deductions, no fees. For CLOWNDOCTORS - encouraging more joy and happiness to children staying in hospitals (http://www.rednoses.eu/red-noses-organisations/slovenia/)

€2.00

Top SQL Server Bloggers 2018
TomazTsql

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Discover WordPress

A daily selection of the best content published on WordPress, collected for you by humans who love to read.

Revolutions

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

tenbulls.co.uk

tenbulls.co.uk - attaining enlightenment with the Microsoft Data and Cloud Platforms with a sprinkling of Open Source and supporting technologies!

SQL DBA with A Beard

He's a SQL DBA and he has a beard

Reeves Smith's SQL & BI Blog

A blog about SQL Server and the Microsoft Business Intelligence stack with some random Non-Microsoft tools thrown in for good measure.

SQL Server

for Application Developers

Business Analytics 3.0

Data Driven Business Models

SQL Database Engine Blog

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Search Msdn

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

R-bloggers

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Data Until I Die!

Data for Life :)

Paul Turley's SQL Server BI Blog

sharing my experiences with the Microsoft data platform, SQL Server BI, Data Modeling, SSAS Design, Power Pivot, Power BI, SSRS Advanced Design, Power BI, Dashboards & Visualization since 2009

Grant Fritchey

Intimidating Databases and Code

Madhivanan's SQL blog

A modern business theme

Alessandro Alpi's Blog

DevOps could be the disease you die with, but don’t die of.

Paul te Braak

Business Intelligence Blog

Sql Insane Asylum (A Blog by Pat Wright)

Information about SQL (PostgreSQL & SQL Server) from the Asylum.

Gareth's Blog

A blog about Life, SQL & Everything ...