Use computer vision to measure agriculture yield with Amazon Rekognition Custom Labels

In the agriculture sector, the problem of identifying and counting the amount of fruit on trees plays an important role in crop estimation. The concept of renting and leasing a tree is becoming popular, where a tree owner leases the tree every year before the harvest based on the estimated fruit yeild. The common practice of manually counting fruit is a time-consuming and labor-intensive process. It’s one of the hardest but most important tasks in order to obtain better results in your crop management system. This estimation of the amount of fruit and flowers helps farmers make better decisions—not only on only leasing prices, but also on cultivation practices and plant disease prevention.

This is where an automated machine learning (ML) solution for computer vision (CV) can help farmers. Amazon Rekognition Custom Labels is a fully managed computer vision service that allows developers to build custom models to classify and identify objects in images that are specific and unique to your business.

Rekognition Custom Labels doesn’t require you to have any prior computer vision expertise. You can get started by simply uploading tens of images instead of thousands. If the images are already labeled, you can begin training a model in just a few clicks. If not, you can label them directly within the Rekognition Custom Labels console, or use Amazon SageMaker Ground Truth to label them. Rekognition Custom Labels uses transfer learning to automatically inspect the training data, select the right model framework and algorithm, optimize the hyperparameters, and train the model. When you’re satisfied with the model accuracy, you can start hosting the trained model with just one click.

In this post, we showcase how you can build an end-to-end solution using Rekognition Custom Labels to detect and count fruit to measure agriculture yield.

Solution overview

We create a custom model to detect fruit using the following steps:

  1. Label a dataset with images containing fruit using Amazon SageMaker Ground Truth.
  2. Create a project in Rekognition Custom Labels.
  3. Import your labeled dataset.
  4. Train the model.
  5. Test the new custom model using the automatically generated API endpoint.

Rekognition Custom Labels lets you manage the ML model training process on the Amazon Rekognition console, which simplifies the end-to-end model development and inference process.

Prerequisites

To create an agriculture yield measuring model, you first need to prepare a dataset to train the model with. For this post, our dataset is composed of images of fruit. The following images show some examples.

We sourced our images from our own garden. You can download the image files from the GitHub repo.

For this post, we only use a handful of images to showcase the fruit yield use case. You can experiment further with more images.

To prepare your dataset, complete the following steps:

  1. Create an Amazon Simple Storage Service (Amazon S3) bucket.
  2. Create two folders inside this bucket, called raw_data and test_data, to store images for labeling and model testing.
  3. Choose Upload to upload the images to their respective folders from the GitHub repo.

The uploaded images aren’t labeled. You label the images in the following step.

Label your dataset using Ground Truth

To train the ML model, you need labeled images. Ground Truth provides an easy process to label the images. The labeling task is performed by a human workforce; in this post, you create a private workforce. You can use Amazon Mechanical Turk for labeling at scale.

Create a labeling workforce

Let’s first create our labeling workforce. Complete the following steps:

  1. On the SageMaker console, under Ground Truth in the navigation pane, choose Labeling workforces.
  2. On the Private tab, choose Create private team.
  3. For Team name, enter a name for your workforce (for this post, labeling-team).
  4. Choose Create private team.
  5. Choose Invite new workers.

  6. In the Add workers by email address section, enter the email addresses of your workers. For this post, enter your own email address.
  7. Choose Invite new workers.

You have created a labeling workforce, which you use in the next step while creating a labeling job.

Create a Ground Truth labeling job

To great your labeling job, complete the following steps:

  1. On the SageMaker console, under Ground Truth, choose Labeling jobs.
  2. Choose Create labeling job.
  3. For Job name, enter fruits-detection.
  4. Select I want to specify a label attribute name different from the labeling job name.
  5. For Label attribute name¸ enter Labels.
  6. For Input data setup, select Automated data setup.
  7. For S3 location for input datasets, enter the S3 location of the images, using the bucket you created earlier (s3://{your-bucket-name}/raw-data/images/).
  8. For S3 location for output datasets, select Specify a new location and enter the output location for annotated data (s3://{your-bucket-name}/annotated-data/).
  9. For Data type, choose Image.
  10. Choose Complete data setup.
    This creates the image manifest file and updates the S3 input location path. Wait for the message “Input data connection successful.”
  11. Expand Additional configuration.
  12. Confirm that Full dataset is selected.
    This is used to specify whether you want to provide all the images to the labeling job or a subset of images based on filters or random sampling.
  13. For Task category, choose Image because this is a task for image annotation.
  14. Because this is an object detection use case, for Task selection, select Bounding box.
  15. Leave the other options as default and choose Next.
  16. Choose Next.

    Now you specify your workers and configure the labeling tool.
  17. For Worker types, select Private.For this post, you use an internal workforce to annotate the images. You also have the option to select a public contractual workforce (Amazon Mechanical Turk) or a partner workforce (Vendor managed) depending on your use case.
  18. For Private teams¸ choose the team you created earlier.
  19. Leave the other options as default and scroll down to Bounding box labeling tool.It’s essential to provide clear instructions here in the labeling tool for the private labeling team. These instructions acts as a guide for annotators while labeling. Good instructions are concise, so we recommend limiting the verbal or textual instructions to two sentences and focusing on visual instructions. In the case of image classification, we recommend providing one labeled image in each of the classes as part of the instructions.
  20. Add two labels: fruit and no_fruit.
  21. Enter detailed instructions in the Description field to provide instructions to the workers. For example: You need to label fruits in the provided image. Please ensure that you select label 'fruit' and draw the box around the fruit just to fit the fruit for better quality of label data. You also need to label other areas which look similar to fruit but are not fruit with label 'no_fruit'.You can also optionally provide examples of good and bad labeling images. You need to make sure that these images are publicly accessible.
  22. Choose Create to create the labeling job.

After the job is successfully created, the next step is to label the input images.

Start the labeling job

Once you have successfully created the job, the status of the job is InProgress. This means that the job is created and the private workforce is notified via email regarding the task assigned to them. Because you have assigned the task to yourself, you should receive an email with instructions to log in to the Ground Truth Labeling project.

  1. Open the email and choose the link provided.
  2. Enter the user name and password provided in the email.
    You may have to change the temporary password provided in the email to a new password after login.
  3. After you log in, select your job and choose Start working.

    You can use the provided tools to zoom in, zoom out, move, and draw bounding boxes in the images.
  4. Choose your label (fruit or no_fruit) and then draw a bounding box in the image to annotate it.
  5. When you’re finished, choose Submit.

Now you have correctly labeled images that will be used by the ML model for training.

Create your Amazon Rekognition project

To create your agriculture yield measuring project, complete the following steps:

  1. On the Amazon Rekognition console, choose Custom Labels.
  2. Choose Get Started.
  3. For Project name, enter fruits_yield.
  4. Choose Create project.

You can also create a project on the Projects page. You can access the Projects page via the navigation pane. The next step is to provide images as input.

Import your dataset

To create your agriculture yield measuring model, you first need to import a dataset to train the model with. For this post, our dataset is already labeled using Ground Truth.

  1. For Import images, select Import images labeled by SageMaker Ground Truth.
  2. For Manifest file location, enter the S3 bucket location of your manifest file (s3://{your-bucket-name}/fruits_image/annotated_data/fruits-labels/manifests/output/output.manifest).
  3. Choose Create Dataset.

You can see your labeled dataset.

Now you have your input dataset for the ML model to start training on them.

Train your model

After you label your images, you’re ready to train your model.

  1. Choose Train model.
  2. For Choose project, choose your project fruits_yield.
  3. Choose Train Model.

Wait for the training to complete. Now you can start testing the performance for this trained model.

Test your model

Your agriculture yield measuring model is now ready for use and should be in the Running state. To test the model, complete the following steps:

Step 1 : Start the model

On your model details page, on the Use model tab, choose Start.

Rekognition Custom Labels also provides the API calls for starting, using, and stopping your model.

Step 2 : Test the model

When the model is in the Running state, you can use the sample testing script analyzeImage.py to count the amount of fruit in an image.

  1. Download this script from of the GitHub repo.
  2. Edit this file to replace the parameter bucket with your bucket name and model with your Amazon Rekognition model ARN.

We use the parameters photo and min_confidence as input for this Python script.

You can run this script locally using the AWS Command Line Interface (AWS CLI) or using AWS CloudShell. In our example, we ran the script via the CloudShell console. Note that CloudShell is free to use.

Make sure to install the required dependences using the command pip3 install boto3 PILLOW if not already installed.

  1. Upload the file analyzeImage.py to CloudShell using the Actions menu.

The following screenshot shows the output, which detected two fruits in the input image. We supplied 15.jpeg as the photo argument and 85 as the min_confidence value.

The following example shows image 15.jpeg with two bounding boxes.

You can run the same script with other images and experiment by changing the confidence score further.

Step 3:  Stop the model

When you’re done, remember to stop model to avoid incurring in unnecessary charges. On your model details page, on the Use model tab, choose Stop.

Clean up

To avoid incurring unnecessary charges, delete the resources used in this walkthrough when not in use. We need to delete the Amazon Rekognition project and the S3 bucket.

Delete the Amazon Rekognition project

To delete the Amazon Rekognition project, complete the following steps:

  1. On the Amazon Rekognition console, choose Use Custom Labels.
  2. Choose Get started.
  3. In the navigation pane, choose Projects.
  4. On the Projects page, select the project that you want to delete.
    1. Choose Delete.
      The Delete project dialog box appears.
  5. If the project has no associated models:
    1. Enter delete to delete the project.
    2. Choose Delete to delete the project.
  6. If the project has associated models or datasets:
    1. Enter delete to confirm that you want to delete the model and datasets.
    2. Choose either Delete associated models, Delete associated datasets, or Delete associated datasets and models, depending on whether the model has datasets, models, or both.

    Model deletion might take a while to complete. Note that the Amazon Rekognition console can’t delete models that are in training or running. Try again after stopping any running models that are listed, and wait until the models listed as training are complete. If you close the dialog box during model deletion, the models are still deleted. Later, you can delete the project by repeating this procedure.

  7. Enter delete to confirm that you want to delete the project.
  8. Choose Delete to delete the project.

Delete your S3 bucket

You first need to empty the bucket and then delete it.

  1. On the Amazon S3 console, choose Buckets.
  2. Select the bucket that you want to empty, then choose Empty.
  3. Confirm that you want to empty the bucket by entering the bucket name into the text field, then choose Empty.
  4. Choose Delete.
  5. Confirm that you want to delete the bucket by entering the bucket name into the text field, then choose Delete bucket.

Conclusion

In this post, we showed you how to create an object detection model with Rekognition Custom Labels. This feature makes it easy to train a custom model that can detect an object class without needing to specify other objects or losing accuracy in its results.

For more information about using custom labels, see What Is Amazon Rekognition Custom Labels?


About the authors

Dhiraj Thakur is a Solutions Architect with Amazon Web Services. He works with AWS customers and partners to provide guidance on enterprise cloud adoption, migration, and strategy. He is passionate about technology and enjoys building and experimenting in the analytics and AI/ML space.

Sameer Goel is a Sr. Solutions Architect in the Netherlands, who drives customer success by building prototypes on cutting-edge initiatives. Prior to joining AWS, Sameer graduated with a master’s degree from Boston, with a concentration in data science. He enjoys building and experimenting with AI/ML projects on Raspberry Pi. You can find him on LinkedIn.

Read More

Amazon SageMaker Automatic Model Tuning now supports SageMaker Training Instance Fallbacks

Today Amazon SageMaker announced the support of SageMaker training instance fallbacks for Amazon SageMaker Automatic Model Tuning (AMT) that allow users to specify alternative compute resource configurations.

SageMaker automatic model tuning finds the best version of a model by running many training jobs on your dataset using the ranges of hyperparameters that you specify for your algorithm. Then, it chooses the hyperparameter values that result in a model that performs the best, as measured by a metric that you choose.

Previously, users only had the option to specify a single instance configuration. This can lead to problems when the specified instance type isn’t available due to high utilization. In the past, your training jobs would fail with an InsufficientCapacityError (ICE). AMT used smart retries to avoid these failures in many cases, but it remained powerless in the face of sustained low capacity.

This new feature means that you can specify a list of instance configurations in the order of preference, such that your AMT job will automatically fallback to the next instance in the list in the event of low capacity.

In the following sections, we walk through these high-level steps for overcoming an ICE:

  1. Define HyperParameter Tuning Job Configuration
  2. Define the Training Job Parameters
  3. Create the Hyperparameter Tuning Job
  4. Describe training job

Define HyperParameter Tuning Job Configuration

The HyperParameterTuningJobConfig object describes the tuning job, including the search strategy, the objective metric used to evaluate training jobs, the ranges of the parameters to search, and the resource limits for the tuning job. This aspect wasn’t changed with today’s feature release. Nevertheless, we’ll go over it to give a complete example.

The ResourceLimits object specifies the maximum number of training jobs and parallel training jobs for this tuning job. In this example, we’re doing a random search strategy and specifying a maximum of 10 jobs (MaxNumberOfTrainingJobs) and 5 concurrent jobs (MaxParallelTrainingJobs) at a time.

The ParameterRanges object specifies the ranges of hyperparameters that this tuning job searches. We specify the name, as well as the minimum and maximum value of the hyperparameter to search. In this example, we define the minimum and maximum values for the Continuous and Integer parameter ranges and the name of the hyperparameter (“eta”, “max_depth”).

AmtTuningJobConfig={
            "Strategy": "Random",
            "ResourceLimits": {
              "MaxNumberOfTrainingJobs": 10,
              "MaxParallelTrainingJobs": 5
            },
            "HyperParameterTuningJobObjective": {
              "MetricName": "validation:rmse",
              "Type": "Minimize"
            },
            "ParameterRanges": {
              "CategoricalParameterRanges": [],
              "ContinuousParameterRanges": [
                {
                    "MaxValue": "1",
                    "MinValue": "0",
                    "Name": "eta"
                }
              ],
              "IntegerParameterRanges": [
                {
                  "MaxValue": "6",
                  "MinValue": "2",
                  "Name": "max_depth"
                }
              ]
            }
          }

Define the Training Job Parameters

In the training job definition, we define the input needed to run a training job using the algorithm that we specify. After the training completes, SageMaker saves the resulting model artifacts to an Amazon Simple Storage Service (Amazon S3) location that you specify.

Previously, we specified the instance type, count, and volume size under the ResourceConfig parameter. When the instance under this parameter was unavailable, an Insufficient Capacity Error (ICE) was thrown.

To avoid this, we now have the HyperParameterTuningResourceConfig parameter under the TrainingJobDefinition, where we specify a list of instances to fall back on. The format of these instances is the same as in the ResourceConfig. The job will traverse the list top-to-bottom to find an available instance configuration. If an instance is unavailable, then instead of an Insufficient Capacity Error (ICE), the next instance in the list is chosen, thereby overcoming the ICE.

TrainingJobDefinition={
            "HyperParameterTuningResourceConfig": {
      		"InstanceConfigs": [
            		{
                		"InstanceType": "ml.m4.xlarge",
                		"InstanceCount": 1,
                		"VolumeSizeInGB": 5
            		},
            		{
                		"InstanceType": "ml.m5.4xlarge",
                		"InstanceCount": 1,
                		"VolumeSizeInGB": 5
            		}
        		 ]
    		  },
            "AlgorithmSpecification": {
              "TrainingImage": "433757028032.dkr.ecr.us-west-2.amazonaws.com/xgboost:latest",
              "TrainingInputMode": "File"
            },
            "InputDataConfig": [
              {
                "ChannelName": "train",
                "CompressionType": "None",
                "ContentType": "json",
                "DataSource": {
                  "S3DataSource": {
                    "S3DataDistributionType": "FullyReplicated",
                    "S3DataType": "S3Prefix",
                    "S3Uri": "s3://<bucket>/test/"
                  }
                },
                "RecordWrapperType": "None"
              }
            ],
            "OutputDataConfig": {
              "S3OutputPath": "s3://<bucket>/output/"
            },
            "RoleArn": "arn:aws:iam::340308762637:role/service-role/AmazonSageMaker-ExecutionRole-20201117T142856",
            "StoppingCondition": {
              "MaxRuntimeInSeconds": 259200
            },
            "StaticHyperParameters": {
              "training_script_loc": "q2bn-sagemaker-test_6"
            },
          }

Run a Hyperparameter Tuning Job

In this step, we’re creating and running a hyperparameter tuning job with the hyperparameter tuning resource configuration defined above.

We initialize a SageMaker client and create the job by specifying the tuning config, training job definition, and a job name.

import boto3
sm = boto3.client('sagemaker')     
                    
sm.create_hyper_parameter_tuning_job(
    HyperParameterTuningJobName="my-job-name",
    HyperParameterTuningJobConfig=AmtTuningJobConfig,
    TrainingJobDefinition=TrainingJobDefinition) 

Running an AMT job with the support of SageMaker training instance fallbacks empowers the user to overcome insufficient capacity by themselves, thereby reducing the chance of a job failure.

Describe training jobs

The following function lists all instance types used during the experiment and can be used to verify if an SageMaker training instance has automatically fallen back to the next instance in the list during resource allocation.

def list_instances(name):
    job_list = []
    instances = []
    def _get_training_jobs(name, next=None):
        if next:
            list = sm.list_training_jobs_for_hyper_parameter_tuning_job(
            HyperParameterTuningJobName=name, NextToken=next)
        else:
            list = sm.list_training_jobs_for_hyper_parameter_tuning_job(
            HyperParameterTuningJobName=name)
        for jobs in list['TrainingJobSummaries']:
            job_list.append(jobs['TrainingJobName'])
        next = list.get('NextToken', None)
        if next:
            _get_training_jobs(name, next=next)
            pass
        else:
            pass
    _get_training_jobs(name)


    for job_name in job_list:
        ec2 = sm.describe_training_job(
        TrainingJobName=job_name
        )
        instances.append(ec2['ResourceConfig'])
    return instances

list_instances("my-job-name")  

The output of the function above displays all of the instances that the AMT job is using to run the experiment.

Conclusion

In this post, we demonstrated how you can now define a pool of instances on which your AMT experiment can fall back in the case of InsufficientCapacityError. We saw how to define a hyperparameter tuning job configuration, as well as specify the maximum number of training jobs and maximum parallel jobs. Finally, we saw how to overcome the InsufficientCapacityError by using the HyperParameterTuningResourceConfig parameter, which can be specified under the training job definition.

To learn more about AMT, visit Amazon SageMaker Automatic Model Tuning.


About the authors

Doug Mbaya is a Senior Partner Solution architect with a focus in data and analytics. Doug works closely with AWS partners, helping them integrate data and analytics solution in the cloud.

Kruthi Jayasimha Rao is a Partner Solutions Architect in the Scale-PSA team. Kruthi conducts technical validations for Partners enabling them progress in the Partner Path.

Bernard Jollans is a Software Development Engineer for Amazon SageMaker Automatic Model Tuning.

Read More

Create Amazon SageMaker model building pipelines and deploy R models using RStudio on Amazon SageMaker

In November 2021, in collaboration with RStudio PBC, we announced the general availability of RStudio on Amazon SageMaker, the industry’s first fully managed RStudio Workbench IDE in the cloud. You can now bring your current RStudio license to easily migrate your self-managed RStudio environments to Amazon SageMaker in just a few simple steps.

RStudio is one of the most popular IDEs among R developers for machine learning (ML) and data science projects. RStudio provides open-source tools for R and enterprise-ready professional software for data science teams to develop and share their work in the organization. Bringing RStudio on SageMaker not only gives you access to the AWS infrastructure in a fully managed way, but it also gives you native access to SageMaker.

In this post, we explore how you can use SageMaker features via RStudio on SageMaker to build a SageMaker pipeline that builds, processes, trains and registers your R models. We also explore using SageMaker for our model deployment, all using R.

Solution overview

The following diagram shows the architecture used in our solution. All code used in this example can be found in the GitHub repository.

Prerequisites

To follow this post, access to RStudio on SageMaker is required. If you’re new to using RStudio on SageMaker, review Get started with RStudio on Amazon SageMaker.

We also need to build custom Docker containers. We use AWS CodeBuild to build these containers, so you need a few extra AWS Identity and Access Management (IAM) permissions that you might not have by default. Before you proceed, make sure that the IAM role that you’re using has a trust policy with CodeBuild:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "codebuild.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

The following permissions are also required in the IAM role to run a build in CodeBuild and push the image to Amazon Elastic Container Registry (Amazon ECR):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "codebuild:DeleteProject",
                "codebuild:CreateProject",
                "codebuild:BatchGetBuilds",
                "codebuild:StartBuild"
            ],
            "Resource": "arn:aws:codebuild:*:*:project/sagemaker-studio*"
        },
        {
            "Effect": "Allow",
            "Action": "logs:CreateLogStream",
            "Resource": "arn:aws:logs:*:*:log-group:/aws/codebuild/sagemaker-studio*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:GetLogEvents",
                "logs:PutLogEvents"
            ],
            "Resource": "arn:aws:logs:*:*:log-group:/aws/codebuild/sagemaker-studio*:log-stream:*"
        },
        {
            "Effect": "Allow",
            "Action": "logs:CreateLogGroup",
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "ecr:CreateRepository",
                "ecr:BatchGetImage",
                "ecr:CompleteLayerUpload",
                "ecr:DescribeImages",
                "ecr:DescribeRepositories",
                "ecr:UploadLayerPart",
                "ecr:ListImages",
                "ecr:InitiateLayerUpload", 
                "ecr:BatchCheckLayerAvailability",
                "ecr:PutImage"
            ],
            "Resource": "arn:aws:ecr:*:*:repository/sagemaker-studio*"
        },
        {
            "Sid": "ReadAccessToPrebuiltAwsImages",
            "Effect": "Allow",
            "Action": [
                "ecr:BatchGetImage",
                "ecr:GetDownloadUrlForLayer"
            ],
            "Resource": [
                "arn:aws:ecr:*:763104351884:repository/*",
                "arn:aws:ecr:*:217643126080:repository/*",
                "arn:aws:ecr:*:727897471807:repository/*",
                "arn:aws:ecr:*:626614931356:repository/*",
                "arn:aws:ecr:*:683313688378:repository/*",
                "arn:aws:ecr:*:520713654638:repository/*",
                "arn:aws:ecr:*:462105765813:repository/*"
            ]
        },
        {
            "Sid": "EcrAuthorizationTokenRetrieval",
            "Effect": "Allow",
            "Action": [
                "ecr:GetAuthorizationToken"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
              "s3:GetObject",
              "s3:DeleteObject",
              "s3:PutObject"
              ],
            "Resource": "arn:aws:s3:::sagemaker-*/*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:CreateBucket"
            ],
            "Resource": "arn:aws:s3:::sagemaker*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "iam:GetRole",
                "iam:ListRoles"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "arn:aws:iam::*:role/*",
            "Condition": {
                "StringLikeIfExists": {
                    "iam:PassedToService": "codebuild.amazonaws.com"
                }
            }
        }
    ]
}

Create baseline R containers

To use our R scripts for processing and training on SageMaker processing and training jobs, we need to create our own Docker containers containing the necessary runtime and packages. The ability to use your own container, which is part of the SageMaker offering, gives great flexibility to developers and data scientists to use the tools and frameworks of their choice, with virtually no limitations.

We create two R-enabled Docker containers: one for processing jobs and one for training and deployment of our models. Processing data typically requires different packages and libraries than modeling, so it makes sense here to separate the two stages and use different containers.

For more details about using containers with SageMaker, refer to Using Docker containers with SageMaker.

The container used for processing is defined as follows:

FROM public.ecr.aws/docker/library/r-base:4.1.2

# Install tidyverse
RUN apt update && apt-get install -y --no-install-recommends 
    r-cran-tidyverse
    
RUN R -e "install.packages(c('rjson'))"

ENTRYPOINT ["Rscript"]

For this post, we use a simple and relatively lightweight container. Depending on your or your organization’s needs, you may want to pre-install several more R packages.

The container used for training and deployment is defined as follows:

FROM public.ecr.aws/docker/library/r-base:4.1.2

RUN apt-get -y update && apt-get install -y --no-install-recommends 
    wget 
    apt-transport-https 
    ca-certificates 
    libcurl4-openssl-dev 
    libsodium-dev
    
RUN apt-get update && apt-get install -y python3-dev python3-pip 
RUN pip3 install boto3
RUN R -e "install.packages(c('readr','plumber', 'reticulate'),dependencies=TRUE, repos='http://cran.rstudio.com/')"

ENV PATH="/opt/ml/code:${PATH}"

WORKDIR /opt/ml/code

COPY ./docker/run.sh /opt/ml/code/run.sh
COPY ./docker/entrypoint.R /opt/ml/entrypoint.R

RUN /bin/bash -c 'chmod +x /opt/ml/code/run.sh'

ENTRYPOINT ["/bin/bash", "run.sh"]

The RStudio kernel runs on a Docker container, so you won’t be able to build and deploy the containers using Docker commands directly on your Studio session. Instead, you can use the very useful library sagemaker-studio-image-build, which essentially outsources the task of building containers to CodeBuild.

With the following commands, we create two Amazon ECR registries: sagemaker-r-processing and sagemaker-r-train-n-deploy, and build the respective containers that we use later:

if (!py_module_available("sagemaker-studio-image-build")){py_install("sagemaker-studio-image-build", pip=TRUE)}
system("cd pipeline-example ; sm-docker build . —file ./docker/Dockerfile-train-n-deploy —repository sagemaker-r-train-and-deploy:1.0")
system("cd pipeline-example ; sm-docker build . —file ./docker/Dockerfile-processing —repository sagemaker-r-processing:1.0")

Create the pipeline

Now that the containers are built and ready, we can create the SageMaker pipeline that orchestrates the model building workflow. The full code of this is under the file pipeline.R in the repository. The easiest way to create a SageMaker pipeline is by using the SageMaker SDK, which is a Python library that we can access using the library reticulate. This gives us access to all functionalities of SageMaker without leaving the R language environment.

The pipeline we build has the following components:

  • Preprocessing step – This is a SageMaker processing job (utilizing the sagemaker-r-processing container) responsible for preprocessing the data and splitting the data into train and test datasets.
  • Training step – This is a SageMaker training job (utilizing the sagemaker-r-train-n-deploy container) responsible for training the model. In this example, we train a simple linear model.
  • Evaluation step – This is a SageMaker processing job (utilizing the sagemaker-r-processing container) responsible for performing evaluation of the model. Specifically in this example, we’re interested in the RMSE (root mean square error) on the test dataset, which we want to use in the next step as well as to associate with the model itself.
  • Conditional step – This is a conditional step, native to SageMaker pipelines, that allows us to branch the pipeline logic based on some parameter. In this case, the pipeline branches based on the value of RMSE that is calculated in the previous step.
  • Register model step – If the preceding conditional step is True, and the performance of the model is acceptable, then the model is registered in the model registry. For more information, refer to Register and Deploy Models with Model Registry.

First call the upsert function to create (or update) the pipeline and then call the start function to actually start running the pipeline:

source("pipeline-example/pipeline.R")
my_pipeline <- get_pipeline(input_data_uri=s3_raw_data)

upserted <- my_pipeline$upsert(role_arn=role_arn)
started <- my_pipeline$start()

Inspect the pipeline and model registry

One of the great things about using RStudio on SageMaker is that by being on the SageMaker platform, you can use the right tool for the right job and swiftly switch between them based on what you need to do.

As soon as we start the pipeline run, we can switch to Amazon SageMaker Studio, which allows us to visualize the pipeline and monitor current and previous runs of it.

To view details about the pipeline we just created and ran, navigate to the Studio IDE interface, choose SageMaker resources, choose Pipelines on the drop-down menu, and choose the pipeline (in this case, AbalonePipelineUsingR).

This reveals details of the pipeline, including all current and previous runs. Choose the latest one to bring up a visual representation of the pipeline, as per the following screenshot.

The DAG of the pipeline is created automatically by the service based on the data dependencies between steps, as well as based on custom added dependencies (not added any in this example).

When the run is complete, if successful, you should see all the steps turn green.

Choosing any of the individual steps brings up details about the specific step, including inputs, outputs, logs, and initial configuration settings. This allows you to drill down in the pipeline and investigate any failed steps.

Similarly, when the pipeline has finished running, a model is saved in the model registry. To access it, in the SageMaker resources pane, choose Model registry on the drop-down and choose your model. This reveals the list of registered models, as shown in the following screenshot. Choose one to open the details page for that particular model version.

After you open a version of the model, choose Update Status and Approve to approve the model.

At this point, based on your use case, you can set up this approval to trigger further actions, including the deployment of the model as per your needs.

Serverless deployment of the model

After you’ve trained and registered a model on SageMaker, deploying the model on SageMaker is straightforward.

There are several options of how you can deploy a model, such as batch inference, real-time endpoints, or asynchronous endpoints. Each method comes with several required configurations, including choosing the instance type you want as well as the scaling mechanism.

For this example, we use the recently announced feature of SageMaker, Serverless Inference (in preview mode as of the time of writing), to deploy our R model on a serverless endpoint. For this type of endpoint, we only define the amount of RAM that we want to be allocated to the model for inference, as well as the maximum number of allowed concurrent invocations of the model. SageMaker takes care of hosting the model and auto scaling as needed. You’re only charged for the exact number of seconds and data used by the model, with no cost for idle time.

You can deploy the model to a serverless endpoint with the following code:

model_package_arn <- 'ENTER_MODEL_PACKAGE_ARN_HERE'
model <- sagemaker$ModelPackage(
                        role=role_arn, 
                        model_package_arn=model_package_arn, 
                        sagemaker_session=session)
serverless_config <- sagemaker$serverless$ServerlessInferenceConfig(
                        memory_size_in_mb=1024L, 
                        max_concurrency=5L)
model$deploy(serverless_inference_config=serverless_config, 
             endpoint_name="serverless-r-abalone-endpoint")

If you see the error ClientError: An error occurred (ValidationException) when calling the CreateModel operation: Invalid approval status "PendingManualApproval" the model you want to deploy hasn’t been approved. Follow the steps from the previous section to approve your model.

Invoke the endpoint by sending a request to the HTTP endpoint we deployed, or instead use the SageMaker SDK. In the following code, we invoke the endpoint on some test data:

library(jsonlite)
x = list(features=format_csv(abalone_t[1:3,1:11]))
x = toJSON(x)

# test the endpoint
predictor <- sagemaker$predictor$Predictor(endpoint_name="serverless-r-abalone-endpoint", sagemaker_session=session)
predictor$predict(x)

The endpoint we invoked was a serverless endpoint, and as such we’re charged for the exact duration and data used. You might notice that the first time you invoke the endpoint it takes about a second to respond. This is due to the cold start time of the serverless endpoint. If you make another invocation soon after, the model returns the prediction in real time because it’s already warm.

When you finish experimenting with the endpoint, you can delete it with the following command:

predictor$delete_endpoint(delete_endpoint_config=TRUE)

Conclusion

In this post, we walked through the process of creating a SageMaker pipeline using R in our RStudio environment and showcased how to deploy our R model on a serverless endpoint on SageMaker using the SageMaker model registry.

With the combination of RStudio and SageMaker, you can now create and orchestrate complete end-to-end ML workflows on AWS using our preferred language of choice, R.

To dive deeper into this solution, I encourage you to review the source code of this solution, as well as other examples, on GitHub.


About the Author

Georgios Schinas is a Specialist Solutions Architect for AI/ML in the EMEA region. He is based in London and works closely with customers in UK and Ireland. Georgios helps customers design and deploy machine learning applications in production on AWS with a particular interest in MLOps practices and enabling customers to perform machine learning at scale. In his spare time, he enjoys traveling, cooking and spending time with friends and family.

Read More

MLOps at the edge with Amazon SageMaker Edge Manager and AWS IoT Greengrass

Internet of Things (IoT) has enabled customers in multiple industries, such as manufacturing, automotive, and energy, to monitor and control real-world environments. By deploying a variety of edge IoT devices such as cameras, thermostats, and sensors, you can collect data, send it to the cloud, and build machine learning (ML) models to predict anomalies, failures, and more. However, if the use case requires real-time prediction, you need to enrich your IoT solution with ML at the edge (ML@Edge) capabilities. ML@Edge is a concept that decouples the ML model’s lifecycle from the app lifecycle and allows you to run an end-to-end ML pipeline that includes data preparation, model building, model compilation and optimization, model deployment (to a fleet of edge devices), model execution, and model monitoring and governing. You deploy the app once and run the ML pipeline as many times as you need.

As you can imagine, to implement all the steps proposed by the ML@Edge concept is not trivial. There are many questions that developers need to address in order to implement a complete ML@Edge solution, for example:

  • How do I operate ML models on a fleet (hundreds, thousands, or millions) of devices at the edge?
  • How do I secure my model while deploying and running it at the edge?
  • How do I monitor my model’s performance and retrain it, if needed?

In this post, you learn how to answer all these questions and build an end-to-end solution for automating your ML@Edge pipeline. You’ll see how to use Amazon SageMaker Edge Manager, Amazon SageMaker Studio, and AWS IoT Greengrass v2 to create an MLOps (ML Operations) environment that automates the process of building and deploying ML models to large fleets of edge devices.

In the next sections, we present a reference architecture that details all the components and workflows required to build a complete solution for MLOps focused on edge workloads. Then we dive deep into the steps this solution runs automatically to build and prepare a new model. We also show you how to prepare the edge devices to start deploying, running, and monitoring ML models, and demonstrate how to monitor and maintain the ML models deployed to your fleet of devices.

Solution overview

Productionization of robust ML models requires the collaboration of multiple personas, such as data scientists, ML engineers, data engineers, and business stakeholders, under a semi-automate infrastructure following specific operations (MLOps). Also, the modularization of the environment is important in order to give all these different personas the flexibility and agility to develop or improve (independently of the workflow) the component for which they are responsible. An example of such an infrastructure consists of multiple AWS accounts that enable this collaboration and productionization of the ML models both in the cloud and to the edge devices. In the following reference architecture, we show how we organized the multiple accounts and services that compose this end-to-end MLOps platform for building ML models and deploying them at the edge.

This solution consists of the following accounts:

  • Data lake account – Data engineers ingest, store, and prepare data from multiple data sources, including on-premise databases and IoT devices.
  • Tooling account – IT operators manage and check CI/CD pipelines for automated continuous delivery and deployment of ML model packages across the pre-production and production accounts for remote edge devices. Runs of CI/CD pipelines are automated through the usage of Amazon EventBridge, which monitors change status events of ML models and targets AWS CodePipeline.
  • Experimentation and development account – Data scientists can conduct research and experiment with multiple modeling techniques and algorithms to solve business problems based on ML, creating proof of concept solutions. ML engineers and data scientists collaborate to scale a proof of concept, creating automated workflows using Amazon SageMaker Pipelines to prepare data and build, train, and package ML models. The deployment of the pipelines is driven via CI/CD pipelines, while the version control of the models is achieved using the Amazon SageMaker model registry. Data scientists evaluate the metrics of multiple model versions and request the promotion of the best model to production by triggering the CI/CD pipeline.
  • Pre-production account – Before the promotion of the model to the production environment, the model needs to be tested to ensure robustness in a simulation environment. Therefore, the pre-production environment is a simulator of the production environment, in which SageMaker model endpoints are deployed and tested automatically. Test methods might include an integration test, stress test, or ML-specific tests on inference results. In this case, the production environment isn’t a SageMaker model endpoint but an edge device. To simulate an edge device in pre-production, two approaches are possible: use an Amazon Elastic Compute Cloud (Amazon EC2) instance with the same hardware characteristics, or use an in-lab testbed consisting of the actual devices. With this infrastructure, the CI/CD pipeline deploys the model to the corresponding simulator and conducts the multiple tests automatically. After the tests run successfully, the CI/CD pipeline requires manual approval (for example, from the IoT stakeholder to promote the model to production).
  • Production account – In the case of hosting the model on the AWS Cloud, the CI/CD pipeline deploys a SageMaker model endpoint on the production account. In this case, the production environment consists of multiple fleets of edge devices. Therefore, the CI/CD pipeline uses Edge Manager to deploy the models to the corresponding fleet of devices.
  • Edge devices – Remote edge devices are hardware devices that can run ML models using Edge Manager. It allows the application on those devices to manage the models, run inference against the models, and capture data securely into Amazon Simple Storage Service (Amazon S3).

SageMaker projects help you to automate the process of provisioning resources inside each of these accounts. We don’t dive deep into this feature, but to learn more about how to build a SageMaker project template that deploys ML models across accounts, check out Multi-account model deployment with Amazon SageMaker Pipelines.

Pre-production account: Digital twin

After the training process, the resulting model needs to be evaluated. In the pre-production account, you have a simulated Edge device. It represents the digital twin of the edge device on which the ML model runs in production. This environment has the dual purpose of performing the classic tests (such as unit, integration, and smoke) and to be a playground for the development team. This device is simulated using an EC2 instance where all the components needed to manage the ML model were deployed.

The involved services are as follows:

  • AWS IoT Core – We use AWS IoT Core to create AWS IoT thing objects, create a device fleet, register the device fleet so it can interact with the cloud, create X.509 certificates to authenticate edge devices to AWS IoT Core, associate the role alias with AWS IoT Core that was generated when the fleet has created, get an AWS account-specific endpoint for the credential provider, get an official Amazon Root CA file, and upload the Amazon CA file to Amazon S3.
  • Amazon Sagemaker Neo – Sagemaker Neo automatically optimizes machine learning models for inference to run faster with no loss in accuracy. It supports machine learning model already built with DarkNet, Keras, MXNet, PyTorch, TensorFlow, TensorFlow-Lite, ONNX, or XGBoost and trained in Amazon SageMaker or anywhere else. Then you choose your target hardware platform, which can be a SageMaker hosting instance or an edge device based on processors from Ambarella, Apple, ARM, Intel, MediaTek, Nvidia, NXP, Qualcomm, RockChip, Texas Instruments, or Xilinx.
  • Edge Manager – We use Edge Manager to register and manage the edge device within the Sagemaker fleets. Fleets are collections of logically grouped devices you can use to collect and analyze data. Besides, Edge Manager packager, packages the optimized model and create an AWS IoT Greengrass V2 component that can directly be deployed. You can use Edge Manager to operate ML models on a fleet of smart cameras, smart speakers, robots, and other SageMaker device fleets.
  • AWS IoT Greengrass V2AWS IoT Greengrass allows you to deploy components into the simulated devices using an EC2 instance. By using the AWS IoT Greengrass V2 agent in the EC2 instances, we can simplify the access, management, and deployment of the Edge Manager agent and model to devices. Without AWS IoT Greengrass V2, setting up devices and fleets to use Edge Manager requires you to manually copy the agent from an S3 release bucket. With AWS IoT Greengrass V2 and Edge Manager integration, it’s possible to use AWS IoT Greengrass V2 components. Components are pre-built software modules that can connect edge devices to AWS services or third-party service via AWS IoT Greengrass.
  • Edge Manager agent – The Edge Manager agent is deployed via AWS IoT Greengrass V2 in the EC2 instance. The agent can load multiple models at a time and make inference with loaded models on edge devices. The number of models the agent can load is determined by the available memory on the device.
  • Amazon S3 – We use an S3 bucket to store the inference captured data from the Edge Manager agent.

We can define a pre-production account as a digital twin for testing ML models before moving them into real edge devices. This offers the following benefits:

  • Agility and flexibility – Data scientists and ML engineers need to quickly validate if the ML model and associated scripts (preprocessing and inference scripts) will work on the device edge. However, IoT and data science departments in large enterprises may be different entities. By identically replicating the technology stack in the cloud, data scientists and ML engineers can iterate and consolidate artifacts prior to deployment.
  • Accelerated risk assessment and production time – Deployment on the edge device is the final stage of the process. After you validate everything in an isolated and self-contained environment, secure it to be adherent to the specifications required by the edge in terms of quality, performance, and integration. This helps avoid further involvement of other people in the IoT department to fix and iterate on artifact versions.
  • Improved team collaboration and enhanced quality and performance – Development team can immediately assess the impact of the ML model by analyzing edge hardware metrics and measuring the level of interactions with third-party tools (eg. I/O rate). Then, the IoT team is only responsible for deployment to the production environment, and can be confident that the artifacts are accurate for a production environment.
  • Integrated playground for testing – Given the target of ML models, the pre-production environment in a traditional workflow should be represented by an edge device outside the cloud environment. This introduces another level of complexity. Integrations are needed to collect metrics and feedback. Instead, by using the digital twin simulated environment, interactions are reduced and time to market is shortened.

Production account and edge environment

After the tests are complete and the artifact stability is achieved, you can proceed to production deployment through the pipelines. Artifact deployment occurs programmatically after an operator has approved the artifact. However, access to the AWS Management Console is granted to operators in read-only mode to be able to monitor metadata associated with the fleets and therefore have insight into the version of the deployed ML model and other metrics associated with the lifecycle.

Edge device fleets belong to the AWS production account. This account has specific security and networking configurations to allow communication between the cloud and edge devices. The main AWS services deployed in the production account are Edge Manager, which is responsible for managing all the device fleets, collecting data, and operating ML models, and AWS IoT Core, which manages IoT thing objects, certificates, role alias, and endpoints.

At the same time, we need to configure an edge device with the services and components to manage ML models. The main components are as follows:

  • AWS IoT Greengrass V2
  • An Edge Manager agent
  • AWS IoT certificates
  • Application.py, which is responsible for orchestrating the inference process (retrieving information from the edge data source and performing inference using the Edge Manager agent and loaded ML model)
  • A connection to Amazon S3 or the data lake account to store inferenced data

Automated ML pipeline

Now that you know more about the organization and the components of the reference architecture, we can dive deeper into the ML pipeline that we use to build, train, and evaluate the ML model inside the development account.

A pipeline (built using Amazon SageMaker Model Building Pipelines) is a series of interconnected steps that is defined by a JSON pipeline definition. This pipeline definition encodes a pipeline using a Directed Acyclic Graph (DAG). This DAG gives information on the requirements for and relationships between each step of your pipeline. The structure of a pipeline’s DAG is determined by the data dependencies between steps. These data dependencies are created when the properties of a step’s output are passed as the input to another step.

To enable data science teams to easily automate the creation of new versions of ML models, it’s important to introduce validation steps and automated data for continuously feeding and improving ML models, as well as model monitoring strategies for enabling pipeline triggering. The following diagram shows an example pipeline.

For enabling automations and MLOps capabilities, it’s important to create modular components for creating reusable code artifacts that can be sharable across different steps and ML use cases. This enables you to quickly move the implementation from an experimentation phase to a production phase by automating the transition.

The steps for defining an ML pipeline for enabling the continuous training and versioning of ML models are as follows:

  • Preprocessing – The process of data cleaning, feature engineering, and dataset creation for training the ML algorithm
  • Training – The process of training the developed ML algorithm for generating a new version of the ML model artifact
  • Evaluation – The process of evaluation of the generated ML model, for extracting key metrics related to the model behavior on new data not seen during the training phase
  • Registration – The process of versioning the new trained ML model artifact by linking the metrics extracted with the generated artifact

You can see more details of how to build a SageMaker pipeline in the following notebook.

Trigger CI/CD pipelines using EventBridge

When you finish building the model, you can start the deployment process. The last step of the SageMaker pipeline defined in the previous section registers a new version of the model in the specific SageMaker model registry group. The deployment of a new version of the ML model is managed using the model registry status. By manually approving or rejecting an ML model version, this step raises an event that is captured by EventBridge. This event can then start a new pipeline (CI/CD this time) for creating a new version of the AWS IoT Greengrass component that is then deployed to the pre-production and production accounts. The following screenshot shows our defined EventBridge rule.

This rule monitors the SageMaker model package group by looking for updates of model packages in the status Approved or Rejected.

The EventBridge rule is then configured to target CodePipeline, which starts the workflow of creating a new AWS IoT Greengrass component by using Amazon SageMaker Neo and Edge Manager.

Optimize ML models for the target architecture

Neo allows you to optimize ML models for performing inference on edge devices (and in the cloud). It automatically optimizes the ML models for better performance based on the target architecture, and decouples the model from the original framework, allowing you to run it on a lightweight runtime.

Refer to the following notebook for an example of how to compile a PyTorch Resnet18 model using Neo.

Build the deployment package by including the AWS IoT Greengrass component

Edge Manager allows you to manage, secure, deploy, and monitor models to a fleet of edge devices. In the following notebook, you can see more details of how to build a minimalist fleet of edge devices and run some experiments with this feature.

After you configure the fleet and compile the model, you need to run an Edge Manager packaging job, which prepares the model to be deployed to the fleet. You can start a packaging job by using the Boto3 SDK. For our parameters, we use the optimized model and model metadata. By adding the following parameters to OutputConfig, the job also prepares an AWS IoT Greengrass V2 component with the model:

  • PresetDeploymentType
  • PresetDeploymentConfig

See the following code:

import boto3
import time

SageMaker_client = boto3.client('SageMaker')

SageMaker_client.create_edge_packaging_job(
    EdgePackagingJobName="mlops-edge-packaging-{}".format(int(time.time()*1000)),
    CompilationJobName=compilation_job_name,
    ModelName="PytorchMLOpsEdgeModel",
    ModelVersion="1.0.0",
    RoleArn=role,
    OutputConfig={
        'S3OutputLocation': 's3://{}/model/'.format(bucket_name),
        "PresetDeploymentType": "GreengrassV2Component",
        "PresetDeploymentConfig": json.dumps(
            {"ComponentName": component_name, "ComponentVersion": component_version}
        ),
    }
)

Deploy ML models at the edge at scale

Now it’s time to deploy the model to your fleet of edge devices. First, we need to ensure that we have the necessary AWS Identity and Access Management (IAM) permissions to provision our IoT devices and are able to deploy components to it. We require two basic elements to start onboarding devices into our IoT platform:

  • IAM policy – This policy allows for the automatic provisioning of such devices, attached to the user or role performing the provisioning. It should have IoT write permissions to create the IoT thing and group, as well as to attach the necessary policies to the device. For more information, refer to Minimal IAM policy for installer to provision resources.
  • IAM role – this role is attached to the IoT things and groups that we create. You can create this role at provisioning time with basic permissions, but it will lack features like access to Amazon S3 or AWS Key Management Service (AWS KMS) that might be needed later. You can create this role beforehand and reuse it when we provision the device. For more information, refer to Authorize core devices to interact with AWS.

AWS IoT Greengrass installation and provisioning

After we have the IAM policy and role in place, we’re ready to install AWS IoT Greengrass Core software with automatic resource provisioning. Although it’s possible to provision the IoT resources following manual steps, there is the convenient procedure of automatically provisioning these resources during the installation of the AWS IoT Greengrass v2 nucleus. This is the preferred option to quickly onboard new devices into the platform. Besides default-jdk, other packages are required to be installed, such as curl, unzip, and python3.

When we provision our device, the IoT thing name must be exactly the same as the edge device defined in Edge Manager, otherwise data won’t be captured to the destination S3 bucket.

The installer can create the AWS IoT Greengrass role and alias during the installation if they don’t exist. However, they’ll be created with minimal permissions and will require manually adding more policies to interact with other services such as Amazon S3. We recommend creating these IAM resources beforehand as shown earlier, and then reuse them as you onboard new devices into the account.

Model and inference component packaging

After our code has been developed, we can deploy both the code (for inference) and our ML models as components into our devices.

After the ML model is trained in SageMaker, you can optimize the model with Neo using a Sagemaker compilation job. The resulting compiled model artifacts, can then be packaged into a GreenGrass V2 component using the Edge Manager packager. Then, it can be registered as a custom component in the My Components section on the AWS IoT Greengrass console. This component already contains the necessary lifecycle commands to download and decompress the model artifact in our device, so that the inference code can load it up to send the images captured through it.

Regarding the inference code, we must create a component using the console or AWS Command Line Interface (AWS CLI). First, we pack our source inference code and necessary dependencies to Amazon S3. After we upload the code, we can create our component using a recipe in .yaml or JSON like the following example:

---
RecipeFormatVersion: 2020-01-25
ComponentName: dummymodel.inference
ComponentVersion: 0.0.1
ComponentDescription: Deploys inference code to a client
ComponentPublisher: Amazon Web Services, Inc.
ComponentDependencies:
  aws.GreenGrass.TokenExchangeService:
    VersionRequirement: '>=0.0.0'
    DependencyType: HARD
  dummymodel:
    VersionRequirement: '>=0.0.0'
    DependencyType: HARD
Manifests:
  - Platform:
      os: linux
      architecture: "*"
    Lifecycle:
      install: |-
        apt-get install python3-pip
        pip3 install numpy
        pip3 install sysv_ipc
        pip3 install boto3
        pip3 install grpcio-tools
        pip3 install grpcio
        pip3 install protobuf
        pip3 install SageMaker
        tar xf {artifacts:path}/sourcedir.tar.gz
      run:
        script: |-
          sleep 5 && sudo python3 {work:path}/inference.py 
    Artifacts:
      - URI: s3://BUCKET-NAME/path/to/inference/sourcedir.tar.gz
        Permission:
          Execute: OWNER

This example recipe shows the name and description of our component, as well as the necessary prerequisites before our run script command. The recipe unpacks the artifact in a work folder environment in the device, and we use that path to run our inference code. The AWS CLI command to create such recipe is:

aws greengrassv2 create-component-version --region $REGION 
                                          --inline-recipe fileb://path/to/recipe.yaml

You can now see this component created on the AWS IoT Greengrass console.

Beware of the fact that the component version matters, and it must be specified in the recipe file. Repeating the same version number will return an error.

After our model and inference code have been set up as components, we’re ready to deploy them.

Deploy the application and model using AWS IoT Greengrass

In the previous sections, you learned how to package the inference code and the ML models. Now we can create a deployment with multiple components that include both components and configurations needed for our inference code to interact with the model in the edge device.

The Edge Manager agent is the component that should be installed on each edge device in order enable all the Edge Manager capabilities. On the SageMaker console, we have a device fleet defined, which has an associated S3 bucket. All edge devices associated with the fleet will capture and report their data to this S3 path. The agent can be deployed as a component in AWS IoT Greengrass v2, which makes it easier to install and configure than if the agent were deployed in standalone mode. When deploying the agent as a component, we need to specify its configuration parameters, namely the device fleet and S3 path.

We create a deployment configuration with the custom components for the model and code we just created. This setup is defined in a JSON file that lists the deployment name and target, as well as the components in the deployment. We can add and update the configuration parameters of each component, such as in the Edge Manager agent, where we specify the fleet name and bucket.

{
    "targetArn": "targetArn",
    "deploymentName": "dummy-deployment",
    "components": {
        "aws.GreenGrass.Nucleus": {
            "version": "2.5.3",
        },
        "aws.GreenGrass.Cli": {
            "version": "2.5.3"
        },
        "aws.GreenGrass.SageMakerEdgeManager": {
            "version": 1.1.0,
            "configurationUpdate": {
                "merge": {
                "DeviceFleetName": "FLEET-NAME",
                "BucketName": "BUCKET-NAME-URI"
                }
            }
        },
        "dummymodel.inference": {
            "version": "0.0.1"
        },
        "dummymodel": {
            "version": "0.0.1"
        }
    }
}

It’s worth noting that we have added not only the model, inference components, and agent, but also the AWS IoT Greengrass CLI and nucleus as components. The former can help debug certain deployments locally on the device. The latter is added into the deployment to configure the necessary network access from the device itself if needed (for example, proxy settings), and also in case you want to perform an OTA upgrade of the AWS IoT Greengrass v2 nucleus. The nucleus isn’t deployed because it’s installed in the device, and only the configuration update will be applied (unless an upgrade is in place). To deploy, we simply need to run the following command over the preceding configuration. Remember to set up the target ARN to which the deployment will be applied (an IoT thing or IoT group). We can also deploy these components from the console.

aws greengrassv2 create-deployment --region $REGION 
                                   --cli-input-json file://path/to/deployment.json

Monitor and manage ML models deployed to the edge

Now that your application is running on the edge devices, it’s time to understand how to monitor the fleet to improve governance, maintenance, and visibility. On the SageMaker console, choose Edge Manager in the navigation pane, then choose Edge device fleets. From here, choose your fleet.

On the fleet’s detail page, you can see some metadata of the models that are running on each device of your fleet. Fleet report is generated every 24 hours.

Data captured by each device through the Edge Agent is sent to an S3 bucket in json lines format (JSONL). The process of sending captured data is managed from an application standpoint. You are therefore free to decide whether to send this data, how and how often.

You can use this data for many things, such as monitoring data drift and model quality, building a new dataset, enriching a data lake, and more. A simple example of how to utilize this data is when you identify some data drift in the way users are interacting with your application and you need to train a new model. You then build a new dataset with the captured data and copy it back to the development account. This can automatically start a new run of your environment that builds a new model and redeploys it to the whole fleet to keep the performance of the deployed solution.

Conclusion

In this post, you learned how to build a complete solution that combines MLOps and ML@Edge using AWS services. Building such a solution is not trivial, but we hope the reference architecture presented in this post can inspire and help you build a solid architecture for your own business challenges. You can also use just the parts or modules of this architecture that integrate with your existing MLOps environment. By prototyping one single module at a time and using the appropriate AWS services to address each piece of this challenge, you can learn how to build a robust MLOps environment and also further simplify the final architecture.

As a next step, we encourage you to try out Sagemaker Edge Manager to manage your ML at edge lifecycle. For more information on how Edge Manager works, see Deploy models at the edge with SageMaker Edge Manager .


About the authors

Bruno Pistone is an AI/ML Specialist Solutions Architect for AWS based in Milan. He works with customers of any size on helping them to to deeply understand their technical needs and design AI and Machine Learning solutions that make the best use of the AWS Cloud and the Amazon Machine Learning stack. His field of expertice are Machine Learning end to end, Machine Learning Industrialization and MLOps. He enjoys spending time with his friends and exploring new places, as well as travelling to new destinations.

Matteo Calabrese is an AI/ML Customer Delivery Architect in AWS Professional Services team. He works with EMEA large enterprises on AI/ML projects, helping them in proposition, design, deliver, scale, and optimize ML production workloads. His main expertise are ML Operation (MLOps) and Machine Learning at Edge. His goal is shortening their time to value and accelerate business outcomes by providing AWS best practices. In his spare time, he enjoys hiking and traveling.

Raúl Díaz García is a Sr Data Scientist in AWS Professional Services team. He works with large enterprise customers across EMEA, where he helps them enable solutions related to Computer Vision and Machine Learning in the IoT space.

Sokratis Kartakis is a Senior Machine Learning Specialist Solutions Architect for Amazon Web Services. Sokratis focuses on enabling enterprise customers to industrialize their Machine Learning (ML) solutions by exploiting AWS services and shaping their operating model, i.e. MLOps foundation, and transformation roadmap leveraging best development practices. He has spent 15+ years on inventing, designing, leading, and implementing innovative end-to-end production-level ML and Internet of Things (IoT) solutions in the domains of energy, retail, health, finance/banking, motorsports etc. Sokratis likes to spend his spare time with family and friends, or riding motorbikes.

Samir Araújo is an AI/ML Solutions Architect at AWS. He helps customers creating AI/ML solutions which solve their business challenges using AWS. He has been working on several AI/ML projects related to computer vision, natural language processing, forecasting, ML at the edge, and more. He likes playing with hardware and automation projects in his free time, and he has a particular interest for robotics.

Read More