Amazon AWS – Page 208

Promote feature discovery and reuse across your organization using Amazon SageMaker Feature Store and its feature-level metadata capability

August 3, 2022

by Arnaud Lauer Amazon AWS

Amazon SageMaker Feature Store helps data scientists and machine learning (ML) engineers securely store, discover, and share curated data used in training and prediction workflows. Feature Store is a centralized store for features and associated metadata, allowing features to be easily discovered and reused by data scientist teams working on different projects or ML models.

With Feature Store, you have always been able to add metadata at the feature group level. Data scientists who want the ability to search and discover existing features for their models now have the ability to search for information at the feature level by adding custom metadata. For example, the information can include a description of the feature, the date it was last modified, its original data source, certain metrics, or the level of sensitivity.

The following diagram illustrates the architecture relationships between feature groups, features, and associated metadata. Note how data scientists can now specify descriptions and metadata at both the feature group level and the individual feature level.

In this post, we explain how data scientists and ML engineers can use feature-level metadata with the new search and discovery capabilities of Feature Store to promote better feature reuse across their organization. This capability can significantly help data scientists in the feature selection process and, as a result, help you identify features that lead to increased model accuracy.

Use case

For the purposes of this post, we use two feature groups, customer and loan.

The customer feature group has the following features:

age – Customer’s age (numeric)
job – Type of job (one-hot encoded, such as admin or services)
marital – Marital status (one-hot encoded, such as married or single)
education – Level of education (one-hot encoded, such as basic 4y or high school)

The loan feature group has the following features:

default – Has credit in default? (one-hot encoded: no or yes)
housing – Has housing loan? (one-hot encoded: no or yes)
loan – Has personal loan? (one-hot encoded: no or yes)
total_amount – Total amount of loans (numeric)

The following figure shows example feature groups and feature metadata.

The purpose of adding a description and assigning metadata to each feature is to increase the speed of discovery by enabling new search parameters along which a data scientist or ML engineer can explore features. These can reflect details about a feature such as its calculation, whether it’s an average over 6 months or 1 year, origin, creator or owner, what the feature means, and more.

In the following sections, we provide two approaches to search and discover features and configure feature-level metadata: the first using Amazon SageMaker Studio directly, and the second programmatically.

Feature discovery in Studio

You can easily search and query features using Studio. With the new enhanced search and discovery capabilities, you can immediately retrieve results using a simple type-ahead of a few characters.

The following screenshot demonstrates the following capabilities:

You can access the Feature Catalog tab and observe features across feature groups. The features are presented in a table that includes the feature name, type, description, parameters, date of creation, and associated feature group’s name.
You can directly use the type-ahead functionality to immediately return search results.
You have the flexibility to use different types of filter options: All, Feature name, Description, or Parameters. Note that All will return all features where either Feature name, Description, or Parameters match the search criteria.
You can narrow down the search further by specifying a date range using the Created from and Created to fields and specifying parameters using the Search parameter key and Search parameter value fields.

After you have selected a feature, you can choose the feature’s name to bring up its details. When you choose Edit Metadata, you can add a description and up to 25 key-value parameters, as shown in the following screenshot. Within this view, you can ultimately create, view, update, and delete the feature’s metadata. The following screenshot illustrates how to edit feature metadata for total_amount.

As previously stated, adding key-value pairs to a feature gives you more dimensions along which to search for their given features. For our example, the feature’s origin has been added to every feature’s metadata. When you choose the search icon and filter along the key-value pair origin: job, you can see all the features that were one-hot-encoded from this base attribute.

Feature discovery using code

You can also access and update feature information through the AWS Command Line Interface (AWS CLI) and SDK (Boto3) rather than directly through the AWS Management Console. This allows you to integrate the feature-level search functionality of Feature Store with your own custom data science platforms. In this section, we interact with the Boto3 API endpoints to update and search feature metadata.

To begin improving feature search and discovery, you can add metadata using the update_feature_metadata API. In addition to the description and created_date fields, you can add up to 25 parameters (key-value pairs) to a given feature.

The following code is an example of five possible key-value parameters that have been added to the job_admin feature. This feature was created, along with job_services and job_none, by one-hot-encoding job.

sagemaker_client.update_feature_metadata(
    FeatureGroupName="customer",
    FeatureName="job_admin",
    ParameterAdditions=[
        {"Key": "author", "Value": "arnaud"}, # Feature's author
        {"Key": "team", "Value": "mlops"}, # Team owning the feature
        {"Key": "origin", "Value": "job"}, # Raw input parameter
        {"Key": "sensitivity", "Value": "5"}, # 1-5 scale for data sensitivity
        {"Key": "env", "Value": "testing"} # Environment the feature is used in
    ]
)

After author, team, origin, sensitivity, and env have been added to the job_admin feature, data scientists or ML engineers can retrieve them by calling the describe_feature_metadata API. You can navigate to the Parameters object in the response for the metadata we previously added to our feature. The describe_feature_metadata API endpoint allows you to get greater insight into a given feature by getting its associated metadata.

response = sagemaker_client.describe_feature_metadata(
    FeatureGroupName="customer",
    FeatureName="job_admin",
)

# Navigate to 'Parameters' in response to get metadata
metadata = response['Parameters']

You can search for features by using the SageMaker search API using metadata as search parameters. The following code is an example function that takes a search_string parameter as an input and returns all features where the feature’s name, description, or parameters match the condition:

def search_features_using_string(search_string):
    response = sagemaker_client.search(
        Resource= "FeatureMetadata",
        SearchExpression={
            'Filters': [
               {
                   'Name': 'FeatureName',
                   'Operator': 'Contains',
                   'Value': search_string
               },
               {
                   'Name': 'Description',
                   'Operator': 'Contains',
                   'Value': search_string
               },
               {
                   'Name': 'AllParameters',
                   'Operator': 'Contains',
                   'Value': search_string
               }
           ],
           "Operator": "Or"
        },
    )

    # Displaying results in a pandas DataFrame
    df=pd.json_normalize(response['Results'], max_level=1)
    df.columns = df.columns.map(lambda col: col.split(".")[1])
    df=df.drop('FeatureGroupArn', axis=1)

    return df

The following code snippet uses our search_features function to retrieve all features for which either the feature name, description, or parameters contain the word job:

search_results = search_features_using_string('mlops')
search_results

The following screenshot contains the list of matching feature names as well as their corresponding metadata, including timestamps for each feature’s creation and last modification. You can use this information to improve discovery and visibility into your organization’s features.

Conclusion

SageMaker Feature Store provides a purpose-built feature management solution to help organizations scale ML development across business units and data science teams. Improving feature reuse and feature consistency are primary benefits of a feature store. In this post, we explained how you can use feature-level metadata to improve search and discovery of features. This included creating metadata around a variety of use cases and using it as additional search parameters.

Give it a try, and let us know what you think in comments. If you want to learn more about collaborating and sharing features within Feature Store, refer to Enable feature reuse across accounts and teams using Amazon SageMaker Feature Store.

About the authors

Arnaud Lauer is a Senior Partner Solutions Architect in the Public Sector team at AWS. He enables partners and customers to understand how best to use AWS technologies to translate business needs into solutions. He brings more than 16 years of experience in delivering and architecting digital transformation projects across a range of industries, including the public sector, energy, and consumer goods. Artificial intelligence and machine learning are some of his passions. Arnaud holds 12 AWS certifications, including the ML Specialty Certification.

Nicolas Bernier is an Associate Solutions Architect, part of the Canadian Public Sector team at AWS. He is currently conducting a master’s degree with a research area in Deep Learning and holds five AWS certifications, including the ML Specialty Certification. Nicolas is passionate about helping customers deepen their knowledge of AWS by working with them to translate their business challenges into technical solutions.

Mark Roy is a Principal Machine Learning Architect for AWS, helping customers design and build AI/ML solutions. Mark’s work covers a wide range of ML use cases, with a primary interest in computer vision, deep learning, and scaling ML across the enterprise. He has helped companies in many industries, including insurance, financial services, media and entertainment, healthcare, utilities, and manufacturing. Mark holds six AWS certifications, including the ML Specialty Certification. Prior to joining AWS, Mark was an architect, developer, and technology leader for over 25 years, including 19 years in financial services.

Khushboo Srivastava is a Senior Product Manager for Amazon SageMaker. She enjoys building products that simplify machine learning workflows for customers. In her spare time, she enjoys playing violin, practicing yoga, and traveling.

Amazon wins best-paper award at first AutoML conference

August 3, 2022

by Amazon AWS

Paper presents a criterion for halting the hyperparameter optimization process.Read More

Scale YOLOv5 inference with Amazon SageMaker endpoints and AWS Lambda

August 2, 2022

by Kevin Song Amazon AWS

After data scientists carefully come up with a satisfying machine learning (ML) model, the model must be deployed to be easily accessible for inference by other members of the organization. However, deploying models at scale with optimized cost and compute efficiencies can be a daunting and cumbersome task. Amazon SageMaker endpoints provide an easily scalable and cost-optimized solution for model deployment. The YOLOv5 model, distributed under the GPLv3 license, is a popular object detection model known for its runtime efficiency as well as detection accuracy. In this post, we demonstrate how to host a pre-trained YOLOv5 model on SageMaker endpoints and use AWS Lambda functions to invoke these endpoints.

Solution overview

The following image outlines the AWS services used to host the YOLOv5 model using a SageMaker endpoint and invoke the endpoint using Lambda. The SageMaker notebook accesses a YOLOv5 PyTorch model from an Amazon Simple Storage Service (Amazon S3) bucket, converts it to YOLOv5 TensorFlow SavedModel format, and stores it back to the S3 bucket. This model is then used when hosting the endpoint. When an image is uploaded to Amazon S3, it acts as a trigger to run the Lambda function. The function utilizes OpenCV Lambda layers to read the uploaded image and run inference using the endpoint. After the inference is run, you can use the results obtained from it as needed.

In this post, we walk through the process of utilizing a YOLOv5 default model in PyTorch and converting it to a TensorFlow SavedModel. This model is hosted using a SageMaker endpoint. Then we create and publish a Lambda function that invokes the endpoint to run inference. Pre-trained YOLOv5 models are available on GitHub. For the purpose of this post, we use the yolov5l model.

Prerequisites

As a prerequisite, we need to set up the following AWS Identity and Access Management (IAM) roles with appropriate access policies for SageMaker, Lambda, and Amazon S3:

SageMaker IAM role – This requires AmazonS3FullAccess policies attached for storing and accessing the model in the S3 bucket
Lambda IAM role – This role needs multiple policies:
- To access images stored in Amazon S3, we require the following IAM policies:
  - s3:GetObject
  - s3:ListBucket
- To run the SageMaker endpoint, we need access to the following IAM policies:
  - sagemaker:ListEndpoints
  - sagemaker:DescribeEndpoint
  - sagemaker:InvokeEndpoint
  - sagemaker:InvokeEndpointAsync

You also need the following resources and services:

The AWS Command Line Interface (AWS CLI), which we use to create and configure Lambda.
A SageMaker notebook instance. These come with Docker pre-installed, and we use this to create the Lambda layers. To set up the notebook instance, complete the following steps:
- On the SageMaker console, create a notebook instance and provide the notebook name, instance type (for this post, we use ml.c5.large), IAM role, and other parameters.
- Clone the public repository and add the YOLOv5 repository provided by Ultralytics.

Host YOLOv5 on a SageMaker endpoint

Before we can host the pre-trained YOLOv5 model on SageMaker, we must export and package it in the correct directory structure inside model.tar.gz. For this post, we demonstrate how to host YOLOv5 in the saved_model format. The YOLOv5 repo provides an export.py file that can export the model in many different ways. After you clone the YOLOv5 and enter the YOLOv5 directory from command line, you can export the model with the following command:

$ cd yolov5
$ pip install -r requirements.txt tensorflow-cpu
$ python export.py --weights yolov5l.pt --include saved_model --nms

This command creates a new directory called yolov5l_saved_model inside the yolov5 directory. Inside the yolov5l_saved_model directory, we should see the following items:

yolov5l_saved_model
    ├─ assets
    ├─ variables
    │    ├── variables.data-00000-of-00001
    │    └── variables.index
    └── saved_model.pb

To create a model.tar.gz file, move the contents of yolov5l_saved_model to export/Servo/1. From the command line, we can compress the export directory by running the following command and upload the model to the S3 bucket:

$ mkdir export && mkdir export/Servo
$ mv yolov5l_saved_model export/Servo/1
$ tar -czvf model.tar.gz export/
$ aws s3 cp model.tar.gz "<s3://BUCKET/PATH/model.tar.gz>"

Then, we can deploy a SageMaker endpoint from a SageMaker notebook by running the following code:

import os
import tensorflow as tf
from tensorflow.keras import backend
from sagemaker.tensorflow import TensorFlowModel

model_data = '<s3://BUCKET/PATH/model.tar.gz>'
role = '<IAM ROLE>'

model = TensorFlowModel(model_data=model_data,
                        framework_version='2.8', role=role)

INSTANCE_TYPE = 'ml.m5.xlarge'
ENDPOINT_NAME = 'yolov5l-demo'

predictor = model.deploy(initial_instance_count=1,
                            instance_type=INSTANCE_TYPE,
                            endpoint_name=ENDPOINT_NAME)

The preceding script takes approximately 2–3 minutes to fully deploy the model to the SageMaker endpoint. You can monitor the status of the deployment on the SageMaker console. After the model is hosted successfully, the model is ready for inference.

Test the SageMaker endpoint

After the model is successfully hosted on a SageMaker endpoint, we can test it out, which we do using a blank image. The testing code is as follows:

import numpy as np

ENDPOINT_NAME = 'yolov5l-demo'

modelHeight, modelWidth = 640, 640
blank_image = np.zeros((modelHeight,modelWidth,3), np.uint8)
data = np.array(resized_image.astype(np.float32)/255.)
payload = json.dumps([data.tolist()])
response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,
ContentType='application/json',
Body=payload)

result = json.loads(response['Body'].read().decode())
print('Results: ', result)

Set up Lambda with layers and triggers

We use OpenCV to demonstrate the model by passing an image and getting the inference results. Lambda doesn’t come with external libraries like OpenCV pre-built, therefore we need to build it before we can invoke the Lambda code. Furthermore, we want to make sure that we don’t build external libraries like OpenCV every time Lambda is being invoked. For this purpose, Lambda provides a functionality to create Lambda layers. We can define what goes in these layers, and they can be consumed by the Lambda code every time it’s invoked. We also demonstrate how to create the Lambda layers for OpenCV. For this post, we use an Amazon Elastic Compute Cloud (Amazon EC2) instance to create the layers.

After we have the layers in place, we create the app.py script, which is the Lambda code that uses the layers, runs the inference, and gets results. The following diagram illustrates this workflow.

Create Lambda layers for OpenCV using Docker

Use Dockerfile as follows to create the Docker image using Python 3.7:

FROM amazonlinux

RUN yum update -y
RUN yum install gcc openssl-devel bzip2-devel libffi-devel wget tar gzip zip make -y

# Install Python 3.7
WORKDIR /
RUN wget https://www.python.org/ftp/python/3.7.12/Python-3.7.12.tgz
RUN tar -xzvf Python-3.7.12.tgz
WORKDIR /Python-3.7.12
RUN ./configure --enable-optimizations
RUN make altinstall

# Install Python packages
RUN mkdir /packages
RUN echo "opencv-python" >> /packages/requirements.txt
RUN mkdir -p /packages/opencv-python-3.7/python/lib/python3.7/site-packages
RUN pip3.7 install -r /packages/requirements.txt -t /packages/opencv-python-3.7/python/lib/python3.7/site-packages

# Create zip files for Lambda Layer deployment
WORKDIR /packages/opencv-python-3.7/
RUN zip -r9 /packages/cv2-python37.zip .
WORKDIR /packages/
RUN rm -rf /packages/opencv-python-3.7/

Build and run Docker and store the output ZIP file in the current directory under layers:

$ docker build --tag aws-lambda-layers:latest <PATH/TO/Dockerfile>
$ docker run -rm -it -v $(pwd):/layers aws-lambda-layers cp /packages/cv2-python37.zip /layers

Now we can upload the OpenCV layer artifacts to Amazon S3 and create the Lambda layer:

$ aws s3 cp layers/cv2-python37.zip s3://<BUCKET>/<PATH/TO/STORE/ARTIFACTS>
$ aws lambda publish-layer-version --layer-name cv2 --description "Open CV" --content S3Bucket=<BUCKET>,S3Key=<PATH/TO/STORE/ARTIFACTS>/cv2-python37.zip --compatible-runtimes python3.7

After the preceding commands run successfully, you have an OpenCV layer in Lambda, which you can review on the Lambda console.

Create the Lambda function

We utilize the app.py script to create the Lambda function and use OpenCV. In the following code, change the values for BUCKET_NAME and IMAGE_LOCATION to the location for accessing the image:

import os, logging, json, time, urllib.parse
import boto3, botocore
import numpy as np, cv2

logger = logging.getLogger()
logger.setLevel(logging.INFO)
client = boto3.client('lambda')

# S3 BUCKETS DETAILS
s3 = boto3.resource('s3')
BUCKET_NAME = "<NAME OF S3 BUCKET FOR INPUT IMAGE>"
IMAGE_LOCATION = "<S3 PATH TO IMAGE>/image.png"

# INFERENCE ENDPOINT DETAILS
ENDPOINT_NAME = 'yolov5l-demo'
config = botocore.config.Config(read_timeout=80)
runtime = boto3.client('runtime.sagemaker', config=config)
modelHeight, modelWidth = 640, 640

# RUNNING LAMBDA
def lambda_handler(event, context):
    key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')

    # INPUTS - Download Image file from S3 to Lambda /tmp/
    input_imagename = key.split('/')[-1]
    logger.info(f'Input Imagename: {input_imagename}')
    s3.Bucket(BUCKET_NAME).download_file(IMAGE_LOCATION + '/' + input_imagename, '/tmp/' + input_imagename)

    # INFERENCE - Invoke the SageMaker Inference Endpoint
    logger.info(f'Starting Inference ... ')
    orig_image = cv2.imread('/tmp/' + input_imagename)
    if orig_image is not None:
        start_time_iter = time.time()
        # pre-processing input image
        image = cv2.resize(orig_image.copy(), (modelWidth, modelHeight), interpolation = cv2.INTER_AREA)
        data = np.array(image.astype(np.float32)/255.)
        payload = json.dumps([data.tolist()])
        # run inference
        response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME, ContentType='application/json', Body=payload)
        # get the output results
        result = json.loads(response['Body'].read().decode())
        end_time_iter = time.time()
        # get the total time taken for inference
        inference_time = round((end_time_iter - start_time_iter)*100)/100
    logger.info(f'Inference Completed ... ')

    # OUTPUTS - Using the output to utilize in other services downstream
    return {
        "statusCode": 200,
        "body": json.dumps({
                "message": "Inference Time:// " + str(inference_time) + " seconds.",
                "results": result
                }),
            }

Deploy the Lambda function with the following code:

$ zip app.zip app.py
$ aws s3 cp app.zip s3://<BUCKET>/<PATH/TO/STORE/FUNCTION>
$ aws lambda create-function --function-name yolov5-lambda --handler app.lambda_handler --region us-east-1 --runtime python3.7 --environment "Variables={BUCKET_NAME=$BUCKET_NAME,S3_KEY=$S3_KEY}" --code S3Bucket=<BUCKET>,S3Key="<PATH/TO/STORE/FUNCTION/app.zip>"

Attach the OpenCV layer to the Lambda function

After we have the Lambda function and layer in place, we can connect the layer to the function as follows:

$ aws lambda update-function-configuration --function-name yolov5-lambda --layers cv2

We can review the layer settings via the Lambda console.

Trigger Lambda when an image is uploaded to Amazon S3

We use an image upload to Amazon S3 as a trigger to run the Lambda function. For instructions, refer to Tutorial: Using an Amazon S3 trigger to invoke a Lambda function.

You should see the following function details on the Lambda console.

Run inference

After you set up Lambda and the SageMaker endpoint, you can test the output by invoking the Lambda function. We use an image upload to Amazon S3 as a trigger to invoke Lambda, which in turn invokes the endpoint for inference. As an example, we upload the following image to the Amazon S3 location <S3 PATH TO IMAGE>/test_image.png configured in the previous section.

After the image is uploaded, the Lambda function is triggered to download and read the image data and send it to the SageMaker endpoint for inference. The output result from the SageMaker endpoint is obtained and returned by the function in JSON format, which we can use in different ways. The following image shows example output overlayed on the image.

Clean up

Depending on the instance type, SageMaker notebooks can require significant compute usage and cost. To avoid unnecessary costs, we advise stopping the notebook instance when it’s not in use. Additionally, Lambda functions and SageMaker endpoints incur charges only when they’re invoked. Therefore, no cleanup is necessary for those services. However, if an endpoint isn’t being used any longer, it’s good practice to remove the endpoint and the model.

Conclusion

In this post, we demonstrated how to host a pre-trained YOLOv5 model on a SageMaker endpoint and use Lambda to invoke inference and process the output. The detailed code is available on GitHub.

To learn more about SageMaker endpoints, check out Create your endpoint and deploy your model and Build, test, and deploy your Amazon SageMaker inference models to AWS Lambda, which highlights how you can automate the process of deploying YOLOv5 models.

About the authors

Kevin Song is an IoT Edge Data Scientist at AWS Professional Services. Kevin holds a PhD in Biophysics from The University of Chicago. He has over 4 years of industry experience in Computer Vision and Machine Learning. He is involved in helping customers in the sports and life sciences industry deploy Machine Learning models.

Romil Shah is an IoT Edge Data Scientist at AWS Professional Services. Romil has over 6 years of industry experience in Computer Vision, Machine Learning and IoT edge devices. He is involved in helping customers optimize and deploy their Machine Learning models for edge devices for industrial setup.

20B-parameter Alexa model sets new marks in few-shot learning

August 2, 2022

by Amazon AWS

With an encoder-decoder architecture — rather than decoder only — the Alexa Teacher Model excels other large language models on few-shot tasks such as summarization and machine translation.Read More

Amazon Scholar Kathleen McKeown receives dual honors

August 1, 2022

by Amazon AWS

McKeown awarded IEEE Innovation in Societal Infrastructure Award and named a member of the American Philosophical Society.Read More

Simplify iterative machine learning model development by adding features to existing feature groups in Amazon SageMaker Feature Store

August 1, 2022

by Chaitra Mathur Amazon AWS

Feature engineering is one of the most challenging aspects of the machine learning (ML) lifecycle and a phase where the most amount of time is spent—data scientists and ML engineers spend 60–70% of their time on feature engineering. AWS introduced Amazon SageMaker Feature Store during AWS re:Invent 2020, which is a purpose-built, fully managed, centralized store for features and associated metadata. Features are signals extracted from data to train ML models. The advantage of Feature Store is that the feature engineering logic is authored one time, and the features generated are stored on a central platform. The central store of features can be used for training and inference and be reused across different data engineering teams.

Features in a feature store are stored in a collection called feature group. A feature group is analogous to a database table schema where columns represent features and rows represent individual records. Feature groups have been immutable since Feature Store was introduced. If we had to add features to an existing feature group, the process was cumbersome—we had to create a new feature group, backfill the new feature group with historical data, and modify downstream systems to use this new feature group. ML development is an iterative process of trial and error where we may identify new features continuously that can improve model performance. It’s evident that not being able to add features to feature groups can lead to a complex ML model development lifecycle.

Feature Store recently introduced the ability to add new features to existing feature groups. A feature group schema evolves over time as a result of new business requirements or because new features have been identified that yield better model performance. Data scientists and ML engineers need to easily add features to an existing feature group. This ability reduces the overhead associated with creating and maintaining multiple feature groups and therefore lends itself to iterative ML model development. Model training and inference can take advantage of new features using the same feature group by making minimal changes.

In this post, we demonstrate how to add features to a feature group using the newly released UpdateFeatureGroup API.

Overview of solution

Feature Store acts as a single source of truth for feature engineered data that is used in ML training and inference. When we store features in Feature Store, we store them in feature groups.

We can enable feature groups for offline only mode, online only mode, or online and offline modes.

An online store is a low-latency data store and always has the latest snapshot of the data. An offline store has a historical set of records persisted in Amazon Simple Storage Service (Amazon S3). Feature Store automatically creates an AWS Glue Data Catalog for the offline store, which enables us to run SQL queries against the offline data using Amazon Athena.

The following diagram illustrates the process of feature creation and ingestion into Feature Store.

The workflow contains the following steps:

Define a feature group and create the feature group in Feature Store.
Ingest data into the feature group, which writes to the online store immediately and then to the offline store.
Use the offline store data stored in Amazon S3 for training one or more models.
Use the offline store for batch inference.
Use the online store supporting low-latency reads for real-time inference.
To update the feature group to add a new feature, we use the new Amazon SageMaker UpdateFeatureGroup API. This also updates the underlying AWS Glue Data Catalog. After the schema has been updated, we can ingest data into this updated feature group and use the updated offline and online store for inference and model training.

Dataset

To demonstrate this new functionality, we use a synthetically generated customer dataset. The dataset has unique IDs for customer, sex, marital status, age range, and how long since they have been actively purchasing.

Let’s assume a scenario where a business is trying to predict the propensity of a customer purchasing a certain product, and data scientists have developed a model to predict this intended outcome. Let’s also assume that the data scientists have identified a new signal for the customer that could potentially improve model performance and better predict the outcome. We work through this use case to understand how to update feature group definition to add the new feature, ingest data into this new feature, and finally explore the online and offline feature store to verify the changes.

Prerequisites

For this walkthrough, you should have the following prerequisites:

An AWS account.
A SageMaker Jupyter notebook instance. Access the code from the Amazon SageMaker Feature Store Update Feature Group GitHub repository and upload it to your notebook instance.
You can also run the notebook in the Amazon SageMaker Studio environment, which is an IDE for ML development. You can clone the GitHub repo via a terminal inside the Studio environment using the following command:

git clone https://github.com/aws-samples/amazon-sagemaker-feature-store-update-feature-group.git

Add features to a feature group

In this post, we walk through the update_feature_group.ipynb notebook, in which we create a feature group, ingest an initial dataset, update the feature group to add a new feature, and re-ingest data that includes the new feature. At the end, we verify the online and offline store for the updates. The fully functional notebook and sample data can be found in the GitHub repository. Let’s explore some of the key parts of the notebook here.

We create a feature group to store the feature-engineered customer data using the FeatureGroup.create API of the SageMaker SDK.

customers_feature_group = FeatureGroup(name=customers_feature_group_name, 
                                      sagemaker_session=sagemaker_session)

customers_feature_group.create(s3_uri=f's3://{default_bucket}/{prefix}', 
                               record_identifier_name='customer_id', 
                               event_time_feature_name='event_time', 
                               role_arn=role, 
                               enable_online_store=True)

We create a Pandas DataFrame with the initial CSV data. We use the current time as the timestamp for the event_time feature. This corresponds to the time when the event occurred, which implies when the record is added or updated in the feature group.
We ingest the DataFrame into the feature group using the SageMaker SDK FeatureGroup.ingest API. This is a small dataset and therefore can be loaded into a Pandas DataFrame. When we work with large amounts of data and millions of rows, there are other scalable mechanisms to ingest data into Feature Store, such as batch ingestion with Apache Spark.
```
customers_feature_group.ingest(data_frame=customers_df,
                               max_workers=3,
                               wait=True)
```

We can verify that data has been ingested into the feature group by running Athena queries in the notebook or running queries on the Athena console.
After we verify that the offline feature store has the initial data, we add the new feature has_kids to the feature group using the Boto3 update_feature_group API
```
sagemaker_runtime.update_feature_group(
                          FeatureGroupName=customers_feature_group_name,
                          FeatureAdditions=[
                             {"FeatureName": "has_kids", "FeatureType": "Integral"}
                          ])
```
The Data Catalog gets automatically updated as part of this API call. The API supports adding multiple features at a time by specifying them in the FeatureAdditions dictionary.

We verify that feature has been added by checking the updated feature group definition
```
describe_feature_group_result = sagemaker_runtime.describe_feature_group(
                                           FeatureGroupName=customers_feature_group_name)
pretty_printer.pprint(describe_feature_group_result)
```
The LastUpdateStatus in the describe_feature_group API response initially shows the status InProgress. After the operation is successful, the LastUpdateStatus status changes to Successful. If for any reason the operation encounters an error, the lastUpdateStatus status shows as Failed, with the detailed error message in FailureReason.

When the update_feature_group API is invoked, the control plane reflects the schema change immediately, but the data plane takes up to 5 minutes to update its feature group schema. We must ensure that enough time is given for the update operation before proceeding to data ingestion.

We prepare data for the has_kids feature by generating random 1s and 0s to indicate whether a customer has kids or not.
```
customers_df['has_kids'] =np.random.randint(0, 2, customers_df.shape[0])
```

We ingest the DataFrame that has the newly added column into the feature group using the SageMaker SDK FeatureGroup.ingest API

customers_feature_group.ingest(data_frame=customers_df,
                               max_workers=3,
                               wait=True)

Next, we verify the feature record in the online store for a single customer using the Boto3 get_record API.

get_record_result = featurestore_runtime.get_record(
                                          FeatureGroupName=customers_feature_group_name,
                                          RecordIdentifierValueAsString=customer_id)
pretty_printer.pprint(get_record_result)

Get Record API response

Let’s query the same customer record on the Athena console to verify the offline data store. The data is appended to the offline store to maintain historical writes and updates. Therefore, we see two records here: a newer record that has the feature updated to value 1, and an older record that doesn’t have this feature and therefore shows the value as empty. The offline store persistence happens in batches within 15 minutes, so this step could take time.

Now that we have this feature added to our feature group, we can extract this new feature into our training dataset and retrain models. The goal of the post is to highlight the ease of modifying a feature group, ingesting data into the new feature, and then using the updated data in the feature group for model training and inference.

Clean up

Don’t forget to clean up the resources created as part of this post to avoid incurring ongoing charges.

Delete the S3 objects in the offline store:

s3_config = describe_feature_group_result['OfflineStoreConfig']['S3StorageConfig']
s3_uri = s3_config['ResolvedOutputS3Uri']
full_prefix = '/'.join(s3_uri.split('/')[3:])
bucket = s3.Bucket(default_bucket)
offline_objects = bucket.objects.filter(Prefix=full_prefix)
offline_objects.delete()

Delete the feature group:
```
customers_feature_group.delete()
```

Stop the SageMaker Jupyter notebook instance. For instructions, refer to Clean Up.

Conclusion

Feature Store is a fully managed, purpose-built repository to store, share, and manage features for ML models. Being able to add features to existing feature groups simplifies iterative model development and alleviates the challenges we see in creating and maintaining multiple feature groups.

In this post, we showed you how to add features to existing feature groups via the newly released SageMaker UpdateFeatureGroup API. The steps shown in this post are available as a Jupyter notebook in the GitHub repository. Give it a try and let us know your feedback in the comments.

References

More information is available at the following resources:

About the authors

Chaitra Mathur is a Principal Solutions Architect at AWS. She guides customers and partners in building highly scalable, reliable, secure, and cost-effective solutions on AWS. She is passionate about Machine Learning and helps customers translate their ML needs into solutions using AWS AI/ML services. She holds 5 certifications including the ML Specialty certification. In her spare time, she enjoys reading, yoga, and spending time with her daughters.

Charu Sareen is a Sr. Product Manager for Amazon SageMaker Feature Store. Prior to AWS, she was leading growth and monetization strategy for SaaS services at VMware. She is a data and machine learning enthusiast and has over a decade of experience spanning product management, data engineering, and advanced analytics. She has a bachelor’s degree in Information Technology from National Institute of Technology, India and an MBA from University of Michigan, Ross School of Business.

Frank McQuillan is a Principal Product Manager for Amazon SageMaker Feature Store.

The path to carbon reductions in high-growth economic sectors

August 1, 2022

by Amazon AWS

Confronting climate change requires the participation of governments, companies, academics, civil-society organizations, and the public.Read More

Add conversational AI to any contact center with Amazon Lex and the Amazon Chime SDK

July 29, 2022

by Prem Ranga Amazon AWS

Customer satisfaction is a potent metric that directly influences the profitability of an organization. With rapid technological advances in the past decade or so, it’s even more important to elevate customer focus in the following ways:

Making your organization accessible to your customers across multiple modalities, including voice, text, social media, and more
Providing your customers with a highly efficient post-sales and service experience
Continuously improving the quality of your service as business trends and dynamics change

Establishing highly efficient contact centers requires significant automation, the ability to scale, and a mechanism of active learning through customer feedback. There is a challenge at every point in the contact center customer journey—from long hold times at the beginning to operational costs associated with long average handle times.

In traditional contact centers, one solution for long hold times is enabling self-service options for customers using an Interactive Voice Response system (IVR). An IVR uses a set of automated menu options to help reduce agent call volumes by addressing common frequently asked requests without involving a live agent. Traditional IVRs, however, typically follow a pre-determined sequence, without the ability to respond intelligently to customer requests. A non-conversational IVR such as this can frustrate your customers and lead them to attempt to contact an agent as soon as possible, which increases your call deflection rates. You can solve for this challenge by adding artificial intelligence (AI) to your IVR. An AI-enabled IVR can more quickly and accurately help your customer resolve issues without human intervention. When an agent is needed, the AI-enabled IVR can route your customer to the correct agent with the correct information already collected, thereby saving the customer from having to repeat the information. With AWS AI services, it’s even easier because there is no machine learning (ML) training or expertise required to use powerful, pre-trained ML models.

AI-powered automated applications are a natural choice for IVRs because they can understand and respond in natural language. Additionally, you can add enhanced capabilities to your IVR to learn and evolve based on how customers interact with it. With Amazon Lex, you can build powerful, multi-lingual conversational AI systems and elevate the self-service experience for your customers with no ML skills required. With the Amazon Chime SDK, you can easily integrate your existing contact center to Amazon Lex using an Amazon Chime SDK SIP media application. This includes contact centers such as Avaya, Cisco, Genesys, and others. Amazon Chime SDK integration with Amazon Lex is available in US East (N. Virginia) and US West (Oregon) AWS Regions.

This allows you the flexibility of native integration with Amazon Lex for AI-powered self-service, and the ability to integrate with a host of other AWS AI services to transform your entire contact center operations.

In this post, we provide a walkthrough of how you can add AI-powered IVRs to any contact center that supports SIP trunking using Amazon Chime SDK and Amazon Lex, via the recently launched Amazon Chime SDK PSTN audio integration with Amazon Lex. We cover the following topics in this post:

Reference solution architecture for the self-service AI
Deploying the solution
Reviewing the Account Balance chatbot
Reviewing the Amazon Chime SDK Voice Connector
Testing the solution
Cleaning up resources

Solution overview

As described in the previous section, we use two key AWS services, Amazon Lex and the Amazon Chime SDK, to build the self-service AI solution. We also use AWS Lambda (a fully managed serverless compute service), Amazon Elastic Compute Cloud (Amazon EC2, a compute infrastructure), and Amazon DynamoDB (a fully managed no SQL database) to create a working example. The code base for this solution is available in the accompanying GitHub repository. Instructions to deploy and test this solution are provided in the next section.

The following diagram illustrates the solution architecture.

The solution workflow consists of the following steps:

When we make a phone call using a landline or cell phone, the Public Switched Telephone Network (PSTN) connects us to the other party. In this demo, we use an Asterisk server (a free contact center framework) deployed on an Amazon EC2 server to emulate a contact center connected to the PSTN through an Amazon Chime Voice Connector. Asterisk is a software implementation of a private branch exchange (PBX)— a controller of a private telephone network used within a company or organization.
As part of this demo, a phone number is acquired via the Amazon Chime SDK and associated with the Asterisk PBX. When a call is made to this number, it’s delivered as SIP (Session Initiation Protocol) to the Asterisk PBX server. The Asterisk PBX then routes this call to the Amazon Chime Voice Connector using SIP, where it triggers an Amazon Chime SIP media application.
Amazon Chime PSTN audio uses a SIP media application to create a programmable VoIP application. The Amazon Chime SIP media application works with a Lambda function to programmatically handle the call.
When the call arrives at the Amazon Chime SIP media application, the associated Lambda function is invoked. The function stores the call information in a DynamoDB table and returns a StartBotConversation action. The StartBotConversation action establishes a voice conversation between the end-user on PSTN and the Amazon Lex bot.
Amazon Lex is a fully managed AWS AI service with advanced natural language models to design, build, test, and deploy conversational interfaces in applications. It combines automatic speech recognition and natural language understanding technologies to create a human-like interaction for your applications. As an example, this demo deploys a bot to perform three automated tasks, or intents: Check Balance, Transfer Funds, and Open Account. An intent represents an action that the user wants to perform.
The conversation starts with the caller interacting with the Amazon Lex bot by telling the bot what they want to do. The automatic speech recognition (ASR) and natural language understanding (NLU) capabilities of the bot help it understand the user input. Amazon Lex is able to determine the intent requested based on the caller input and sample utterances configured for each intent.
After the intent is determined, Amazon Lex interacts with the caller to gather information for all the slots configured for that intent. For example, the Open Account intent includes four slots:
1. First Name
2. Last Name
3. Account Type
4. Phone Number
Amazon Lex works with the caller to capture information for all of these required slots of the selected intent. After these have been captured and the intent has been fulfilled, Amazon Lex returns call processing to the Amazon Chime SIP media application, along with the full results of the Amazon Lex bot conversation.
The subsequent processing steps are performed by the PSTN audio handler Lambda function. This includes parsing the results, determining the next call route action, storing the results in a DynamoDB table, and returning the hang up action.
The Asterisk PBX uses the information stored in the DynamoDB table to determine the next action. For example, if the caller wanted to check their balance, the call ends. However, if the caller wanted to open an account, the call is sent to the agent and includes the information captured in the Amazon Lex bot.

We have used AWS Cloud Development Kit (AWS CDK) to package this application for easy deployment in your account. The AWS CDK is an open-source software development framework to define your cloud application resources using familiar programming languages. It provides high-level components called constructs that preconfigure cloud resources with proven defaults, so you can build cloud applications with ease.

Prerequisites

Before we deploy the solution, we need to have an AWS account and a local machine to run the AWS CDK stack. Complete the following steps:

Log in to your AWS account.
If you don’t have an AWS account, you can sign up for one.For new customers, AWS provides a Free Tier, which provides the ability to explore and try out AWS services free of charge (up to the specified limits for each service). This can help you gain hands-on experience with the AWS platform, products, and services.We use a local machine, such as a laptop or a desktop computer, to deploy the stack using AWS CDK.
Open a new terminal window for MacOS, or putty for Windows OS to install all the prerequisites required to deploy the solution.
Install the following prerequisite software:
1. AWS Command Line Interface (AWS CLI) – A command line tool for interacting with AWS services. For installation instructions, refer to Installing, updating, and uninstalling the AWS CLI.
2. Node.js > 16 – Open-source JavaScript backend engine for application development and deployment. For installation instructions, refer to Tutorial: Setting Up Node.js on an Amazon EC2 Instance.
3. Yarn – Yarn is a package manager for your code. It allows easy access to use and share the code between developers. Run the following command to install Yarn:
```
curl -o- -L https://yarnpkg.com/install.sh | bash
```
  Now we run the following commands to set up the AWS access keys we need. For more information, refer to Managing access keys for IAM users.
Run the following command:
```
aws configure list
```
Run the following command:
```
aws configure
```
Provide the values for your AWS account’s access key ID and secret access key.
Change the Region name or leave the default Region as it is.
Accept the default value of JSON for the output format.

Deploy the solution

You can also customize this solution for your requirements. Review the output resources this deployment contains and modify the Lambda function to add the custom business logic you need for your own solution.

Run the following steps in the same terminal to deploy the application:

Clone the git repository:

git clone https://github.com/aws-samples/amazon-chime-pstn-audio-with-amazon-lex.git

Enter the project directory:

cd amazon-chime-pstn-audio-with-amazon-lex
Deploy the AWS CDK application:
```
yarn launch
```
After a few minutes, your stack deployment should be complete. The following screenshot shows the sample output.
Install the web client SIP phone with the following commands:
```
cd site 
```
```
Yarn 
```
```
yarn run start
```

Review the Amazon Chime SDK Voice Connector

In this post, we use the Amazon Chime SDK to route the calls received on the Asterisk PBX server (or your existing contact centers) to Amazon Lex. This is done using Amazon Chime SIP PSTN audio and the Amazon Chime Voice Connector. Amazon Chime PSTN audio enables you to create programmable telephony applications using Lambda functions. These Amazon Chime SIP media applications are triggered by either a PSTN phone number or Amazon Chime Voice Connector. The following screenshot shows the SIP rule that is triggered by an Amazon Chime SDK Voice Connector and targets a SIP media application.

Review the Account Balance chatbot

The Amazon Lex bot in this demo includes three intents. These intents can be requested through natural language speech from the caller. For example, the Check Balance intent is seeded with the following sample utterances.

An intent can require zero or more parameters, which are called slots. We add slots as part of the intent configuration while building the blot. At runtime, Amazon Lex prompts the user for specific slot values. The user must provide values for all required slots before Amazon Lex can fulfill the intent.

For the Check Balance intent, Amazon Lex prompts for slot data, such as:

For which account would you like to check the balance?
For verification purposes, what is your date of birth?

After the Amazon Lex bot gathers all the required slot information, it fulfills the intent by invoking the appropriate response. In this case, it queries the account balance related to the account and provides it to the customer.

In this post, we’re using a Lambda function to help initialize, validate, and fulfill the intent. The following is the sample Python code showing how the function handles invocations depending on which intent is being used:

def dispatch(intent_request):
    intent_name = intent_request["sessionState"]["intent"]["name"]
    response = None
    # Dispatch to your bot's intent handlers
    if intent_name == "CheckBalance":
        return CheckBalance(intent_request)
    elif intent_name == "FollowupCheckBalance":
        return FollowupCheckBalance(intent_request)
    elif intent_name == "OpenAccount":
        return OpenAccount(intent_request)

    raise Exception("Intent with name " + intent_name + " not supported")


def lambda_handler(event, context):
    print(event)
    response = dispatch(event)
    print(response)
    return response

The following is the sample code that explains the code block for the Check Balance intent in the Lambda function. In this example, we generate a random number as the account balance, but this could be integrated with your existing database to provide accurate caller information.

def CheckBalance(intent_request):
    session_attributes = get_session_attributes(intent_request)
    slots = get_slots(intent_request)
    account = get_slot(intent_request, "accountType")
    # The account balance in this case is a random number
    # Here is where you could query a system to get this information
    balance = str(random_num())
    text = "Thank you. The balance on your " + account + " account is $" + balance
    message = {"contentType": "PlainText", "content": text}
    fulfillment_state = "Fulfilled"
    return close(session_attributes, "CheckBalance", fulfillment_state, message)

Test the solution

Let’s walk through the solution by following the path of a single user request:

Get the phone number from the output after deploying the AWS CDK:
```
Outputs:
LexContactCenter.voiceConnectorPhone = +1NPANXXXXXX
```
Dial into the phone number from any PSTN-based phone.
Now you can try the menu options.

For the Amazon Lex bot to understand the Check Balance intent, you can speak any of the following utterances:

What’s the balance in my account?
Check my account balance?
I want to check the balance?

Amazon Lex prompts for the slot data that’s required to fulfill this intent. For the Check Balance intent, Amazon Lex prompts for the account and date of birth:

For which account would you like to check the balance?
For verification purposes, what is your data of birth?

After you provide the required information, the bot fulfills the intent and provides the account balance information. The following is a sample output message for the Check Balance intent: Thank you. The balance on your <account> account is $<balance>.

Complete the call by hanging up or being transferred to an agent.

When the conversation with the Amazon Lex bot is complete, the call returns to the SIP media application and associated Lambda function with the results from the bot conversation.

The Amazon Chime SIP media application performs the post-processing steps and returns the call to the Asterisk PBX. For the Open Account intent, this causes the Asterisk PBX to call an agent using a web client-based SIP phone. The following screenshot shows the dashboard with the agent call information. This call can be answered on the web client to establish two-way audio between the caller and the agent. As shown in the screenshot, the information provided by the caller has been preserved and presented to the agent.

Watch the following video for an example of a partner solution on how to integrate Amazon Lex with Cisco Unified Contact Center using Amazon Chime SDK:

Clean up resources

To clean up the resources used in this demo and avoid incurring further charges, run the following command in the terminal window:

yarn destroy

The AWS CloudFormation stack created by the AWS CDK is destroyed, removing all the allocated resources.

Conclusion

In this post, we demonstrated a solution with a reference architecture to add self-service AI to any contact center using Amazon Lex and the Amazon Chime SDK. We showed how the solution works and provided a detailed walkthrough of the code and deployment steps. This solution is meant to be a reference architecture or a quick start guide that you can customize for your own needs.

Give it a whirl and let us know how this solved your use case by leaving feedback in the comments section. For more information, see the project GitHub repository.

About the authors

Prem Ranga is a NLP domain lead and a Sr. AI/ML specialist SA at AWS and an author who frequently publishes blogs, research papers, and recently a NLP text book. When he is not helping customers adopt AWS AI/ML, Prem dabbles with building Simple Beer Service units for AWS offices, running competitive gaming events with DeepRacer & DeepComposer, and educating students, young professionals on career building AI/ML skills. You can follow Prem’s work on LinkedIn.

Court Schuett is the Lead Evangelist for the Amazon Chime SDK with a background in telephony and now loves to build things that build things. Court is focused on teaching developers and non-developers alike how to build with AWS.

Vamshi Krishna Enabothala is a Senior AI/ML Specialist SA at AWS with expertise in big data, analytics, and orchestrating scalable AI/ML architectures for startups and enterprises. Vamshi is focused on Language AI and innovates in building world-class recommender engines. Outside of work, Vamshi is an RC enthusiast, building and playing with RC equipment (planes, cars, and drones), and also enjoys gardening.

Identify the location of anomalies using Amazon Lookout for Vision at the edge without using a GPU

July 29, 2022

by Manish Talreja Amazon AWS

Automated defect detection using computer vision helps improve quality and lower the cost of inspection. Defect detection involves identifying the presence of a defect, classifying types of defects, and identifying where the defects are located. Many manufacturing processes require detection at a low latency, with limited compute resources, and with limited connectivity.

Amazon Lookout for Vision is a machine learning (ML) service that helps spot product defects using computer vision to automate the quality inspection process in your manufacturing lines, with no ML expertise required. Lookout for Vision now includes the ability to provide the location and type of anomalies using semantic segmentation ML models. These customized ML models can either be deployed to the AWS Cloud using cloud APIs or to custom edge hardware using AWS IoT Greengrass. Lookout for Vision now supports inference on an x86 compute platform running Linux with or without an NVIDIA GPU accelerator and on any NVIDIA Jetson-based edge appliance. This flexibility allows detection of defects on existing or new hardware.

In this post, we show you how to detect defective parts using Lookout for Vision ML models running on an edge appliance, which we simulate using an Amazon Elastic Compute Cloud (Amazon EC2) instance. We walk through training the new semantic segmentation models, exporting them as AWS IoT Greengrass components, and running inference in CPU-only mode with Python example code.

Solution overview

In this post, we use a set of pictures of toy aliens composed of normal and defective images such as missing limbs, eyes, or other parts. We train a Lookout for Vision model in the cloud to identify defective toy aliens. We compile the model to a target X86 CPU, package the trained Lookout for Vision model as an AWS IoT Greengrass component, and deploy the model to an EC2 instance without a GPU using the AWS IoT Greengrass console. Finally, we demonstrate a Python-based sample application running on the EC2 (C5a.2xl) instance that sources the toy alien images from the edge device file system, runs the inference on the Lookout for Vision model using the gRPC interface, and sends the inference data to an MQTT topic in the AWS Cloud. The scripts outputs an image that includes the color and location of the defects on the anomalous image.

The following diagram illustrates the solution architecture. It’s important to note for each defect type you want to detect in localization, you must have 10 marked anomaly images in training and 10 in test data, for a total of 20 images of that type. For this post, we search for missing limbs on the toy.

The solution has the following workflow:

Upload a training dataset and a test dataset to Amazon Simple Storage Service (Amazon S3).
Use the new Lookout for Vision UI to add an anomaly type and mark where those anomalies are in the training and test images.
Train a Lookout for Vision model in the cloud.
Compile the model to the target architecture (X86) and deploy the model to the EC2 (C5a.2xl) instance using the AWS IoT Greengrass console.
Source images from local disk.
Run inferences on the deployed model via the gRPC interface and retrieve an image of anomaly masks overlaid on the original image.
Post the inference results to an MQTT client running on the edge instance.
Receive the MQTT message on a topic in AWS IoT Core in the AWS Cloud for further monitoring and visualization.

Steps 5, 6, and 7 are coordinated with the sample Python application.

Prerequisites

Before you get started, complete the following prerequisites. For this post, we use an EC2 c5.2xl instance and install AWS IoT Greengrass V2 on it to try out the new features. If you want to run on an NVIDIA Jetson, follow the steps in our previous post, Amazon Lookout for Vision now supports visual inspection of product defects at the edge.

Create an AWS account.
Start an EC2 instance that we can install AWS IoT Greengrass on and use the new CPU-only inference mode.You can also use an Intel X86 64 bit machine with 8 gigabytes of ram or more (we use a c5a.2xl, but anything with greater than 8 gigabytes on x86 platform should work) running Ubuntu 20.04.

Install AWS IoT Greengrass V2:

git clone https://github.com/aws-samples/amazon-lookout-for-vision.git
cd edge
# be sure to edit the installation script to match your region, also adjust any device names and groups!
vi install_greengrass.sh

Install the needed system and Python 3 dependencies (Ubuntu 20.04):

# install Ubuntu dependencies on the EC2 instance
./install-ec2-ubuntu-deps.sh
pip3 install -r requirements.txt
# Replace ENDPOINT variable in sample-client-file-mqtt.py with the value on the AWS console AWS IoT->Things->l4JetsonXavierNX->Interact.  
# Under HTTPS. It will be of type <name>-ats.iot.<region>.amazon.com

Upload the dataset and train the model

We use the toy aliens dataset to demonstrate the solution. The dataset contains normal and anomalous images. Here are a few sample images from the dataset.

The following image shows a normal toy alien.

The following image shows a toy alien missing a leg.

The following image shows a toy alien missing a head.

In this post, we look for missing limbs. We use the new user interface to draw a mask around the defects in our training and tests data. This will tell the semantic segmentation models how to identify this type of defect.

Start by uploading your dataset, either via Amazon S3 or from your computer.
Sort them into folders titled normal and anomaly.
When creating your dataset, select Automatically attach labels to images based on the folder name.This allows us to sort out the anomalous images later and draw in the areas to be labeled with a defect.
Try to hold back some images for testing later of both normal and anomaly.
After all the images have been added to the dataset, choose Add anomaly labels.
Begin labeling data by choosing Start labeling.
To speed up the process, you can select multiple images and classify them as Normal or Anomaly.

If you want to highlight anomalies in addition to classifying them, you need to highlight where the anomalies are located.
Choose the image you want to annotate.
Use the drawing tools to show the area where part of the subject is missing, or draw a mask over the defect.
Choose Submit and close to keep these changes.
Repeat this process for all your images.
When you’re done, choose Save to persist your changes. Now you’re ready to train your model.
Choose Train model.

After you complete these steps, you can navigate to the project and the Models page to check the performance of the trained model. You can start the process of exporting the model to the target edge device any time after the model is trained.

Retrain the model with corrected images

Sometimes the anomaly tagging may not be quite correct. You have the chance to help your model learn your anomalies better. For example, the following image is identified as an anomaly, but doesn’t show the missing_limbs tag.

Let’s open the editor and fix this.

Go through any images you find like this. If you find it’s tagged an anomaly incorrectly, you can use the eraser tool to remove the incorrect tag.

You can now train your model again and achieve better accuracy.

Compile and package the model as an AWS IoT Greengrass component

In this section, we walk through the steps to compile the toy alien model to our target edge device and package the model as an AWS IoT Greengrass component.

On the Lookout for Vision console, choose your project.
In the navigation pane, choose Edge model packages.
Choose Create model packaging job.
For Job name, enter a name.
For Job description, enter an optional description.
Choose Browse models.
Select the model version (the toy alien model built in the previous section).
Choose Choose.
If you’re running this on Amazon EC2 or an X86-64 device, select Target platform and choose Linux, X86, and CPU.
If using CPU, you can leave the compiler options empty if you’re not sure and don’t have an NVIDIA GPU. If you have an Intel-based platform that supports AVX512, you can add these compiler options to optimize for better performance: {"mcpu": "skylake-avx512"}.
You can see your job name and status showing as In progress. The model packaging job may take a few minutes to complete.When the model packaging job is complete, the status shows as Success.
Choose your job name (in our case it’s aliensblogcpux86) to see the job details.
Choose Create model packaging job.
Enter the details for Component name, Component description (optional), Component version, and Component location.Lookout for Vision stores the component recipes and artifacts in this Amazon S3 location.
Choose Continue deployment in Greengrass to deploy the component to the target edge device.

The AWS IoT Greengrass component and model artifacts have been created in your AWS account.

Deploy the model

Be sure you have AWS IoT Greengrass V2 installed on your target device for your account before you continue. For instructions, refer to Install the AWS IoT Greengrass Core software.

In this section, we walk through the steps to deploy the toy alien model to the edge device using the AWS IoT Greengrass console.

On the AWS IoT Greengrass console, navigate to your edge device.
Choose Deploy to initiate the deployment steps.
Select Core device (because the deployment is to a single device) and enter a name for Target name.The target name is the same name you used to name the core device during the AWS IoT Greengrass V2 installation process.
Choose your component. In our case, the component name is aliensblogcpux86, which contains the toy alien model.
Choose Next.
Configure the component (optional).
Choose Next.
Expand Deployment policies.
For Component update policy, select Notify components.This allows the already deployed component (a prior version of the component) to defer an update until you’re ready to update.
For Failure handling policy, select Don’t roll back.In case of a failure, this option allows us to investigate the errors in deployment.
Choose Next.
Review the list of components that will be deployed on the target (edge) device.
Choose Next.You should see the message Deployment successfully created.
To validate the model deployment was successful, run the following command on your edge device:
```
sudo /greengrass/v2/bin/greengrass-cli component list
```

You should see a similar output running the aliensblogcpux86 lifecycle startup script:

Components currently running in Greengrass:

Components currently running in Greengrass:
 
Component Name: aws.iot.lookoutvision.EdgeAgent
    Version: 0.1.34
    State: RUNNING
    Configuration: {"Socket":"unix:///tmp/aws.iot.lookoutvision.EdgeAgent.sock"}
 Component Name: aliensblogcpux86
    Version: 1.0.0
    State: RUNNING
    Configuration: {"Autostart":false}

Run inferences on the model

Note: If you are running Greengrass as another user than what you are logged in as, you will need to change permissions of the file /tmp/aws.iot.lookoutvision.EdgeAgent.sock:

chmod 666 /tmp/aws.iot.lookoutvision.EdgeAgent.sock

We’re now ready to run inferences on the model. On your edge device, run the following command to load the model (replace <modelName> with the model name used in your component):

# run command to load the model# This will load the model into running state pass
# the name of the model component as a parameter.
python3 warmup-model.py <modelName>

To generate inferences, run the following command with the source file name (replace <path/to/images> with the path and file name of the image to check and replace <modelName> with the model name used for your component):

python3 sample-client-file-mqtt.py </path/to/images> <modelName>

start client ['sample-client-file.py', 'aliens-dataset/anomaly/1.png', 'aliensblogcpux86']
channel set
shape=(380, 550, 3)
Image is anomalous, (90.05860090255737 % confidence) contains defects with total area over .1%: {'missing_limbs': '#FFFFFF'}

The model correctly predicts the image as anomalous (missing_limbs) with a confidence score of 0.9996867775917053. It tells us the mask of the anomaly tag missing_limbs and the percentage area. The response also contains bitmap data you can decode of what it found.

Download and open the file blended.png, which looks like the following image. Note the area highlighted with the defect around the legs.

Customer stories

With AWS IoT Greengrass and Lookout for Vision, you can now automate visual inspection with computer vision for processes like quality control and defect assessment—all on the edge and in real time. You can proactively identify problems such as parts damage (like dents, scratches, or poor welding), missing product components, or defects with repeating patterns on the production line itself—saving you time and money. Customers like Tyson and Baxter are discovering the power of Lookout for Vision to increase quality and reduce operational costs by automating visual inspection.

“Operational excellence is a key priority at Tyson Foods. Predictive maintenance is an essential asset for achieving this objective by continuously improving overall equipment effectiveness (OEE). In 2021, Tyson Foods launched a machine learning-based computer vision project to identify failing product carriers during production to prevent them from impacting team member safety, operations, or product quality. The models trained using Amazon Lookout for Vision performed well. The pin detection model achieved 95% accuracy across both classes. The Amazon Lookout for Vision model was tuned to perform at 99.1% accuracy for failing pin detection. By far the most exciting result of this project was the speedup in development time. Although this project utilizes two models and a more complex application code, it took 12% less developer time to complete. This project for monitoring the condition of the product carriers at Tyson Foods was completed in record time using AWS managed services such as Amazon Lookout for Vision.”

—Audrey Timmerman, Sr Applications Developer, Tyson Foods.

“Latency and inferencing speed is critical for real-time assessment and critical quality checks of our manufacturing processes. Amazon Lookout for Vision edge on a CPU device gives us the ability to achieve this on production-grade equipment, enabling us to deliver cost-effective AI vision solutions at scale.”

—A.K. Karan, Global Senior Director – Digital Transformation, Integrated Supply Chain, Baxter International Inc.

Cleanup

Complete the following steps to remove the assets you created from your account and avoid any ongoing billing:

On the Lookout for Vision console, navigate to your project.
On the Actions menu, delete your datasets.
Delete your models.
On the Amazon S3 console, empty the buckets you created, then delete the buckets.
On the Amazon EC2 console, delete the instance you started to run AWS IoT Greengrass.
On the AWS IoT Greengrass console, choose Deployments in the navigation pane.
Delete your component versions.
On the AWS IoT Greengrass console, delete the AWS IoT things, groups, and devices.

Conclusion

In this post, we described a typical scenario for industrial defect detection at the edge using defect localization and deployed to a CPU-only device. We walked through the key components of the cloud and edge lifecycle with an end-to-end example using Lookout for Vision and AWS IoT Greengrass. With Lookout for Vision, we trained an anomaly detection model in the cloud using the toy alien dataset, compiled the model to a target architecture, and packaged the model as an AWS IoT Greengrass component. With AWS IoT Greengrass, we deployed the model to an edge device. We demonstrated a Python-based sample application that sources toy alien images from the edge device local file system, runs the inferences on the Lookout for Vision model at the edge using the gRPC interface, and sends the inference data to an MQTT topic in the AWS Cloud.

In a future post, we will show how to run inferences on a real-time stream of images using a GStreamer media pipeline.

Start your journey towards industrial anomaly detection and identification by visiting the Amazon Lookout for Vision and AWS IoT Greengrass resource pages.

About the authors

Manish Talreja is a Senior Industrial ML Practice Manager with AWS Professional Services. He helps AWS customers achieve their business goals by architecting and building innovative solutions that use AWS ML and IoT services on the AWS Cloud.

Ryan Vanderwerf is a partner solutions architect at Amazon Web Services. He previously provided Java virtual machine-focused consulting and project development as a software engineer at OCI on the Grails and Micronaut team. He was chief architect/director of products at ReachForce, with a focus on software and system architecture for AWS Cloud SaaS solutions for marketing data management. Ryan has built several SaaS solutions in several domains such as financial, media, telecom, and e-learning companies since 1996.

Prakash Krishnan is a Senior Software Development Manager at Amazon Web Services. He leads the engineering teams that are building large-scale distributed systems to apply fast, efficient, and highly scalable algorithms to deep learning-based image and video recognition problems.

Fine-tune and deploy a summarizer model using the Hugging Face Amazon SageMaker containers bringing your own script

July 29, 2022

by Viktor Malesevic Amazon AWS

There have been many recent advancements in the NLP domain. Pre-trained models and fully managed NLP services have democratised access and adoption of NLP. Amazon Comprehend is a fully managed service that can perform NLP tasks like custom entity recognition, topic modelling, sentiment analysis and more to extract insights from data without the need of any prior ML experience.

Last year, AWS announced a partnership with Hugging Face to help bring natural language processing (NLP) models to production faster. Hugging Face is an open-source AI community, focused on NLP. Their Python-based library (Transformers) provides tools to easily use popular state-of-the-art Transformer architectures like BERT, RoBERTa, and GPT. You can apply these models to a variety of NLP tasks, such as text classification, information extraction, and question answering, among others.

Amazon SageMaker is a fully managed service that provides developers and data scientists the ability to build, train, and deploy machine learning (ML) models quickly. SageMaker removes the heavy lifting from each step of the ML process, making it easier to develop high-quality models. The SageMaker Python SDK provides open-source APIs and containers to train and deploy models on SageMaker, using several different ML and deep learning frameworks.

The Hugging Face integration with SageMaker allows you to build Hugging Face models at scale on your own domain-specific use cases.

In this post, we walk you through an example of how to build and deploy a custom Hugging Face text summarizer on SageMaker. We use Pegasus [1] for this purpose, the first Transformer-based model specifically pre-trained on an objective tailored for abstractive text summarization. BERT is pre-trained on masking random words in a sentence; in contrast, during Pegasus’s pre-training, sentences are masked from an input document. The model then generates the missing sentences as a single output sequence using all the unmasked sentences as context, creating an executive summary of the document as a result.

Thanks to the flexibility of the HuggingFace library, you can easily adapt the code shown in this post for other types of transformer models, such as t5, BART, and more.

Load your own dataset to fine-tune a Hugging Face model

To load a custom dataset from a CSV file, we use the load_dataset method from the Transformers package. We can apply tokenization to the loaded dataset using the datasets.Dataset.map function. The map function iterates over the loaded dataset and applies the tokenize function to each example. The tokenized dataset can then be passed to the trainer for fine-tuning the model. See the following code:

# Python
def tokenize(batch):
    tokenized_input = tokenizer(batch[args.input_column], padding='max_length', truncation=True, max_length=args.max_source)
    tokenized_target = tokenizer(batch[args.target_column], padding='max_length', truncation=True, max_length=args.max_target)
    tokenized_input['target'] = tokenized_target['input_ids']

    return tokenized_input
    

def load_and_tokenize_dataset(data_dir):
    for file in os.listdir(data_dir):
        dataset = load_dataset("csv", data_files=os.path.join(data_dir, file), split='train')
    tokenized_dataset = dataset.map(lambda batch: tokenize(batch), batched=True, batch_size=512)
    tokenized_dataset.set_format('numpy', columns=['input_ids', 'attention_mask', 'labels'])
    
    return tokenized_dataset

Build your training script for the Hugging Face SageMaker estimator

As explained in the post AWS and Hugging Face collaborate to simplify and accelerate adoption of Natural Language Processing models, training a Hugging Face model on SageMaker has never been easier. We can do so by using the Hugging Face estimator from the SageMaker SDK.

The following code snippet fine-tunes Pegasus on our dataset. You can also find many sample notebooks that guide you through fine-tuning different types of models, available directly in the transformers GitHub repository. To enable distributed training, we can use the Data Parallelism Library in SageMaker, which has been built into the HuggingFace Trainer API. To enable data parallelism, we need to define the distribution parameter in our Hugging Face estimator.

# Python
from sagemaker.huggingface import HuggingFace
# configuration for running training on smdistributed Data Parallel
distribution = {'smdistributed':{'dataparallel':{ 'enabled': True }}}
huggingface_estimator = HuggingFace(entry_point='train.py',
                                    source_dir='code',
                                    base_job_name='huggingface-pegasus',
                                    instance_type= 'ml.g4dn.16xlarge',
                                    instance_count=1,
                                    transformers_version='4.6',
                                    pytorch_version='1.7',
                                    py_version='py36',
                                    output_path=output_path,
                                    role=role,
                                    hyperparameters = {
                                        'model_name': 'google/pegasus-xsum',
                                        'epoch': 10,
                                        'per_device_train_batch_size': 2
                                    },
                                    distribution=distribution)
huggingface_estimator.fit({'train': training_input_path, 'validation': validation_input_path, 'test': test_input_path})

The maximum training batch size you can configure depends on the model size and the GPU memory of the instance used. If SageMaker distributed training is enabled, the total batch size is the sum of every batch that is distributed across each device/GPU. If we use an ml.g4dn.16xlarge with distributed training instead of an ml.g4dn.xlarge instance, we have eight times (8 GPUs) as much memory as a ml.g4dn.xlarge instance (1 GPU). The batch size per device remains the same, but eight devices are training in parallel.

As usual with SageMaker, we create a train.py script to use with Script Mode and pass hyperparameters for training. The following code snippet for Pegasus loads the model and trains it using the Transformers Trainer class:

# Python
from transformers import (
    AutoModelForSeq2SeqLM,
    AutoTokenizer,
    Seq2SeqTrainer,
    Seq2seqTrainingArguments
)

model = AutoModelForSeq2SeqLM.from_pretrained(model_name).to(device)
    
training_args = Seq2seqTrainingArguments(
    output_dir=args.model_dir,
    num_train_epochs=args.epoch,
    per_device_train_batch_size=args.train_batch_size,
    per_device_eval_batch_size=args.eval_batch_size,
    warmup_steps=args.warmup_steps,
    weight_decay=args.weight_decay,
    logging_dir=f"{args.output_data_dir}/logs",
    logging_strategy='epoch',
    evaluation_strategy='epoch',
    saving_strategy='epoch',
    adafactor=True,
    do_train=True,
    do_eval=True,
    do_predict=True,
    save_total_limit = 3,
    load_best_model_at_end=True,
    metric_for_best_model='eval_loss'
    # With the goal to deploy the best checkpoint to production
    # it is important to set load_best_model_at_end=True,
    # this makes sure that the last model is saved at the root
    # of the model_dir” directory
)
    
trainer = Seq2SeqTrainer(
    model=model,
    args=training_args,
    train_dataset=dataset['train'],
    eval_dataset=dataset['validation']
)

trainer.train()
trainer.save_model()

# Get rid of unused checkpoints inside the container to limit the model.tar.gz size
os.system(f"rm -rf {args.model_dir}/checkpoint-*/")

The full code is available on GitHub.

Deploy the trained Hugging Face model to SageMaker

Our friends at Hugging Face have made inference on SageMaker for Transformers models simpler than ever thanks to the SageMaker Hugging Face Inference Toolkit. You can directly deploy the previously trained model by simply setting up the environment variable "HF_TASK":"summarization" (for instructions, see Pegasus Models), choosing Deploy, and then choosing Amazon SageMaker, without needing to write an inference script.

However, if you need some specific way to generate or postprocess predictions, for example generating several summary suggestions based on a list of different text generation parameters, writing your own inference script can be useful and relatively straightforward:

# Python
# inference.py script

import os
import json
import torch
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

def model_fn(model_dir):
    # Create the model and tokenizer and load weights
    # from the previous training Job, passed here through "model_dir"
    # that is reflected in HuggingFaceModel "model_data"
    tokenizer = AutoTokenizer.from_pretrained(model_dir)
    model = AutoModelForSeq2SeqLM.from_pretrained(model_dir).to(device).eval()
    
    model_dict = {'model':model, 'tokenizer':tokenizer}
    
    return model_dict
        

def predict_fn(input_data, model_dict):
    # Return predictions/generated summaries
    # using the loaded model and tokenizer on input_data
    text = input_data.pop('inputs')
    parameters_list = input_data.pop('parameters_list', None)
    
    tokenizer = model_dict['tokenizer']
    model = model_dict['model']

    # Parameters may or may not be passed    
    input_ids = tokenizer(text, truncation=True, padding='longest', return_tensors="pt").input_ids.to(device)
    
    if parameters_list:
        predictions = []
        for parameters in parameters_list:
            output = model.generate(input_ids, **parameters)
            predictions.append(tokenizer.batch_decode(output, skip_special_tokens=True))
    else:
        output = model.generate(input_ids)
        predictions = tokenizer.batch_decode(output, skip_special_tokens=True)
    
    return predictions
    
    
def input_fn(request_body, request_content_type):
    # Transform the input request to a dictionary
    request = json.loads(request_body)
    return request

As shown in the preceding code, such an inference script for HuggingFace on SageMaker only needs the following template functions:

model_fn() – Reads the content of what was saved at the end of the training job inside SM_MODEL_DIR, or from an existing model weights directory saved as a tar.gz file in Amazon Simple Storage Service (Amazon S3). It’s used to load the trained model and associated tokenizer.
input_fn() – Formats the data received from a request made to the endpoint.
predict_fn() – Calls the output of model_fn() (the model and tokenizer) to run inference on the output of input_fn() (the formatted data).

Optionally, you can create an output_fn() function for inference formatting, using the output of predict_fn(), which we didn’t demonstrate in this post.

We can then deploy the trained Hugging Face model with its associated inference script to SageMaker using the Hugging Face SageMaker Model class:

# Python
from sagemaker.huggingface import HuggingFaceModel

model = HuggingFaceModel(model_data=huggingface_estimator.model_data,
                     role=role,
                     framework_version='1.7',
                     py_version='py36',
                     entry_point='inference.py',
                     source_dir='code')
                     
predictor = model.deploy(initial_instance_count=1,
                         instance_type='ml.g4dn.xlarge'
                         )

Test the deployed model

For this demo, we trained the model on the Women’s E-Commerce Clothing Reviews dataset, which contains reviews of clothing articles (which we consider as the input text) and their associated titles (which we consider as summaries). After we remove articles with missing titles, the dataset contains 19,675 reviews. Fine-tuning the Pegasus model on a training set containing 70% of those articles for five epochs took approximately 3.5 hours on an ml.p3.16xlarge instance.

We can then deploy the model and test it with some example data from the test set. The following is an example review describing a sweater:

# Python
Review Text
"I ordered this sweater in green in petite large. The color and knit is beautiful and the shoulders and body fit comfortably; however, the sleeves were very long for a petite. I roll them, and it looks okay but would have rather had a normal petite length sleeve."

Original Title
"Long sleeves"

Rating
3

Thanks to our custom inference script hosted in a SageMaker endpoint, we can generate several summaries for this review with different text generation parameters. For example, we can ask the endpoint to generate a range of very short to moderately long summaries specifying different length penalties (the smaller the length penalty, the shorter the generated summary). The following are some parameter input examples, and the subsequent machine-generated summaries:

# Python
inputs = {
    "inputs":[
"I ordered this sweater in green in petite large. The color and knit is   beautiful and the shoulders and body fit comfortably; however, the sleeves were very long for a petite. I roll them, and it looks okay but would have rather had a normal petite length sleeve."
    ],

    "parameters_list":[
        {
            "length_penalty":2
        },
	{
            "length_penalty":1
        },
	{
            "length_penalty":0.6
        },
        {
            "length_penalty":0.4
        }
    ]

result = predictor.predict(inputs)
print(result)

[
    ["Beautiful color and knit but sleeves are very long for a petite"],
    ["Beautiful sweater, but sleeves are too long for a petite"],
    ["Cute, but sleeves are long"],
    ["Very long sleeves"]
]

Which summary do you prefer? The first generated title captures all the important facts about the review, with a quarter the number of words. In contrast, the last one only uses three words (less than 1/10th the length of the original review) to focus on the most important feature of the sweater.

Conclusion

You can fine-tune a text summarizer on your custom dataset and deploy it to production on SageMaker with this simple example available on GitHub. Additional sample notebooks to train and deploy Hugging Face models on SageMaker are also available.

As always, AWS welcomes feedback. Please submit any comments or questions.

References

[1] PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

About the authors

Viktor Malesevic is a Machine Learning Engineer with AWS Professional Services, passionate about Natural Language Processing and MLOps. He works with customers to develop and put challenging deep learning models to production on AWS. In his spare time, he enjoys sharing a glass of red wine and some cheese with friends.

Aamna Najmi is a Data Scientist with AWS Professional Services. She is passionate about helping customers innovate with Big Data and Artificial Intelligence technologies to tap business value and insights from data. In her spare time, she enjoys gardening and traveling to new places.

Use case

Feature discovery in Studio

Feature discovery using code

Conclusion

About the authors

Solution overview

Prerequisites

Host YOLOv5 on a SageMaker endpoint

Test the SageMaker endpoint

Set up Lambda with layers and triggers

Create Lambda layers for OpenCV using Docker

Create the Lambda function

Attach the OpenCV layer to the Lambda function

Trigger Lambda when an image is uploaded to Amazon S3

Run inference

Clean up

Conclusion

About the authors

Overview of solution

Dataset

Prerequisites

Add features to a feature group

Clean up

Conclusion

Further reading

References

About the authors

Solution overview

Prerequisites

Deploy the solution

Review the Amazon Chime SDK Voice Connector

Review the Account Balance chatbot

Test the solution

Clean up resources

Conclusion

About the authors

Solution overview

Prerequisites

Upload the dataset and train the model

Retrain the model with corrected images

Compile and package the model as an AWS IoT Greengrass component

Deploy the model

Run inferences on the model

Customer stories

Cleanup

Conclusion

About the authors

Load your own dataset to fine-tune a Hugging Face model

Build your training script for the Hugging Face SageMaker estimator

Deploy the trained Hugging Face model to SageMaker

Test the deployed model

Conclusion

References

About the authors

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.