Thirteen new projects focus on ensuring fairness in AI algorithms and the systems that incorporate them.Read More
Domenico Giannone’s never-ending drive to learn more from economic data
How the Amazon Supply Chain Optimization Technologies principal economist uses his expertise in time series econometrics to forecast aggregate demand.Read More
Optimal pricing for maximum profit using Amazon SageMaker
This is a guest post by Viktor Enrico Jeney, Senior Machine Learning Engineer at Adspert.
Adspert is a Berlin-based ISV that developed a bid management tool designed to automatically optimize performance marketing and advertising campaigns. The company’s core principle is to automate maximization of profit of ecommerce advertising with the help of artificial intelligence. The continuous development of advertising platforms paves the way for new opportunities, which Adspert expertly utilizes for their customers’ success.
Adspert’s primary goal is to simplify the process for users while optimizing ad campaigns across different platforms. This includes the use of information gathered across the various platforms balanced against the optimum budget set on a level above each platform. Adspert’s focus is to optimize a customer’s goal attainment, regardless of what platform is used. Adspert continues to add platforms as necessary to give our customers significant advantages.
In this post, we share how Adspert created the pricing tool from scratch using different AWS services like Amazon SageMaker and how Adspert collaborated with the AWS Data Lab to accelerate this project from design to build in record time.
The pricing tool reprices a seller-selected product on an ecommerce marketplace based on the visibility and profit margin to maximize profits on the product level.
As a seller, it’s essential that your products are always visible because this will increase sales. The most important factor in ecommerce sales is simply if your offer is visible to customers instead of a competitor’s offer.
Although it certainly depends on the specific ecommerce platform, we’ve found that product price is one of the most important key figures that can affect visibility. However, prices change often and fast; for this reason the pricing tool needs to act in near-real time to increase the visibility.
Overview of solution
The following diagram illustrates the solution architecture.
The solution contains the following components:
- Amazon Relational Database Service (Amazon RDS) for PostgreSQL is the main source of data, containing product information that is stored in an RDS for Postgres database.
- Product listing changes information arrives in real time in an Amazon Simple Queue Service (Amazon SQS) queue.
- Product information stored in Amazon RDS is ingested in near-real time into the raw layer using the change data capture (CDC) pattern available in AWS Database Migration Service (AWS DMS).
- Product listing notifications coming from Amazon SQS are ingested in near-real time into the raw layer using an AWS Lambda function.
- The original source data is stored in the Amazon Simple Storage Service (Amazon S3) raw layer bucket using Parquet data format. This layer is the single source of truth for the data lake. The partitioning used on this storage supports the incremental processing of data.
- AWS Glue extract, transform, and load (ETL) jobs clean the product data, removing duplicates, and applying data consolidation and generic transformations not tied to a specific business case.
- The Amazon S3 stage layer receives prepared data that is stored in Apache Parquet format for further processing. The partitioning used on the stage store supports the incremental processing of data.
- The AWS Glue jobs created in this layer use the data available in the Amazon S3 stage layer. This includes application of use case-specific business rules and calculations required. The results data from these jobs are stored in the Amazon S3 analytics layer.
- The Amazon S3 analytics layer is used to store the data that is used by the ML models for training purposes. The partitioning used on the curated store is based on the data usage expected. This may be different to the partitioning used on the stage layer.
- The repricing ML model is a Scikit-Learn Random Forest implementation in SageMaker Script Mode, which is trained using data available in the S3 bucket (the analytics layer).
- An AWS Glue data processing job prepares data for the real-time inference. The job processes data ingested in the S3 bucket (stage layer) and invokes the SageMaker inference endpoint. The data is prepared to be used by the SageMaker repricing model. AWS Glue was preferred to Lambda, because the inference requires different complex data processing operations like joins and window functions on a high volume of data (billions of daily transactions). The result from the repricing model invocations are stored in the S3 bucket (inference layer).
- The SageMaker training job is deployed using a SageMaker endpoint. This endpoint is invoked by the AWS Glue inference processor, generating near-real-time price recommendations to increase product visibility.
- The predictions generated by the SageMaker inference endpoint are stored in the Amazon S3 inference layer.
- The Lambda predictions optimizer function processes the recommendations generated by the SageMaker inference endpoint and generates a new price recommendation that focuses on maximizing the seller profit, applying a trade-off between sales volume and sales margin.
- The price recommendations generated by the Lambda predictions optimizer are submitted to the repricing API, which updates the product price on the marketplace.
- The updated price recommendations generated by the Lambda predictions optimizer are stored in the Amazon S3 optimization layer.
- The AWS Glue prediction loader job reloads into the source RDS for Postgres SQL database the predictions generated by the ML model for auditing and reporting purposes. AWS Glue Studio was used to implement this component; it’s a graphical interface that makes it easy to create, run, and monitor ETL jobs in AWS Glue.
Data preparation
The dataset for Adspert’s visibility model is created from an SQS queue and ingested into the raw layer of our data lake in real time with Lambda. Afterwards, the raw data is sanitized by performing simple transformations, like removing duplicates. This process is implemented in AWS Glue. The result is stored in the staging layer of our data lake. The notifications provide the competitors for a given product, with their prices, fulfilment channels, shipping times, and many more variables. They also provide a platform-dependent visibility measure, which can be expressed as a Boolean variable (visible or not visible). We receive a notification any time an offer change happens, which adds up to several million events per month over all our customers’ products.
From this dataset, we extract the training data as follows: for every notification, we pair the visible offers with every non visible offer, and vice versa. Every data point represents a competition between two sellers, in which there is a clear winner and loser. This processing job is implemented in an AWS Glue job with Spark. The prepared training dataset is pushed to the analytics S3 bucket to be used by SageMaker.
Train the model
Our model classifies for each pair of offers, if a given offer will be visible. This model enables us to calculate the best price for our customers, increase visibility based on competition, and maximize their profit. On top of that, this classification model can give us deeper insights into the reasons for our listings being visible or not visible. We use the following features:
- Ratio of our price to competitors’ prices
- Difference in fulfilment channels
- Amount of feedback for each seller
- Feedback rating of each seller
- Difference in minimum shipping times
- Difference in maximum shipping times
- Availability of each seller’s product
Adspert uses SageMaker to train and host the model. We use Scikit-Learn Random Forest implementation in SageMaker Script Mode. We also include some feature preprocessing directly in the Scikit-Learn pipeline in the training script. See the following code:
One of the most important preprocessing functions is transform_price
, which divides the price by the minimum of the competitor price and an external price column. We’ve found that this is feature has a relevant impact on the model accuracy. We also apply the logarithm to let the model decide based on relative price differences, not absolute price differences.
In the training_script.py
script, we first define how to build the Scikit-Learn ColumnTransformer
to apply the specified transformers to the columns of a dataframe:
In the training script, we load the data from Parquet into a Pandas dataframe, define the pipeline of the ColumnTranformer
and the RandomForestClassifier
, and train the model. Afterwards, the model is serialized using joblib
:
In the training script, we also have to implement functions for inference:
- input_fn – Is responsible for parsing the data from the request body of the payload
- model_fn – Loads and returns the model that has been dumped in the training section of the script
- predict_fn – Contains our implementation to request a prediction from the model using the data from the payload
-
predict_proba – In order to draw predicted visibility curves, we return the class probability using the
predict_proba
function, instead of the binary prediction of the classifier
See the following code:
The following figure shows the impurity-based feature importances returned by the Random Forest Classifier.
With SageMaker, we were able to train the model on a large amount of data (up to 14 billion daily transactions) without putting load on our existing instances or having to set up a separate machine with enough resources. Moreover, because the instances are immediately shut down after the training job, training with SageMaker was extremely cost-efficient. The model deployment with SageMaker worked without any additional workload. A single function call in the Python SDK is sufficient to host our model as an inference endpoint, and it can be easily requested from other services using the SageMaker Python SDK as well. See the following code:
The model artefact is stored in Amazon S3 by the fit function. As seen in the following code, the model can be loaded as a SKLearnModel
object using the model artefact, script path, and some other parameters. Afterwards, it can be deployed to the desired instance type and number of instances.
Evaluate the model in real time
Whenever a new notification is sent for one of our products, we want to calculate and submit the optimal price. To calculate optimal prices, we create a prediction dataset in which we compare our own offer with each competitor’s offer for a range of possible prices. These data points are passed to the SageMaker endpoint, which returns the predicted probability of being visible against each competitor for each given price. We call the probability of being visible the predicted visibility. The result can be visualized as a curve for each competitor, portraying the relationship between our price and the visibility, as shown in the following figure.
In this example, the visibility against Competitor 1 is almost a piecewise constant function, suggesting that we mainly have to decrease the price below a certain threshold, roughly the price of the competitor, to become visible. However, the visibility against Competitor 2 doesn’t decrease as steeply. On top of that, we still have a 50% chance of being visible even with a very high price. Analyzing the input data revealed that the competitor has a low amount of ratings, which happen to be very poor. Our model learned that this specific ecommerce platform gives a disadvantage to sellers with poor feedback ratings. We discovered similar effects for the other features, like fulfilment channel and shipping times.
The necessary data transformations and inferences against the SageMaker endpoint are implemented in AWS Glue. The AWS Glue job work in micro-batches on the real-time data ingested from Lambda.
Calculate optimal prices using predicted visibilities
Finally, we want to calculate the aggregated visibility curve, which is the predicted visibility for each possible price. Our offer is visible if it’s better than all other sellers’ offers. Assuming independence between the probabilities of being visible against each seller given our price, the probability of being visible against all sellers is the product of the respective probabilities. That means the aggregated visibility curve can be calculated by multiplying all curves.
The following figures show the predicted visibilities returned from the SageMaker endpoint.
The following figure shows the aggregated visibility curve.
To calculate the optimal price, the visibility curve is first smoothed and then multiplied by the margin. To calculate the margin, we use the costs of goods and the fees. The cost of goods sold and fees are the static product information synced via AWS DMS. Based on the profit function, Adspert calculates the optimal price and submits it to the ecommerce platform through the platform’s API.
This is implemented in the AWS Lambda prediction optimizer.
The following figure shows the relation between predicted visibility and price.
The following figure shows the relation between price and profit.
Conclusion
Adspert’s existing approach to profit maximization is focused on bid management to increase the returns from advertising. To achieve superior performance on ecommerce marketplaces, however, sellers have to consider both advertising and competitive pricing of their products. With this new ML model to predict visibility, we can extend our functionality to also adjust customer’s prices.
The new pricing tool has to be capable of automated training of the ML model on a large amount of data, as well as real-time data transformations, predictions, and price optimizations. In this post, we walked through the main steps of our price optimization engine, and the AWS architecture we implemented in collaboration with the AWS Data Lab to achieve those goals.
Taking ML models from concept to production is typically complex and time-consuming. You have to manage large amounts of data to train the model, choose the best algorithm for training it, manage the compute capacity while training it, and then deploy the model into a production environment. SageMaker reduced this complexity by making it much more straightforward to build and deploy the ML model. After we chose the right algorithms and frameworks from the wide range of choices available, SageMaker managed all of the underlying infrastructure to train our model and deploy it to production.
If you’d like start familiarizing yourself with SageMaker, the Immersion Day workshop can help you gain an end-to-end understanding of how to build ML use cases from feature engineering, the various in-built algorithms, and how to train, tune, and deploy the ML model in a production-like scenario. It guides you to bring your own model and perform an on-premise ML workload lift-and-shift to the SageMaker platform. It further demonstrates advanced concepts like model debugging, model monitoring, and AutoML, and helps you evaluate your ML workload through the AWS ML Well-Architected lens.
If you’d like help accelerating the implementation of use cases that involve data, analytics, AI and ML, serverless, and container modernization, please contact the AWS Data Lab.
About the authors
Viktor Enrico Jeney is a Senior Machine Learning Engineer at Adspert based in Berlin, Germany. He creates solutions for prediction and optimization problems in order to increase customers’ profits. Viktor has a background in applied mathematics and loves working with data. In his free time, he enjoys learning Hungarian, practicing martial arts, and playing the guitar.
Ennio Pastore is a data architect on the AWS Data Lab team. He is an enthusiast of everything related to new technologies that have a positive impact on businesses and general livelihood. Ennio has over 9 years of experience in data analytics. He helps companies define and implement data platforms across industries, such as telecommunications, banking, gaming, retail, and insurance.
“Data science could be used everywhere”
How Jared Wilber is using his skills as a storyteller and data scientist to help others learn about machine learning.Read More
Amazon Comprehend announces lower annotation limits for custom entity recognition
Amazon Comprehend is a natural-language processing (NLP) service you can use to automatically extract entities, key phrases, language, sentiments, and other insights from documents. For example, you can immediately start detecting entities such as people, places, commercial items, dates, and quantities via the Amazon Comprehend console, AWS Command Line Interface, or Amazon Comprehend APIs. In addition, if you need to extract entities that aren’t part of the Amazon Comprehend built-in entity types, you can create a custom entity recognition model (also known as custom entity recognizer) to extract terms that are more relevant for your specific use case, like names of items from a catalog of products, domain-specific identifiers, and so on. Creating an accurate entity recognizer on your own using machine learning libraries and frameworks can be a complex and time-consuming process. Amazon Comprehend simplifies your model training work significantly. All you need to do is load your dataset of documents and annotations, and use the Amazon Comprehend console, AWS CLI, or APIs to create the model.
To train a custom entity recognizer, you can provide training data to Amazon Comprehend as annotations or entity lists. In the first case, you provide a collection of documents and a file with annotations that specify the location where entities occur within the set of documents. Alternatively, with entity lists, you provide a list of entities with their corresponding entity type label, and a set of unannotated documents in which you expect your entities to be present. Both approaches can be used to train a successful custom entity recognition model; however, there are situations in which one method may be a better choice. For example, when the meaning of specific entities could be ambiguous and context-dependent, providing annotations is recommended because this might help you create an Amazon Comprehend model that is capable of better using context when extracting entities.
Annotating documents can require quite a lot of effort and time, especially if you consider that both the quality and quantity of annotations have an impact on the resulting entity recognition model. Imprecise or too few annotations can lead to poor results. To help you set up a process for acquiring annotations, we provide tools such as Amazon SageMaker Ground Truth, which you can use to annotate your documents more quickly and generate an augmented manifest annotations file. However, even if you use Ground Truth, you still need to make sure that your training dataset is large enough to successfully build your entity recognizer.
Until today, to start training an Amazon Comprehend custom entity recognizer, you had to provide a collection of at least 250 documents and a minimum of 100 annotations per entity type. Today, we’re announcing that, thanks to recent improvements in the models underlying Amazon Comprehend, we’ve reduced the minimum requirements for training a recognizer with plain text CSV annotation files. You can now build a custom entity recognition model with as few as three documents and 25 annotations per entity type. You can find further details about new service limits in Guidelines and quotas.
To showcase how this reduction can help you getting started with the creation of a custom entity recognizer, we ran some tests on a few open-source datasets and collected performance metrics. In this post, we walk you through the benchmarking process and the results we obtained while working on subsampled datasets.
Dataset preparation
In this post, we explain how we trained an Amazon Comprehend custom entity recognizer using annotated documents. In general, annotations can be provided as a CSV file, an augmented manifest file generated by Ground Truth, or a PDF file. Our focus is on CSV plain text annotations, because this is the type of annotation impacted by the new minimum requirements. CSV files should have the following structure:
The relevant fields are as follows:
- File – The name of the file containing the documents
- Line – The number of the line containing the entity, starting with line 0
- Begin Offset – The character offset in the input text (relative to the beginning of the line) that shows where the entity begins, considering that the first character is at position 0
- End Offset – The character offset in the input text that shows where the entity ends
- Type – The name of the entity type you want to define
Additionally, when using this approach, you have to provide a collection of training documents as .txt files with one document per line, or one document per file.
For our tests, we used the SNIPS Natural Language Understanding benchmark, a dataset of crowdsourced utterances distributed among seven user intents (AddToPlaylist
, BookRestaurant
, GetWeather
, PlayMusic
, RateBook
, SearchCreativeWork
, SearchScreeningEvent
). The dataset was published in 2018 in the context of the paper Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces by Coucke, et al.
The SNIPS dataset is made of a collection of JSON files condensing both annotations and raw text files. The following is a snippet from the dataset:
Before creating our entity recognizer, we transformed the SNIPS annotations and raw text files into a CSV annotations file and a .txt documents file.
The following is an excerpt from our annotations.csv
file:
The following is an excerpt from our documents.txt
file:
Sampling configuration and benchmarking process
For our experiments, we focused on a subset of entity types from the SNIPS dataset:
-
BookRestaurant – Entity types:
spatial_relation
,poi
,party_size_number
,restaurant_name
,city
,timeRange
,restaurant_type
,served_dish
,party_size_description
,country
,facility
,state
,sort
,cuisine
-
GetWeather – Entity types:
condition_temperature
,current_location
,geographic_poi
,timeRange
,state
,spatial_relation
,condition_description
,city
,country
-
PlayMusic – Entity types:
track
,artist
,music_item
,service
,genre
,sort
,playlist
,album
,year
Moreover, we subsampled each dataset to obtain different configurations in terms of number of documents sampled for training and number of annotations per entity (also known as shots). This was done by using a custom script designed to create subsampled datasets in which each entity type appears at least k times, within a minimum of n documents.
Each model was trained using a specific subsample of the training datasets; the nine model configurations are illustrated in the following table.
Subsampled dataset name | Number of documents sampled for training | Number of documents sampled for testing | Average number of annotations per entity type (shots) |
snips-BookRestaurant-subsample-A |
132 | 17 | 33 |
snips-BookRestaurant-subsample-B |
257 | 33 | 64 |
snips-BookRestaurant-subsample-C |
508 | 64 | 128 |
snips-GetWeather-subsample-A |
91 | 12 | 25 |
snips-GetWeather-subsample-B |
185 | 24 | 49 |
snips-GetWeather-subsample-C |
361 | 46 | 95 |
snips-PlayMusic-subsample-A |
130 | 17 | 30 |
snips-PlayMusic-subsample-B |
254 | 32 | 60 |
snips-PlayMusic-subsample-C |
505 | 64 | 119 |
To measure the accuracy of our models, we collected evaluation metrics that Amazon Comprehend automatically computes when training an entity recognizer:
- Precision – This indicates the fraction of entities detected by the recognizer that are correctly identified and labeled. From a different perspective, precision can be defined as tp / (tp + fp), where tp is the number of true positives (correct identifications) and fp is the number of false positives (incorrect identifications).
- Recall – This indicates the fraction of entities present in the documents that are correctly identified and labeled. It’s calculated as tp / (tp + fn), where tp is the number of true positives and fn is the number of false negatives (missed identifications).
- F1 score – This is a combination of the precision and recall metrics, which measures the overall accuracy of the model. The F1 score is the harmonic mean of the precision and recall metrics, and is calculated as 2 * Precision * Recall / (Precision + Recall).
For comparing performance of our entity recognizers, we focus on F1 scores.
Considering that, given a dataset and a subsample size (in terms of number of documents and shots), you can generate different subsamples, we generated 10 subsamples for each one of the nine configurations, trained the entity recognition models, collected performance metrics, and averaged them using micro-averaging. This allowed us to get more stable results, especially for few-shot subsamples.
Results
The following table shows the micro-averaged F1 scores computed on performance metrics returned by Amazon Comprehend after training each entity recognizer.
Subsampled dataset name | Entity recognizer micro-averaged F1 score (%) |
snips-BookRestaurant-subsample-A |
86.89 |
snips-BookRestaurant-subsample-B |
90.18 |
snips-BookRestaurant-subsample-C |
92.84 |
snips-GetWeather-subsample-A |
84.73 |
snips-GetWeather-subsample-B |
93.27 |
snips-GetWeather-subsample-C |
93.43 |
snips-PlayMusic-subsample-A |
80.61 |
snips-PlayMusic-subsample-B |
81.80 |
snips-PlayMusic-subsample-C |
85.04 |
The following column chart shows the distribution of F1 scores for the nine configurations we trained as described in the previous section.
We can observe that we were able to successfully train custom entity recognition models even with as few as 25 annotations per entity type. If we focus on the three smallest subsampled datasets (snips-BookRestaurant-subsample-A
, snips-GetWeather-subsample-A
, and snips-PlayMusic-subsample-A
), we see that, on average, we were able to achieve a F1 score of 84%, which is a pretty good result considering the limited number of documents and annotations we used. If we want to improve the performance of our model, we can collect additional documents and annotations and train a new model with more data. For example, with medium-sized subsamples (snips-BookRestaurant-subsample-B
, snips-GetWeather-subsample-B
, and snips-PlayMusic-subsample-B
), which contain twice as many documents and annotations, we obtained on average a F1 score of 88% (5% improvement with respect to subsample-A
datasets). Finally, larger subsampled datasets (snips-BookRestaurant-subsample-C
, snips-GetWeather-subsample-C
, and snips-PlayMusic-subsample-C
), which contain even more annotated data (approximately four times the number of documents and annotations used for subsample-A
datasets), provided a further 2% improvement, raising the average F1 score to 90%.
Conclusion
In this post, we announced a reduction of the minimum requirements for training a custom entity recognizer with Amazon Comprehend, and ran some benchmarks on open-source datasets to show how this reduction can help you get started. Starting today, you can create an entity recognition model with as few as 25 annotations per entity type (instead of 100), and at least three documents (instead of 250). With this announcement, we’re lowering the barrier to entry for users interested in using Amazon Comprehend custom entity recognition technology. You can now start running your experiments with a very small collection of annotated documents, analyze preliminary results, and iterate by including additional annotations and documents if you need a more accurate entity recognition model for your use case.
To learn more and get started with a custom entity recognizer, refer to Custom entity recognition.
Special thanks to my colleagues Jyoti Bansal and Jie Ma for their precious help with data preparation and benchmarking.
About the author
Luca Guida is a Solutions Architect at AWS; he is based in Milan and supports Italian ISVs in their cloud journey. With an academic background in computer science and engineering, he started developing his AI/ML passion at university. As a member of the natural language processing (NLP) community within AWS, Luca helps customers be successful while adopting AI/ML services.
Promote feature discovery and reuse across your organization using Amazon SageMaker Feature Store and its feature-level metadata capability
Amazon SageMaker Feature Store helps data scientists and machine learning (ML) engineers securely store, discover, and share curated data used in training and prediction workflows. Feature Store is a centralized store for features and associated metadata, allowing features to be easily discovered and reused by data scientist teams working on different projects or ML models.
With Feature Store, you have always been able to add metadata at the feature group level. Data scientists who want the ability to search and discover existing features for their models now have the ability to search for information at the feature level by adding custom metadata. For example, the information can include a description of the feature, the date it was last modified, its original data source, certain metrics, or the level of sensitivity.
The following diagram illustrates the architecture relationships between feature groups, features, and associated metadata. Note how data scientists can now specify descriptions and metadata at both the feature group level and the individual feature level.
In this post, we explain how data scientists and ML engineers can use feature-level metadata with the new search and discovery capabilities of Feature Store to promote better feature reuse across their organization. This capability can significantly help data scientists in the feature selection process and, as a result, help you identify features that lead to increased model accuracy.
Use case
For the purposes of this post, we use two feature groups, customer
and loan
.
The customer
feature group has the following features:
- age – Customer’s age (numeric)
-
job – Type of job (one-hot encoded, such as
admin
orservices
) -
marital – Marital status (one-hot encoded, such as
married
orsingle
) -
education – Level of education (one-hot encoded, such as
basic 4y
orhigh school
)
The loan
feature group has the following features:
-
default – Has credit in default? (one-hot encoded:
no
oryes
) -
housing – Has housing loan? (one-hot encoded:
no
oryes
) -
loan – Has personal loan? (one-hot encoded:
no
oryes
) - total_amount – Total amount of loans (numeric)
The following figure shows example feature groups and feature metadata.
The purpose of adding a description and assigning metadata to each feature is to increase the speed of discovery by enabling new search parameters along which a data scientist or ML engineer can explore features. These can reflect details about a feature such as its calculation, whether it’s an average over 6 months or 1 year, origin, creator or owner, what the feature means, and more.
In the following sections, we provide two approaches to search and discover features and configure feature-level metadata: the first using Amazon SageMaker Studio directly, and the second programmatically.
Feature discovery in Studio
You can easily search and query features using Studio. With the new enhanced search and discovery capabilities, you can immediately retrieve results using a simple type-ahead of a few characters.
The following screenshot demonstrates the following capabilities:
- You can access the Feature Catalog tab and observe features across feature groups. The features are presented in a table that includes the feature name, type, description, parameters, date of creation, and associated feature group’s name.
- You can directly use the type-ahead functionality to immediately return search results.
- You have the flexibility to use different types of filter options:
All
,Feature name
,Description
, orParameters
. Note thatAll
will return all features where eitherFeature name
,Description
, orParameters
match the search criteria. - You can narrow down the search further by specifying a date range using the
Created from
andCreated to
fields and specifying parameters using theSearch parameter key
andSearch parameter value
fields.
After you have selected a feature, you can choose the feature’s name to bring up its details. When you choose Edit Metadata, you can add a description and up to 25 key-value parameters, as shown in the following screenshot. Within this view, you can ultimately create, view, update, and delete the feature’s metadata. The following screenshot illustrates how to edit feature metadata for total_amount
.
As previously stated, adding key-value pairs to a feature gives you more dimensions along which to search for their given features. For our example, the feature’s origin has been added to every feature’s metadata. When you choose the search icon and filter along the key-value pair origin: job
, you can see all the features that were one-hot-encoded from this base attribute.
Feature discovery using code
You can also access and update feature information through the AWS Command Line Interface (AWS CLI) and SDK (Boto3) rather than directly through the AWS Management Console. This allows you to integrate the feature-level search functionality of Feature Store with your own custom data science platforms. In this section, we interact with the Boto3 API endpoints to update and search feature metadata.
To begin improving feature search and discovery, you can add metadata using the update_feature_metadata
API. In addition to the description
and created_date
fields, you can add up to 25 parameters (key-value pairs) to a given feature.
The following code is an example of five possible key-value parameters that have been added to the job_admin
feature. This feature was created, along with job_services
and job_none
, by one-hot-encoding job
.
After author
, team
, origin
, sensitivity
, and env
have been added to the job_admin
feature, data scientists or ML engineers can retrieve them by calling the describe_feature_metadata
API. You can navigate to the Parameters
object in the response for the metadata we previously added to our feature. The describe_feature_metadata
API endpoint allows you to get greater insight into a given feature by getting its associated metadata.
You can search for features by using the SageMaker search
API using metadata as search parameters. The following code is an example function that takes a search_string
parameter as an input and returns all features where the feature’s name, description, or parameters match the condition:
The following code snippet uses our search_features
function to retrieve all features for which either the feature name, description, or parameters contain the word job
:
The following screenshot contains the list of matching feature names as well as their corresponding metadata, including timestamps for each feature’s creation and last modification. You can use this information to improve discovery and visibility into your organization’s features.
Conclusion
SageMaker Feature Store provides a purpose-built feature management solution to help organizations scale ML development across business units and data science teams. Improving feature reuse and feature consistency are primary benefits of a feature store. In this post, we explained how you can use feature-level metadata to improve search and discovery of features. This included creating metadata around a variety of use cases and using it as additional search parameters.
Give it a try, and let us know what you think in comments. If you want to learn more about collaborating and sharing features within Feature Store, refer to Enable feature reuse across accounts and teams using Amazon SageMaker Feature Store.
About the authors
Arnaud Lauer is a Senior Partner Solutions Architect in the Public Sector team at AWS. He enables partners and customers to understand how best to use AWS technologies to translate business needs into solutions. He brings more than 16 years of experience in delivering and architecting digital transformation projects across a range of industries, including the public sector, energy, and consumer goods. Artificial intelligence and machine learning are some of his passions. Arnaud holds 12 AWS certifications, including the ML Specialty Certification.
Nicolas Bernier is an Associate Solutions Architect, part of the Canadian Public Sector team at AWS. He is currently conducting a master’s degree with a research area in Deep Learning and holds five AWS certifications, including the ML Specialty Certification. Nicolas is passionate about helping customers deepen their knowledge of AWS by working with them to translate their business challenges into technical solutions.
Mark Roy is a Principal Machine Learning Architect for AWS, helping customers design and build AI/ML solutions. Mark’s work covers a wide range of ML use cases, with a primary interest in computer vision, deep learning, and scaling ML across the enterprise. He has helped companies in many industries, including insurance, financial services, media and entertainment, healthcare, utilities, and manufacturing. Mark holds six AWS certifications, including the ML Specialty Certification. Prior to joining AWS, Mark was an architect, developer, and technology leader for over 25 years, including 19 years in financial services.
Khushboo Srivastava is a Senior Product Manager for Amazon SageMaker. She enjoys building products that simplify machine learning workflows for customers. In her spare time, she enjoys playing violin, practicing yoga, and traveling.
Amazon wins best-paper award at first AutoML conference
Paper presents a criterion for halting the hyperparameter optimization process.Read More
Scale YOLOv5 inference with Amazon SageMaker endpoints and AWS Lambda
After data scientists carefully come up with a satisfying machine learning (ML) model, the model must be deployed to be easily accessible for inference by other members of the organization. However, deploying models at scale with optimized cost and compute efficiencies can be a daunting and cumbersome task. Amazon SageMaker endpoints provide an easily scalable and cost-optimized solution for model deployment. The YOLOv5 model, distributed under the GPLv3 license, is a popular object detection model known for its runtime efficiency as well as detection accuracy. In this post, we demonstrate how to host a pre-trained YOLOv5 model on SageMaker endpoints and use AWS Lambda functions to invoke these endpoints.
Solution overview
The following image outlines the AWS services used to host the YOLOv5 model using a SageMaker endpoint and invoke the endpoint using Lambda. The SageMaker notebook accesses a YOLOv5 PyTorch model from an Amazon Simple Storage Service (Amazon S3) bucket, converts it to YOLOv5 TensorFlow SavedModel
format, and stores it back to the S3 bucket. This model is then used when hosting the endpoint. When an image is uploaded to Amazon S3, it acts as a trigger to run the Lambda function. The function utilizes OpenCV Lambda layers to read the uploaded image and run inference using the endpoint. After the inference is run, you can use the results obtained from it as needed.
In this post, we walk through the process of utilizing a YOLOv5 default model in PyTorch and converting it to a TensorFlow SavedModel
. This model is hosted using a SageMaker endpoint. Then we create and publish a Lambda function that invokes the endpoint to run inference. Pre-trained YOLOv5 models are available on GitHub. For the purpose of this post, we use the yolov5l model.
Prerequisites
As a prerequisite, we need to set up the following AWS Identity and Access Management (IAM) roles with appropriate access policies for SageMaker, Lambda, and Amazon S3:
-
SageMaker IAM role – This requires
AmazonS3FullAccess
policies attached for storing and accessing the model in the S3 bucket -
Lambda IAM role – This role needs multiple policies:
- To access images stored in Amazon S3, we require the following IAM policies:
s3:GetObject
s3:ListBucket
- To run the SageMaker endpoint, we need access to the following IAM policies:
sagemaker:ListEndpoints
sagemaker:DescribeEndpoint
sagemaker:InvokeEndpoint
sagemaker:InvokeEndpointAsync
- To access images stored in Amazon S3, we require the following IAM policies:
You also need the following resources and services:
- The AWS Command Line Interface (AWS CLI), which we use to create and configure Lambda.
- A SageMaker notebook instance. These come with Docker pre-installed, and we use this to create the Lambda layers. To set up the notebook instance, complete the following steps:
- On the SageMaker console, create a notebook instance and provide the notebook name, instance type (for this post, we use ml.c5.large), IAM role, and other parameters.
- Clone the public repository and add the YOLOv5 repository provided by Ultralytics.
Host YOLOv5 on a SageMaker endpoint
Before we can host the pre-trained YOLOv5 model on SageMaker, we must export and package it in the correct directory structure inside model.tar.gz
. For this post, we demonstrate how to host YOLOv5 in the saved_model
format. The YOLOv5 repo provides an export.py
file that can export the model in many different ways. After you clone the YOLOv5 and enter the YOLOv5 directory from command line, you can export the model with the following command:
$ cd yolov5
$ pip install -r requirements.txt tensorflow-cpu
$ python export.py --weights yolov5l.pt --include saved_model --nms
This command creates a new directory called yolov5l_saved_model
inside the yolov5
directory. Inside the yolov5l_saved_model
directory, we should see the following items:
To create a model.tar.gz
file, move the contents of yolov5l_saved_model
to export/Servo/1
. From the command line, we can compress the export
directory by running the following command and upload the model to the S3 bucket:
$ mkdir export && mkdir export/Servo
$ mv yolov5l_saved_model export/Servo/1
$ tar -czvf model.tar.gz export/
$ aws s3 cp model.tar.gz "<s3://BUCKET/PATH/model.tar.gz>"
Then, we can deploy a SageMaker endpoint from a SageMaker notebook by running the following code:
import os
import tensorflow as tf
from tensorflow.keras import backend
from sagemaker.tensorflow import TensorFlowModel
model_data = '<s3://BUCKET/PATH/model.tar.gz>'
role = '<IAM ROLE>'
model = TensorFlowModel(model_data=model_data,
framework_version='2.8', role=role)
INSTANCE_TYPE = 'ml.m5.xlarge'
ENDPOINT_NAME = 'yolov5l-demo'
predictor = model.deploy(initial_instance_count=1,
instance_type=INSTANCE_TYPE,
endpoint_name=ENDPOINT_NAME)
The preceding script takes approximately 2–3 minutes to fully deploy the model to the SageMaker endpoint. You can monitor the status of the deployment on the SageMaker console. After the model is hosted successfully, the model is ready for inference.
Test the SageMaker endpoint
After the model is successfully hosted on a SageMaker endpoint, we can test it out, which we do using a blank image. The testing code is as follows:
import numpy as np
ENDPOINT_NAME = 'yolov5l-demo'
modelHeight, modelWidth = 640, 640
blank_image = np.zeros((modelHeight,modelWidth,3), np.uint8)
data = np.array(resized_image.astype(np.float32)/255.)
payload = json.dumps([data.tolist()])
response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,
ContentType='application/json',
Body=payload)
result = json.loads(response['Body'].read().decode())
print('Results: ', result)
Set up Lambda with layers and triggers
We use OpenCV to demonstrate the model by passing an image and getting the inference results. Lambda doesn’t come with external libraries like OpenCV pre-built, therefore we need to build it before we can invoke the Lambda code. Furthermore, we want to make sure that we don’t build external libraries like OpenCV every time Lambda is being invoked. For this purpose, Lambda provides a functionality to create Lambda layers. We can define what goes in these layers, and they can be consumed by the Lambda code every time it’s invoked. We also demonstrate how to create the Lambda layers for OpenCV. For this post, we use an Amazon Elastic Compute Cloud (Amazon EC2) instance to create the layers.
After we have the layers in place, we create the app.py
script, which is the Lambda code that uses the layers, runs the inference, and gets results. The following diagram illustrates this workflow.
Create Lambda layers for OpenCV using Docker
Use Dockerfile as follows to create the Docker image using Python 3.7:
FROM amazonlinux
RUN yum update -y
RUN yum install gcc openssl-devel bzip2-devel libffi-devel wget tar gzip zip make -y
# Install Python 3.7
WORKDIR /
RUN wget https://www.python.org/ftp/python/3.7.12/Python-3.7.12.tgz
RUN tar -xzvf Python-3.7.12.tgz
WORKDIR /Python-3.7.12
RUN ./configure --enable-optimizations
RUN make altinstall
# Install Python packages
RUN mkdir /packages
RUN echo "opencv-python" >> /packages/requirements.txt
RUN mkdir -p /packages/opencv-python-3.7/python/lib/python3.7/site-packages
RUN pip3.7 install -r /packages/requirements.txt -t /packages/opencv-python-3.7/python/lib/python3.7/site-packages
# Create zip files for Lambda Layer deployment
WORKDIR /packages/opencv-python-3.7/
RUN zip -r9 /packages/cv2-python37.zip .
WORKDIR /packages/
RUN rm -rf /packages/opencv-python-3.7/
Build and run Docker and store the output ZIP file in the current directory under layers
:
$ docker build --tag aws-lambda-layers:latest <PATH/TO/Dockerfile>
$ docker run -rm -it -v $(pwd):/layers aws-lambda-layers cp /packages/cv2-python37.zip /layers
Now we can upload the OpenCV layer artifacts to Amazon S3 and create the Lambda layer:
$ aws s3 cp layers/cv2-python37.zip s3://<BUCKET>/<PATH/TO/STORE/ARTIFACTS>
$ aws lambda publish-layer-version --layer-name cv2 --description "Open CV" --content S3Bucket=<BUCKET>,S3Key=<PATH/TO/STORE/ARTIFACTS>/cv2-python37.zip --compatible-runtimes python3.7
After the preceding commands run successfully, you have an OpenCV layer in Lambda, which you can review on the Lambda console.
Create the Lambda function
We utilize the app.py
script to create the Lambda function and use OpenCV. In the following code, change the values for BUCKET_NAME
and IMAGE_LOCATION
to the location for accessing the image:
import os, logging, json, time, urllib.parse
import boto3, botocore
import numpy as np, cv2
logger = logging.getLogger()
logger.setLevel(logging.INFO)
client = boto3.client('lambda')
# S3 BUCKETS DETAILS
s3 = boto3.resource('s3')
BUCKET_NAME = "<NAME OF S3 BUCKET FOR INPUT IMAGE>"
IMAGE_LOCATION = "<S3 PATH TO IMAGE>/image.png"
# INFERENCE ENDPOINT DETAILS
ENDPOINT_NAME = 'yolov5l-demo'
config = botocore.config.Config(read_timeout=80)
runtime = boto3.client('runtime.sagemaker', config=config)
modelHeight, modelWidth = 640, 640
# RUNNING LAMBDA
def lambda_handler(event, context):
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
# INPUTS - Download Image file from S3 to Lambda /tmp/
input_imagename = key.split('/')[-1]
logger.info(f'Input Imagename: {input_imagename}')
s3.Bucket(BUCKET_NAME).download_file(IMAGE_LOCATION + '/' + input_imagename, '/tmp/' + input_imagename)
# INFERENCE - Invoke the SageMaker Inference Endpoint
logger.info(f'Starting Inference ... ')
orig_image = cv2.imread('/tmp/' + input_imagename)
if orig_image is not None:
start_time_iter = time.time()
# pre-processing input image
image = cv2.resize(orig_image.copy(), (modelWidth, modelHeight), interpolation = cv2.INTER_AREA)
data = np.array(image.astype(np.float32)/255.)
payload = json.dumps([data.tolist()])
# run inference
response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME, ContentType='application/json', Body=payload)
# get the output results
result = json.loads(response['Body'].read().decode())
end_time_iter = time.time()
# get the total time taken for inference
inference_time = round((end_time_iter - start_time_iter)*100)/100
logger.info(f'Inference Completed ... ')
# OUTPUTS - Using the output to utilize in other services downstream
return {
"statusCode": 200,
"body": json.dumps({
"message": "Inference Time:// " + str(inference_time) + " seconds.",
"results": result
}),
}
Deploy the Lambda function with the following code:
$ zip app.zip app.py
$ aws s3 cp app.zip s3://<BUCKET>/<PATH/TO/STORE/FUNCTION>
$ aws lambda create-function --function-name yolov5-lambda --handler app.lambda_handler --region us-east-1 --runtime python3.7 --environment "Variables={BUCKET_NAME=$BUCKET_NAME,S3_KEY=$S3_KEY}" --code S3Bucket=<BUCKET>,S3Key="<PATH/TO/STORE/FUNCTION/app.zip>"
Attach the OpenCV layer to the Lambda function
After we have the Lambda function and layer in place, we can connect the layer to the function as follows:
$ aws lambda update-function-configuration --function-name yolov5-lambda --layers cv2
We can review the layer settings via the Lambda console.
Trigger Lambda when an image is uploaded to Amazon S3
We use an image upload to Amazon S3 as a trigger to run the Lambda function. For instructions, refer to Tutorial: Using an Amazon S3 trigger to invoke a Lambda function.
You should see the following function details on the Lambda console.
Run inference
After you set up Lambda and the SageMaker endpoint, you can test the output by invoking the Lambda function. We use an image upload to Amazon S3 as a trigger to invoke Lambda, which in turn invokes the endpoint for inference. As an example, we upload the following image to the Amazon S3 location <S3 PATH TO IMAGE>/test_image.png
configured in the previous section.
After the image is uploaded, the Lambda function is triggered to download and read the image data and send it to the SageMaker endpoint for inference. The output result from the SageMaker endpoint is obtained and returned by the function in JSON format, which we can use in different ways. The following image shows example output overlayed on the image.
Clean up
Depending on the instance type, SageMaker notebooks can require significant compute usage and cost. To avoid unnecessary costs, we advise stopping the notebook instance when it’s not in use. Additionally, Lambda functions and SageMaker endpoints incur charges only when they’re invoked. Therefore, no cleanup is necessary for those services. However, if an endpoint isn’t being used any longer, it’s good practice to remove the endpoint and the model.
Conclusion
In this post, we demonstrated how to host a pre-trained YOLOv5 model on a SageMaker endpoint and use Lambda to invoke inference and process the output. The detailed code is available on GitHub.
To learn more about SageMaker endpoints, check out Create your endpoint and deploy your model and Build, test, and deploy your Amazon SageMaker inference models to AWS Lambda, which highlights how you can automate the process of deploying YOLOv5 models.
About the authors
Kevin Song is an IoT Edge Data Scientist at AWS Professional Services. Kevin holds a PhD in Biophysics from The University of Chicago. He has over 4 years of industry experience in Computer Vision and Machine Learning. He is involved in helping customers in the sports and life sciences industry deploy Machine Learning models.
Romil Shah is an IoT Edge Data Scientist at AWS Professional Services. Romil has over 6 years of industry experience in Computer Vision, Machine Learning and IoT edge devices. He is involved in helping customers optimize and deploy their Machine Learning models for edge devices for industrial setup.
20B-parameter Alexa model sets new marks in few-shot learning
With an encoder-decoder architecture — rather than decoder only — the Alexa Teacher Model excels other large language models on few-shot tasks such as summarization and machine translation.Read More
Amazon Scholar Kathleen McKeown receives dual honors
McKeown awarded IEEE Innovation in Societal Infrastructure Award and named a member of the American Philosophical Society.Read More