Francesco Locatello on the four NeurIPS papers he coauthored this year, which largely concern generalization to out-of-distribution test data.Read More
Amazon Rekognition Labels adds 600 new labels, including landmarks, and now detects dominant colors
Amazon Rekognition offers pre-trained and customizable computer vision capabilities to extract information and insights from images and videos. One such capability is Amazon Rekognition Labels, which detects objects, scenes, actions, and concepts in images. Customers such as Synchronoss, Shutterstock, and Nomad Media use Amazon Rekognition Labels to automatically add metadata to their content library and enable content-based search results. TripleLift uses Amazon Rekognition Labels to determine the best moments to dynamically insert ads that complement the viewing experience for the audience. VidMob uses Amazon Rekognition Labels to extract metadata from ad creatives to understand the unique role of creative decision-making in ad performance, so marketers can produce ads that impact key objectives they care about most. Additionally, thousands of other customers use Amazon Rekognition Labels to support many other use cases, such as classifying trail or hiking photos, detecting people or vehicles in security camera footage, and classifying identity document pictures.
Amazon Rekognition Labels for images detects 600 new labels, including landmarks and activities, and improves accuracy for over 2,000 existing labels. In addition, Amazon Rekognition Labels now supports Image Properties to detect dominant colors of an image, its foreground and background, as well as detected objects with bounding boxes. Image Properties also measures image brightness, sharpness, and contrast. Lastly, Amazon Rekognition Labels now organizes label results using two additional fields, aliases
and categories
, and supports filtering of those results. In the following sections, we review the new capabilities and their benefits in more detail with some examples.
New labels
Amazon Rekognition Labels has added over 600 new labels, expanding the list of supported labels. The following are some examples of the new labels:
- Popular landmarks – Brooklyn Bridge, Colosseum, Eiffel Tower, Machu Picchu, Taj Mahal, etc.
- Activities – Applause, Cycling, Celebrating, Jumping, Walking Dog, etc.
- Damage detection – Car Dent, Car Scratch, Corrosion, Home Damage, Roof Damage, Termite Damage, etc.
- Text and documents – Bar Chart, Boarding Pass, Flow Chart, Notebook, Invoice, Receipt, etc.
- Sports – Baseball Game, Cricket Bat, Figure Skating, Rugby, Water Polo, etc.
- Many more – Boat Racing, Fun, Cityscape, Village, Wedding Proposal, Banquet, etc.
With these labels, customers in image sharing, stock photography, or broadcast media can automatically add new metadata to their content library to improve their search capabilities.
Let’s look at a label detection example for the Brooklyn Bridge.
The following table shows the labels and confidence scores returned in the API response.
Labels | Confidence Scores |
Brooklyn Bridge | 95.6 |
Bridge | 95.6 |
Landmark | 95.6 |
Improved labels
Amazon Rekognition Labels has also improved the accuracy for over 2,000 labels. The following are some examples of the improved labels:
- Activities – Diving, Driving, Reading, Sitting, Standing, etc.
- Apparel and accessories – Backpack, Belt, Blouse, Hoodie, Jacket, Shoe, etc.
- Home and indoors – Swimming Pool, Potted Plant, Pillow, Fireplace, Blanket, etc.
- Technology and computing – Headphones, Mobile Phone, Tablet Computer, Reading, Laptop, etc.
- Vehicles and automotive – Truck, Wheel, Tire, Bumper, Car Seat, Car Mirror, etc.
- Text and documents – Passport, Driving License, Business Card, Document, etc.
- Many more – Dog, Kangaroo, Town Square, Festival, Laughing, etc.
Image Properties for dominant color detection and image quality
Image Properties is a new capability of Amazon Rekognition Labels for images, and can be used with or without the label detection functionality. Note: Image Properties is priced separately from Amazon Rekognition Labels, and is only available with the updated SDKs.
Dominant color detection
Image Properties identifies dominant colors in an image based on pixel percentages. These dominant colors are mapped to the 140 CSS color palette, RGB, hex code, and 12 simplified colors (green, pink, black, red, yellow, cyan, brown, orange, white, purple, blue, grey). By default, the API returns up to 10 dominant colors unless you specify the number of colors to return. The maximum number of dominant colors the API can return is 12.
When used standalone, Image Properties detects the dominant colors of an entire image as well as its foreground and background. When used together with label detection functionalities, Image Properties also identifies the dominant colors of detected objects with bounding boxes.
Customers in image sharing or stock photography can use dominant color detection to enrich their image library metadata to improve content discovery, allowing their end-users to filter by color or search objects with specific colors, such as “blue chair” or “red shoes.” Additionally, customers in advertising can determine ad performance based on the colors of their creative assets.
Image quality
In addition to dominant color detection, Image Properties also measures image qualities through brightness, sharpness, and contrast scores. Each of these scores ranges from 0–100. For example, a very dark image will return low brightness values, whereas a brightly lit image will return high values.
With these scores, customers in image sharing, advertising, or ecommerce can perform quality inspection and filter out images with low brightness and sharpness to reduce false label predictions.
The following image shows an example with the Eiffel Tower.
The following table is an example of Image Properties data returned in the API response.
The following image is an example for a red chair.
The following is an example of Image Properties data returned in the API response.
The following image is an example for a dog with a yellow background.
The following is an example of Image Properties data returned in the API response.

New aliases and categories fields
Amazon Rekognition Labels now returns two new fields, aliases
and categories
, in the API response. Aliases are other names for the same label and categories group individual labels together based on 40 common themes, such as Food and Beverage
and Animals and Pets
. With the label detection model update, aliases are no longer returned in the primary list of label names. Instead, aliases are returned in the new aliases
field in the API response. Note: Aliases and categories are only returned with the updated SDKs.
Customers in photo sharing, ecommerce, or advertising can use aliases and categories to organize their content metadata taxonomy to further enhance content search and filtering:
- Aliases example – Because
Car
andAutomobile
are aliases, you can add metadata to an image withCar
andAutomobile
at the same time - Categories example – You can use categories to create a category filter or display all images related to a particular category, such as
Food and Beverage
, without having to explicitly add metadata to each image withFood and Beverage
The following image shows a label detection example with aliases and categories for a diver.
The following table shows the labels, confidence scores, aliases, and categories returned in the API response.
Labels | Confidence Scores | Aliases | Categories |
Nature | 99.9 | – | Nature and Outdoors |
Water | 99.9 | – | Nature and Outdoors |
Scuba Diving | 99.9 | Aqua Scuba | Travel and Adventure |
Person | 99.9 | Human | Person Description |
Leisure Activities | 99.9 | Recreation | Travel and Adventure |
Sport | 99.9 | Sports | Sports |
The following image is an example for a cyclist.
The following table contains the labels, confidence scores, aliases, and categories returned in the API response.
Labels | Confidence Scores | Aliases | Categories |
Sky | 99.9 | – | Nature and Outdoors |
Outdoors | 99.9 | – | Nature and Outdoors |
Person | 98.3 | Human | Person Description |
Sunset | 98.1 | Dusk, Dawn | Nature and Outdoors |
Bicycle | 96.1 | Bike | Hobbies and Interests |
Cycling | 85.1 | Cyclist, Bike Cyclist | Actions |
Inclusion and exclusion filters
Amazon Rekognition Labels introduces new inclusion and exclusion filtering options in the API input parameters to narrow down the specific list of labels returned in the API response. You can provide an explicit list of labels or categories that you want to include or exclude. Note: These filters are available with the updated SDKs.
Customers can use inclusion and exclusion filters to obtain specific labels or categories they are interested in without having to create additional logic in their application. For example, customers in insurance can use LabelCategoriesInclusionFilter
to only include label results in the Damage Detection
category.
The following code is an API sample request with inclusion and exclusion filters:
The following are examples of how inclusion and exclusion filters work:
- If you only want to detect
Person
andCar
, and don’t care about other labels, you can specify [“Person”,”Car”
] inLabelsInclusionFilter
. - If you want to detect all labels except for
Clothing
, you can specify [“Clothing”
] inLabelsExclusionFilter
. - If you want to detect only labels within the
Animal and Pets
categories except forDog
andCat
, you can specify ["Animal and Pets"
] in theLabelCategoriesInclusionFilter
, with ["Dog", "Cat"
] inLabelsExclusionFilter
. - If a label is specified in
LabelsInclusionFilter
orLabelsExclusionFilter
, their aliases will be included or excluded accordingly becausealiases
is a sub-taxonomy of labels. For example, becauseAutomobile
is an alias ofCar
, if you specifyCar
inLabelsInclusionFilter
, the API will return theCar
label withAutomobile
in thealiases
field.
Conclusion
Amazon Rekognition Labels detects 600 new labels and improves accuracy for over 2,000 existing labels. Along with these updates, Amazon Rekognition Labels now supports Image Properties, aliases and categories, as well as inclusion and inclusion filters.
To try the new label detection model with its new features, log in to your AWS account and check out the Amazon Rekognition console for label detection and image properties. To learn more, visit Detecting labels.
About the authors
Maria Handoko is a Senior Product Manager at AWS. She focuses on helping customers solve their business challenges through machine learning and computer vision. In her spare time, she enjoys hiking, listening to podcasts, and exploring different cuisines.
Shipra Kanoria is a Principal Product Manager at AWS. She is passionate about helping customers solve their most complex problems with the power of machine learning and artificial intelligence. Before joining AWS, Shipra spent over 4 years at Amazon Alexa, where she launched many productivity-related features on the Alexa voice assistant.
Generate cold start forecasts for products with no historical data using Amazon Forecast, now up to 45% more accurate
Now with Amazon Forecast, you can generate up to 45% more accurate forecasts for products with no historical data. Forecast is a managed service that uses machine learning (ML) to generate accurate demand forecasts, without requiring any ML experience. Accurate forecasting is the foundation for inventory optimization, logistics planning, and workforce management and it enables businesses to be better prepared to serve their customers. Cold start forecasting is a common challenge where there is a need to generate a forecast but there is no historical data for the product. This is typical in industries such as retail, manufacturing, or consumer packaged goods where there is rapid new product introductions by bringing newly developed products to market, onboarding brands or catalogs for the very first time, or cross-selling products into new regions. With this launch, we improved on our existing approach to cold start forecasting and now provide forecasts that are up to 45% more accurate.
It can be challenging to develop a cold start forecasting model because traditional statistical forecasting methods such as Autoregressive Integrated Moving Average (ARIMA) or Exponential Smoothing are built using the concept that a product’s historical data can be used to predict its future values. But, without historical data, the model parameters can’t be calculated and thus the model can’t be built. Forecast already had the ability to generate forecasts for cold start products using proprietary neural network algorithms such as DeepAR+ and CNN-QR. These models learn relationships between products and can generate forecasts for products with no historical data. The usage of item metadata to establish these relationships was implicit which meant that the networks were not able to fully extrapolate trend characteristics for cold start products.
Today, we launched a new approach for cold start forecasting that is up to 45% more accurate than before. This approach improves our treatment of item metadata through which we identify explicit products within your dataset that have the most similar characteristics to the cold start products. By focusing on this subset of similar products, we are able to better learn trends to generate a forecast for the cold start product. For example, a fashion retailer introducing a new T-shirt line will want to forecast demand for that line to optimize store inventory. You can provide Forecast with historical data for other products in your catalog such as existing T-shirt lines, jackets, trousers, and shoes, as well as item metadata such as brand name, color, size, and product category for both new and existing products. With this metadata, Forecast automatically detects the products that are most closely related to the new T-shirt line and uses those to generate forecasts for the T-shirt line.
This feature is available in all Regions where Forecast is publicly available through the AWS Management Console or the AutoPredictor API. For more information about Region availability, see AWS Regional Services. To get started on using Forecast for cold start forecasting, refer to Generating Forecasts or the GitHub notebook.
Solution overview
The steps in this post demonstrate how to use Forecast for cold start forecasting on the AWS Management Console. We walk through an example of a retailer generating an inventory demand forecast for a newly launched product by following the three steps in Forecast: importing your data, training a predictor, and creating a forecast. To directly use the Forecast API for cold start forecasting, follow the notebook in our GitHub repo, which provides an analogous demonstration.
Import your training data
To use the new cold start forecasting method, you must import two CSV files: one file containing the target time series data (showing the prediction target), and another file containing the item metadata (showing product characteristics such as size or color). Forecast identifies cold start products as those products that are present in the item metadata file but aren’t present in the target time series file.
To correctly identify your cold start product, ensure that the item ID of your cold start product is entered as a row in your item metadata file and that it’s not contained in the target time series file. For multiple cold start products, enter each product item ID as a separate row in the item metadata file. If you don’t yet have an item ID for your cold start product, you can use any alphanumeric combination less than 64 characters that isn’t already representative of another product in your dataset.
In our example, the target time series file contains the product item ID, timestamp, and demand (inventory), and the item metadata file contains the product item ID, color, product category, and location.
To import your data, complete the following steps:
- On the Forecast console, choose View dataset groups.
-
- Choose Create dataset group.
- For Dataset group name, enter a dataset name (for this post, my_company_shoe_inventory).
- For Forecasting domain, choose a forecasting domain (for this post, Retail).
- Choose Next.
- On the Create target time series dataset page, provide the dataset name, frequency of your data, and data schema.
- Provide the dataset import details.
- Choose Start.
The following screenshot shows the information for the target time series page filled out for our example.
You’re redirected to the dashboard that you can use to track progress.
- To import the item metadata file, on the dashboard, choose Import.
- On the Create item metadata dataset page, provide the dataset name and data schema.
- Provide the dataset import details.
- Choose Start.
The following screenshot shows the information filled out for our example.
Train a predictor
Next, we train a predictor.
- On the dashboard, choose Train predictor.
- On the Train predictor page, enter a name for your predictor, how long in the future you want to forecast and at what frequency, and the number of quantiles you want to forecast for.
- Enable AutoPredictor. This is required for cold start forecasting.
- Choose Create.
The following screenshot shows the information filled out for our example.
Create a forecast
After our predictor is trained (this can take approximately 2.5 hours), we create a forecast for the newly launched product. You will know that your predictor is trained when you see the View Predictors button on your dashboard.
- Choose Create a forecast on the dashboard.
- On the Create a forecast page, enter a forecast name, choose the predictor that you created, and specify the forecast quantiles (optional) and the items to generate a forecast for.
- Choose Start.
Export your forecasts
After your forecast is created, you can export the data to CSV. You will know that your forecast is created when you see the status is active.
- Choose Create forecast export.
- Enter the export file name (for this post, my_cold_start_forecast_export).
- For Export location, specify the Amazon Simple Storage Service (Amazon S3) location.
- Choose Start.
- To download the export, navigate to the S3 file path location from the console, then select the file and choose Download.
The export file contains the timestamp, item ID, item metadata, and the forecasts for each quantile selected.
View your forecasts
After your forecast is created, you can view the forecasts for the new products graphically on the console.
- Choose Query forecast on the dashboard.
- Choose the name of the forecast created in the previous step (my_cold_start_forecast in our example).
- Enter the start date and end date you want to view your forecast over.
- In the item ID field for the forecast key, add the unique ID of your cold start product.
- Chose Get forecast.
In the figure, you will see the forecast for any quantile selected.
Conclusion
With Forecast, you’re able to obtain the same forecasting insights for cold-start products with no historical data, now up to 45% more accurate than before. To generate cold start forecasts with Forecast, open the Forecast console and follow the steps outlined in this post, or refer to our GitHub notebook on how to access the functionality via API. To learn more, refer to Generating Forecasts.
About the authors
Brandon Nair is a Senior Product Manager for Amazon Forecast. His professional interest lies in creating scalable machine learning services and applications. Outside of work he can be found exploring national parks, perfecting his golf swing or planning an adventure trip.
Manas Dadarkar is a Software Development Manager owning the engineering of the Amazon Forecast service. He is passionate about the applications of machine learning and making ML technologies easily available for everyone to adopt and deploy to production. Outside of work, he has multiple interests including travelling, reading and spending time with friends and family.
Bharat Nandamuri is a Sr Software Engineer working on Amazon Forecast. He is passionate about building high scale backend services with focus on Engineering for ML systems. Outside of work, he enjoys playing chess, hiking and watching movies.
Gaurav Gupta is an Applied Scientist at AWS AI labs and Amazon Forecast. His research interests lie in machine learning for sequential data, operator learning for partial differential equations, wavelets. He completed his PhD from University of Southern California before joining AWS.
Identify key insights from text documents through fine-tuning and HPO with Amazon SageMaker JumpStart
Organizations across industries such as retail, banking, finance, healthcare, manufacturing, and lending often have to deal with vast amounts of unstructured text documents coming from various sources, such as news, blogs, product reviews, customer support channels, and social media. These documents contain critical information that’s key to making important business decisions. As an organization grows, it becomes a challenge to extract critical information from these documents. With the advancement of natural language processing (NLP) and machine learning (ML) techniques, we can uncover valuable insights and connections from these textual documents quickly and with high accuracy, thereby helping companies make quality business decisions on time. Fully managed NLP services have also accelerated the adoption of NLP. Amazon Comprehend is a fully managed service that enables you to build custom NLP models that are specific to your requirements, without the need for any ML expertise.
In this post, we demonstrate how to utilize state-of-the-art ML techniques to solve five different NLP tasks: document summarization, text classification, question answering, named entity recognition, and relationship extraction. For each of these NLP tasks, we demonstrate how to use Amazon SageMaker to perform the following actions:
- Deploy and run inference on a pre-trained model
- Fine-tune the pre-trained model on a new custom dataset
- Further improve the fine-tuning performance with SageMaker automatic model tuning
- Evaluate model performance on the hold-out test data with various evaluation metrics
Although we cover five specific NLP tasks in this post, you can use this solution as a template to generalize fine-tuning pre-trained models with your own dataset, and subsequently run hyperparameter optimization to improve accuracy.
JumpStart solution templates
Amazon SageMaker JumpStart provides one-click, end-to-end solutions for many common ML use cases. Explore the following use cases for more information on available solution templates:
- Demand forecasting
- Credit rating prediction
- Fraud detection
- Computer vision
- Extract and analyze data from documents
- Predictive maintenance
- Churn prediction
- Personalized recommendations
- Reinforcement learning
- Healthcare and life sciences
- Financial pricing
The JumpStart solution templates cover a variety of use cases, under each of which several different solution templates are offered (this Document Understanding solution is under the “Extract and analyze data from documents” use case).
Choose the solution template that best fits your use case from the JumpStart landing page. For more information on specific solutions under each use case and how to launch a JumpStart solution, see Solution Templates.
Solution overview
The following image demonstrates how you can use this solution with SageMaker components. The SageMaker training jobs are used to train the various NLP model, and SageMaker endpoints are used to deploy the models in each stage. We use Amazon Simple Storage Service (Amazon S3) alongside SageMaker to store the training data and model artifacts, and Amazon CloudWatch to log training and endpoint outputs.
Open the Document Understanding solution
Navigate to the Document Understanding solution in JumpStart.
Now we can take a closer look at some of the assets that are included in this solution, starting with the demo notebook.
Demo notebook
You can use the demo notebook to send example data to already deployed model endpoints for the document summarization and question answering tasks. The demo notebook quickly allows you to get hands-on experience by querying the example data.
After you launch the Document Understanding solution, open the demo notebook by choosing Use Endpoint in Notebook.
Let’s dive deeper into each of the five main notebooks for this solution.
Prerequisites
In Amazon SageMaker Studio, ensure you’re using the PyTorch 1.10 Python 3.8 CPU Optimized
image/kernel to open the notebooks. Training uses five ml.g4dn.2xlarge instances, so you should raise a service limit increase request if your account requires increased limits for this type.
Text classification
Text classification refers to classifying an input sentence to one of the class labels of the training dataset. This notebook demonstrates how to use the JumpStart API for text classification.
Deploy and run inference on the pre-trained model
The text classification model we’ve chosen to use is built upon a text embedding (tensorflow-tc-bert-en-uncased-L-12-H-768-A-12-2
) model from TensorFlow Hub, which is pre-trained on Wikipedia and BookCorpus datasets.
The model available for deployment is created by attaching a binary classification layer to the output of the text embedding model, and then fine-tuning the entire model on the SST-2 dataset, which is comprised of positive and negative movie reviews.
To run inference on this model, we first need to download the inference container (deploy_image_uri
), inference script (deploy_source_uri
), and pre-trained model (base_model_uri
). We then pass those as parameters to instantiate a SageMaker model object, which we can then deploy:
The following code shows our responses:
Fine-tune the pre-trained model on a custom dataset
We just walked through running inference on a pre-trained BERT model, which was fine-tuned on the SST-2
dataset.
Next, we discuss how to fine-tune a model on a custom dataset with any number of classes. The dataset we use for fine-tuning is still the SST-2
dataset. You can replace this dataset with any dataset that you’re interested in.
We retrieve the training Docker container, training algorithm source, and pre-trained model:
For algorithm-specific hyperparameters, we start by fetching a Python dictionary of the training hyperparameters that the algorithm accepts with their default values. You can override them with custom values, as shown in the following code:
The dataset (SST-2
) is split into training, validation, and test sets, where the training set is used to fit the model, the validation set is used to compute evaluation metrics that can be used for HPO, and the test set is used as hold-out data for evaluating model performance. Next, the train and validation dataset are uploaded to Amazon S3 and used to launch the fine-tuning training job:
After the fine-tuning job is complete, we deploy the model, run inference on the hold-out test dataset, and compute evaluation metrics. Because it’s a binary classification task, we use the accuracy score and F1 score as the evaluation metrics. A larger value indicates the better performance. The following screenshot shows our results.
Further improve the fine-tuning performance with SageMaker automatic model tuning
In this step, we demonstrate how you can further improve model performance by fine-tuning the model with SageMaker automatic model tuning. Automatic model tuning, also known as hyperparameter optimization (HPO), finds the best version of a model by running multiple training jobs on your dataset with a range of hyperparameters that you specify. It then chooses the hyperparameter values that result in a model that performs the best, as measured by a metric that you choose, on the validation dataset.
First, we set the objective as the accuracy score on the validation data (val_accuracy
) and defined metrics for the tuning job by specifying the objective metric name and a regular expression (regex). The regular expression is used to match the algorithm’s log output and capture the numeric values of metrics. Next, we specify hyperparameter ranges to select the best hyperparameter values from. We set the total number of tuning jobs as six and distribute these jobs on three different Amazon Elastic Compute Cloud (Amazon EC2) instances for running parallel tuning jobs. See the following code:
We pass those values to instantiate a SageMaker Estimator object, similar to what we did in the previous fine-tuning step. Instead of calling the fit
function of the Estimator
object, we pass the Estimator
object in as a parameter to the HyperparameterTuner constructor and call the fit
function of it to launch tuning jobs:
After the tuning jobs are complete, we deploy the model that gives the best evaluation metric score on the validation dataset, perform inference on the same hold-out test dataset we did in the previous section, and compute evaluation metrics.
The results show that the model selected by automatic model tuning significantly outperforms the model fine-tuned in the previous section on a hold-out test dataset.
Named entity recognition
Named entity recognition (NER) is the process of detecting and classifying named entities into predefined categories, such as names of persons, organizations, locations, and quantities. There are many real-world use cases for NER, such as recommendation engines, categorizing and assigning customer support tickets to the right department, extracting essential information from patient reports in healthcare, and content classification from news and blogs.
Deploy and run inference on the pre-trained model
We deploy the En_core_web_md model from the spaCy library. spaCy is an open-source NLP library that can be used for various tasks, and has built-in methods for NER. We use an AWS PyTorch Deep Learning Container (DLC) with a script mode and install the spaCy library as a dependency on top of the container.
Next, an entry point for the script (argument entry_point.py
) is specified, containing all the code to download and load the En_core_web_md
model and perform inference on the data that is sent to the endpoint. Finally, we still need to provide model_data
as the pre-trained model for inference. Because the pre-trained En_core_web_md
model is downloaded on the fly, which is specified in the entry script, we provide an empty archive file. After the endpoint is deployed, you can invoke the endpoint directly from the notebook using the SageMaker Python SDK’s Predictor
. See the following code:
The input data for the model is a textual document. The named entity model extracts noun chunks and named entities in the textual document and classifies them into a number of different types (such as people, places, and organizations). The example input and output are shown in the following code. The start_char
parameter indicates the character offset for the start of the span, and end_char
indicates the end of the span.
Fine-tune the pre-trained model on a custom dataset
In this step, we demonstrate how to fine-tune a pre-trained language models for NER on your own dataset. The fine-tuning step updates the model parameters to capture the characteristic of your own data and improve accuracy. We use the WikiANN (PAN-X) dataset to fine-tune the DistilBERT-base-uncased Transformer model from Hugging Face.
The dataset is split into training, validation, and test sets.
Next, we specify the hyperparameters of the model, and use an AWS Hugging Face DLC with a script mode (argument entry_point
) to trigger the fine-tuning job:
After the fine-tuning job is complete, we deploy an endpoint and query that endpoint with the hold-out test data. To query the endpoint, each text string needs to be tokenized into one or multiple tokens and sent to the transformer model. Each token gets a predicted named entity tag. Because each text string can be tokenized into one or multiple tokens, we need to duplicate the ground truth named entity tag of the string to all the tokens that are associated to it. The notebook provided walks you through the steps to achieve this.
Lastly, we use Hugging Face built-in evaluation metrics seqeval to compute evaluation scores on the hold-out test data. The evaluation metrics used are overall precision, overall recall, overall F1, and accuracy. The following screenshot shows our results.
Further improve the fine-tuning performance with SageMaker automatic model tuning
Similar to text classification, we demonstrate how you can further improve model performance by fine-tuning the model with SageMaker automatic model tuning. To run the tuning job, we need define an objective metric we want to use for evaluating model performance on the validation dataset (F1 score in this case), hyperparameter ranges to select the best hyperparameter values from, as well as tuning job configurations such as maximum number of tuning jobs and number of parallel jobs to launch at a time:
After the tuning jobs are complete, we deploy the model that gives the best evaluation metric score on the validation dataset, perform inference on the same hold-out test dataset we did in the previous section, and compute evaluation metrics.
We can see that the model with HPO achieves significantly better performance across all metrics.
Question answering
Question answering is useful when you want to query a large amount of text for specific information. It allows a user to express a question in natural language and get an immediate and brief response. Question answering systems powered by NLP can be used in search engines and phone conversational interfaces.
Deploy and run inference on the pre-trained model
Our pre-trained model is the extractive question answering (EQA) model bert-large-uncased-whole-word-masking-finetuned-squad built on a Transformer model from Hugging Face. We use an AWS PyTorch DLC with a script mode and install the transformers library as a dependency on top of the container. Similar to the NER task, we provide an empty archive file in the argument model_data
because the pre-trained model is downloaded on the fly. After the endpoint is deployed, you can invoke the endpoint directly from the notebook using the SageMaker Python SDK’s Predictor
. See the following code:
All we need to do is construct a dictionary object with two keys. context
is the text that we wish to retrieve information from. question
is the natural language query that specifies what information we’re interested in extracting. We call predict
on our predictor, and we should get a response from the endpoint that contains the most likely answers:
We have the response, and we can print out the most likely answers that have been extracted from the preceding text. Each answer has a confidence score used for ranking (but this score shouldn’t be interpreted as a true probability). In addition to the verbatim answer, you also get the start and end character indexes of the answer from the original context:
Now we fine-tune this model with our own custom dataset to get better results.
Fine-tune the pre-trained model on a custom dataset
In this step, we demonstrate how to fine-tune a pre-trained language models for EQA on your own dataset. The fine-tuning step updates the model parameters to capture the characteristic of your own data and improve accuracy. We use the SQuAD2.0 dataset to fine-tune a text embedding model bert-base-uncased from Hugging Face. The model available for fine-tuning attaches an answer extracting layer to the text embedding model and initializes the layer parameters to random values. The fine-tuning step fine-tunes all the model parameters to minimize prediction error on the input data and returns the fine-tuned model.
Similar to the text classification task, the dataset (SQuAD2.0) is split into training, validation, and test set.
Next, we specify the hyperparameters of the model, and use the JumpStart API to trigger a fine-tuning job:
After the fine-tuning job is complete, we deploy the model, run inference on the hold-out test dataset, and compute evaluation metrics. The evaluation metrics used are the average exact matching score and average F1 score. The following screenshot shows the results.
Further improve the fine-tuning performance with SageMaker automatic model tuning
Similar to the previous sections, we use a HyperparameterTuner
object to launch tuning jobs:
After the tuning jobs are complete, we deploy the model that gives the best evaluation metric score on the validation dataset, perform inference on the same hold-out test dataset we did in the previous section, and compute evaluation metrics.
We can see that the model with HPO shows a significantly better performance on the hold-out test data.
Relationship extraction
Relationship extraction is the task of extracting semantic relationships from text, which usually occur between two or more entities. Relationship extraction plays an important role in extracting structured information from unstructured sources such as raw text. In this notebook, we demonstrate two use cases of relationship extraction.
Fine-tune the pre-trained model on a custom dataset
We use a relationship extraction model built on a BERT-base-uncased model using transformers from the Hugging Face transformers library. The model for fine-tuning attaches a linear classification layer that takes a pair of token embeddings outputted by the text embedding model and initializes the layer parameters to random values. The fine-tuning step fine-tunes all the model parameters to minimize prediction error on the input data and returns the fine-tuned model.
The dataset we fine-tune the model is SemEval-2010 Task 8. The model returned by fine-tuning can be further deployed for inference.
The dataset contains training, validation, and test sets.
We use the AWS PyTorch DLC with a script mode from the SageMaker Python SDK, where the transformers
library is installed as the dependency on top of the container. We define the SageMaker PyTorch
estimator and a set of hyperparameters such as the pre-trained model, learning rate, and epoch numbers to perform the fine-tuning. The code for fine-tuning the relationship extraction model is defined in the entry_point.py
. See the following code:
Further improve the fine-tuning performance with SageMaker automatic model tuning
Similar to the previous sections, we use a HyperparameterTuner
object to interact with SageMaker hyperparameter tuning APIs. We can start the hyperparameter tuning job by calling the fit
method:
When the hyperparameter tuning job is complete, we perform inference and check the evaluation score.
We can see that the model with HPO shows better performance on the hold-out test data.
Document summarization
Document or text summarization is the task of condensing large amounts of text data into a smaller subset of meaningful sentences that represent the most important or relevant information within the original content. Document summarization is a useful technique to distill important information from large amounts of text data to a few sentences. Text summarization is used in many use cases, such as document processing and extracting information from blogs, articles, and news.
This notebook demonstrates deploying the document summarization model T5-base from the Hugging Face transformers library. We also test the deployed endpoints using a text article and evaluate results using the Hugging Face built-in evaluation metric ROUGE.
Similar to the question answering and NER notebooks, we use the PyTorchModel
from the SageMaker Python SDK along with an entry_point.py
script to load the T5-base model to an HTTPS endpoint. After the endpoint is successfully deployed, we can send a text article to the endpoint to get a prediction response:
Next, we evaluate and compare the text article and summarization result using the the ROUGE metric. Three evaluation metrics are calculated: rougeN
, rougeL
, and rougeLsum
. rougeN
measures the number of matching n-grams
between the model-generated text (summarization result) and a reference
(input text). The metrics rougeL
and rougeLsum
measure the longest matching sequences of words by looking for the longest common substrings in the generated and reference summaries. For each metric, confidence intervals for precision, recall, and F1 score are calculated.See the following code:
Clean up
Resources created for this solution can be deleted using the Delete all resources button from the SageMaker Studio IDE. Each notebook also provides a clean-up section with the code to delete the endpoints.
Conclusion
In this post, we demonstrated how to utilize state-of-the-art ML techniques to solve five different NLP tasks: document summarization, text classification, question and answering, named entity recognition, and relationship extraction using Jumpstart. Get started with Jumpstart now!
About the Authors
Dr. Xin Huang is an Applied Scientist for Amazon SageMaker JumpStart and Amazon SageMaker built-in algorithms. He focuses on developing scalable machine learning algorithms. His research interests are in the area of natural language processing, explainable deep learning on tabular data, and robust analysis of non-parametric space-time clustering. He has published many papers in ACL, ICDM, KDD conferences, and Royal Statistical Society: Series A journal.
Vivek Gangasani is a Senior Machine Learning Solutions Architect at Amazon Web Services. He helps Startups build and operationalize AI/ML applications. He is currently focused on combining his background in Containers and Machine Learning to deliver solutions on MLOps, ML Inference and low-code ML. In his spare time, he enjoys trying new restaurants and exploring emerging trends in AI and deep learning.
Geremy Cohen is a Solutions Architect with AWS where he helps customers build cutting-edge, cloud-based solutions. In his spare time, he enjoys short walks on the beach, exploring the bay area with his family, fixing things around the house, breaking things around the house, and BBQing.
Neelam Koshiya is an enterprise solution architect at AWS. Her current focus is to help enterprise customers with their cloud adoption journey for strategic business outcomes. In her spare time, she enjoys reading and being outdoors.
A quick guide to Amazon’s papers at NeurIPS 2022
Topics range from specific applications, such as computer vision, to more general problems, such as continual learning, to popular AI methods, such as variational autoencoders.Read More
Easy and accurate forecasting with AutoGluon-TimeSeries
AutoGluon-TimeSeries is the latest addition to AutoGluon, which helps you easily build powerful time series forecasting models with as little as three lines of code.
Time series forecasting is a common task in a wide array of industries as well as scientific domains. Having access to reliable forecasts for supply, demand, or capacity is crucial to planning for businesses. However, time series forecasting is a difficult problem, especially when thousands of potentially related time series are available, such as sales in a large catalog in ecommerce, or capacity at hundreds of operational sites.
Simple statistical or judgement-based forecasting methods are often already strong baselines that are difficult to improve on with novel machine learning (ML) methods. Moreover, applications of recent advances in ML to forecasting are varied, with few methods such as DeepAR [1] or Temporal Fusion Transformers [2] emerging as popular choices. However, these methods are difficult to train, tune, and deploy in production, requiring expert knowledge of ML and time series analysis.
AutoML is a fast-growing topic within ML, focusing on automating common tasks in ML pipelines, including feature preprocessing, model selection, model tuning, ensembling, and deployment. AutoGluon-TimeSeries is the latest addition to AutoGluon, one of the leading open-source AutoML solutions, and builds on AutoGluon’s powerful framework for AutoML in forecasting tasks. AutoGluon-TimeSeries was designed to build powerful forecasting systems with as little as three lines of code, alleviating the challenges of feature preprocessing, model selection, model tuning, and ease of deployment.
With a simple call to AutoGluon-TimeSeries’s TimeSeriesPredictor
, AutoGluon follows an intuitive order of priority in fitting models: starting from simple naive baselines and moving to powerful global neural network and boosted tree-based methods, all within the time budget specified by the user. When related time series (time-varying covariates or exogenous variables) or item metadata (static features) are available, AutoGluon-TimeSeries factors them into the forecast. The library also taps into Bayesian optimization for hyperparameter tuning, arriving to the best model configuration by tuning complex models. Finally, AutoGluon-TimeSeries combines the best of statistical and ML-based methods into a model ensemble optimized for the problem at hand.
In this post, we showcase AutoGluon-TimeSeries’s ease of use in quickly building a powerful forecaster.
Get started with AutoGluon-TimeSeries
To start, you need to install AutoGluon, which is easily done with pip on a UNIX shell:
AutoGluon-TimeSeries introduces the TimeSeriesDataFrame
class for working with datasets that include multiple related time series (sometimes called a panel dataset). These data frames can be created from so-called long format data frames, which have time series IDs and timestamps arranged into rows. The following is one such data example, taken from the M4 competition [3]. Here, the item_id
column specifies the unique identifier of a single time series, such as the product ID for daily sales data of multiple products. The target
column is the value of interest that AutoGluon-TimeSeries will learn to forecast. weekend
is an extra time-varying covariate we produced to mark if the observation was on the weekend or not.
We can easily produce a new TimeSeriesDataFrame
from this dataset using the from_data_frame
constructor. See the following Python code:
Some time series data has non-time-varying features (static features or item metadata) that can be used in training a forecasting model. For example, the M4 dataset features a category variable for each time series. These can be added to the TimeSeriesDataFrame
by setting the static_features
variable with a new data frame.
Use the following code:
Train a TimeSeriesPredictor
Finally, we can call the TimeSeriesPredictor
to fit a wide array of forecasting models to build an accurate forecasting system. See the following code:
Here, we specify that the TimeSeriesPredictor
should produce models to forecast the next seven time periods and judge the best models by using mean absolute scaled error (MASE). Moreover, we indicate that the time-varying covariate weekend
is available in the dataset. We can now fit the predictor object on the TimeSeriesDataFrame
produced earlier:
Apart from providing the training data, we ask the predictor to use “medium_quality”
presets. AutoGluon-TimeSeries comes with multiple presets to select subsets of models to consider and how much time to spend tuning them, managing the trade-off between training speed vs. accuracy. Apart from presets, more experienced users can use a hyperparameters
argument to precisely specify component models and which hyperparameters to set on them. We also specify a time limit of 1,800 seconds, after which the predictor stops training.
Under the hood, AutoGluon-TimeSeries trains as many models as it can within the specified time frame, starting from naive but powerful baselines and working towards more complex forecasters based on boosted trees and neural network models. By calling predictor.leaderboard()
, we can see a list of all models it has trained and the accuracy scores and training times for each. Note that every AutoGluon-TimeSeries model reports its errors in a “higher is better” format, which means most forecasting error measures are multiplied by -1 when reported. See the following example:
Forecast with a TimeSeriesPredictor
Finally, we can use the predictor to predict all time series in a TimeSeriesDataFrame
, 7 days into the future. Note that because we used time-varying covariates that are assumed to be known in the future, these should also be specified at prediction time. See the following code:
By default, AutoGluon-TimeSeries provides both point forecasts and probabilistic (quantile) forecasts of the target value. Probabilistic forecasts are essential in many planning tasks, and they can be used to flexibly compute intervals, enabling downstream tasks such as inventory and capacity planning.
The following is a sample forecast plot demonstrating point forecasts and prediction intervals.
Conclusion
AutoGluon-TimeSeries gives forecasters and data scientists a quick and easy way to build powerful forecasting models. In addition to some of the library’s commonly used features showcased in this post, AutoGluon-TimeSeries features a set of ways to configure forecasts for advanced users. Predictors are also easy to train, deploy, and serve at scale with Amazon SageMaker, using AutoGluon deep learning containers.
For more details on using AutoGluon, examples, tutorials, as well as other tasks AutoGluon tackles such as learning on tabular or multimodal data, visit AutoGluon. To get started using AutoGluon-TimeSeries, check out our quick start tutorial or our in-depth tutorial for a deeper look into all features the library offers. Follow AutoGluon on Twitter, and star us on GitHub to be informed of the latest updates.
For forecasting at scale with dedicated compute and workflows, enterprise-level support, forecast explainability and more, also check out Amazon Forecast.
References
[1] Salinas, David, Valentin Flunkert, Jan Gasthaus, and Tim Januschowski. “DeepAR: Probabilistic forecasting with autoregressive recurrent networks.” International Journal of Forecasting 36. 3 (2020): 1181-1191. [2] Lim, Bryan, Sercan O Arik, Nicolas Loeff, and Tomas Pfister. “Temporal Fusion Transformers for interpretable multi-horizon time series forecasting.” International Journal of Forecasting 37.4 (2021): 1748-1764. [3] Makridakis, Spyros, Evangelos Spiliotis, and Vassilios Assimakopoulos. “The M4 Competition: 100,000 time series and 61 forecasting methods.” International Journal of Forecasting 36.1 (2020): 54-74.About the authors
Caner Turkmen is an Applied Scientist at Amazon Web Services, where he works on problems at the intersection of machine learning and forecasting, in addition to developing AutoGluon-TimeSeries. Before joining AWS, he worked in the management consulting industry as a data scientist, serving the financial services and telecommunications industries on projects across the globe. Caner’s personal research interests span a range of topics, including forecasting, causal inference, and AutoML.
Oleksandr Shchur is an Applied Scientist at Amazon Web Services, where he works on time series forecasting in AutoGluon-TimeSeries. Before joining AWS, he completed a PhD in Machine Learning at the Technical University of Munich, Germany, doing research on probabilistic models for event data. His research interests include machine learning for temporal data and generative modeling.
Nick Erickson is a Senior Applied Scientist at Amazon Web Services. He obtained his master’s degree in Computer Science and Engineering from the University of Minnesota Twin Cities. He is the co-author and lead developer of the open-source AutoML framework AutoGluon. Starting as a personal competition ML toolkit in 2018, Nick continually expanded the capabilities of AutoGluon and joined Amazon AI in 2019 to open-source the project and work full time on advancing the state-of-the-art in AutoML.
How Prime Video distills time series anomalies into actionable alarms
Targeted handling of three distinct types of “special events” dramatically reduces false-alarm rate.Read More
Your guide to AI/ML at AWS re:Invent 2022
AWS re:Invent season is upon us again! Just a few days to go until re:Invent takes place for the 11th year in Las Vegas, Nevada. The Artificial Intelligence and Machine Learning team at AWS has been working hard to offer amazing content, an outstanding AWS DeepRacer experience, and much more. In this post, we give you a sense of how the AI/ML track is organized and highlight a few sessions we think you’ll like.
The technical sessions in the AI/ML track are divided into four areas. First, there are many common use cases that you can address with a combination of AI/ML and other AWS services, such as Intelligent Document Processing, Contact Center Intelligence, and Personalization among others. Second, ML practitioners of all levels will find compelling content on the entire ML lifecycle, such as data preparation, training, inference, MLOps, AutoML, and no-code ML. This year, we have a renewed emphasis on responsible AI. Customers have been looking for more guidance and new tools in this space. And last but never least, we have exciting workshops and activities with AWS DeepRacer—they have become a signature event!
Visit the AWS Village at the Venetian Expo Hall to meet our AI/ML specialists at the AI/ML booth and learn more about AI/ML services and solutions. You can also chat with our AWS Manufacturing experts at the AWS Industries Networking Lounge, in the Caesars Forum Main Hall.
If you’re new to re:Invent, you can attend sessions of the following types:
- Keynotes – Join in-person or virtual, and learn about all the exciting announcements.
- Leadership sessions – Learn from AWS leaders about key topics in cloud computing.
- Breakout sessions – These 60-minute sessions are expected to have broad appeal, are delivered to larger audiences, and will be recorded. If you miss them, you can watch them on demand after re:Invent.
- Chalk talks – 60 minutes of content delivered to smaller audiences with an interactive whiteboarding session. Chalk talks are where discussions happen, and these offer you the greatest opportunity to ask questions or share your opinion.
- Workshops – Hands-on learning opportunities where, in the course of 2 hours, you’ll be able to build a solution to a problem, understand the inner workings of the resulting infrastructure, and cross-service interaction. Bring your laptop and be ready to learn!
- Builders’ sessions – These highly interactive 60-minute mini-workshops are conducted in small groups of less than 10 attendees. Some of these appeal to beginners, and others are on specialized topics.
If you have reserved your seat at any of the sessions, great! If not, we always set aside some spots for walk-ins, so make a plan and come to the room early.
To help you plan your agenda for this year’s re:Invent, here are some highlights of the AI/ML track. So buckle up, and start registering for your favorite sessions.
Visit the session catalog to learn about all AI/ML sessions.
AWS Data and Machine Learning Keynote
Swami Sivasubramanian, Vice President of AWS Data and Machine Learning – Keynote
Wednesday November 30 | 8:30 AM – 10:30 AM PST | The Venetian
Join Swami Sivasubramanian, Vice President of AWS Data and Machine Learning on Wednesday, as he reveals the latest AWS innovations that can help you transform your company’s data into meaningful insights and actions for your business, in person or via livestream.
AI/ML Leadership session
AIM217-L (LVL 200) Innovate with AI/ML to transform your business
Wednesday November 30 | 1:00 PM – 2:00 PM PST
Join Dr. Bratin Saha, VP of AI/ML at AWS, for the AI/ML thought-leadership session. Bratin will share how to use AI/ML to innovate in your business in order to disrupt the status quo. You learn how customers Baxter, BMW, and Alexa have used AWS AI/ML services to fuel business profitability and growth, the latest AI/ML trends, and the details of newly launched AWS capabilities.
Breakout sessions
AIM314 (LVL 300) Accelerate your ML journey with Amazon SageMaker low-code tools
Monday November 28 | 10:00 AM – 11:00 AM PST
In this session, learn how low-code tools, including Amazon SageMaker Data Wrangler, Amazon SageMaker Autopilot, and Amazon SageMaker JumpStart, make it easier to experiment faster and bring highly accurate models to production more quickly and efficiently.
AIM204 (LVL 200) Automate insurance document processing with AI
Monday November 28 | 4:00 PM – 5:00 PM PST
The rapid rate of data generation means that organizations that aren’t investing in document automation risk getting stuck with legacy processes that are slow, error-prone, and difficult to scale. In this session, learn how organizations can take advantage of the latest innovations in AI and ML from AWS to improve the efficiency of their document-intensive claims processing use case.
AIM207 (LVL 200) Make better decisions with no-code ML using SageMaker Canvas, feat. Samsung
Wednesday November 30 | 2:30 PM – 3:30 PM PSTOrganizations everywhere use ML to accurately predict outcomes and make faster business decisions. In this session, learn how you can use Amazon SageMaker Canvas to access and combine data from a variety of sources, clean data, build ML models to generate predictions with a single click, and share models across your organization to improve productivity.
AIM307 (LVL 300) JPMorganChase real-time agent assist for contact center productivity
Wednesday November 30 | 11:30 AM – 12:30 PM PST
Resolving complex customer issues is often time-consuming and requires agents to quickly gather relevant information from knowledge bases to resolve queries accurately. Join this session to learn how JPMorgan Chase built an AWS Contact Center Intelligence (CCI) real-time agent assist solution to help 75 million customers and help 8,500 servicing agents generate next best actions in the shortest time—reducing agent frustration and churn.
AIM321 (LVL 300) Productionize ML workloads using Amazon SageMaker MLOps, feat. NatWest
Wednesday November 30 | 4:45 PM – 5:45 PM PST
Amazon SageMaker provides a breadth of MLOps tools to train, test, troubleshoot, deploy, and govern ML models at scale. In this session, explore SageMaker MLOps features, including SageMaker Pipelines, SageMaker Projects, SageMaker Experiments, SageMaker Model Registry, and SageMaker Model Monitor, and learn how to increase automation and improve the quality of your ML workflows.
AIM319 (LVL 300) Build, manage, and scale ML development with a web-based visual interface
Wednesday November 30 | 3:15 PM – 4:15 PM PST
Amazon SageMaker Studio is an integrated development environment (IDE) for data science and ML. In this session, explore how to use SageMakerStudio to prepare data and build, train, deploy, and manage ML models from a single, web-based visual interface.
Chalk talks
AIM341-R (LVL 300) Transforming responsible AI from theory into practice
Thursday December 1 | 4:15 PM – 5:15 PM PST
The practices of responsible AI can help reduce biased outcomes from models and improve their fairness, explainability, robustness, privacy, and transparency. Walk away from this chalk talk with best practices and hands-on support to guide you in applying responsible AI in your project.
*This chalk talk will be repeated Wednesday November 30 | 7:00 PM – 8:00 PM PST
AIM306-R (LVL 300) Automate content moderation and compliance with AI
Monday November 28 | 12:15 PM – 1:15 PM PST
In this chalk talk, learn how to efficiently moderate high volumes of user-generated content across media types with AI. Discover how to add humans in the moderation loop to verify low-confidence decisions and continuously improve ML models to keep online communities safe and inclusive and lower content moderation costs.
*This chalk talk will be repeated Wednesday November 30 | 9:15 AM – 10:15 AM PST
AIM407-R (LVL 400) Choosing the right ML instance for training and inference on AWS
Wednesday November 30 | 11:30 AM – 12:30 PM PST
This chalk talk guides you through how to choose the right compute instance type on AWS for your deep learning projects. Explore the available options, such as the most performant instance for training, the best instance for prototyping, and the most cost-effective instance for inference deployments.
*This chalk talk will be repeated Wednesday November 30 | 8:30 AM – 9:30 AM PST
AIM328-R (LVL 300) Explain your ML models with Amazon SageMaker Clarify
Tuesday November 29 | 2:00 PM – 3:00 PM PST
Amazon SageMaker Clarify helps organizations understand their model predictions by providing real-time explanations for models deployed on SageMaker endpoints. In this chalk talk, learn how to identify the importance of various features in overall model predictions and for individual inferences using Shapley values and detect any shifts in feature importance over time after a model is deployed to production.
*This chalk talk will be repeated Monday November 28 | 2:30 PM – 3:30 PM PST
Workshops
AIM342 (LVL 300) Advancing responsible AI: Bias assessment and transparency
Wednesday November 30 | 2:30 PM – 4:30 PM PST
Building and operating ML applications responsibly requires an active, consistent approach to prevent, assess, and mitigate bias. This workshop takes you through a computer vision case study in assessing unwanted bias—follow along during the workshop with a Jupyter notebook.
AIM402-R (LVL 400) Extract AI-driven customer insights using Post-Call Analytics
Monday November 28 | 4:00 PM – 6:00 PM PST
Companies are transforming existing contact centers by adding AI/ML to deliver actionable insights and improve automation with existing telephony systems. Join this workshop to learn how to use the AWS Contact Center Intelligence (CCI) Post-Call Analytics solution to derive AI-driven insights from virtually all customer conversations.
*This workshop will be repeated Wednesday November 30 | 9:15 AM – 11:15 AM PST
AIM212-R (LVL 200) Deep learning with Amazon SageMaker, AWS Trainium, and AWS Inferentia
Monday November 28 | 1:00 PM – 3:00 PM PST
Amazon EC2 Trn1 instances, powered by AWS Trainium, and Amazon EC2 Inf1 instances, powered by AWS Inferentia, deliver the best price performance for deep learning training and inference. In this workshop, walk through training a BERT model for natural language processing on Trn1 instances to save up to 50% in training costs over equivalent GPU-based EC2 instances.
*This workshop will be repeated Monday November 28 | 8:30 AM – 10:30 AM PST
AIM312-R (LVL 300) Build a custom recommendation engine in 2 hours with Amazon Personalize
Monday November 28 | 1:00 PM – 3:00 PM PST
In this workshop, learn how to build a customer-specific solution using your own data to deliver personalized experiences that can be integrated into your existing websites, applications, SMS, and email marketing systems using simple APIs.
*This workshop will be repeated Wednesday November 30 | 11:30 AM – 1:30 PM PST
Builders’ sessions
AIM325-R (LVL 300) Build applications faster with an ML-powered coding companion
Tuesday November 29 | 3:30 PM – 4:30 PM PST
Join this builders’ session to get hands-on experience with ML-powered developer tools from AWS. Learn how to accelerate application development with automatic code recommendations from Amazon CodeWhisperer and automate code reviews with Amazon CodeGuru.
*This session will be repeated Thursday December 1 | 12:30 PM – 1:30 PM PST
Make sure to check out the re:Invent content catalog or the AI/ML attendee guide for more AI/ML content at re:Invent.
AWS DeepRacer: Get hands-on with machine learning
Developers of all skill levels can get hands-on with ML at re:Invent by participating in AWS DeepRacer. Learn to build your own ML model from AWS ML experts in one of 11 workshop sessions, featuring guest speakers from JPMorgan Chase and Intel. Compete by racing your own ML model on real championship tracks in both the MGM and the Sands Expo, or hop in the driver’s seat to experience ML fundamentals through the fun of gamified learning with AWS DeepRacer Arcades. Whether in the classroom, on the track, or behind the wheel, AWS DeepRacer is the fastest way to get rolling with ML.
Developers: start your engines! Starting Monday November 28, the top 50 racers from around the world compete in the AWS DeepRacer League Championships presented by Intel, hosted at the AWS DeepRacer Championship Stadium in the Sands Expo. Watch trackside or tune in live on twitch.tv/aws at 3:00 PM PST on Tuesday, November 29, to see the top eight racers battle it out in the semifinals. Cheer on the finalists as they go for their shot at $20,000 in cash prizes and the right to hoist the Championship Cup.
Race on any AWS DeepRacer track on Thursday, December 1, to compete in the 2023 re:Invent Open, where the fastest competitor of the day will claim an all-expenses paid trip back to Vegas to compete in the 2023 AWS DeepRacer Championship Cup.
Attendees who participate in AWS DeepRacer Arcades or open track (non-competitive) racing will also have the chance to win one of six spots in the AWS DeepRacer Winner’s Circle Driving Experience Sweepstakes, where they will race real, full-size exotic cars on a closed track alongside the AWS DeepRacer 2022 Champions in Las Vegas.
Don’t forget to check out the AWS DeepRacer workshops before they fill up to reserve your spot. We can’t wait to see you in Las Vegas!
About the authors
Denis V. Batalov is a 17-year Amazon veteran and a PhD in Machine Learning, Denis worked on such exciting projects as Search Inside the Book, Amazon Mobile apps and Kindle Direct Publishing. Since 2013 he has helped AWS customers adopt AI/ML technology as a Solutions Architect. Currently, Denis is a Worldwide Tech Leader for AI/ML responsible for the functioning of AWS ML Specialist Solutions Architects globally. Denis is a frequent public speaker, you can follow him on Twitter @dbatalov.
Amelie Perkuhn is a Product Marketing Manager on the AI Services team at AWS. She has held various roles within AWS over the past 6 years, and in her current role, she is focused on driving adoption of AI Services including Amazon Kendra. In her spare time, Amelie enjoys the Pacific Northwest with her dog Moxie.
Amazon and UCLA announce fellowship recipients
The UCLA Science Hub fellows fulfill the hub’s mission of researching the societal impact of artificial intelligence.Read More
AlexaTM 20B is now available in Amazon SageMaker JumpStart
Today, we announce the public availability of Amazon’s state-of-the-art Alexa Teacher Model with 20 billion parameters (AlexaTM 20B) through Amazon SageMaker JumpStart, SageMaker’s machine learning hub. AlexaTM 20B is a multilingual large-scale sequence-to-sequence (seq2seq) language model developed by Amazon. You can use AlexaTM 20B for a wide range of industry use-cases, from summarizing financial reports to question answering for customer service chatbots. It can be applied even when there are only a few available training examples, or even none at all. AlexaTM 20B outperforms a 175 billion GPT-3 model on zero-shot learning tasks such as SuperGLUE and shows state-of-the-art performance for multilingual zero-shot tasks such as XNLI.
In this post, we provide an overview of how to deploy and run inference with the AlexaTM 20B model programmatically through JumpStart APIs, available in the SageMaker Python SDK. We exemplify how you can use this model to translate between multiple languages, summarize long-form text, answer questions based on a given context and generate text that appears indistinguishable from human-written text.
AlexaTM 20B and in-context learning
The Alexa Teacher Model (AlexaTM) program by Amazon Alexa AI is designed to build large-scale, multilingual deep learning models (primarily Transformer-based), aiming to improve generalization and handling data scarcity for downstream tasks. With large-scale pre-training, teacher models can generalize well to learn new tasks from sparse data and help developers improve performance on downstream tasks. AlexaTM 20B has shown competitive performance on common natural language processing (NLP) benchmarks and tasks, such as machine translation, data generation and summarization.
Using foundation models such as AlexaTM 20B reduces the need for expensive model pre-training and provides a state-of-the-art starting point to develop task models with less effort and less task-specific training data. One of the key abilities of foundation models is that we can teach a model to perform new tasks such as question and answering in different languages, with very small amounts of input examples and no fine-tuning or gradient updates required. This is known as in-context learning. With only a few examples of a new task provided as context for inference, the AlexaTM 20B model can transfer knowledge from what has been learned during large-scale pre-training, even across languages. This is called few-shot learning. In some cases, the model can perform well without any training data at all, with only an explanation of what should be predicted. This is called zero-shot learning. For example, let’s say we are using AlexaTM 20B for one-shot natural language generation. The input passed to the model is the training example in the form of attribute-value pairs, along with its corresponding output text narrative. The test example is then appended to form the full input prompt, as shown in the following figure.
To learn more about the model, check out 20B-parameter Alexa model sets new marks in few-shot learning or the original paper.
Use of AlexaTM 20B is made available for non-commercial use and is covered under the Alexa Teacher Model License agreement.
Solution overview
The following sections provide a step-by-step demo on how to deploy the model, run inference, and do in-context-learning to solve few-shot learning tasks.
Note that the following section contains code snippets; the full code with all the steps in this demo is available in the accompanying notebook: In-context-learning with AlexaTM 20B in SageMaker JumpStart.
Deploy the model
To use a large language model in SageMaker, you need an inferencing script specific for the model, which includes steps like model loading, parallelization and more. You also need to create end-to-end tests for scripts, model and the desired instance types to validate that all three can work together. JumpStart removes this effort by providing ready-to-use scripts that have been robustly tested.
SageMaker gives you the ability to run Docker containers extensively for training and inferencing. JumpStart uses these available framework-specific SageMaker Deep Learning Containers (DLCs). We start by fetching the optimized DLC (deploy_image_uri
) using the model_id
. Then we fetch the model_uri
containing the model parameters, along with inference handling scripts and any associated dependencies. Next, we create a model instance in SageMaker and deploy it to a real-time endpoint. See the following code:
Deploying AlexaTM 20B requires a GPU-backed instance with at least 50 GB of CPU memory and at least 42 GB of GPU memory. SageMaker provides many such instances that support real-time inference. We tested this solution on three instances: ml.g4dn.12xlarge, ml.p3.8xlarge, ml.p3.16xlarge. See the following code:
Next, we deploy the model to a SageMaker real-time endpoint:
AlexaTM 20B requires 40 GB of disk space in the inference container. An ml.g4dn.12xlarge instance fulfills this requirement. For instance types ml.p3.8xlarge and ml.p3.16xlarge, we attach an Amazon Elastic Block Store (Amazon EBS) volume to handle the large model size. Therefore, we set volume_size = None
when deploying on ml.g4dn.12xlarge and volume_size=256
when deploying on ml.p3.8xlarge or ml.p3.16xlarge.
Deploying the model may take up to 10 minutes. After the model is deployed, we can get predictions from it in real time!
Run inference
AlexaTM 20B is a text generation model which, given a partial sequence (a sentence or piece of text), generates the next set of words. The following code snippet gives you a glimpse of how to query the endpoint we deployed and parse the outputs for auto-completion task. To send requests to a deployed model, we use a JSON dictionary encoded in UTF-8 format. The endpoint response is a JSON object containing a list of generated texts.
Next, we query the endpoint and parse the response on a sample input text:
AlexaTM 20B currently supports 10 text generation parameters during inference: max_length
, num_return_sequences
, num_beams
, no_repeat_ngram_size
, temperature
, early_stopping
, do_sample
, top_k
, top_p
, and seed
. For detailed information on valid values for each parameter and their impact on the output, see the accompanying notebook: In-context-learning with AlexaTM 20B in SageMaker JumpStart.
In-context learning
In-context learning refers to the following: we provide the language model with a prompt, which consists of training input-output pairs that demonstrate the task. We append a test input to the prompt and allow the language model to make predictions by conditioning on the prompt and predicting the next tokens or words. This is a highly effective technique to solve few shot-learning problems, in which we learn a task from a few training samples.
Next, we show how you can use AlexaTM 20B for several 1-shot and zero-shot tasks via in-context learning. Unlike prior sequence-to-sequence models, AlexaTM 20B was trained on causal language modeling in addition to denoising, which makes it a good model for in-context learning.
1-shot text summarization
Text summarization is the task of shortening the data and creating a summary that represents the most important information present in the original text. 1-shot text summarization refers to the setting where we learn to summarize the text based on a single training sample. The following code is a text summarization sample from the XSUM dataset:
We use the following prompt for summarization when only one training sample is provided. The generated text from the model is interpreted as the predicted summary of the test article.
The output is as follows:
1-shot natural language generation
Natural language generation is the task of producing text narratives given the input text. The following sample shows a training sample from the E2E dataset:
We use the following prompt for natural language generation when only one training sample (1-shot) is provided. The generated text from the model is interpreted as the predicted text narrative for the test input (test_inp
).
The output is as follows:
1-shot machine translation
Machine translation is the task of translating text from one language to another. The following example shows a training sample from the WMT19 dataset in which we need to translate from German to English:
We use the following prompt for machine translation when only one training sample (1-shot) is provided. Generated text from the model is interpreted as the translation of the test input (test_inp
).
The output is as follows:
Zero-shot extractive question answering
Extractive question answering is the task of finding the answer to a question from the context paragraph. The following is an example of a context and a question from the SQuAD v2 dataset:
Note that we don’t have any training samples for our task. Instead, we create a dummy question about the last word in the prompt , based on the test_context
(dummy-shot). Therefore, we’re actually doing zero-shot extractive question answering.
We use the following prompt for extractive question answering when no training sample is provided. Generated text from the model is interpreted as the answer to the test question.
The output is as follows:
Prompt Engineering
Prompt engineering can sometimes be an art. Even small changes to the prompt template can result in significant changes to the model’s performance on a specific task. The following are a few pieces of advice for writing good prompt templates. First, it’s important to remember that the model was trained to learn the structure of real sentences (causal language modeling). As such, it’s best to ensure that your prompt template is grammatically and structurally correct in natural language. Second, this particular model benefits from dummy shots to help teach it the structure expected in the answer, as demonstrated above. Third, it’s always advised to examine task performance over a variety of candidate prompt templates. Promptsource and Natural Instructions are two open-source frameworks for standardizing prompt templates, and they provide a variety of example prompts used for existing modeling tasks. Additionally, Appendix B of the AlexaTM 20B paper provides the prompt templates used to generate the results presented in the paper. There is a growing sub-field dedicated to the automatic creation and learning of the best prompts for a task, including both natural language and continuous prompts. This is beyond the scope of this tutorial.
Conclusion
In this post, we showed how to deploy the AlexaTM 20B model on a SageMaker endpoint and run inference. You can use the AlexaTM 20B model for in-context-learning for a variety of few-shot learning tasks. To learn more about AlexaTM 20B, refer to 20B-parameter Alexa model sets new marks in few-shot learning or the original paper.
The authors would like to acknowledge the technical contributions of Maciej Rudnicki, Jakub Debski, Ashish Khetan, Anastasiia Dubinina, Vitaliy Korolev, Karl Albertsen, Saleh Soltan, and Mariusz Momotko toward making this launch possible.
About JumpStart
JumpStart is the machine learning (ML) hub of Amazon SageMaker that offers over 350 pre-trained models, built-in algorithms, and pre-built solution templates to help you get started with ML fast. JumpStart hosts state-of-the-art models from popular model hubs such as TensorFlow, PyTorch, Hugging Face, and MXNet, which support popular ML tasks such as object detection, text classification, and text generation. The ML research community has put a large amount of effort into making a majority of recently developed models publicly available for use. JumpStart aims to help you find right the ML models and algorithms, and immediately start building models. Specifically, JumpStart provides the following benefits:
- Easy access with the UI and SDK – You can access models and algorithms in JumpStart programmatically using the SageMaker Python SDK or through the JumpStart UI in Amazon SageMaker Studio. Currently, AlexaTM 20B is only accessible through the SageMaker Python SDK.
- SageMaker built-in algorithms – JumpStart provides over 350 built-in algorithms and pre-trained models, along with corresponding training scripts (if supported), inferencing scripts, and example notebooks. Scripts are optimized for each framework and task, and provide features such as GPU support, automatic model tuning and incremental training. Scripts are also tested against SageMaker instances and features so that you don’t run into compatibility issues.
- Pre-built solutions – JumpStart provides a set of 23 solutions for common ML use cases, such as demand forecasting and industrial and financial applications, which you can deploy with just a few clicks. Solutions are end-to-end ML applications that string together various AWS services to solve a particular business use case. They use AWS CloudFormation templates and reference architectures for quick deployment, which means they’re fully customizable.
- Support – SageMaker provides a range of support, such as maintaining up-to-date versions when new SageMaker features or Deep Learning Container versions are released, and creating documentation on how to use JumpStart contents in a SageMaker environment.
To learn more about JumpStart and how you can use open-source pre-trained models for a variety of other ML tasks, check out the following AWS re:Invent 2020 video.
About the Authors
Dr. Vivek Madan is an Applied Scientist with the Amazon SageMaker JumpStart team. He got his PhD from University of Illinois at Urbana-Champaign and was a Post Doctoral Researcher at Georgia Tech. He is an active researcher in machine learning and algorithm design and has published papers in EMNLP, ICLR, COLT, FOCS, and SODA conferences.
Jack FitzGerald is a senior applied scientist with Alexa AI, where he currently focuses on large language modeling, multilingual text modeling, and machine learning operations.
João Moura is an AI/ML Specialist Solutions Architect at Amazon Web Services. He is mostly focused on NLP use cases and helping customers optimize deep learning model training and deployment. He is also an active proponent of low-code ML solutions and ML-specialized hardware.
June Won is a product manager with SageMaker JumpStart and Built-in Algorithms. He focuses on making ML contents easily discoverable and usable for SageMaker customers.
Pulkit Kapur is the product lead for the Alexa Teacher Model program with Alexa AI, focusing on generalized intelligence and applications of Alexa’s multitask multimodal foundation models.