Customizing your machine translation using Amazon Translate Active Custom Translation

Customizing your machine translation using Amazon Translate Active Custom Translation

When translating the English phrase “How are you?” to Spanish, would you prefer to use “¿Cómo estás?” or “¿Cómo está usted?” instead?

Amazon Translate is a neural machine translation service that delivers fast, high-quality, and affordable language translation. Today, we’re excited to introduce Active Custom Translation (ACT), a feature that gives you more control over your machine translation output. You can now influence what machine translation output you would like to get between “¿Cómo estás?” or “¿Cómo está usted?”. To make ACT work, simply provide your translation examples in TMX, TSV, or CSV format to create parallel data (PD), and Amazon Translate uses your PD along with your batch translation job to customize the translation output at runtime. If you have PD that shows “How are you?” being translated to “¿Cómo está usted?”, ACT knows to customize the translation to “¿Cómo está usted?”.

Today, professional translators use examples of previous translations to provide more customized translations for customers. Similar to profession translators, Amazon Translate can now provide customized translations by learning from your translation examples.

Traditionally, this customization was done by creating a custom translation model­—a specific-purpose translation engine built using customer data. Building custom translation models is complex, tedious, and expensive. It requires special expertise to prepare the data for training, testing, and validation. Then you build, deploy, and maintain the model by updating the model frequently. To save on model training and management costs, you may choose to delay updating your custom translation model, which means your models are always stale—negatively affecting your custom translation experience. In spite of all this work, these custom models perform well when the translation job is within the domain of your data. However, they tend to perform worse than a generic model when the translation job is outside of the domain of your customization data.

Amazon Translate ACT introduces an innovative way of providing customized translation output on the fly with your parallel data, without building a custom translation model. ACT output quality is always up to date with your PD. ACT provides the best translations for jobs both within the domain and outside the domain of PD. For example, if a source sentence isn’t in the domain of the PD, the translation output is still as good as the generic translation with no significant deterioration in translation quality. You no longer need to go through the tedious process of building and retraining custom translation models for each incoming use case. Just update the PD, and the ACT output automatically adapts to the most recent PD, without needing any retraining.

“Innovation is in our DNA. Our customers look to AWS to lead in customization of machine translation. Current custom translation technology is inefficient, cumbersome, and expensive,” says Marcello Federico, Principal Applied Scientist at Amazon Machine Learning, AWS. “Active Custom Translation allows our customers to focus on the value of their latest data and forget about the lifecycle management of custom translation models. We innovated on behalf of the customer to make custom machine translation easy.”

Don’t just take our word for it

Custom.MT implements machine translation for localization groups and translation companies. Konstantin Dranch, Custom.MT co-founder, shares, “Amazon Translate’s ACT is a breakthrough machine translation setup. A manual engine retraining takes 15–16 work hours, that’s why most language teams in the industry update their engines only once a month or once a quarter. With ACT, retraining is continuous and engines improve every day based on edits by human translators. Even before the feature was released to the market, we saw tremendous interest from leading software localization teams. With a higher quality of machine translation, enterprise teams can save millions of USD in manual translations and improve other KPIs, such as international user engagement and time to market.”

Welocalize is a leading global localization and translation company. Senior Manager of AI Deployments at Welocalize Alex Yanishevsky says, “Welocalize produces high-quality translations, so our customers can transform their content and data to grow globally and expand into international markets. Active Custom Translation from Amazon Translate allows us to customize our translations at runtime and provides us with significant flexibility in our production cycles. In addition, we see great business value and engine quality improvement since we can retrain engines frequently without incurring additional hosting or training charges.”

One Hour Translation is a leading professional language services provider. Yair Tal, CEO of One Hour Translation, says, “The customer demand for customized Neural Machine Translation (NMT) is growing every month because of the cost savings. As one of the first to try Amazon Translate ACT, we have found that ACT provides the best translation output for many language pairs. With ACT, training and maintenance is simple and the Translate API integrates with our system seamlessly. Translate’s pay-as-you-translate pricing helps our clients, both big and small, get translation output that is tailored for their needs without paying to train custom models.”

Building an Active Custom Translation job

Active Custom Translation’s capabilities are built right into the Amazon Translate experience. In this post, we walk you through the step-by-step process of using your data and getting a customized machine translated output securely. ACT is now available on batch translation, so first familiarize yourself with how to create a batch translation job.

You need data to customize your translation for terms or phrases that are unique to a specific domain, such as life sciences, law, or finance. You bring in examples of high-quality translations (source sentence and translated target sentence) in your preferred domain as a file in TMX, TSV, or CSV format. This data should also be UTF-8 encoded. You use this data to create a PD. Amazon Translate uses this PD to customize your machine translation. Each PD can be up to 1 GB large. You can upload up to 1,000 PD per account per Region. The 1,000 parallel data limit can be increased upon request. You get free storage for parallel data for up to 200 GB. You pay the local Amazon Simple Storage Service (Amazon S3) rate for excess data stored.

For our use case, I have my data in TSV format, and the name of my file is Mydata.tsv. I first upload this file to an S3 location (for this post, I store my data in s3://input-s3bucket/Paralleldata/).

The following table summarizes the contents of the file.

en es
Amazon Translate is a neural machine translation service. Amazon Translate es un servicio de traducción automática basado en redes neuronales.
Neural machine translation is a form of language translation automation that uses deep learning models. La traducción automática neuronal es una forma de automatizar la traducción de lenguajes utilizando modelos de aprendizaje profundo.
How are you? ¿Cómo está usted?

We run this example in the US West (Oregon) Region, us-west-2.

CreateParallelData

Calling the CreateParallelData API creates a PD resource record in our database and asynchronously starts a workflow for processing the PD file and ingesting it into our service.

CLI

The following CLI commands are formatted for Unix, Linux, and macOS. For Windows, replace the backslash () Unix continuation character at the end of each line with a caret (^).

Run the following CLI command:

aws translate create-parallel-data 
--name ${PARALLEL_DATA_NAME}
--parallel-data-config S3Uri=${S3_URI},Format=${FORMAT} 
--region ${REGION}

I use Mydata.tsv to create my PD my-parallel-data-1:

aws translate create-parallel-data 
--name my-parallel-data-1 
--parallel-data-config S3Uri= s3://input-s3bucket/Paralleldata/Mydata.tsv,Format=TSV 
--region us-west-2 

You get a response like the following code:

{
    "Name": "my-parallel-data-1",
    "Status": "CREATING"
}

This means that your PD is being created now.

Run aws translate create-parallel-data help for more information.

Console

To use the Amazon Translate console, complete the following steps:

  1. On the Amazon Translate console, under Customization, choose Parallel data.
  2. Choose Create parallel data.

  1. For Name, insert my-parallel-data-1.
  2. For Parallel data location in S3, enter your S3 location (for this post, s3://input-s3bucket/Paralleldata/Mydata.tsv).
  3. For File format¸ you can choose CSV, TSV, or TMX. For this post, we choose Tab-separated values (.tsv).

Your data is always secure with Amazon Translate. It’s encrypted using an AWS owned encryption key by default. You can encrypt it using a key from your current account or use a key from a different account.

  1. For this post, for Encryption key, we select Use AWS owned key.
  2. Choose Create parallel data.

ListParallelData

Calling the ListParallelData API returns a list of PD that exists and their details (it doesn’t include a pre-signed Amazon S3 URL for downloading the data)

CLI

Run the following CLI command:

aws translate list-parallel-data 
--region us-west-2

You get a response like the following code:

{
    "ParallelDataPropertiesList": [
        {
            "Name": "my-parallel-data-1",
            "Arn": "arn:aws:translate:us-west-2:123456789012:parallel-data/my-parallel-data-1",
            "Status": "ACTIVE",
            "SourceLanguageCode": "en",
            "TargetLanguageCodes": [
                "es"
            ],
            "ParallelDataConfig": {
                "S3Uri": "s3://input-s3bucket/Paralleldata/Mydata.tsv",
                "Format": "TSV"
            },
            "ImportedDataSize": 532,
            "ImportedRecordCount": 3,
            "FailedRecordCount": 0,
            "CreatedAt": 1234567890.406,
            "LastUpdatedAt": 1234567890.675
        }
    ]
}

The "Status": "ACTIVE" means your PD is ready for you to use.

Run aws translate list-parallel-data help for more information.

Console

This following screenshot shows the result for list-parallel-data on the Amazon Translate console.

GetParallelData

Calling the GetParallelData API returns details of the named parallel data and a pre-signed Amazon S3 URL for downloading the data.

CLI

Run the following CLI command:

aws translate get-parallel-data 
--name ${PARALLEL_DATA_NAME} 
--region ${REGION}

For example, my code looks like the following:

aws translate get-parallel-data 
--name my-parallel-data-1 
--region us-west-2

You get a response like the following code:

{
    "ParallelDataProperties": {
        "Name": "my-parallel-data-1",
        "Arn": "arn:aws:translate:us-west-2:123456789012:parallel-data/my-parallel-data-1",
        "Status": "ACTIVE",
        "SourceLanguageCode": "en",
        "TargetLanguageCodes": [
            "es"
        ],
        "ParallelDataConfig": {
            "S3Uri": "s3://input-s3bucket/Paralleldata/Mydata.tsv",
            "Format": "TSV"
        },
        "ImportedDataSize": 532,
        "ImportedRecordCount": 3,
        "FailedRecordCount": 0,
        "CreatedAt": 1234567890.406,
        "LastUpdatedAt": 1234567890.675
    },
    "DataLocation": {
        "RepositoryType": "S3",
        "Location": "xxx"
    }
}

“Location” contains the pre-signed Amazon S3 URL for downloading the data.

Run aws translate get-parallel-data help for more information.

Console

On the Amazon Translate console, choose one of the PD files on the Parallel data page.

You’re directed to another page that includes the detail for this parallel data file. The following screenshot shows the details for get-parallel-data.

UpdateParallelData

Calling the UpdateParallelData API replaces the old parallel data with the new one.

CLI

Run the following CLI command:

aws translate update-parallel-data 
--name ${PARALLEL_DATA_NAME}
--parallel-data-config S3Uri=${NEW_S3_URI},Format=${FORMAT} 
--region us-west-2

For this post, Mydata1.tsv is my new parallel data. My code looks like the following:

aws translate update-parallel-data 
--name my-parallel-data-1 
--parallel-data-config S3Uri= s3://input-s3bucket/Paralleldata/Mydata1.tsv,Format=TSV 
--region us-west-2

You get a response like the following code:

{
    "Name": "my-parallel-data-1",
    "Status": "ACTIVE",
    "LatestUpdateAttemptStatus": "UPDATING",
    "LatestUpdateAttemptAt": 1234567890.844
}

The "LatestUpdateAttemptStatus": "UPDATING" means your parallel data is being updated now.

Wait for a few minutes and run get-parallel-data again. You can see the parallel data get updated, such as in the following code:

{
    "ParallelDataProperties": {
            "Name": "my-parallel-data-1",
            "Arn": "arn:aws:translate:us-west-2:123456789012:parallel-data/my-parallel-data-1",
            "Status": "ACTIVE",
            "SourceLanguageCode": "en",
            "TargetLanguageCodes": [
                "es"
            ],
            "ParallelDataConfig": {
                "S3Uri": "s3://input-s3bucket/Paralleldata/Mydata1.tsv",
                "Format": "TSV"
            },
        ...
    }
}

We can see that the parallel data has been updated from Mydata.tsv to Mydata1.tsv.

Run aws translate update-parallel-data help for more information.

Console

On the Amazon Translate console, choose the parallel data file and choose Update.

You can replace the new parallel data file with the existing one by specifying the new Amazon S3 URL.

Creating your first Active Custom Translation job

In this section, we discuss the different ways you can create your ACT job.

StartTextTranslationJob

Calling the StartTextTranslationJob starts a batch translation. When you add parallel data to a batch translation job, you create an ACT job. Amazon Translate customizes your ACT output to match the style, tone, and word choices it finds in your PD. ACT is a premium product, so see Amazon Translate pricing for pricing information. You can only specify one parallel data file to use with the text translation job.

CLI

Run the following command:

aws translate start-text-translation-job 
--input-data-config ContentType=${CONTENT_TYPE},S3Uri=${INPUT_S3_URI} 
--output-data-config S3Uri=${OUTPUT_S3_URI} 
--data-access-role-arn ${DATA_ACCESS_ROLE}
--source-language-code=${SOURCE_LANGUAGE_CODE} --target-language-codes=${TARGET_LANGUAGE_CODE} 
--parallel-data-names ${PARALLEL_DATA_NAME}
--region ${REGION}
--job-name ${JOB_NAME}

For example, my code looks like the following:

aws translate start-text-translation-job 
--input-data-config ContentType=application/vnd.openxmlformats-officedocument.spreadsheetml.sheet,S3Uri= s3://input-s3bucket/inputfile/ 
--output-data-config S3Uri= s3://output-s3bucket/Output/ 
--data-access-role-arn arn:aws:iam::123456789012:role/TranslateBatchAPI 
--source-language-code=en --target-language-codes=es 
--parallel-data-names my-parallel-data-1 
--region us-west-2 
--job-name ACT1

You get a response like the following code:

{
    "JobId": "4446f95f20c88a4b347449d3671fbe3d",
    "JobStatus": "SUBMITTED"
}

This output means the job has been submitted successfully.

Run aws translate start-text-translation-job help for more information.

Console

For instructions on running a batch translation job on the Amazon Translate console, see Translating documents, spreadsheets, and presentations in Office Open XML format using Amazon Translate. Choose my-parallel-data-1 as the parallel data to create your first ACT job, ACT1.

Congratulations! You have created your first ACT job. ACT is available in the following Regions:

  • US East (Northern Virginia)
  • US West (Oregon)
  • Europe (Ireland)

Running your Active Custom Translation job

ACT works on asynchronous batch translation for language pairs that have English as either the source or target language.

Now, let’s try to translate the following text from English to Spanish and see how ACT helps to customize the output:

“How are you?” is one of the most common questions you’ll get asked when meeting someone. The most common response is “good”

The following is the output you get when you translate without any customization:

“¿Cómo estás?” es una de las preguntas más comunes que se le harán cuando conozca a alguien. La respuesta más común es “Buena”

The following is the output you get when you translate using ACT with my-parallel-data-1 as the PD:

“¿Cómo está usted?” es una de las preguntas más comunes que te harán cuando te reúnas con alguien. La respuesta más común es “Buena”

Conclusion

Amazon Translate ACT introduces a powerful way of providing personalized translation output with the following benefits:

  • You don’t have to build a custom translation model
  • You only pay for what you translate using ACT
  • There is no additional model building or model hosting cost
  • Your data is always secure and always under your control
  • You get the best machine translation even when your source text is outside the domain of your parallel data
  • You can update your parallel data as often as you need for no additional cost

Try ACT today. Bring your parallel data and start customizing your machine translation output. For more information about Amazon Translate ACT, see Asynchronous Batch Processing.

Related resources

For additional resources, see the following:

 


About the Authors

Watson G. Srivathsan is the Sr. Product Manager for Amazon Translate, AWS’s natural language processing service. On weekends you will find him exploring the outdoors in the Pacific Northwest.

 

 

 

Xingyao Wang is the Software Develop Engineer for Amazon Translate, AWS’s natural language processing service. She likes to hang out with her cats at home.

Read More

Getting started with Amazon Kendra ServiceNow Online connector

Getting started with Amazon Kendra ServiceNow Online connector

Amazon Kendra is a highly accurate and easy-to-use intelligent search service powered by machine learning (ML). To make it simple to search data across multiple content repositories, Amazon Kendra offers a number of native data source connectors to help get your documents easily ingested and indexed.

This post describes how you can use the Amazon Kendra ServiceNow connector. To allow the connector to access your ServiceNow site, you need to know your ServiceNow version, the Amazon Kendra index, the ServiceNow host URL, and the credentials of a user with the ServiceNow admin role attached to it. The ServiceNow credentials needed for the Amazon Kendra ServiceNow connector to work are securely stored in AWS Secrets Manager, and can be entered during the connector setup.

Currently, Amazon Kendra has two provisioning editions: the Amazon Kendra Developer Edition for building proof of concepts (POCs), and the Amazon Kendra Enterprise Edition. Amazon Kendra connectors work with both these editions.

The Amazon Kendra ServiceNow Online connector indexes Service Catalog items and public articles that have a published state, so a knowledge base article must have the public role under Can Read, and Cannot Read must be null or not set.

Prerequisites

To get started, you need the following:

  • The ServiceNow host URL
  • Username and Password of a user with the admin role
  • Know your ServiceNow version

The user that you use for the connector needs to have the admin role in ServiceNow. This is defined on ServiceNow’s User Administration page (see the section Insufficient Permissions for more information).

When setting up the ServiceNow connector, we need to define if our build is London or a different ServiceNow version. To obtain our build name, we can go on the System Diagnostics menu and choose Stats.

In the following screenshot, my build name is Orlando, so I indicate on the connector that my version is Others.

Creating a ServiceNow connector in the Amazon Kendra console

The following section describes the process of deploying an Amazon Kendra index and configuring a ServiceNow connector.

  1. Create your index. For instructions, see Getting started with the Amazon Kendra SharePoint Online connector.

If you already have an index, you can skip this step.

The next step is to set up the data sources. One of the advantages of implementing Amazon Kendra is that you can use a set of pre-built connectors for data sources, such as Amazon Simple Storage Service (Amazon S3), Amazon Relational Database Service (Amazon RDS), Salesforce, ServiceNow, and SharePoint Online, among others.

For this post, we use the ServiceNow connector.

  1. On the Amazon Kendra console, choose Indexes.
  2. Choose MyServiceNowindex.
  3. Choose Add data sources.

  1. Choose ServiceNow Online.

  1. For Name, enter a connector name.
  2. For Description, enter an optional description.
  3. For Tags¸ you can optionally assign tags to your data source.
  4. Choose Next.

In this next step, we define targets.

  1. For ServiceNow host, enter the host name.
  2. For ServiceNow version, enter your version (for this post, we choose Others).
  3. For IAM role, we can create a new AWS Identity and Access Management (IAM) role or use an existing one.

For more information, see IAM role for ServiceNow data sources.

This role has four functions:

If you use an existing IAM role, you have to grant permissions to this secret in Secrets Manager. If you create a new IAM and a new secret, no further action is required.

  1. Choose Next.

You then need to define ServiceNow authentication details, the content to index, and the synchronization schedule.

The ServiceNow user you provide for the connector needs to have the admin role.

  1. In the Authentication section, for Type of authentication, choose an existing secret or create a new one. For this post, we choose New.
  2. Enter your secret’s name, username, and password.

  1. In the ServiceNow configuration section, we define the content types we need to index: Knowledge articles, Service catalog items, or both.
  2. You also define if it include the item attachments.

Amazon Kendra only indexes public articles that have a published state, so a knowledge base article must have the public role under Can Read, and Cannot Read must be null or not set.

  1. You can include or exclude some file extensions (for example, for Microsoft Word, we have six different types of extensions).

  1. For Frequency, choose Run on demand.

  1. Add field mappings.

Even though this is an optional step, it’s a good idea to add this extra layer of metadata to our documents from ServiceNow. This metadata enables you to improve accuracy through manual tuning, filtering, and faceting. There is no way to add metadata to already ingested documents, so if you want to add metadata later, you need to delete this data source and recreate a data source with metadata and ingest your documents again.

If you map fields through the console when setting up the ServiceNow connector for the first time, these fields are created automatically. If you configure the connector via the API, you need update your index first and define those new fields.

You can map ServiceNow properties to Amazon Kendra index fields. The following table is the list of fields that we can map.

ServiceNow Field Name Suggested Amazon Kendra Field Name
content _document_body
displayUrl sn_display_url
first_name sn_ka_first_name
kb_category sn_ka_category
kb_catagory_name _category
kb_knowledge_base sn_ka_knowledge_base
last_name sn_ka_last_name
number sn_kb_number
published sn_ka_publish_date
repItemType sn_repItemType
short_description _document_title
sys_created_by sn_createdBy
sys_created_on _created_at
sys_id sn_sys_id
sys_updated_by sn_updatedBy
sys_updated_on _last_updated_at
url sn_url
user_name sn_ka_user_name
valid_to sn_ka_valid_to
workflow_state sn_ka_workflow_state

Even though there are suggested Kendra field names you can define, you can map a field into a different name.

The following table summarizes the available service catalog fields.

ServiceNow Field Name Suggested Amazon Kendra Field Name
category sn_sc_category
category_full_name sn_sc_category_full_name
category_name _category
description _document_body
displayUrl sn_display_url
repItemType sn_repItemType
sc_catalogs sn_sc_catalogs
sc_catalogs_name sn_sc_catalogs_name
short_description _document_body
sys_created_by sn_createdBy
sys_created_on _created_at
sys_id sn_sys_id
sys_updated_by sn_updatedBy
sys_updated_on _last_updated_at
title _document_title
url sn_url

For this post, our Amazon Kendra index has a custom index field called MyCustomUsername, which you can use to map the Username field from different data sources. This custom field was created under the index’s facet definition. The following screenshot shows a custom mapping.

  1. Review the settings and choose Create data source.

After your ServiceNow data source is created, you see a banner similar to the following screenshot.

  1. Choose Sync now to start the syncing and document ingestion process.

If everything goes as expected, you can see the status as Succeeded.

Testing

Now that you have synced your ServiceNow site you can test it on the Amazon Kendra’s search console.

In my case, my ServiceNow site has the demo examples, so I asked what is the storage on the ipad 3, which returned information from a service catalog item:

Creating a ServiceNow connector with Python

We saw how to create an index on the Amazon Kendra console; now we create a new Amazon Kendra index and a ServiceNow connector and sync it by using the AWS SDK for Python (Boto3). Boto3 makes it easy to integrate your Python application, library, or script with AWS services, including Amazon Kendra.

My personal preference to test my Python scripts is to spin up an Amazon SageMaker notebook instance, a fully managed ML Amazon Elastic Compute Cloud (Amazon EC2) instance that runs the Jupyter Notebook app. For instructions, see Create an Amazon SageMaker Notebook Instance.

To create an index using the AWS SDK, we need to have the policy AmazonKendraFullAccess attached to the role we use.

Also, Amazon Kendra requires different roles to operate:

  • IAM roles for indexes, which are needed by Amazon Kendra to write to Amazon CloudWatch Logs.
  • IAM roles for data sources, which are needed when we use the CreateDataSource These roles require a specific set of permissions depending on the connector we use. Because we use ServiceNow data sources, it must provide permissions to:
    • Secrets Manager, where the ServiceNow online credentials are stored.
    • Permission to use the AWS Key Management Service (AWS KMS) customer master Key (CMK) to decrypt the credentials by Secrets Manager.
    • Permission to use the BatchPutDocument and BatchDeleteDocument operations to update the index.

For more information, see IAM access roles for Amazon Kendra.

Our current requirements are:

  • Amazon SageMaker Notebooks execution role with permission to create an Amazon Kendra index using an Amazon SageMaker notebook
  • Amazon Kendra IAM role for CloudWatch
  • Amazon Kendra IAM role for ServiceNow connector
  • ServiceNow credentials stored on Secrets Manager

To create an index, we use the following code:

import boto3
 from botocore.exceptions import ClientError
 import pprint
 import time
  
 kendra = boto3.client("kendra")
  
 print("Creating an index")
  
 description = <YOUR_INDEX_DESCRIPTION>
 index_name = <YOUR_NEW_INDEX_NAME>
 role_arn = <KENDRA_ROLE_WITH_CLOUDWATCH_PERMISSIONS ROLE>
  
 try:
     index_response = kendra.create_index(
         Description = <DESCRIPTION>,
         Name = index_name,
         RoleArn = role_arn,
         Edition = "DEVELOPER_EDITION",
         Tags=[
         {
             'Key': 'Project',
             'Value': 'SharePoint Test'
         } 
         ]
     )
  
     pprint.pprint(index_response)
  
     index_id = index_response['Id']
  
     print("Wait for Kendra to create the index.")
  
     while True:
         # Get index description
         index_description = kendra.describe_index(
             Id = index_id
         )
         # If status is not CREATING quit
         status = index_description["Status"]
         print("    Creating index. Status: "+status)
         if status != "CREATING":
             break
         time.sleep(60)
  
 except  ClientError as e:
         print("%s" % e)
  
 print("Done creating index.")

While our index is being created, we obtain regular updates (every 60 seconds to be exact, check line 38) until the process is finished. See the following code:

Creating an index
 {'Id': '3311b507-bfef-4e2b-bde9-7c297b1fd13b',
  'ResponseMetadata': {'HTTPHeaders': {'content-length': '45',
                                       'content-type': 'application/x-amz-json-1.1',
                                       'date': 'Wed, 12 Aug 2020 12:58:19 GMT',
                                       'x-amzn-requestid': 'a148a4fc-7549-467e-b6ec-6f49512c1602'},
                       'HTTPStatusCode': 200,
                       'RequestId': 'a148a4fc-7549-467e-b6ec-6f49512c1602',
                       'RetryAttempts': 2}}
 Wait for Kendra to create the index.
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: ACTIVE
 Done creating index

The preceding code indicates that our index has been created and our new index ID is 3311b507-bfef-4e2b-bde9-7c297b1fd13b (your ID is different from our example code). This information is included as ID in the response.

Our Amazon Kendra index is up and running now.

If you have metadata attributes associated with your ServiceNow articles, you want to do three things:

  1. Determine the Amazon Kendra attribute name you want for each of your ServiceNow metadata attributes. By default, Amazon Kendra has six reserved fields (_category, created_at, _file_type, _last_updated_at, _source_uri, and _view_count).
  2. Update the index with the UpdateIndex API call with the Amazon Kendra attribute names.
  3. Map each ServiceNow metadata attribute to each Amazon Kendra metadata attribute.

You can find a table with metadata attributes and the suggested Amazon Kendra fields under step 20 on the previous section.

For this post, I have the metadata attribute UserName associated with my ServiceNow article and I want to map it to the field MyCustomUsername on my index. The following code shows how to add the attribute MyCustomUsername to my Amazon Kendra index. After we create this custom field in our index, we map our field Username from ServiceNow to it. See the following code:

try:
     update_response = kendra.update_index(
         Id='3311b507-bfef-4e2b-bde9-7c297b1fd13b',
         RoleArn='arn:aws:iam::<MY-ACCOUNT-NUMBER>:role/service-role/AmazonKendra-us-east-1-KendraRole',
         DocumentMetadataConfigurationUpdates=[
         {
             'Name': <MY_CUSTOM_FIELD_NAME>,
             'Type': 'STRING_VALUE',
             'Search': {
                 'Facetable': True,
                 'Searchable': True,
                 'Displayable': True
             }
         }   
     ]
     )
 except  ClientError as e:
         print('%s' % e)   
 pprint.pprint(update_response)

If everything goes well, we receive a 200 response:

{'ResponseMetadata': {'HTTPHeaders': {'content-length': '0',
                                       'content-type': 'application/x-amz-json-1.1',
                                       'date': 'Wed, 12 Aug 2020 12:17:07 GMT',
                                       'x-amzn-requestid': '3eba66c9-972b-4757-8d92-37be17c8f8a2},
                       'HTTPStatusCode': 200,
                       'RequestId': '3eba66c9-972b-4757-8d92-37be17c8f8a2',
                       'RetryAttempts': 0}} 
  

 }

We also need to have GetSecretValue for our secret stored in Secrets Manager.

If you need to create a new secret in Secrets Manager to store your ServiceNow credentials, make sure the role you use has permissions to CreateSecret and tagging for Secrets Manager. The policy should look like the following code:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "SecretsManagerWritePolicy",
            "Effect": "Allow",
            "Action": [
                "secretsmanager:UntagResource",
                "secretsmanager:CreateSecret",
                "secretsmanager:TagResource"
            ],
            "Resource": "*"
        }
    ]
}

The following code creates a secret in Secrets Manager:

secretsmanager = boto3.client('secretsmanager')

SecretName = <YOURSECRETNAME>
SharePointCredentials = "{'username': <YOUR_SERVICENOW_SITE_USERNAME>, 'password': <YOUR_SERVICENOW_SITE_PASSWORD>}"

try:
  create_secret_response = secretsmanager.create_secret(
  Name=SecretName,
  Description='Secret for a servicenow data source connector',
  SecretString=SharePointCredentials,
  Tags=[
   {
    'Key': 'Project',
    'Value': 'ServiceNow Test'
   }
 ]
 )
except ClientError as e:
  print('%s' % e)
  pprint.pprint(create_secret_response)

If everything goes well, you get a response with your secret’s ARN:

{'ARN':<YOUR_SECRETS_ARN>,
 'Name': <YOUR_SECRET_NAME>,
 'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
                                      'content-length': '161',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Sat, 22 Aug 2020 14:44:13 GMT',
                                      'x-amzn-requestid': '68c9a153-c08e-42df-9e6d-8b82550bc412'},
                      'HTTPStatusCode': 200,
                      'RequestId': '68c9a153-c08e-42df-9e6d-8b82550bc412',
                      'RetryAttempts': 0},
 'VersionId': 'bee74dab-6beb-4723-a18b-4829d527aad8'}

Now that we have our Amazon Kendra index, our custom field, and our ServiceNow credentials, we can proceed with creating our data source.

To ingest documents from this data source, we need an IAM role with Kendra:BatchPutDocument and kendra:BatchDeleteDocument permissions. For more information, see IAM roles for Microsoft SharePoint Online data sources. We use the ARN for this IAM role when invoking the CreateDataSource API.

Make sure the role you use for your data source connector has a trust relationship with Amazon Kendra. It should look like the following code:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "kendra.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]

The following code is the policy structure we need:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetSecretValue"
            ],
            "Resource": [
                "arn:aws:secretsmanager:region:account ID:secret:secret ID"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "kms:Decrypt"
            ],
            "Resource": [
                "arn:aws:kms:region:account ID:key/key ID"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "kendra:BatchPutDocument",
                "kendra:BatchDeleteDocument"
            ],
            "Resource": [
                "arn:aws:kendra:<REGION>:<account_ID>:index/<index_ID>"
            ],
            "Condition": {
                "StringLike": {
                    "kms:ViaService": [
                        "kendra.amazonaws.com"
                    ]
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::<BUCKET_NAME>/*"
            ]
        }
    ]
}

Finally, the following code is my role’s ARN:

arn:aws:iam::<MY_ACCOUNT_NUMBER>:role/Kendra-Datasource

Following the least privilege principle, we only allow our role to put and delete documents in our index, and read the secrets to connect to our ServiceNow site.

One detail we can specify when creating a data source is the sync schedule, which indicates how often our index syncs with the data source we create. This schedule is defined on the Schedule key of our request. You can use schedule expressions for rules to define how often you want to sync your data source. For this post, I use the ScheduleExpression 'cron(0 11 * * ? *)', which means that our data source is synced every day at 11:00 AM.

I use the following code. Make sure you match your SiteURL and SecretARN, as well as your IndexID. Additionally, FieldMappings is where you map between the ServiceNow attribute names with the Amazon Kendra index attribute names. I chose the same attribute name in both, but you can call the Amazon Kendra attribute whatever you’d like.

For more details, see  create_data_source(**kwargs).

print('Create a data source')
 
SecretArn= "<YOUR_SERVICENOW_ONLINE_USER_AND_PASSWORD_SECRETS_ARN>"
SiteUrl = "<YOUR_SERVICENOW_SITE_URL>"
DSName= "<YOUR_NEW_DATASOURCE_NAME>"
IndexId= "<YOUR_INDEX_ID>"
DSRoleArn= "<YOUR_DATA_SOURCE_ROLE>"
ScheduleExpression='cron(0 11 * * ? *)'
try:try:
    datasource_response = kendra.create_data_source(
    Name=DSName,
    IndexId=IndexId,        
    Type='SERVICENOW',
    Configuration={
        'ServiceNowConfiguration': {
            'HostUrl': SiteUrl,
            'SecretArn': SecretArn,
            'ServiceNowBuildVersion': 'OTHERS',
            'KnowledgeArticleConfiguration': 
            {
                'CrawlAttachments': True,
                'DocumentDataFieldName': 'content',
                'DocumentTitleFieldName': 'short_description',
                'FieldMappings': 
                [
                    {
                        'DataSourceFieldName': 'sys_created_on',
                        'DateFieldFormat': 'yyyy-MM-dd hh:mm:ss',
                        'IndexFieldName': '_created_at'
                    },
                    {
                        'DataSourceFieldName': 'sys_updated_on',
                        'DateFieldFormat': 'yyyy-MM-dd hh:mm:ss',
                        'IndexFieldName': '_last_updated_at'
                    },
                    {
                        'DataSourceFieldName': 'kb_category_name',
                        'IndexFieldName': '_category'
                    },
                    {
                        'DataSourceFieldName': 'sys_created_by',
                        'IndexFieldName': 'MyCustomUsername'
                    }
                ],
                'IncludeAttachmentFilePatterns': 
                [
                    '.*\.(dotm|ppt|pot|pps|ppa)$',
                    '.*\.(doc|dot|docx|dotx|docm)$',
                    '.*\.(pptx|ppsx|pptm|ppsm|html)$',
                    '.*\.(txt)$',
                    '.*\.(hml|xhtml|xhtml2|xht|pdf)$'
                ]
            },
            'ServiceCatalogConfiguration': {
                'CrawlAttachments': True,
                'DocumentDataFieldName': 'description',
                'DocumentTitleFieldName': 'title',
                'FieldMappings': 
                [
                    {
                        'DataSourceFieldName': 'sys_created_on',
                        'DateFieldFormat': 'yyyy-MM-dd hh:mm:ss',
                        'IndexFieldName': '_created_at'
                    },
                    {
                        'DataSourceFieldName': 'sys_updated_on',
                        'DateFieldFormat': 'yyyy-MM-dd hh:mm:ss',
                        'IndexFieldName': '_last_updated_at'
                    },
                    {
                        'DataSourceFieldName': 'category_name',
                        'IndexFieldName': '_category'
                    }
                ],
                'IncludeAttachmentFilePatterns': 
                [
                    '.*\.(dotm|ppt|pot|pps|ppa)$',
                    '.*\.(doc|dot|docx|dotx|docm)$',
                    '.*\.(pptx|ppsx|pptm|ppsm|html)$',
                    '.*\.(txt)$',
                    '.*\.(hml|xhtml|xhtml2|xht|pdf)$'
                ]
            },
        },
    },
    Description='My ServiceNow Datasource',
    RoleArn=DSRoleArn,
    Schedule=ScheduleExpression,
    Tags=[
        {
            'Key': 'Project',
            'Value': 'ServiceNow Test'
        }
    ])
    pprint.pprint(datasource_response)
    print('Waiting for Kendra to create the DataSource.')
    datasource_id = datasource_response['Id']
    while True:
        # Get index description
        datasource_description = kendra.describe_data_source(
            Id=datasource_id,
            IndexId=IndexId
        )
        # If status is not CREATING quit
        status = datasource_description["Status"]
        print("    Creating index. Status: "+status)
        if status != "CREATING":
            break
        time.sleep(60)    

except  ClientError as e:
        pprint.pprint('%s' % e)
pprint.pprint(datasource_response)

If everything goes well, we should receive a 200 status response:

Create a DataSource
{'Id': '507686a5-ff4f-4e82-a356-32f352fb477f',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '45',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Sat, 22 Aug 2020 15:50:08 GMT',
                                      'x-amzn-requestid': '9deaea21-1d38-47b0-a505-9bb2efb0b74f'},
                      'HTTPStatusCode': 200,
                      'RequestId': '9deaea21-1d38-47b0-a505-9bb2efb0b74f',
                      'RetryAttempts': 0}}
Waiting for Kendra to create the DataSource.
    Creating index. Status: CREATING
    Creating index. Status: ACTIVE
{'Id': '507686a5-ff4f-4e82-a356-32f352fb477f',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '45',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Sat, 22 Aug 2020 15:50:08 GMT',
                                      'x-amzn-requestid': '9deaea21-1d38-47b0-a505-9bb2efb0b74f'},
                      'HTTPStatusCode': 200,
                      'RequestId': '9deaea21-1d38-47b0-a505-9bb2efb0b74f',
                      'RetryAttempts': 0}}

Even though we have defined a schedule for syncing my data source, we can sync on demand by using the method start_data_source_sync_job:

DSId=<YOUR DATA SOURCE ID>
IndexId=<YOUR INDEX ID>
 
try:
    ds_sync_response = kendra.start_data_source_sync_job(
    Id=DSId,
    IndexId=IndexId
)
except  ClientError as e:
        print('%s' % e)  
        
pprint.pprint(ds_sync_response)

The response should look like the following code:

{'ExecutionId': '3d11e6ef-3b9e-4283-bf55-f29d0b10e610',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '54',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Sat, 22 Aug 2020 15:52:36 GMT',
                                      'x-amzn-requestid': '55d94329-50af-4ad5-b41d-b173f20d8f27'},
                      'HTTPStatusCode': 200,
                      'RequestId': '55d94329-50af-4ad5-b41d-b173f20d8f27',
                      'RetryAttempts': 0}}

Testing

Finally, we can query our index. See the following code:

response = kendra.query(
IndexId=<YOUR_INDEX_ID>,
QueryText='Is there a service that has 11 9s of durability?')
if response['TotalNumberOfResults'] > 0:
    print(response['ResultItems'][0]['DocumentExcerpt']['Text'])
    print("More information: "+response['ResultItems'][0]['DocumentURI'])
else:
    print('No results found, please try a different search term.')

Common errors

In this section, we discuss errors that may occur, whether using the Amazon Kendra console or the Amazon Kendra API.

You should look at CloudWatch logs and error messages returned in the Amazon Kendra console or via the Amazon Kendra API. The CloudWatch logs help you determine the reason for a particular error, whether you experience it using the console or programmatically.

Common errors when trying to access ServiceNow as a data source are:

  • Insufficient permissions
  • Invalid credentials
  • Secrets Manager error

Insufficient permissions

A common scenario you may come across is when you have the right credentials but your user doesn’t have enough permissions for the Amazon Kendra ServiceNow connector to crawl your knowledge base and service catalog items.

You receive the following error message:

We couldn't sync the following data source: 'MyServiceNowOnline', at start time Sep 12, 2020, 1:08 PM CDT. Amazon Kendra can't connect to the ServiceNow server with the specified credentials. Check your credentials and try your request again.

If you can log in to your ServiceNow instance, make sure that the user you designed for the connector has the admin role.

  1. On your ServiceNow instance, under User Administration, choose Users.

  1. On the users list, choose the user ID of the user you want to use for the connector.

  1. On the Roles tab, verify that your user has the admin

  1. If you don’t have that role attached to your user, choose Edit to add it.

  1. On the Amazon Kendra console, on your connector configuration page, choose Sync now.

Invalid credentials

You may encounter an error with the following message:

We couldn't sync the following data source: 'MyServiceNowOnline', at start time Jul 28, 2020, 3:59 PM CDT. Amazon Kendra can't connect to the ServiceNow server with the specified credentials. Check your credentials and try your request again.

To investigate, complete the following steps:

  1. Choose the error message to review the CloudWatch logs.

You’re redirected CloudWatch Logs Insights.

  1. Choose Run Query to start analyzing the logs.

We can verify our credentials by going to Secrets Manager and reviewing our credentials stored in the secret.

  1. Choose your secret name.

  1. Choose Retrieve secret value.

  1. If your password doesn’t match, choose Edit,

  1. And the username or password and choose Save.

  1. Go back to your data source in Amazon Kendra, and choose Sync now.

Secrets Manager error

You may encounter an error stating that the customer’s secret can’t be fetched. This may happen if you use an existing secret and the IAM role used for syncing your ServiceNow data source doesn’t have permissions to access the secret.

To address this issue, first we need our secret’s ARN.

  1. On the Secrets Manager console, choose your secret’s name (for this post, AmazonKendra-ServiceNow-demosite).
  2. Copy the secret’s ARN.
  3. On the IAM console, search for the role we use to sync our ServiceNow data source (for this post, AmazonKendra-servicenow-role).
  4. For Permissions, choose Add inline policy.
  5. Following the least privilege principle, for Service, choose Secrets Manager.
  6. For Access Level, choose Read and GetSecretValue.
  7. For Resources, enter our secret’s ARN.

Your settings should look similar to the following screenshot.

  1. Enter a name for your policy.
  2. Choose Create Policy.

After your policy has been created and attached to your data source role, try to sync again.

Conclusion

You have now learned how to ingest the documents from your ServiceNow site into your Amazon Kendra index. We hope this post helps you take advantage of the intelligent search capabilities in Amazon Kendra to find accurate answers from your enterprise content.

For more information about Amazon Kendra, see AWS re:Invent 2019 – Keynote with Andy Jassy on YouTube, Amazon Kendra FAQs, and What is Amazon Kendra?

 


About the Authors

David Shute is a Senior ML GTM Specialist at Amazon Web Services focused on Amazon Kendra. When not working, he enjoys hiking and walking on a beach.

 

 

 

Juan Pablo Bustos is an AI Services Specialist Solutions Architect at Amazon Web Services, based in Dallas, TX. Outside of work, he loves spending time writing and playing music as well as trying random restaurants with his family.

Read More

Amazon Augmented AI is now a HIPAA eligible service

Amazon Augmented AI is now a HIPAA eligible service

Amazon Augmented AI (Amazon A2I) is now a HIPAA eligible service. Amazon A2I makes it easy to build the workflows required for human review of machine learning (ML) predictions. HIPPA eligibility applies to AWS Regions where the service is available and means you can use Amazon A2I add human review of protected health information (PHI) to power your healthcare workflows through your private workforce.

If you have a HIPAA Business Associate Addendum (BAA) in place with AWS, you can now start using Amazon A2I for your HIPAA eligible workloads. If you don’t have a BAA in place with AWS, or if you have any other questions about running HIPAA-regulated workloads on AWS, please contact us. For information and best practices about configuring AWS HIPAA eligible services to store, process, and transmit PHI, see the Architecting for HIPAA Security and Compliance on Amazon Web Services whitepaper.

Amazon A2I makes it easy to build the workflows required for human review of ML predictions. Amazon A2I brings human review to all developers, removing the undifferentiated heavy lifting associated with building human review systems and managing large numbers of human reviewers. Many healthcare customers like Change Healthcare, EPL Innovative Solutions, and partners like TensorIoT are already exploring new ways to use the power of ML to automate their current workloads and transform how they provide care to patients and use AWS services to help them meet their compliance needs under HIPAA. 

Change Healthcare

Change Healthcare is a leading independent healthcare technology company that provides data and analytics-driven solutions to improve clinical, financial, and patient engagement outcomes in the US healthcare system.

“At Change Healthcare, we help accelerate healthcare’s transformation by innovating to remove inefficiencies, reduce costs, and improve outcomes. We have a robust set of integrated artificial intelligence engines that bring new insights, impact, and innovation to the industry. Critical to our results is enabling human-in-the-loop to understand our data and automate workflows. Amazon A2I makes it easy to build the workflows required for human review of ML predictions. With Amazon A2I becoming HIPAA eligible, we are able to involve the human in the workflow and decision-making process, helping to increase efficiency with the millions of documents we process to create even more value for patients, payers, and providers.”

—Luyuan Fang, Chief AI Officer, Change Healthcare

TensorIoT

TensorIoT was founded on the instinct that the majority of compute is moving to the edge and all things are becoming smarter. TensorIoT is creating solutions to simplify the way enterprises incorporate Intelligent Edge Computing devices, AI, and their data into their day-to-day operations.

“TensorIoT has been working with customers to build document ingestion pipelines since Amazon Textract was in preview. Amazon A2I helps us easily add human-in-the-loop for document workflows, and one of the most frequently requested features from our healthcare customers is the ability to handle protected health information. Now with Amazon A2I being added to HIPAA eligible services, our healthcare customers will also be able to significantly increase the ingestion speed and accuracy of documents to provide insights and drive better outcomes for their doctors and patients.”

—Charles Burden, Head of Business Development, TensorIoT, AWS Advanced Consulting Partner

EPL Innovative Solutions

EPL Innovative Solutions, charged with the mission of “Changing Healthcare,” is an orchestrated effort, based on decades of experience in various healthcare theaters across the nation, to assess systems, identify shortcomings, plan and implement strategies, and lead the process of change for the betterment of the patient, provider, organization, or community experience.

“At EPL Innovative Solutions, we are excited to add Amazon A2I to our revolutionary cloud-based platform to serve healthcare clients that rely on us for HIPAA secure, accurate, and efficient medical coding and auditing services. We needed industry experts to optimize our platform, so we reached out to Belle Fleur Technologies as our AWS Partner for the execution of this solution to allow 100% human-in-the-loop review to meet our industry-leading standards of speed and accuracy.”

—Amanda Donoho, COO, EPL Innovative Solutions

Summary

Amazon A2I helps you automate your human review workloads and is now a HIPAA eligible service. For video presentations, sample Jupyter notebooks, or more information about use cases like document processing, object detection, sentiment analysis, text translation, and others, see Amazon Augmented AI Resources.

 


About the Author

Anuj Gupta is the Product Manager for Amazon Augmented AI. He is focusing on delivering products that make it easier for customers to adopt machine learning. In his spare time, he enjoys road trips and watching Formula 1.

Read More

How DeepMap optimizes their video inference workflow with Amazon SageMaker Processing

How DeepMap optimizes their video inference workflow with Amazon SageMaker Processing

Although we might think the world is already sufficiently mapped by the advent of global satellite images and street views, it’s far from complete because much of the world is still uncharted territory. Maps are designed for humans, and can’t be consumed by autonomous vehicles, which need a very different technology of maps with much higher precision.

DeepMap, a Palo Alto startup, is the leading technology provider of HD mapping and localization services for autonomous vehicles. These two services are integrated to provide high-precision localization maps, likely down to a centimeter precision. This demands processing a high volume of data to maintain precision and localization accuracy. In addition, road conditions can change minute to minute, so the maps guiding self-driving cars have to update in real time. DeepMap accumulates years of experience of mapping server development and uses the latest big data, machine learning (ML), and AI technology to build out their video inferencing and mapping pipeline.

In this post, we describe how DeepMap is revamping their video inference workflow by using Amazon SageMaker Processing, a customizable data processing and model evaluation feature, to streamline their workload by reducing complexity, processing time, and cost. We start out by describing the current challenges DeepMap is facing. Then we go over the proof of concept (POC) implementation and production architecture for the new solution using Amazon SageMaker Processing. Finally, we conclude with the performance improvements they achieved with this solution.

Challenges

DeepMap’s current video inference pipeline needs to process large amounts of video data collected by their cars, which are equipped with cameras and LIDAR laser scanning devices and drive on streets and freeways to collect video and image data. It’s a complicated, multi-step, batch processing workflow. The following diagram shows the high-level architecture view of the video processing and inferencing workflow.


With their previous workflow architecture, DeepMap discovered a scalability issue that increased processing time and cost due to the following reasons:

  • Multiple steps and separate batch processing stages, which could be error-prone and interrupt the workflow
  • Additional storage required in Amazon Simple Storage Service (Amazon S3) for intermediate steps
  • Sequential processing steps that prolonged the total time to complete inference

DeepMap’s infrastructure team recognized the issue and approached the AWS account team for guidance on how to better optimize their workflow using AWS services. The problem was originally presented as a workflow orchestration issue. A few AWS services were proposed and discussed for workflow orchestration, including:

However, after a further deep dive into their objectives and requirements, they determined that these services addressed the multiple steps coordination issue but not the storage and performance optimization objectives. Also, DeepMap wanted to keep the solution in the realm of the Amazon SageMaker ML ecosystem, if possible. After a debrief and further engagement with the Amazon SageMaker product team, a recently released Amazon SageMaker feature—Amazon SageMaker Processing—was proposed as a viable and potentially best fit solution for the problem.

Amazon SageMaker Processing comes to the rescue

Amazon SageMaker Processing lets you easily run the preprocessing, postprocessing, and model evaluation workloads on a fully managed infrastructure. Besides the full set of additional data and processing capabilities, Amazon SageMaker Processing is particularly attractive and promising for DeepMap’s problem at hand because of its flexibility in the following areas:

  • Setting up your customized container environment, also known as bring your own container (BYOC)
  • Custom coding to integrate with other application APIs that reside in your VPCs

These were the key functionalities DeepMap was looking for to redesign and optimize their current inference workflow. They quickly agreed on a proposal to explore Amazon SageMaker Processing and move forward as a proof of concept (POC).

Solution POC

The following diagram shows the POC architecture, which illustrates how a container in Amazon SageMaker Processing can make real-time API calls to a private VPC endpoint. The full architecture of the new video inference workload is depicted in the next section.

 

The POC demonstration includes the following implementation details:

  • Sample source data – Two video files (from car view). The following images show examples of Paris and Beijing streets.


  • Data stores – Two S3 buckets:
    • Source video buckets3://sourcebucket/input
    • Target inference result buckets3://targetbucket/output
  • Custom container – An AWS pre-built deep learning container based on MXNET with other needed packages and the pretrained model.
  • Model – A pre-trained semantic segmentation GluonCV model from the GluonCV model zoo. GluonCV provides implementations of state-of-the-art deep learning algorithms in computer vision. It aims to help engineers, researchers, and students quickly prototype products, validate new ideas, and learn computer vision. The GluonCV model zoo contains six kinds of pretrained models: classification, object detection, segmentation, pose estimation, action recognition, and depth prediction. For this post, we use deeplab_resnet101_citys, which was trained with Cityscape dataset and focuses on semantic understanding of urban street scenes, so this model is suitable for car view images. The following images are a sample of segmentation inference; we can see the model assigned red for people and blue for cars.


  • Amazon SageMaker Processing environment – Two instances (p3.2xlarge) configured for private access to the VPC API endpoint.
  • Mock API server – A web server in a private VPC mimicking DeepMap’s video indexing APIs. When invoked, it responds with a “Hello, Builders!” message.
  • Custom processing script – An API call to the mock API endpoint in the private VPC to extract frames from the videos, perform segmentation model inference on the frames, and store the results.

Amazon SageMaker Processing launches the instances you specified, downloads the container image and datasets, runs your script, and uploads the results to the S3 bucket automatically. We use the Amazon SageMaker Python SDK to launch the processing job. See the following code:

from sagemaker.network import NetworkConfig
from sagemaker.processing import (ProcessingInput, ProcessingOutput,
                                  ScriptProcessor)

instance_count = 2
"""
This network_config is for Enable VPC mode, which means the processing instance could access resources within vpc
change to your security_group_id and subnets_id
security_group_ids = ['YOUR_SECURITY_GROUP_ID']
subnets = ["YOUR_SUBNETS_ID1","YOUR_SUBNETS_ID2"]
"""
security_group_ids = vpc_security_group_ids
subnets = vpc_subnets

network_config = NetworkConfig(enable_network_isolation=False, 
                               security_group_ids=security_group_ids, 
                               subnets=subnets)

video_formats = [".mp4", ".avi"]
image_width = 1280
image_height = 720
frame_time_interval = 1000

script_processor = ScriptProcessor(command=['python3'],
                image_uri=processing_repository_uri,
                role=role,
                instance_count=instance_count,
                instance_type='ml.p3.2xlarge',
                network_config=network_config)

# with S3 shard key
script_processor.run(code='processing.py',
                      inputs=[ProcessingInput(
                        source=input_data,
                        destination='/opt/ml/processing/input_data',
                        s3_data_distribution_type='ShardedByS3Key')],
                      outputs=[ProcessingOutput(destination=output_data,
                                                source='/opt/ml/processing/output_data',
                                                s3_upload_mode = 'Continuous')],
                      arguments=['--api_server_address', vpc_api_server_address,
                                '--video_formats', "".join(video_formats),
                                '--image_width', str(image_width),
                                '--image_height', str(image_height),
                                '--frame_time_interval', str(frame_time_interval)]
                     )
script_processor_job_description = script_processor.jobs[-1].describe()
print(script_processor_job_description)

We use the ShardedByS3Key mode for the S3_data_distribution_type to leverage the feature in Amazon SageMaker that shards the objects by Amazon S3 prefix, so the instance receives 1/N of the total objects for faster parallel processing. Because this video inference job is just one part of DeepMap’s entire map processing workflow, S3_upload_mode is set to Continuous to streamline with the subsequent processing tasks. For the complete POC sample codes, see the GitHub repo.

The POC was successfully completed, and the positive results demonstrated the capability and flexibility of Amazon SageMaker Processing. It met the following requirements for DeepMap:

  • Dynamically invoke an API in a private VPC endpoint for real-time custom processing needs
  • Reduce the unnecessary intermediate storage for video frames

Production solution architecture

With the positive results from the demonstration of the POC, DeepMap’s team decided to re-architect their current video process and inference workflow by using Amazon SageMaker Processing. The following diagram depicts the high-level architecture of the new workflow.


The DeepMap team initiated a project to implement this new architecture. The initial production development setting is as follows:

  • Data source – Camera streams (30fps) collected from the cars are chopped and stored as 1-second h264 encoded video clips. All video clips are stored in the source S3 buckets.
  • Video processing – Within a video clip (of 30 frames in total), only a fraction of key frames are useful for map making. The relevant key frames information is stored in DeepMap’s video metadata database. Video processing codes run in an Amazon SageMaker Processing container, which call a video indexing API via a VPC private endpoint to retrieve relevant key frames for inferencing.
  • Deep learning inference – Deep learning inference code queries the key frame information from the database, decodes the key frames in memory, and applies the deep learning model using the semantic segmentation algorithm to produce the results and store the output in the S3 result bucket. The inference codes also run within the Amazon SageMaker Processing custom containers.
  • Testing example – We use a video clip of a collected road scene in .h264 format (000462.h264). Key frame metadata information about the video clip is stored in the database. The following is an excerpt of the key frame metadata information dumped from the database:
    image_paths {
      image_id {
        track_id: 12728
        sample_id: 4686
      }
      camera_video_data {
        stream_index: 13862
        key_frame_index: 13860
        video_path: "s3://sensor-data/4e78__update1534_vehicle117_2020-06-11__upload11960/image_00/rectified-video/000462.h264"
      }
    }
    image_paths {
      image_id {
        track_id: 12728
        sample_id: 4687
      }
      camera_video_data {
        stream_index: 13864
        key_frame_index: 13860
        video_path: "s3://sensor-data/4e78__update1534_vehicle117_2020-06-11__upload11960/image_00/rectified-video/000462.h264"
      }
    }

A relevant key frame is returned from the video index API call for the subsequent inference task (such as the following image).

The deep learning inference result is performed using the semantic segmentation algorithm running in Amazon SageMaker Processing to determine the proper lane line from the key frame. Using the preceding image as input, we receive the following output.

Performance improvements

As of this writing, DeepMap has already seen the expected performance improvements using the newly optimized workflow, and been able to achieve the following:

  • Streamline and reduce the complexity of current video-to-image preprocessing workflow. The real-time API video indexing call has reduced two steps to one.
  • Reduce the total time for video preprocessing and image DL inferencing. Through the streamlined process, they can now run decoding and deep learning inference on different processors (CPU and GPU) in different threads, potentially saving 100% preprocessing time (as long as the inference takes longer than the video decoding, which is true in most cases).
  • Reduce the intermediate storage spaces to store the images for inference job. Each camera frame (1920×1200, encoded as JPEG format) takes 500 KB to store, but a 1-second video clip (x264 encoded) with 30 continuous frames takes less than 2 MB storage (thanks to the video encoding). So, the storage reduction rate is about (1 – 2MB / (500KB * 30)) ~= 85%.

The following table summarizes the overall improvements of the new optimized workflow.

Measurements Before After Performance Improvements
Processing steps Two steps One step 50% simpler workflow
Processing time Video preprocessing to extract key frames Parallel processing (with multiple threads in Amazon SageMaker Processing containers) 100% reduction of video preprocessing time
Storage Intermediate S3 buckets for preprocessed video frames None (in-memory) 85% reduction
Compute Separate compute resources for video pre-processing using Amazon Elastic Compute Cloud (Amazon EC2) None (running in the Amazon SageMaker Processing container) 100% reduction of video preprocessing compute resources

Conclusion

In this post, we described how DeepMap used the new Amazon SageMaker Processing capability to redesign their video inference workflow to achieve a more streamlined workflow. Not only did they save on storage costs, they also improved their total processing time.

Their successful use case also demonstrates the flexibility and scalability of Amazon SageMaker Processing, which can help you build more scalable ML processing and inferencing workloads. For more information about integrating Amazon SageMaker Processing, see Amazon SageMaker Processing – Fully Managed Data Processing and Model Evaluation. For more information about using services such as Step Functions to build more efficient ML workflows, see Building machine learning workflows with Amazon SageMaker Processing jobs and AWS Step Functions.

Try out Amazon SageMaker Processing today to further optimize your ML workloads.

 


About the Authors

Frank Wang is a Startup Senior Solutions Architect at AWS. He has worked with a wide range of customers with focus on Independent Software Vendors (ISVs) and now startups. He has several years of engineering leadership, software architecture, and IT enterprise architecture experiences, and now focuses on helping customers through their cloud journey on AWS.

 

 

Shishuai Wang is an ML Specialist Solutions Architect working with the AWS WWSO team. He works with AWS customers to help them adopt machine learning on a large scale. He enjoys watching movies and traveling around the world.

 

 

 

Yu Zhang is a staff software engineer and the technical lead of the Deep Learning platform at DeepMap. His research interests include Large-Scale Image Concept Learning, Image Retrieval, and Geo and Climate Informatic

 

 

 

Tom Wang is a founding team member and Director of Engineering at DeepMap, managing their cloud infrastructure, backend services, and map pipelines. Tom has 20+ years of industry experience in database storage systems, distributed big data processing, and map data infrastructure. Prior to DeepMap, Tom was a tech lead at Apple Maps and key contributor to Apple’s map data infrastructure and map data validation platform. Tom holds an MS degree in computer science from the University of Wisconsin-Madison.

Read More

Automating business processes with machine learning in the COVID-19 pandemic

Automating business processes with machine learning in the COVID-19 pandemic

COVID-19 has changed our world significantly. All of this change has been almost instantaneous, forcing companies to pivot quickly and find new ways to operate. Automation is playing an increasingly important role to help companies adjust. The ability to automate business processes with machine learning (ML) is unlocking new efficiencies and allowing companies to move faster where they might have otherwise been stuck using antiquated systems. What might have previously taken an organization years is now happening in weeks. In this post, we discuss how AWS customers are applying ML in areas such as document processing and forecasting to quickly respond to the challenges at hand.

Automating document processing

The ability to automate document processing remotely has proven essential as companies face new challenges in this pandemic. Demand for services like loan processing and grocery delivery has spiked in areas that no one could have predicted and the ability to quickly respond to those demands remains vital.

In April 2020, the US federal government announced the Paycheck Protection Program (PPP) to provide small businesses with funds to cover up to 8 weeks of payroll, mortgage, rent, and utility expenses. With phenomenal demand and over $349 billion allocated in just the first 13 days of the program, small business owners were scrambling to qualify.

BlueVine, a fintech company that provides small business banking, used their technology and engineering expertise to help process billions in loans. They chose Amazon Textract, a fully managed ML service that automatically extracts text and data from documents, to help automate the loan application process. In just a few days, they were up and running, analyzing tens of thousands of pages with high accuracy. In just 4 months, they were able to serve more than 155,000 small businesses with over $4.5 billion in loans. They delivered services to those who needed it most, with 68% of loans going to customers with fewer than 10 employees and 90% of loans under $60,000—serving small businesses struggling to remain afloat. BlueVine worked closely with DoorDash as their strategic partner to serve many stressed small independent restaurants, and simplify and accelerate the loan process. BlueVine used ML to automate loan application processing and scale quickly to meet the unprecedented demand. The company estimates they helped save 470,000 jobs as a result of their efforts.

Other areas of the economy were also experiencing unprecedented demand and needed to staff up quickly. However, it was a challenge to process new hire employment paperwork at the rate required. A typical PDF form has about 50 form fields; to recreate it as a digital form, the customer had to drag and drop data to the right location on each form—a particularly time-consuming task. Enter HelloSign, a Dropbox company that automates the signature process.  HelloWorks is a HelloSign product that turns PDFs into mobile friendly forms. It uses Amazon Textract to automate document processing and save customers hundreds of hours. A popular on-demand grocery delivery service was able to onboard millions of shoppers using HelloWorks in a few weeks. HelloWorks helped the company scale their onboarding paperwork quickly by automating document processing with ML. An NY-based urgent care started to use HelloSign to register new patients. An ambulance service started using HelloWorks to send out COVID-19 test applications. What’s more, this was all happening online. As organizations continue to limit in-person interactions, demand surged for HelloWorks with users creating 3x more forms than they used to. With Textract, HelloWorks was able to automate the process and automatically create all of the fields and components, saving customers time and keeping them safe.

Forecasting the pandemic

Forecasting is a growing challenge as supply chains and demand have been disrupted on a global scale. Amazon Forecast, a fully managed service that uses ML to deliver highly accurate forecasts, is helping customers deliver forecasts for everything from product demand to financial performance. Many forecasting tools only look at a historical series of data to predict the future, with the assumption being that the future is determined by the past. When faced with irregular trends this approach falters – as demonstrated by the challenges faced by companies to develop models that accurately capture the complexities of the real world since the beginning of the COVID-19 pandemic. With Amazon Forecast, you can integrate irregular trends and a multitude of other variables—on top of your historical series of data—to deliver the most accurate forecast possible with no ML experience required.

One of the largest challenges when it comes to forecasting has been understanding the projection of the disease itself. How quickly will it spread? When will it spike next? How many hospital beds will be needed to accommodate that spike? Forecasting models have the potential to assess disease trends and the course of the COVID-19 pandemic. However, the nature of the COVID-19 time-series makes forecasting extremely challenging, given the variations we’ve observed in disease spread across multiple communities and populations. COVID-19 remains a relatively unknown disease with no historic data to predict trends, such as seasonality and vulnerable sections of the population.

To better understand and forecast the disease, Rackspace Technology, University of California Irvine, Scientific Systems, and Plan4Co have come together to introduce a new COVID-19 forecasting model to deliver greater accuracy using Amazon Forecast. The team of medical, academic, data science modeling, and forecasting experts worked together to use Amazon Forecast DeepAR+ to incorporate related time-series to build more powerful forecasting models. Their model used deep learning to learn patterns between related time-series, like mobility data, and the target time-series. As a result, the model outperformed other approaches, such as those provided by the well-known IHME model.

With Amazon Forecast, the team was able to preprocess the time-series, train dozens of models quickly, compare model performance, and quantify the best forecasts. These forecasts can be developed on a daily and weekly basis, now available for countries, states, counties, and zip-codes. This information can, for example, help forecast what new cases will be in the short-term and long-term by learning from real-world data, like time to peak and rate of transmission. This information is critical because government agencies frequently use the occurrence of new cases in a population over a specified period of time to help determine when it’s safe to re-open sectors of the economy.

Conclusion

As the pandemic continues, new challenges will inevitably arise. When time was of the essence, these organizations looked to ML technology and automation to serve their customers’ needs and find new ways to operate. This use of new technology will not only help them respond to the pandemic today, but also set them up to thrive in the future.

To learn about other ways AWS is working toward solutions in the COVID-19 pandemic check out Introducing the COVID-19 Simulator and Machine Learning Toolkit for Predicting COVID-19 Spread and Intelligently connect to customers using machine learning in the COVID-19 pandemic.

 


About the Author

Taha A. Kass-Hout, MD, MS, is director of machine learning and chief medical officer at Amazon Web Services (AWS). Taha received his medical training at Beth Israel Deaconess Medical Center, Harvard Medical School, and during his time there, was part of the BOAT clinical trial. He holds a doctor of medicine and master’s of science (bioinformatics) from the University of Texas Health Science Center at Houston.

Read More

Helping small businesses deliver personalized experiences with the Amazon Personalize extension for Magento

Helping small businesses deliver personalized experiences with the Amazon Personalize extension for Magento

This is a guest post by Jeff Finkelstein, founder of Customer Paradigm, a full-service interactive media firm and Magento solutions partner.

Many small retailers use Magento, an open-source ecommerce platform, to create websites or mobile applications to sell their products online. Personalization is key to creating high-quality ecommerce experiences, but small businesses often lack access to resources required to implement a scalable, sophisticated personalization solution—especially one that is powered by machine learning (ML). An ML-based solution results in better end-user engagement, conversion, and increased sales compared to traditional rudimentary techniques like static rules. With the Amazon Personalize extension for Magento, we’re now making the same ML personalization technology used by Amazon.com accessible to small businesses that use Magento.

This is the case with Hoopologie, a small hula hoop supply company based in Boulder, Colorado. Founder Melina Rider started the business in 2013 and has an extensive product line of over 1,000 SKUs. For Hoopologie, like many other small ecommerce merchants, creating sophisticated personalized experiences has been out of reach. Hiring ML experts to implement an ML solution isn’t affordable, and using a rules-based system requires a lot of manual maintenance, limiting scale and performance.

“Creating personalized recommendations for every type of user on the site would take us hours and hours every month. It’s not affordable, and I’d rather have our limited staff help our customers in other ways,” Rider says.

Hoopologie uses Magento to power their website, and has seen an increase of 40.5% in sales and an average order value increase of just over $50 per order by using the Amazon Personalize extension for Magento. In this post, we show you how to implement the Amazon Personalize extension for Magento to start creating ML-powered personalized experiences for your customers to improve engagement, conversion, and revenue.

Amazon Personalize for Magento

Amazon Personalize is a fully managed ML service that leverages over 20 years of experience at Amazon.com to help you deliver personalized experiences faster. The Amazon Personalize extension allows Magento merchants to take advantage of the benefits of Amazon Personalize and use its algorithms to power a Magento store’s product recommendations. All data and ML models are stored privately and securely in the merchant’s AWS account.

It’s easy to install the extension on your site, create an AWS account, and authorize the extension to access your AWS account. For instructions, see Amazon Personalize for Magento 2: Installation & Configuration Instructions. After you complete those steps, you see the Amazon Personalize extension in your Magento admin area, under Stores, Configuration.

Entering configuration details

For instructions on configuring the extension, watch the video clip Amazon Personalize for Magento 2: Extension Configuration on YouTube.

On the configuration page, you can provide all the values that tie your Magento site into Amazon Personalize in your AWS account: 

  1. For License Key, enter the license key that you received for the extension.

If you installed the extension but don’t have a license key, you can start a 15-day free trial.

If your license is active, the License Active field shows as yes.

  1. If the license isn’t activated, add a valid access key.
  2. For Module Enabled, you can enable or disable the module from the admin area.

This is helpful if you need to troubleshoot a site, or if you have a copy of the site on a test server.

When your Amazon Personalize campaign (an ML-powered recommender trained on your data) is active in Amazon Personalize, Campaign Active shows as yes.

The system automatically uploads, ingests, and trains the data. This is a read-only display; this isn’t something you can change to turn on or off the campaign.

  1. For File owner home directory, enter a file directory outside your web root (such as ../../keys).

Amazon’s security requirements mandate that your AWS credentials not be stored in the Magento database. Instead, your keys need to be stored in a directory outside the web root, usually up a level or two from where your Magento site is stored in your file system.

  1. For AWS Region, enter the Region where you want the extension to access Amazon Personalize.

Because Amazon Personalize may not be available in all Regions, be sure to enter a Region code where Amazon Personalize is available and that is located geographically closest to your Magento server’s physical location.

  1. For AWS Account Number, enter your AWS account number.
  2. For Access Key, enter the access key ID for the user that you created in the authorization part of the setup.
  3. For Secret Key, enter the secret key ID for the user that you created in authorization part of the setup.

 

For instructions on finding your AWS account number, access key, and secret key, watch the video clip Amazon Personalize for Magento 2: Creating IAM User in AWS on YouTube.

  1. Choose Save Config.

This saves the access information and writes the access key and secret key to a file outside the web root.

Starting training

To begin the training process, choose Start Process.

Make sure that your Magento cron is running. If it’s not running, it’s highly likely that the process won’t move from one step to the next.

The process includes the following high-level steps:

  • Exporting historical data from your Magento site into CSV files.
  • Creating a private Amazon Simple Storage Service (Amazon S3) bucket in your AWS account to stage CSV files.
  • Uploading the CSV files to the S3 bucket in your AWS account.
  • Instructing Amazon Personalize to create a custom solution and campaign based on your data. The result is a private ML model hosted in your AWS account.

The following screenshot shows the progress tracker of the individual steps.

You can restart the process at any time by choosing Reset Process. This process starts at the beginning, and may incur additional data upload and training costs.

A/B split testing

To evaluate the effectiveness of the Amazon Personalize campaign created on your Magento store, we built in an A/B split testing system. A/B testing allows you to expose two subsets of your users to two variations of a user experience on your site and measure which variation results in the highest conversion rate. Control is default Magento; Test is personalized recommendations from your Amazon Personalize campaign.

  1. For the system to be active, make sure that Enabled is set to Yes.

If Enabled is set to No, Magento doesn’t use the extension.

  1. For Set Percentage, choose from the following A/B split test options:
    1. 100% Control / 0% test (Amazon Personalize isn’t used at all)
    2. 75% Control / 25% test (Amazon Personalize is used 25% of the time)
    3. 50% Control / 50% test (Amazon Personalize is used 50% of the time)
    4. 25% Control / 75% test (Amazon Personalize is used 75% of the time)
    5. 10% Control / 90% test (Amazon Personalize is used 90% of the time)
    6. 0% Control / 100% test (Amazon Personalize is used 100% of the time)

  1. Choose Save Config to save the settings.

Next you allow Amazon Personalize to train. Depending on the amount of historical data in your Magento system, this process can take a few hours to complete. You can revisit this page to see the progress of the training.

After you complete this process the first time, you can retrain new versions of the model while continuing to use the active campaign to provide product recommendations. To contain costs and create the most relevant dataset for your site, we’ve limited historical data to the previous 6 months. (Interaction data older than 6 months provides less value in making relevant product recommendations.)

After the Amazon Personalize system is enabled, the system automatically does the following:

  • Displays personalized product recommendations to your end-users.
  • Adds a “We also suggest” list of recommended products to your product page.
  • Automatically adds a Google Analytics tag (if your site uses Google Analytics) for any order that was placed on the site when Amazon Personalize was active. This uses the existing Google Analytics system. If you’re not using Google Analytics, this step is omitted.
  • Adds a field to the Magento database that indicates if an order was placed when Amazon Personalize was active.
  • Adds real-time data interaction indicators, allowing Amazon Personalize to learn from users when they add a product to their cart, wishlist, or complete a purchase.

To add Amazon Personalize to additional pages, choose Content, Pages, Select a Page.

For help troubleshooting, see Amazon Personalize Extension for Magento 2.

Conclusion

In just a few easy steps, you can add the Amazon Personalize extension for Magento to an ecommerce site. Hooplogie started testing the Amazon Personalize extension in January 2020. They began by using an A/B split test to show the personalized recommendations to 50% of their users. After 6 weeks, users seeing personalized recommendations spent $50.02 more per transaction, and overall revenue from users in the test was up by 42%. Hoopologie continues to use the Amazon Personalize extension to create personalized experiences for their customers. They intend to expand usage by continuously evaluating the system, retraining as new products are added, and adding personalization to additional areas of the site.

Advanced personalization is no longer beyond the reach of many ecommerce merchants using Magento. The results from Hoopologie are a testament to the positive impact of implementing a scalable, sophisticated personalization solution powered by ML. Learn more about the Amazon Personalize extension for Magento for Customer Paradigm and enjoy a complementary 15-day free trial.

 


About the Author

Jeff Finklelstein is the founder of Customer Paradigm based in Boulder, Colorado. Founded in 2002, their team has completed more than 12,600 web development and marketing projects for ecommerce and other clients throughout the world. The Customer Paradigm team previously built an integration between Magento and Amazon Marketplace, which was incorporated into the core Magento framework in 2017. More information can be found at https://www.CustomerParadigm.com.

Read More

Detecting hidden but non-trivial problems in transfer learning models using Amazon SageMaker Debugger

Detecting hidden but non-trivial problems in transfer learning models using Amazon SageMaker Debugger

Rapid development of deep learning technology has produced an abundance of open-sourced, pre-trained models in computer vision and natural language processing. As a result, transfer learning has become a popular approach in deep learning. Transfer learning is a machine learning technique where a model pre-trained on one task is fine-tuned on a new task. Given the significant compute and time resources required to develop neural network models, adapting a pre-trained model to new data is compelling in business applications. If you’re new to deep learning, transfer learning is also a good starting point because you don’t have to build a model from scratch. For deep learning beginners, one question you may have is, how do I systematically examine model predictions to see what mistakes were made now that my data is in the form of pictures or text?

In this post, we show you an end-to-end example of doing transfer learning by using Amazon SageMaker Debugger to detect hidden problems that may have serious consequences. Debugger doesn’t incur additional costs if you’re running training on Amazon SageMaker. Moreover, you can enable the built-in rules with just a few lines of code when you call the Amazon SageMaker estimator function. For our use case, we do transfer learning using a ResNet model to recognize German traffic signs [1].

In this post, we focus on issues that occur during training. For more information about using Debugger for inference and explainability, see Detecting and analyzing incorrect model predictions with Amazon SageMaker Model Monitor and Debugger.

Setting up a transfer learning training job

For our use case, we want to adapt a pre-trained computer vision model to recognize traffic signs in Germany. We use the GTSRB dataset [1] for this new task. You can find the notebook and training script in the GitHub repo.

Applying preprocessing on the dataset

We first apply some typical preprocessing for a ResNet model on our dataset (see the complete notebook for where to download the dataset). To improve model generalization, we apply data augmentation (RandomResizedCrop and RandomHorizontalFlip). These operations ensure that an image looks differently in each epoch.  Lastly, we normalize the data: because the model has been pre-trained on the ImageNet dataset, we apply the same preprocessing and normalization (subtract the mean and divide by the standard deviation of the ImageNet dataset). See the following code:

from torchvision import datasets, models, transforms

# Define pre-processing
train_transform =  transforms.Compose([
                                        transforms.RandomResizedCrop(224),
                                        transforms.RandomHorizontalFlip(),
                                        transforms.ToTensor(),
                                        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
                                    ])

We use Pytorch’s ImageFolder function, which takes a local folder and loads all images located in the subdirectories and encodes the directory name as a label. Next we specify the dataloader that takes the batch size and dataset. We use the dataloader during training to provide new batches in each iteration. See the following code:

# Apply the pre-processing to the training dataset
dataset = datasets.ImageFolder(root='GTSRB/Training', transform=train_transform)
train_dataloader = torch.utils.data.DataLoader(dataset, batch_size=64, shuffle=True)

For the validation dataset, we don’t apply data augmentation and only resize images to the appropriate size:

# Apply the pre-processing to validation dataset
val_transform = transforms.Compose([
                                        transforms.Resize(256),
                                        transforms.CenterCrop(224),
                                        transforms.ToTensor(),
                                        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
                                        ])

dataset_test = datasets.ImageFolder(root='GTSRB/Final_Test', transform=val_transform)
val_dataloader = torch.utils.data.DataLoader(dataset, batch_size=64, shuffle=False)

Loading a pre-trained ResNet model

Because we have a limited variety of traffic signs, we pick a simpler ResNet model for this task: resnet18. You can load a ResNet18 from the PyTorch model zoo with pre-trained weights using just one line of code:

#get pretrained ResNet model
model = models.resnet18(pretrained=True)

The model has been pre-trained on the ImageNet dataset, which consists of 1,000 image classes. For our use case, we fine-tune it on a dataset that only has 43 classes. We adjust the last layer, which is a fully connected Linear layer:

#traffic sign dataset has 43 classes
nfeatures = model.fc.in_features
model.fc = torch.nn.Linear(nfeatures, 43)

Because we train a multi-classification model, we use the cross entropy loss function:

#loss for multi label classification
loss_function = torch.nn.CrossEntropyLoss()

Next we specify the optimizer that takes the model parameters and learning rate. Here we use the stochastic gradient descent optimizer:

# optimizer
optimizer = torch.optim.SGD(model.parameters(), lr=0.001)

Defining the training loop

The following code blocks define the training loop. We iterate over ten epochs, perform the forward and backward pass, and update the model parameters.

for epoch in range(10):  # loop over the entire dataset 10 times
   
    for data in train_dataloader:
    
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data
        
        # zero the parameter gradients
        optimizer.zero_grad()

        # forward 
        outputs = model(inputs)
        
        #compute loss
        loss = loss_function(outputs, labels)
        
        #backward pass
        loss.backward()
        
        #optimize 
        optimizer.step()
        
        #get predictions
        _, preds = torch.max(outputs, 1)

        # statistics
        epoch_loss += loss.item() 
        
        print('Epoch {}/{} Loss: {:.4f}'.format(epoch, 1, epoch_loss))

If you just run the preceding code, the training runs just on your Amazon SageMaker notebook. To make the most out of Amazon SageMaker, you want to use the pre-built DLC containers, which come with optimized performance and let you access the full feature sets of Debugger at no additional cost. By running on Amazon SageMaker, we can easily train our models at scale. Most deep learning models are trained on GPU due to the computational intensity. With Amazon SageMaker, GPU instances are automatically created and torn down after training completes, so you only pay for the time the resources were used.

Making a training script compatible with Amazon SageMaker

To run training on Amazon SageMaker, you need to change the location variable in your pre-processing code to the generic Amazon SageMaker environment variables. When Amazon SageMaker spins up the training instance, it automatically downloads the training and validation data from Amazon Simple Storage Service (Amazon S3) into a local folder on the training instance. We can retrieve the local path with os.environ['SM_CHANNEL_TRAIN'] and os.environ['SM_CHANNEL_TEST']:

# update environment variable for training and testing data sets
dataset = datasets.ImageFolder(os.environ['SM_CHANNEL_TRAIN'], transform=train_transform)
dataset_test = datasets.ImageFolder(os.environ['SM_CHANNEL_TEST'], transform=val_transform)

After the change, you should save all the model code as a separate script called train.py.

Uploading data to the S3 bucket

As mentioned in the previous step, Amazon SageMaker automatically downloads training and validation data into the training instance. We need to upload the data to Amazon S3 first. You can find detailed instructions on how to do that in the notebook.

Setting up debugger

Now that we have defined the training script and uploaded the data, we’re ready to start training on Amazon SageMaker. We run the training with several Debugger built-in rules enabled. Via the Amazon SageMaker Python SDK and the rule_configs module, we can select any of the 20 available built-in rules, which run at no additional cost. For demonstration purposes, we select loss_not_decreasing, class_imbalance and dead_relu. We can configure several parameters for these rules: for instance, most rules take a threshold parameter that can be adjusted to define when a rule should trigger. We can also define the set of tensors the rules should run on.

The class imbalance rule takes the inputs into the loss function and counts the number of samples per class that the model has seen throughout training. To create the rule, we specify rule_configs.class_imbalance() and the rule runs on the inputs of the loss function. To fine-tune the model, we use the cross entropy loss function, which takes predictions and labels and outputs a loss value. See the following code:

from sagemaker.debugger import Rule, CollectionConfig, rule_configs
class_imbalance_rule = Rule.sagemaker(base_config=rule_configs.class_imbalance(),
                                    rule_parameters={"labels_regex": "CrossEntropyLoss_input_1"}
                                    )

Next we define the loss_not_decreasing rule. It determines if the training or validation loss is decreasing and raises an issue if the loss has not decreased by a certain percentage in the last few iterations. In contrast to the previous rule, this rule runs on the outputs of the loss function (CrossEntropyLoss_output_0). See the following code:

loss_not_decreasing_rule = Rule.sagemaker(base_config=rule_configs.loss_not_decreasing(),
                             rule_parameters={"tensor_regex": "CrossEntropyLoss_output_0",
                                             "mode": "TRAIN"})

The dead_relu rule identifies how many rectified linear unit (ReLU) activations are outputting zero values. ReLU is a non-linear activation function used in many state-of-the-art models. It increases linearly for increasing positive values and outputs zero otherwise. A model can suffer from the dying ReLU problem, where the gradients become zero due to the activation output being zero. If the majority of ReLU activations output zero values, the model can’t effectively learn because weights are no longer getting updated. We instantiate the rule by specifying rule_configs.dead_relu(), and the rule runs on all tensors that captured outputs from ReLU activations:

dead_relu_rule = Rule.sagemaker(base_config=rule_configs.dead_relu(),
                                rule_parameters={"tensor_regex": "relu_output"})

To record additional tensors, we can specify a debugger hook configuration. We can either use default collections such as weights and gradients or define our own custom collection. The following collection saves model inputs and loss function inputs and outputs. We just need to specify a regular expression of tensor names. We save the tensors every 500 steps, and a step is one forward and backward pass. So we get tensors for step 0, 500, 1,000, and so on. See the following code:

from sagemaker.debugger import DebuggerHookConfig, CollectionConfig

debugger_hook_config = DebuggerHookConfig(
      collection_configs=[ 
          CollectionConfig(
                name="custom_collection",
                parameters={ "include_regex": "*ResNet_input|*CrossEntropyLoss",
                             "save_interval": "500" })])

For a full list of collections and rules Debugger offers, see List of Debugger Built-in Rules. Debugger captures the tensor collections you specified throughout the training steps and automatically analyzes them against the rules.

Calling the Amazon SageMaker training API

We define the PyTorch estimator that takes the separate training script we saved earlier and specify the instance type that Amazon SageMaker creates for us. To run the training with Debugger and built-in rules, we only have to pass the list of rules and the debugger hook configuration:

from sagemaker.pytorch import PyTorch

pytorch_estimator = PyTorch(entry_point='train.py',
                            role=sagemaker.get_execution_role(),
                            train_instance_type='ml.p2.xlarge',
                            train_instance_count=1,
                            framework_version='1.6.0',
                            py_version='py3',
                            debugger_hook_config=debugger_hook_config,
                            rules=[class_imbalance_rule, dead_relu_rule, loss_not_decreasing_rule]
                           )

Now we start the training on Amazon SageMaker by calling fit(). The function takes a dictionary that specifies the location of the training and validation data in Amazon S3. The keys of the dictionary are the name of data channels Amazon SageMaker creates in the training instance. See the following code:

pytorch_estimator.fit(inputs={'train': 's3://{}/train'.format(bucket), 
                              'test': 's3://{}/test'.format(bucket)}, 
                      wait=True)

While the training is in progress, we can monitor the rule status in real time in Amazon SageMaker Studio. It turns out that the loss_not_decreasing and class_imbalance rules are triggered. The training runs for 10 epochs and reaches a final test accuracy of 96.3%.

This seems good, but why were the rules triggered? Let’s dive into the data Debugger captured to find out the root causes.

Using SageMaker Debugger rules and data to uncover hidden problems

In this section, we investigate the data to find any hidden problems, create custom rules to fix the model, and rerun the training.

Inspecting loss_not_decreasing

We use Debugger to investigate what triggered the loss_not_decreasing rule. We use the smdebug library, which provides all the functionalities to read and access Debugger data. First we create a trial object that takes the path where the Debugger data is stored as input. This can either be a local or Amazon S3 path. With just a few lines of code, we can retrieve and visualize the loss values as training is still in progress. With trial.steps(), we retrieve the number of recorded steps: a step is one forward and backward pass. We can also specify a mode to retrieve data from the training (modes.TRAIN) or validation phase (modes.EVAL). Debugger’s default sampling interval is 500, so we get loss values for step 0, 500, 1,000, and so on.

To access the loss values, we pass the name of the loss into the trial.tensor() function. The cross entropy loss function we picked measures the performance of a multi-classification model. It takes two inputs: the model outputs and ground truth labels. We can access its outputs via trial.tensor('CrossEntropyLoss_output_0').values():

trial.tensor('CrossEntropyLoss_output_0').values(mode=modes.TRAIN)

{0: array(3.9195325, dtype=float32),
 500: array(0.8488243, dtype=float32),
 1000: array(0.54870504, dtype=float32),
 1500: array(0.25874993, dtype=float32),
 2000: array(0.20406848, dtype=float32),
 2500: array(0.29052508, dtype=float32),
 3000: array(0.18074727, dtype=float32),
 3500: array(0.1956144, dtype=float32),
 4000: array(0.2597512, dtype=float32)}

This code returns a dictionary in which the keys are the step numbers and the values are the loss values. We can now easily visualize the loss values as training is still in progress. See the following code:

import matplotlib.pyplot as plt
from smdebug.trials import create_trial

"path = pytorch_estimator.latest_job_debugger_artifacts_path()
create_trial(path)"

plt.ylabel('Train Loss')
plt.xlabel('Steps')
plt.plot(trial.steps(mode=modes.TRAIN),
         list(trial.tensor('CrossEntropyLoss_output_0').values(mode=modes.TRAIN).values()))
plt.show()

The blue curve in the following graph shows that the default training configuration ran the training for too long. Instead of training for 4,000 steps, early_stopping should have been applied after 1,000 steps. We can use Debugger to enable auto-termination, which stops the training when a rule triggers. For our use case, doing so reduces compute time by more than half (orange curve).

Debugger can auto-terminate training jobs. Metrics are sent to Amazon CloudWatch, so you can set up a CloudWatch alarm and AWS Lambda function that stops a training job if a rule triggers.

For more information about how the auto-termination feature helped one customer reduce compute costs by 70%, see Autodesk optimizes visual similarity search model in Fusion 360 with Amazon SageMaker Debugger.

Inspecting class_imbalance

Real-world datasets are often imbalanced and noisy. If the model training doesn’t account for these factors, it produces a model that has low or no predictive power for the classes with few samples. You can address this in different ways, such as during data-loading, when more samples can be drawn from the under-represented classes, or you can adjust the loss function to assign a higher penalty to incorrect predictions using class weights.

To investigate the class imbalance issue, we retrieve the inputs of the loss function (previously we retrieved the outputs) The loss function takes the model predictions and the ground truth labels as inputs. We use the latter (CrossEntropyLoss_input_1) to count the number of samples the model has seen during training:

from collections import Counter

labels = []
for step in trial.steps(mode=modes.TRAIN):
    labels.extend(trial.tensor("CrossEntropyLoss_input_1").value(step, mode=modes.TRAIN))

label_counts = Counter(labels)
plt.bar(np.arange(0,43),  label_counts.values()

The following visualization shows a high imbalance and several classes with fewer than a hundred samples.

To fix the class imbalance issue, we change the default configuration of the dataloaders to take the class weights into account and draw more samples from classes with fewer samples. Therefore, we define WeightedRandomSampler:

sampler = torch.utils.data.sampler.WeightedRandomSampler(weights, len(weights))                     

train_dataloader = torch.utils.data.DataLoader(dataset, 
                                                batch_size=64,
                                                sampler=sampler)

During training, the dataloader now draws more samples from classes with lower counts. Class imbalance may lead to the problem where the model performs well on classes with a lot of samples but poorly on classes with fewer counts. Because we trained the model without WeightedRandomSampler, let’s see which classes had particularly low accuracy by looking at the confusion matrix.

Visualizing the confusion matrix in real time

To evaluate the performance of our model, we retrieve labels and predictions and create the confusion matrix:

from sklearn.metrics import confusion_matrix
import seaborn as sns

predictions = []
labels = []
for step in trial.steps(mode=modes.EVAL):
    predictions.extend(np.argmax(trial.tensor("CrossEntropyLoss_input_0").value(step, mode=modes.EVAL), axis=1))
    labels.extend(trial.tensor("CrossEntropyLoss_input_1").value(step, mode=modes.EVAL))

cm = confusion_matrix(labels,predictions)
sns.heatmap(cm, ax=ax, cbar=True)

Each row in the matrix corresponds to the actual class, and the column indicates the predicted classes. For example, the first row shows class 0 and how often it was predicted as class 0, class 1, class 2, and so on. Ideally, we want high counts on the diagonal, because these are correctly predicted classes. Elements not on the diagonal are incorrect predictions. The confusion matrix helps us determine if particular classes in our dataset get confused more often with each other. This can happen, for instance, because samples from two different classes may be very similar. Debugger also provides a confusion built-in rule that computes the confusion matrix while the training is in progress and triggers if the ratio of data on-diagonal values and off-diagonal values exceeds a pre-defined threshold.

The following image shows that in most cases our model is predicting the correct classes, but there are a few outliers. You can use Debugger to look more closely into those outliers.

Inspecting incorrect model predictions

To find out what is causing those outliers in the confusion matrix, we investigate the examples upon which the model made false predictions. To do this analysis, we take both inputs into the loss function into account: CrossEntropyLoss_input_0 presents the model predictions, and CrossEntropyLoss_input_1 are the labels. We also retrieve the model inputs ResNet_input_0, which presents the input images. We perform the analysis on data recorded during the validation phase, so we specify mode=modes.EVAL.

We iterate over the predictions and model inputs saved by Debugger and select those where the label and prediction do not match. Then we plot the predictions and corresponding images:

for step in trial.steps(mode=modes.EVAL):
    
    predictions = np.argmax(trial.tensor('CrossEntropyLoss_input_0').value(step, mode=modes.EVAL),axis=1)
    labels = trial.tensor('CrossEntropyLoss_input_1').value(step, mode=modes.EVAL)
    images = trial.tensor('ResNet_input_0').value(step, mode=modes.EVAL)
    
    for prediction, label, image in zip(predictions, labels, images):
        if prediction != label:
            print(f"Predicted: '{signnames[p]}' Groundtruth: '{signnames[l]}' ")
            plt.imshow(i)

The following images show the result of the code segment. The analysis reveals that the model is often confused by traffic signs that involve a direction. Clearly this is a severe model bug despite the model achieving a decent test accuracy of 96.3%.

The root cause of this is the data augmentation pipeline, which performs a random horizontal flip on the training data. This data augmentation step is typically used when ResNet models are trained from scratch, but it causes a problem in our use case where the dataset contains similar classes where images just differ in their direction.

Running training with a custom rule

With Debugger, we can easily write a custom rule that checks how often the model was confused about directions. For example, we take the image “Dangerous curve to left” (class 19) and count how often it was mistaken as “Dangerous curve to right” (class 20) or vice versa. We just need to implement the function invoke_at_step that Debugger calls every time data for a new step is available. Like before, we access the inputs into the loss function: check if image class 19 or 20 is present, and count how often it was mistaken for the other. If this happens more than 10 times, the rule triggers. See the following code:

from smdebug.rules.rule import Rule

class MyCustomRule(Rule):
    def __init__(self, base_trial):
        super().__init__(base_trial)
        self.counter = 0
        
    def invoke_at_step(self, step):
        
        
        predictions = np.argmax(trial.tensor('CrossEntropyLoss_input_0').value(step),axis=1)
        labels = trial.tensor('CrossEntropyLoss_input_1').value(step)

        for prediction, label in zip(predictions, labels):
            
            if prediction == 19 and label == 20 or prediction == 20 and label == 19:
                self.counter += 1
                if self.counter > 10:
                    self.logger.info(f'Found {self.counter} cases where class 19 was mistaken as class 20 and vice versa')
                    return True
                
        return False

We can easily test and run the custom rule locally by creating the trial object and invoking the rule on the data:

from smdebug.rules import invoke_rule
from smdebug.exceptions import *

rule = MyCustomRule(trial)
try:
    invoke_rule(rule, raise_eval_cond=True)
except RuleEvaluationConditionMet as e:
    print(e)

Running the code cell in the notebook gives the following output:

[2020-10-18 18:51:24.588 28f0f34b9e29:12513 INFO rule_invoker.py:15] Started execution of rule MyCustomRule at step 0
[2020-10-18 18:53:11.846 28f0f34b9e29:12513 INFO <ipython-input-69-cae132ce9a97>:19] Found 11 cases where class 19 was mistaken as class 20 and vice versa
Evaluation of the rule MyCustomRule at step 1812 resulted in the condition being met

The rule triggered at step 1812. After the rule has been tested locally, we can run it as part of our Amazon SageMaker training job. First we need to save the rule in a separate file and then define the following configuration where we indicate on which instance type the rule should run:

from sagemaker.debugger import Rule, CollectionConfig

custom_rule = Rule.custom(
    name='MyCustomRule',
    image_uri='759209512951.dkr.ecr.us-west-2.amazonaws.com/sagemaker-debugger-rule-evaluator:latest', 
    instance_type='ml.t3.medium',     
    source='my_custom_rule.py',
    volume_size_in_gb=10, 
    rule_to_invoke='MyCustomRule',     
)

After we define the configuration, we add the custom_rule to the list of rules in the estimator object.

Fixing the model and rerunning training

Now that Debugger has helped us identify some critical issues in our model, we apply the fixes and rerun the training. As mentioned before, weighted re-sampling allows us to fix the class imbalance problem. We also change the data augmentation pipeline and remove the horizontal flip. We reduce the number of epochs from 10 to 3, because we have seen that the loss doesn’t decrease after roughly 1,000 iterations.

With Debugger, we can now compare data from different training jobs and see if the issues persist or not. We just need to create a new trial object, read data from both trials, and compare their tensors:

trial1 = create_trial("s3://bucket/training-job/debug-output")
trial2 = create_trial("s3://bucket/improved-training-job/debug-output")

The following visualization shows the label counts for the original training job and the one where we applied weighted re-sampling (orange). We see that there is no longer a class imbalance issue and the model sees roughly the same amount of instances per class.

We run the training with the same built-in rules as before and add our own custom rule. We monitor the status of the rules and can now see that none of them trigger.

Summary

In this post, we have shown an end-to-end example of how to use Amazon SageMaker Debugger to automatically find, inspect, and fix issues in a deep neural network training.

As the state-of-the-art models grow in size and complexity, finding issues early in the prototyping phase is critical to save time and costs. Model bugs may not always be obvious and as we have shown in this post, a suboptimal model may still achieve an overall good accuracy.

In our use case, we not only found critical bugs, but also reduced training time by a factor of two and improved model performance.

There is no extra cost for running Debugger built-in rules, and you benefit by having them enabled because you may discover non-obvious model issues. If you want to learn more about Debugger, check out the following:

References

[1] Johannes Stallkamp, Marc Schlipsing, Jan Salmen, Christian Igel, The German traffic sign recognition benchmark: A multi-class classification competition, The 2011 International Joint Conference on Neural Networks, 2011

 


About the Authors

Nathalie Rauschmayr is an Applied Scientist at AWS, where she helps customers develop deep learning applications.

 

 

 

Lu Huang is a Senior Product Manager on the AWS Deep Engine team, managing Sagemaker Debugger.

 

 

 

Satadal Bhattacharjee is Principal Product Manager at AWS AI. He leads the machine learning engine PM team on projects such as SageMaker and optimizes machine learning frameworks such as TensorFlow, PyTorch, and MXNet.

Read More

Using Transformers to create music in AWS DeepComposer Music studio

Using Transformers to create music in AWS DeepComposer Music studio

AWS DeepComposer provides a creative and hands-on experience for learning generative AI and machine learning (ML). We recently launched a Transformer-based model that iteratively extends your input melody up to 20 seconds. This newly created extension will use the style and musical motifs found in your input melody and create additional notes that sound like they’ve come from your input melody. In this post, we show you how the Transformer model extends the duration of your existing compositions. You can create new and interesting musical scores by using various parameters, including the Edit melody feature.

Introduction to Transformers

The Transformer is a recent deep learning model for use with sequential data such as text, time series, music, and genomes. Whereas older sequence models such as recurrent neural networks (RNNs) or long short-term memory networks (LSTMs) process data sequentially, the Transformer processes data in parallel. This allows them to process massive amounts of available training data by using powerful GPU-based compute resources. 

Furthermore, traditional RNNs and LSTMs can have difficulty modeling the long-term dependencies of a sequence because they can forget earlier parts of the sequence. Transformers use an attention mechanism to overcome this memory shortcoming by directing each step of the output sequence to pay “attention” to relevant parts of the input sequence. For example, when a Transformer-based conversational AI model is asked “How is the weather now?” and the model replies “It is warm and sunny today,” the attention mechanism guides the model to focus on the word “weather” when answering with “warm” and “sunny,” and to focus on “now” when answering with “today.” This is different from traditional RNNs and LSTMs, which process sentences from left to right and forget the context of each word as the distance between the words increases.

Training a Transformer model to generate music

To work with musical datasets, the first step is to convert the data into a sequence of tokens. Each token represents a distinct musical event in the score. A token might represent something like the timestamp for when a note is struck, or the pitch of a note. The relationship between these tokens to musical notes is similar to the relationship between words in a sentence or paragraph. Tokens in music can represent notes or other musical features just like how tokens in language can represent words or punctuation. This differs from previous models supported by AWS DeepComposer such as GAN and AR-CNN, which treat music generation like an image generation problem.

These sequences of tokens are then used to train the Transformer model. During training, the model attempts to learn a distribution that matches the underlying distribution of the training dataset. During inference, the model generates a sequence of tokens by sampling from the distribution learned during training. The new musical score is created by turning the sequence of tokens back into music. Music Transformer and MuseNet are examples of other algorithms that use the Transformer architecture for music generation.

In AWS DeepComposer, we use the TransformerXL architecture to generate music because it’s capable of capturing long-term dependencies that are 4.5 times longer than a traditional Transformer. Furthermore, it has been shown to be 18 times faster than a traditional Transformer during inference. This means that AWS DeepComposer can provide you with higher quality musical compositions at lower latency when generating new compositions.

Extending your input melody using AWS DeepComposer

The Transformers technique extends your input melody by up to 20 seconds. The following screenshot shows the view of your input melody on the AWS DeepComposer console.

To extend your input melody, complete the following steps: 

  1. On the AWS DeepComposer console, in the navigation pane, choose Music studio.
  2. Choose the arrow next to Input melody to expand that section.
  3. For Sample track, choose a melody specifically recommended for the Transformers technique.

These options represent the kinds of complex classical-genre melodies that will work best with the Transformer technique. You can also import a MIDI file or create your own melody using a MIDI keyboard.

  1. Under Generative AI technique, for Model parameters, choose Transformers.

The available model, TransformerXLClassical, is preselected.

  1. Under Advanced parameters, for Model parameters, you have seven parameters that you can adjust (more details about these parameters are provided in the next section of this post).
  2. Choose Extend input melody.
  3. To listen to your new composition, choose Play (►).

This model works by extending your input melody by up to 20 seconds.

  1. After performing inference, you can use the Edit melody tool to add or remove notes, or change the pitch or the length of notes in the track generated.
  2. You can repeat these steps to create compositions up to 2 minutes long.

The following compositions were created using the TransformersXLClassical model in AWS DeepComposer:

Beethoven:

Mozart:

Bach:

In the next section of this post, we look at how different inference parameters affect your output and how we can effectively use these parameters to create interesting and diverse music.

Configuring the advanced parameters for Transformers

In AWS DeepComposer Music studio, you can choose from seven different Advanced parameters that can be used to change how your extended melody is created:

  • Sampling technique and sampling threshold
  • Creative risk
  • Input duration
  • Track extension duration
  • Maximum rest time
  • Maximum note length

Sampling technique and sampling threshold

You have three sampling techniques to choose from: TopK, Nucleus, and Random. You can also set the Sampling threshold value for your chosen technique. We first discuss each technique and provide some examples of how it affects the output below.

TopK sampling

When you choose the TopK sampling technique, the model chooses the K-tokens that have the highest probability of occurring. To set the value for K, change the Sampling threshold.

If your sampling threshold is set high, the number of available tokens (K) is large. A large number of available tokens means the model can choose from a wider variety of musical tokens. In your extended melody, this means the generated notes are likely to be more diverse, but it comes at the cost of potentially creating less coherent music.

On the other hand, if you choose a threshold value that is too low, the model is limited to choosing from a smaller set of tokens (that the model believes has a higher probability of being correct). In your extended melody, you might notice less musical diversity and more repetitive results. 

Nucleus sampling

At a high level, Nucleus sampling is very similar to TopK. Setting a higher sampling threshold allows for more diversity at the cost of coherence or consistency. There is a subtle difference between the two approaches. Nucleus sampling chooses the top probability tokens that sum up to the value set for the sampling threshold. We do this by sorting the probabilities from greatest to least, and calculating a cumulative sum for each token.

For example, we might have six musical tokens with the probabilities {0.3, 0.3, 0.2, 0.1, 0.05, 0.05}. If we choose TopK with a sampling threshold equal to 0.5, we choose three tokens (six total musical tokens * 0.5). Then we sample between the tokens with probabilities equal to 0.3, 0.3, and 0.2. If we choose Nucleus sampling with a 0.5 sampling threshold, we only sample between two tokens {0.3, 0.3} as the cumulative probability (0.6) exceeds the threshold (0.5).

Random sampling

Random sampling is the most basic sampling technique. With random sampling, the model is free to sample between all the available tokens and is “randomly” sampled from the output distribution. The output of this sampling technique is identical to that of TopK or Nucleus sampling when the sampling threshold is set to 1. The following are some audio clips generated using different sampling thresholds paired with the TopK sampling threshold.  

The following audio uses TopK and a Sampling threshold equal to 0.1:

Notice how the notes quickly start to form a pattern.

The following audio uses TopK and a Sampling threshold equal to 0.9: 

You can decide which one sounds better, but did you hear the difference?

The notes are very diverse, but as a whole the notes lose coherence and sound somewhat random at times. This general trend holds for Nucleus sampling as well, but the results differ from TopK depending on the shape of the output distribution. Play around and see for yourself!

Creative risk

Creative risk is a parameter used to control the randomness of predictions. A low creative risk makes the model more confident but also more conservative in its samples (it’s less likely to sample from unlikely candidate tokens). On the other hand, a high creative risk produces a softer (flatter) probability distribution over the list of musical tokens, so the model takes more risks in its samples (it’s more likely to sample from unlikely candidate tokens), resulting in more diversity and probably more mistakes. Mistakes might include creating longer or shorter notes, longer or shorter periods of rest in the generated melody, or adding wrong notes to the generated melody.

Input duration

This parameter tells the model what portion of the input melody to use during inference. The portion used is defined as the number of seconds selected counting backwards from the end of the input track. When extending the melody, the model conditions the output it generates based on the portion of the input melody you provide. For example, if you choose 5 seconds as the input duration, the model only uses the last 5 seconds of the input melody for conditioning and ignores the remaining portion when performing inference. The following audio clips were generated using different input durations.

The following audio has an input duration of 5 seconds:

The following audio has an input duration of 30 seconds:

The output conditioned on 30 seconds of input draws more inspiration from the input melody.

Track extension duration

When extending the melody, the Transformer continuously generates tokens until the generated portion reaches the track extension duration you have selected. The reason the model sometimes generates less than the value you selected is because the model generates values in terms of tokens, not time. Tokens, however, can represent different lengths of the time. For example, a token could represent a note duration of 0.1 seconds or 1 second depending on what the model thinks is appropriate. That token, however, takes the same amount of run time for the model to generate. Because the model can generate hundreds of tokens, this difference adds up. To make sure the model doesn’t have extreme runtime latencies, sometimes the model stops before generating your entire output.

Maximum rest time

During inference, the Transformers model can create musical artifacts. Changing the value of maximum rest time limits the periods of silence, in seconds, the model can generate while performing inference.

Maximum note length

Changing the value of maximum note length limits the amount of time a single note can be held for while performing inference. The following audio clips are some example tracks generated using different maximum rest time and maximum note length.

In the first example audio, we set the maximum note length to 10 seconds.

In the second sample, we set it to 1 second, but set the maximum rest period to 11 seconds.

In the third sample, we set the maximum note length to 1 second and maximum rest period to 2 seconds.

The first sample contains extremely long notes. The second sample doesn’t contain long notes, but contains many missing gaps in the music. On the other hand, the third sample contains both shorter notes and shorter gaps.

Creating compositions using different AWS DeepComposer techniques

What’s great about AWS DeepComposer is that you can mix and match the new Transformers technique with the other techniques found in AWS DeepComposer, such as the AR-CNN and GAN techniques.

To create a sample, we completed the following steps: 

  1. Choose the sample melody Pathétique.
  2. Use the Transformers technique to extend the melody.

For this track, we extended the melody to 11 bars. Transformers tries to extend the melody up to the value you choose for the extension duration.

  1. The AR-CNN and GAN techniques only work with eight bars of input, so we use the Edit melody feature to cut the track down to eight bars.

  1. Use the AR-CNN technique to fill in notes and enhance the melody.

For this post, we set Sampling iterations equal to 100.

  1. We use the GAN technique, paired with the MuseGAN algorithm and the Rock model, to generate accompaniments.

The following audio is our final output:

We think the output sounds pretty impressive. What do you think? Play around and see what kind of composition you can create yourself!

Conclusion

You’ve now learned about the Transformer model and how AWS DeepComposer uses it to extend your input melody. You can also better understand how each parameter for the Transformers technique can affect the characteristics of your composition.

To continue exploring AWS DeepComposer, consider some of the following:

  • Choose a different input melody. You can try importing a track or recording your own.
  • Use the Edit melody feature to assist your AI or correct mistakes.
  • Try feeding the output of the AR-CNN model into the Transformers model.
  • Iteratively extend your melody to create a musical composition up to 2 minutes long.

Although you don’t need a physical device to experiment with AWS DeepComposer, you can take advantage of a limited-time offer and purchase the AWS DeepComposer keyboard at a special price of $79.20 (20% off) on Amazon.com. The pricing includes the keyboard and a 3-month free trial of AWS DeepComposer.

We’re excited for you to try out various combinations to generate your creative musical piece. Start composing in the AWS DeepComposer Music Studio now!

 


About the Authors

Rahul Suresh is an Engineering Manager with the AWS AI org, where he has been working on AI based products for making machine learning accessible for all developers. Prior to joining AWS, Rahul was a Senior Software Developer at Amazon Devices and helped launch highly successful smart home products. Rahul is passionate about building machine learning systems at scale and is always looking for getting these advanced technologies in the hands of customers. In addition to his professional career, Rahul is an avid reader and a history buff.

 

Wayne Chi is a ML Engineer and AI Researcher at AWS. He works on researching interesting Machine Learning problems to teach new developers and then bringing those ideas into production. Prior to joining AWS he was a Software Engineer and AI Researcher at JPL, NASA where he worked on AI Planning and Scheduling systems for the Mars 2020 Rover (Perseverance). In his spare time he enjoys playing tennis, watching movies, and learning more about AI.

 

Liang Li is an AI Researcher at AWS, where she works on AI based products to bring new cutting-edge ideas about Deep Learning to teach developers. Prior to joining AWS, Liang graduated from the University of Tennessee Knoxville with a Ph. D in EE, and she has been focusing on ML projects since graduation. In her spare time, she enjoys cooking and hiking.

 

 

Suri Yaddanapudi  is an AI Researcher and ML Engineer at AWS. He works on researching and implementing modern machine learning algorithms across different domains and teaching them to customers in a fun way. Prior to joining AWS, Suri graduated with his Ph.D. degree from University of Cincinnati and his thesis was focused on implementing AI techniques to Drug Repurposing. In his spare time, he enjoys reading, watching anime and playing futsal.

 

 

Aashiq Muhamed is an AI Researcher and ML Engineer at AWS. He believes that AI can change the world and that democratizing AI is key to making this happen. At AWS, he works on creating meaningful AI products and translating ideas from academia into industry. Prior to joining AWS, he was a graduate student at Stanford where he worked on model reduction in robotics, learning and control. In his spare time he enjoys playing the violin and thinking about healthcare on MARS.

 

 

Patrick L. Cavins is a Programmer Writer for DeepComposer and DeepLens. Previously, he worked in radiochemistry using isotopically labelled compounds to study how plants communicate. In his spare time, he enjoys skiing, playing the piano, and writing.

 

 

 

Maryam Rezapoor is a Senior Product Manager with AWS AI Devices team. As a former biomedical researcher and entrepreneur, she finds her passion in working backward from customers’ needs to create new impactful solutions. Outside of work, she enjoys hiking, photography, and gardening.

Read More