Team awarded $500,000 prize for performance of its Emora socialbot.Read More
Improving speech-to-text transcripts from Amazon Transcribe using custom vocabularies and Amazon Augmented AI
Businesses and organizations are increasingly using video and audio content for a variety of functions, such as advertising, customer service, media post-production, employee training, and education. As the volume of multimedia content generated by these activities proliferates, businesses are demanding high-quality transcripts of video and audio to organize files, enable text queries, and improve accessibility to audiences who are deaf or hard of hearing (466 million with disabling hearing loss worldwide) or language learners (1.5 billion English language learners worldwide).
Traditional speech-to-text transcription methods typically involve manual, time-consuming, and expensive human labor. Powered by machine learning (ML), Amazon Transcribe is a speech-to-text service that delivers high-quality, low-cost, and timely transcripts for business use cases and developer applications. In the case of transcribing domain-specific terminologies in fields such as legal, financial, construction, higher education, or engineering, the custom vocabularies feature can improve transcription quality. To use this feature, you create a list of domain-specific terms and reference that vocabulary file when running transcription jobs.
This post shows you how to use Amazon Augmented AI (Amazon A2I) to help generate this list of domain-specific terms by sending low-confidence predictions from Amazon Transcribe to humans for review. We measure the word error rate (WER) of transcriptions and number of correctly-transcribed terms to demonstrate how to use custom vocabularies to improve transcription of domain-specific terms in Amazon Transcribe.
To complete this use case, use the notebook A2I-Video-Transcription-with-Amazon-Transcribe.ipynb on the Amazon A2I Sample Jupyter Notebook GitHub repo.

Example of mis-transcribed annotation of the technical term, “an EC2 instance”. This term was transcribed as “Annecy two instance”.

Example of correctly transcribed annotation of the technical term “an EC2 instance” after using Amazon A2I to build an Amazon Transcribe custom vocabulary and re-transcribing the video.
This walkthrough focuses on transcribing video content. You can modify the code provided to use audio files (such as MP3 files) by doing the following:
- Upload audio files to your Amazon Simple Storage Service (Amazon S3) bucket and using them in place of the video files provided.
- Modify the button text and instructions in the worker task template provided in this walkthrough and tell workers to listen to and transcribe audio clips.
Solution overview
The following diagram presents the solution architecture.
We briefly outline the steps of the workflow as follows:
- Perform initial transcription. You transcribe a video about Amazon SageMaker, which contains multiple mentions of technical ML and AWS terms. When using Amazon Transcribe out of the box, you may find that some of these technical mentions are mis-transcribed. You generate a distribution of confidence scores to see the number of terms that Amazon Transcribe has difficulty transcribing.
- Create human review workflows with Amazon A2I. After you identify words with low-confidence scores, you can send them to a human to review and transcribe using Amazon A2I. You can make yourself a worker on your own private Amazon A2I work team and send the human review task to yourself so you can preview the worker UI and tools used to review video clips.
- Build custom vocabularies using A2I results. You can parse the human-transcribed results collected from Amazon A2I to extract domain-specific terms and use these terms to create a custom vocabulary table.
- Improve transcription using custom vocabulary. After you generate a custom vocabulary, you can call Amazon Transcribe again to get improved transcription results. You evaluate and compare the before and after performances using an industry standard called word error rate (WER).
Prerequisites
Before beginning, you need the following:
- An AWS account.
- An S3 bucket. Provide its name in
BUCKET
in the notebook. The bucket must be in the same Region as this Amazon SageMaker notebook instance. - An AWS Identity and Access Management (IAM) execution role with required permissions. The notebook automatically uses the role you used to create your notebook instance (see the next item in this list). Add the following permissions to this IAM role:
- Attach managed policies
AmazonAugmentedAIFullAccess
andAmazonTranscribeFullAccess
. - When you create your role, you specify Amazon S3 permissions. You can either allow that role to access all your resources in Amazon S3, or you can specify particular buckets. Make sure that your IAM role has access to the S3 bucket that you plan to use in this use case. This bucket must be in the same Region as your notebook instance.
- Attach managed policies
- An active Amazon SageMaker notebook instance. For more information, see Create a Notebook Instance. Open your notebook instance and upload the notebook A2I-Video-Transcription-with-Amazon-Transcribe.ipynb.
- A private work team. A work team is a group of people that you select to review your documents. You can choose to create a work team from a workforce, which is made up of workers engaged through Amazon Mechanical Turk, vendor-managed workers, or your own private workers that you invite to work on your tasks. Whichever workforce type you choose, Amazon A2I takes care of sending tasks to workers. For this post, you create a work team using a private workforce and add yourself to the team to preview the Amazon A2I workflow. For instructions, see Create a Private Workforce. Record the ARN of this work team—you need it in the accompanying Jupyter notebook.
To understand this use case, the following are also recommended:
- Basic understanding of AWS services like Amazon Transcribe, its features such as custom vocabularies, and the core components and workflow Amazon A2I uses.
- The notebook uses the AWS SDK for Python (Boto3) to interact with these services.
- Familiarity with Python and NumPy.
- Basic familiarity with Amazon S3.
Getting started
After you complete the prerequisites, you’re ready to deploy this solution entirely on an Amazon SageMaker Jupyter notebook instance. Follow along in the notebook for the complete code.
To start, follow the Setup code cells to set up AWS resources and dependencies and upload the provided sample MP4 video files to your S3 bucket. For this use case, we analyze videos from the official AWS playlist on introductory Amazon SageMaker videos, also available on YouTube. The notebook walks through transcribing and viewing Amazon A2I tasks for a video about Amazon SageMaker Jupyter Notebook instances. In Steps 3 and 4, we analyze results for a larger dataset of four videos. The following table outlines the videos that are used in the notebook, and how they are used.
Video # | Video Title | File Name | Function |
1 |
Fully-Managed Notebook Instances with Amazon SageMaker – a Deep Dive | Fully-Managed Notebook Instances with Amazon SageMaker – a Deep Dive.mp4 | Perform the initial transcription and viewing sample Amazon A2I jobs in Steps 1 and 2.Build a custom vocabulary in Step 3 |
2 |
Built-in Machine Learning Algorithms with Amazon SageMaker – a Deep Dive | Built-in Machine Learning Algorithms with Amazon SageMaker – a Deep Dive.mp4 | Test transcription with the custom vocabulary in Step 4 |
3 |
Bring Your Own Custom ML Models with Amazon SageMaker | Bring Your Own Custom ML Models with Amazon SageMaker.mp4 | Build a custom vocabulary in Step 3 |
4 |
Train Your ML Models Accurately with Amazon SageMaker | Train Your ML Models Accurately with Amazon SageMaker.mp4 | Test transcription with the custom vocabulary in Step 4 |
In Step 4, we refer to videos 1 and 3 as the in-sample videos, meaning the videos used to build the custom vocabulary. Videos 2 and 4 are the out-sample videos, meaning videos that our workflow hasn’t seen before and are used to test how well our methodology can generalize to (identify technical terms from) new videos.
Feel free to experiment with additional videos downloaded by the notebook, or your own content.
Step 1: Performing the initial transcription
Our first step is to look at the performance of Amazon Transcribe without custom vocabulary or other modifications and establish a baseline of accuracy metrics.
Use the transcribe
function to start a transcription job. You use vocab_name
parameter later to specify custom vocabularies, and it’s currently defaulted to None
. See the following code:
transcribe(job_names[0], folder_path+all_videos[0], BUCKET)
Wait until the transcription job displays COMPLETED
. A transcription job for a 10–15-minute video typically takes up to 5 minutes.
When the transcription job is complete, the results is stored in an output JSON file called YOUR_JOB_NAME
.json in your specified BUCKET
. Use the get_transcript_text_and_timestamps
function to parse this output and return several useful data structures. After calling this, all_sentences_and_times
has, for each transcribed video, a list of objects containing sentences with their start time, end time, and confidence score. To save those to a text file for use later, enter the following code:
file0 = open("originaltranscript.txt","w")
for tup in sentences_and_times_1:
file0.write(tup['sentence'] + "n")
file0.close()
To look at the distribution of confidence scores, enter the following code:
from matplotlib import pyplot as plt
plt.style.use('ggplot')
flat_scores_list = all_scores[0]
plt.xlim([min(flat_scores_list)-0.1, max(flat_scores_list)+0.1])
plt.hist(flat_scores_list, bins=20, alpha=0.5)
plt.title('Plot of confidence scores')
plt.xlabel('Confidence score')
plt.ylabel('Frequency')
plt.show()
The following graph illustrates the distribution of confidence scores.
Next, we filter out the high confidence scores to take a closer look at the lower ones.
You can experiment with different thresholds to see how many words fall below that threshold. For this use case, we use a threshold of 0.4, which corresponds to 16 words below this threshold. Sequences of words with a term under this threshold are sent to human review.
As you experiment with different thresholds and observe the number of tasks it creates in the Amazon A2I workflow, you can see a tradeoff between the number of mis-transcriptions you want to catch and the amount of time and resources you’re willing to devote to corrections. In other words, using a higher threshold captures a greater percentage of mis-transcriptions, but it also increases the number of false positives—low-confidence transcriptions that don’t actually contain any important technical term mis-transcriptions. The good news is that you can use this workflow to quickly experiment with as many different threshold values as you’d like before sending it to your workforce for human review. See the following code:
THRESHOLD = 0.4
# Filter scores that are less than THRESHOLD
all_bad_scores = [i for i in flat_scores_list if i < THRESHOLD]
print(f"There are {len(all_bad_scores)} words that have confidence score less than {THRESHOLD}")
plt.xlim([min(all_bad_scores)-0.1, max(all_bad_scores)+0.1])
plt.hist(all_bad_scores, bins=20, alpha=0.5)
plt.title(f'Plot of confidence scores less than {THRESHOLD}')
plt.xlabel('Confidence score')
plt.ylabel('Frequency')
plt.show()
You get the following output:
There are 16 words that have confidence score less than 0.4
The following graph shows the distribution of confidence scores less than 0.4.
As you experiment with different thresholds, you can see a number of words classified with low confidence. As we see later, terms that are specific to highly technical domains are more difficult to automatically transcribe in general, so it’s important that we capture these terms and incorporate them into our custom vocabulary.
Step 2: Creating human review workflows with Amazon A2I
Our next step is to create a human review workflow (or flow definition) that sends low confidence scores to human reviewers and retrieves the corrected transcription they provide. The accompanying Jupyter notebook contains instructions for the following steps:
- Create a workforce of human workers to review predictions. For this use case, creating a private workforce enables you to send Amazon A2I human review tasks to yourself so you can preview the worker UI.
- Create a work task template that is displayed to workers for every task. The template is rendered with input data you provide, instructions to workers, and interactive tools to help workers complete your tasks.
- Create a human review workflow, also called a flow definition. You use the flow definition to configure details about your human workforce and the human tasks they are assigned.
- Create a human loop to start the human review workflow, sending data for human review as needed. In this example, you use a custom task type and start human loop tasks using the Amazon A2I Runtime API. Each time
StartHumanLoop
is called, a task is sent to human reviewers.
In the notebook, you create a human review workflow using the AWS Python SDK (Boto3) function create_flow_definition. You can also create human review workflows on the Amazon SageMaker console.
Setting up the worker task UI
Amazon A2I uses Liquid, an open-source template language that you can use to insert data dynamically into HTML files.
In this use case, we want each task to enable a human reviewer to watch a section of the video where low confidence words appear and transcribe the speech they hear. The HTML template consists of three main parts:
- A video player with a replay button that only allows the reviewer to play the specific subsection
- A form for the reviewer to type and submit what they hear
- Logic written in JavaScript to give the replay button its intended functionality
The following code is the template you use:
<head>
<style>
h1 {
color: black;
font-family: verdana;
font-size: 150%;
}
</style>
</head>
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
<crowd-form>
<video id="this_vid">
<source src="{{ task.input.filePath | grant_read_access }}"
type="audio/mp4">
Your browser does not support the audio element.
</video>
<br />
<br />
<crowd-button onclick="onClick(); return false;"><h1> Click to play video section!</h1></crowd-button>
<h3>Instructions</h3>
<p>Transcribe the audio clip </p>
<p>Ignore "umms", "hmms", "uhs" and other non-textual phrases. </p>
<p>The original transcript is <strong>"{{ task.input.original_words }}"</strong>. If the text matches the audio, you can copy and paste the same transcription.</p>
<p>Ignore "umms", "hmms", "uhs" and other non-textual phrases.
If a word is cut off in the beginning or end of the video clip, you do NOT need to transcribe that word.
You also do NOT need to transcribe punctuation at the end of clauses or sentences.
However, apostrophes and punctuation used in technical terms should still be included, such as "Denny's" or "file_name.txt"</p>
<p><strong>Important:</strong> If you encounter a technical term that has multiple words,
please <strong>hyphenate</strong> those words together. For example, "k nearest neighbors" should be transcribed as "k-nearest-neighbors."</p>
<p>Click the space below to start typing.</p>
<full-instructions header="Transcription Instructions">
<h2>Instructions</h2>
<p>Click the play button and listen carefully to the audio clip. Type what you hear in the box
below. Replay the clip by clicking the button again, as many times as needed.</p>
</full-instructions>
</crowd-form>
<script>
var video = document.getElementById('this_vid');
video.onloadedmetadata = function() {
video.currentTime = {{ task.input.start_time }};
};
function onClick() {
video.pause();
video.currentTime = {{ task.input.start_time }};
video.play();
video.ontimeupdate = function () {
if (video.currentTime >= {{ task.input.end_time }}) {
video.pause()
}
}
}
</script>
The {{ task.input.filePath | grant_read_access }}
field allows you to grant access to and display a video to workers using a path to the video’s location in an S3 bucket. To prevent the reviewer from navigating to irrelevant sections of the video, the controls
parameter is omitted from the video tag and a single replay button is included to control which section can be replayed.
Under the video player, the <crowd-text-area>
HTML tag creates a submission form that your reviewer uses to type and submit.
At the end of the HTML snippet, the section enclosed by the <script>
tag contains the JavaScript logic for the replay button. The {{ task.input.start_time }}
and {{ task.input.end_time }}
fields allow you to inject the start and end times of the video subsection you want transcribed for the current task.
You create a worker task template using the AWS Python SDK (Boto3) function create_human_task_ui. You can also create a human task template on the Amazon SageMaker console.
Creating human loops
After setting up the flow definition, we’re ready to use Amazon Transcribe and initiate human loops. While iterating through the list of transcribed words and their confidence scores, we create a human loop whenever the confidence score is below some threshold, CONFIDENCE_SCORE_THRESHOLD
. A human loop is just a human review task that allows workers to review the clips of the video that Amazon Transcribe had difficulty with.
An important thing to consider is how we deal with a low-confidence word that is part of a phrase that was also mis-transcribed. To handle these cases, you use a function that gets the sequence of words centered about a given index, and the sequence’s starting and ending timestamps. See the following code:
def get_word_neighbors(words, index):
"""
gets the words transcribe found at most 3 away from the input index
Returns:
list: words at most 3 away from the input index
int: starting time of the first word in the list
int: ending time of the last word in the list
"""
i = max(0, index - 3)
j = min(len(words) - 1, index + 3)
return words[i: j + 1], words[i]["start_time"], words[j]["end_time"]
For every word we encounter with low confidence, we send its associated sequence of neighboring words for human review. See the following code:
human_loops_started = []
CONFIDENCE_SCORE_THRESHOLD = THRESHOLD
i = 0
for obj in confidences_1:
word = obj["content"]
neighbors, start_time, end_time = get_word_neighbors(confidences_1, i)
# Our condition for when we want to engage a human for review
if (obj["confidence"] < CONFIDENCE_SCORE_THRESHOLD):
# get the original sequence of words
sequence = ""
for block in neighbors:
sequence += block['content'] + " "
humanLoopName = str(uuid.uuid4())
# "initialValue": word,
inputContent = {
"filePath": job_uri_s3,
"start_time": start_time,
"end_time": end_time,
"original_words": sequence
}
start_loop_response = a2i.start_human_loop(
HumanLoopName=humanLoopName,
FlowDefinitionArn=flowDefinitionArn,
HumanLoopInput={
"InputContent": json.dumps(inputContent)
}
)
human_loops_started.append(humanLoopName)
# print(f'Confidence score of {obj["confidence"]} is less than the threshold of {CONFIDENCE_SCORE_THRESHOLD}')
# print(f'Starting human loop with name: {humanLoopName}')
# print(f'Sending words from times {start_time} to {end_time} to review')
print(f'The original transcription is ""{sequence}"" n')
i=i+1
For the first video, you should see output that looks like the following code:
========= Fully-Managed Notebook Instances with Amazon SageMaker - a Deep Dive.mp4 =========
The original transcription is "show up Under are easy to console "
The original transcription is "And more cores see is compute optimized "
The original transcription is "every version of Annecy two instance is "
The original transcription is "distributing data sets wanted by putt mode "
The original transcription is "onto your EBS volumes And again that's "
The original transcription is "of those example No books are open "
The original transcription is "the two main ones markdown is gonna "
The original transcription is "I started using Boto three but I "
The original transcription is "absolutely upgrade on bits fun because you "
The original transcription is "That's the python Asi que We're getting "
The original transcription is "the Internet s Oh this is from "
The original transcription is "this is from Sarraf He's the author "
The original transcription is "right up here then the title of "
The original transcription is "but definitely use Lambda to turn your "
The original transcription is "then edit your ec2 instance or the "
Number of tasks sent to review: 15
As you’re completing tasks, you should see these mis-transcriptions with the associated video clips. See the following screenshot.
Human loop statuses that are complete display Completed
. It’s not required to complete all human review tasks before continuing. Having 3–5 finished tasks is typically sufficient to see how technical terms can be extracted from the results. See the following code:
completed_human_loops = []
for human_loop_name in human_loops_started:
resp = a2i.describe_human_loop(HumanLoopName=human_loop_name)
print(f'HumanLoop Name: {human_loop_name}')
print(f'HumanLoop Status: {resp["HumanLoopStatus"]}')
print(f'HumanLoop Output Destination: {resp["HumanLoopOutput"]}')
print('n')
if resp["HumanLoopStatus"] == "Completed":
completed_human_loops.append(resp)
When all tasks are complete, Amazon A2I stores results in your S3 bucket and sends an Amazon CloudWatch event (you can check for these on your AWS Management Console). Your results should be available in the S3 bucket OUTPUT_PATH
when all work is complete. You can print the results with the following code:
import re
import pprint
pp = pprint.PrettyPrinter(indent=4)
for resp in completed_human_loops:
splitted_string = re.split('s3://' + BUCKET + '/', resp['HumanLoopOutput']['OutputS3Uri'])
output_bucket_key = splitted_string[1]
response = s3.get_object(Bucket=BUCKET, Key=output_bucket_key)
content = response["Body"].read()
json_output = json.loads(content)
pp.pprint(json_output)
print('n')
Step 3: Improving transcription using custom vocabulary
You can use the corrected transcriptions from our human reviewers to parse the results to identify the domain-specific terms you want to add to a custom vocabulary. To get a list of all human-reviewed words, enter the following code:
corrected_words = []
for resp in completed_human_loops:
splitted_string = re.split('s3://' + BUCKET + '/', resp['HumanLoopOutput']['OutputS3Uri'])
output_bucket_key = splitted_string[1]
response = s3.get_object(Bucket=BUCKET, Key=output_bucket_key)
content = response["Body"].read()
json_output = json.loads(content)
# add the human-reviewed answers split by spaces
corrected_words += json_output['humanAnswers'][0]['answerContent']['transcription'].split(" ")
We want to parse through these words and look for uncommon English words. An easy way to do this is to use a large English corpus and verify if our human-reviewed words exist in this corpus. In this use case, we use an English-language corpus from Natural Language Toolkit (NLTK), a suite of open-source, community-driven libraries for natural language processing research. See the following code:
# Create dictionary of English words
# Note that this corpus of words is not 100% exhaustive
import nltk
nltk.download('words')
from nltk.corpus import words
my_dict=set(words.words())
word_set = set([])
for word in remove_contractions(corrected_words):
if word:
if word.lower() not in my_dict:
if word.endswith('s') and word[:-1] in my_dict:
print("")
elif word.endswith("'s") and word[:-2] in my_dict:
print("")
else:
word_set.add(word)
for word in word_set:
print(word)
The words you find may vary depending on which videos you’ve transcribed and what threshold you’ve used. The following code is an example of output from the Amazon A2I results of the first and third videos from the playlist (see the Getting Started section earlier):
including
machine-learning
grabbing
amazon
boto3
started
t3
called
sarab
ecr
using
ebs
internet
jupyter
distributing
opt/ml
optimized
desktop
tokenizing
s3
sdk
encrypted
relying
sagemaker
datasets
upload
iam
gonna
managing
wanna
vpc
managed
mars.r
ec2
blazingtext
With these technical terms, you can now more easily manually create a custom vocabulary of those terms that we want Amazon Transcribe to recognize. You can use a custom vocabulary table to tell Amazon Transcribe how each technical term is pronounced and how it should be displayed. For more information on custom vocabulary tables, see Create a Custom Vocabulary Using a Table.
While you process additional videos on the same topic, you can keep updating this list, and the number of new technical terms you have to add will likely decrease each time you get a new video.
We built a custom vocabulary (see the following code) using parsed Amazon A2I results from the first and third videos with a 0.5 THRESHOLD
confidence value. You can use this vocabulary for the rest of the notebook:
finalized_words=[['Phrase','IPA','SoundsLike','DisplayAs'], # This top line denotes the column headers of the text file.
['machine-learning','','','machine learning'],
['amazon','','am-uh-zon','Amazon'],
['boto-three','','boe-toe-three','Boto3'],
['T.-three','','tee-three','T3'],
['Sarab','','suh-rob','Sarab'],
['E.C.R.','','ee-see-are','ECR'],
['E.B.S.','','ee-bee-ess','EBS'],
['jupyter','','joo-pih-ter','Jupyter'],
['opt-M.L.','','opt-em-ell','/opt/ml'],
['desktop','','desk-top','desktop'],
['S.-Three','','ess-three','S3'],
['S.D.K.','','ess-dee-kay','SDK'],
['sagemaker','','sage-may-ker','SageMaker'],
['mars-dot-r','','mars-dot-are','mars.R'],
['I.A.M.','','eye-ay-em','IAM'],
['V.P.C.','','','VPC'],
['E.C.-Two','','ee-see-too','EC2'],
['blazing-text','','','BlazingText'],
]
After saving your custom vocabulary table to a text file and uploading it to an S3 bucket, create your custom vocabulary with a specified name so Amazon Transcribe can use it:
# The name of your custom vocabulary must be unique!
vocab_improved='sagemaker-custom-vocab'
transcribe = boto3.client("transcribe")
response = transcribe.create_vocabulary(
VocabularyName=vocab_improved,
LanguageCode='en-US',
VocabularyFileUri='s3://' + BUCKET + '/' + custom_vocab_file_name
)
pp.pprint(response)
Wait until the VocabularyState
displays READY
before continuing. This typically takes up to a few minutes. See the following code:
# Wait for the status of the vocab you created to finish
while True:
response = transcribe.get_vocabulary(
VocabularyName=vocab_improved
)
status = response['VocabularyState']
if status in ['READY', 'FAILED']:
print(status)
break
print("Not ready yet...")
time.sleep(5)
Step 4: Improving transcription using custom vocabulary
After you create your custom vocabulary, you can call your transcribe function to start another transcription job, this time with your custom vocabulary. See the following code:
job_name_custom_vid_0='AWS-custom-0-using-' + vocab_improved + str(time_now)
job_names_custom = [job_name_custom_vid_0]
transcribe(job_name_custom_vid_0, folder_path+all_videos[0], BUCKET, vocab_name=vocab_improved)
Wait for the status of your transcription job to display COMPLETED
again.
Write the new transcripts to new .txt files with the following code:
# Save the improved transcripts
i = 1
for list_ in all_sentences_and_times_custom:
file = open(f"improved_transcript_{i}.txt","w")
for tup in list_:
file.write(tup['sentence'] + "n")
file.close()
i = i + 1
Results and analysis
Up to this point, you may have completed this use case with a single video. The remainder of this post refers to the four videos that we used to analyze the results of this workflow. For more information, see the Getting Started section at the beginning of this post.
To analyze metrics on a larger sample size for this workflow, we generated a ground truth transcript in advance, a transcription before the custom vocabulary, and a transcription after the custom vocabulary for each video in the playlist.
The first and third videos are the in-sample videos used to build the custom vocabulary you saw earlier. The second and fourth videos are used as out-sample videos to test Amazon Transcribe again after building the custom vocabulary. Run the associated code blocks to download these transcripts.
Comparing word error rates
The most common metric for speech recognition accuracy is called word error rate (WER), which is defined to be WER =(S+D+I)/N, where S, D, and I are the number of substitution, deletion, and insertion operations, respectively, needed to get from the outputted transcript to the ground truth, and N is the total number of words. This can be broadly interpreted to be the proportion of transcription errors relative to the number of words that were actually said.
We use a lightweight open-source Python library called JiWER for calculating WER between transcripts. See the following code:
!pip install jiwer
from jiwer import wer
import jiwer
For more information, see JiWER: Similarity measures for automatic speech recognition evaluation.
We calculate our metrics for the in-sample videos (the videos that were used to build the custom vocabulary). Using the code from the notebook, the following code is the output:
===== In-sample videos =====
Processing video #1
The baseline WER (before using custom vocabularies) is 5.18%.
The WER (after using custom vocabularies) is 2.62%.
The percentage change in WER score is -49.4%.
Processing video #3
The baseline WER (before using custom vocabularies) is 11.94%.
The WER (after using custom vocabularies) is 7.84%.
The percentage change in WER score is -34.4%.
To calculate our metrics for the out-sample videos (the videos that Amazon Transcribe hasn’t seen before), enter the following code:
===== Out-sample videos =====
Processing video #2
The baseline WER (before using custom vocabularies) is 7.55%.
The WER (after using custom vocabularies) is 6.56%.
The percentage change in WER score is -13.1%.
Processing video #4
The baseline WER (before using custom vocabularies) is 10.91%.
The WER (after using custom vocabularies) is 8.98%.
The percentage change in WER score is -17.6%.
Reviewing the results
The following table summarizes the changes in WER scores.
If we consider absolute WER scores, the initial WER of 5.18%, for instance, might be sufficiently low for some use cases—that’s only around 1 in 20 words that are mis-transcribed! However, this rate can be insufficient for other purposes, because domain-specific terms are often the least common words spoken (relative to frequent words such as “to,” “and,” or “I”) but the most commonly mis-transcribed. For applications like search engine optimization (SEO) and video organization by topic, you may want to ensure that these technical terms are transcribed correctly. In this section, we look at how our custom vocabulary impacted the transcription rates of several important technical terms.
Metrics for specific technical terms
For this post, ground truth refers to the true transcript that was transcribed by hand, original transcript refers to the transcription before applying the custom vocabulary, and new transcript refers to the transcription after applying the custom vocabulary.
In-sample videos
The following table shows the transcription rates for video 1.
The following table shows the transcription rates for video 3.
Out-sample videos
The following table shows the transcription rates for video 2.
The following table shows the transcription rates for video 4.
Using custom vocabularies resulted in an 80-percentage point or more increase in the number of correctly transcribed technical terms. A majority of the time, using a custom vocabulary resulted in 100% accuracy in transcribing these domain-specific terms. It looks like using custom vocabularies was worth the effort after all!
Cleaning up
To avoid incurring unnecessary charges, delete resources when not in use, including your S3 bucket, human review workflow, transcription job, and Amazon SageMaker notebook instance. For instructions, see the following, respectively:
- How do I delete an S3 Bucket?
- Delete a Flow Definition
- DeleteTranscriptionJob
- Cleanup: SageMaker Resources
Conclusion
In this post, you saw how you can use Amazon A2I human review workflows and Amazon Transcribe custom vocabularies to improve automated video transcriptions. This walkthrough allows you to quickly identify domain-specific terms and use these terms to build a custom vocabulary so that future mentions of term are transcribed with greater accuracy, at scale. Transcribing key technical terms correctly may be important for SEO, enabling highly specific textual queries, and grouping large quantities of video or audio files by technical terms.
The full proof-of-concept Jupyter notebook can be found in the GitHub repo. For video presentations, sample Jupyter notebooks, and more information about use cases like document processing, content moderation, sentiment analysis, object detection, text translation, and more, see Amazon Augmented AI Resources.
About the Authors
Jasper Huang is a Technical Writer Intern at AWS and a student at the University of Pennsylvania pursuing a BS and MS in computer science. His interests include cloud computing, machine learning, and how these technologies can be leveraged to solve interesting and complex problems. Outside of work, you can find Jasper playing tennis, hiking, or reading about emerging trends.
Talia Chopra is a Technical Writer in AWS specializing in machine learning and artificial intelligence. She works with multiple teams in AWS to create technical documentation and tutorials for customers using Amazon SageMaker, MxNet, and AutoGluon. In her free time, she enjoys meditating, studying machine learning, and taking walks in nature.
This month in AWS Machine Learning: July 2020 edition
Every day there is something new going on in the world of AWS Machine Learning—from launches to new use cases like posture detection to interactive trainings like the AWS Power Hour: Machine Learning on Twitch. We’re packaging some of the not-to-miss information from the ML Blog and beyond for easy perusing each month. Check back at the end of each month for the latest roundup.

See use case section for how to build a posture tracker project with AWS DeepLens
Launches
As models become more sophisticated, AWS customers are increasingly applying machine learning (ML) prediction to video content, whether that’s in media and entertainment, autonomous driving, or more. At AWS, we had the following exciting July launches:
- On July 9, we announced that SageMaker Ground Truth now supports video labeling. The National Football League (NFL) has already put this new feature to work to develop labels for training a computer vision system that tracks all 22 players as they move on the field during plays. Amazon SageMaker Ground Truth reduced the timeline for developing a high-quality labeling dataset by more than 80%.
- On July 13, we launched the availability of AWS DeepRacer Evo and Sensor Kit for purchase. AWS DeepRacer Evo is available for a limited-time, discounted price of $399, a savings of $199 off the regular bundle price of $598, and the AWS DeepRacer Sensor Kit is available for $149, a savings of $100 off the regular price of $249. AWS DeepRacer is a fully autonomous 1/18th scale race car powered by reinforcement learning (RL) that gives ML developers of all skill levels the opportunity to learn and build their ML skills in a fun and competitive way. AWS DeepRacer Evo includes new features and capabilities to help you learn more about ML through the addition of sensors that enable object avoidance and head-to-head racing. Both items are available on com for shipping in the US only.
- On July 23, we announced that Contact Lens for Amazon Connect is now generally available. Contact Lens is a set of capabilities for Amazon Connect enabled by ML that gives contact centers the ability to understand the sentiment, trends, and compliance of customer conversations to improve their experience and identify crucial feedback.
- As of July 28, Amazon Fraud Detector is now generally available. Amazon Fraud Detector is a fully managed service that makes it easy to identify potentially fraudulent online activities such as online payment fraud and the creation of fake accounts. It uses your data, ML, and more than 20 years of fraud detection expertise from Amazon to automatically identify potentially fraudulent online activity so you can catch more fraud faster.
- Develop your own custom genre model to create AI-generated tunes in our latest AWS DeepComposer Chartbusters Challenge, Spin the Model. Submit your entries by 8/23 & see if you can top the #AI charts on SoundCloud for a chance to win some great prizes.
Use cases
Get ideas and architectures from AWS customers, partners, ML Heroes, and AWS experts on how to apply ML to your use case:
- AWS Machine Learning Hero Cyrus Wong shares how he developed Callouts, a simple, consistent, and scalable solution for educators to communicate with students using Amazon Connect and Amazon Lex. Learn how you can build your own scalable outbound call engine.
- The products Atlassian builds have hundreds of developers working on them, composed of a mixture of monolithic applications and microservices. When an incident occurs, it can be hard to diagnose the root cause due to the high rate of change within the code bases. Learn how implementing Amazon CodeGuru Profiler has enabled Atlassian developers to own and take action on performance engineering.
- Given the shift in customer trends to audio consumption, Amazon Polly launched a new speaking style focusing on the publishing industry: the Newscaster speaking style. Learn how the Newscaster voice was built and how you can use the Newscaster voice with your content in a few simple steps.
- Personalized recommendations can help improve customer engagement and conversion. Learn how Pulselive, a digital media sports technology company, increased video consumption by 20% with Amazon Personalize.
- Working from home can be a big change to your ergonomic setup, which can make it hard for you to keep a healthy posture and take frequent breaks throughout the day. To help you maintain good posture and have fun with ML in the process, you can build a posture tracker project with AWS DeepLens, the AWS programmable video camera for developers to learn ML.
Explore more ML stories
Want more news about developments in ML? Check out the following stories:
- Formula 1 Pit Strategy Battle – Take a deep dive into how the Amazon ML Solutions Lab and Professional Services Teams worked with Formula 1 to build a real-time race strategy prediction application using AWS technology that brings pit wall decisions to the viewer, and resulted in the Pit Strategy Battle graphic. You can also learn how a serverless architecture can provide ML predictions with minimal latency across the globe, and how to get started on your own ML journey.
- Fairness in AI – At the seventh Workshop on Automated Machine Learning (AutoML) at the International Conference on Machine Learning, Amazon researchers won a best paper award for the paper “Fair Bayesian Optimization.” The paper addresses the problem of ensuring the fairness of AI systems, a topic that has drawn increasing attention in recent years. Learn more about the research findings at Amazon.Science.
Mark your calendars
Join us for the following exciting ML events:
- Have fun while learning how to build, train, and deploy ML models with Amazon SageMaker Fridays. Join our expert ML Specialists Emily Webber and Alex McClure for a live session on Twitch. Register now!
- AWS Power Hour: Machine Learning is a weekly, live-streamed program that premiered Thursday, July 23, at 7:00 p.m. EST and will air at that time every Thursday for 7 weeks.
Also, if you missed it, see the Amazon Augmented AI (Amazon A2I) Tech Talk to learn how you can implement human reviews to review your ML predictions from Amazon Textract, Amazon Rekognition, Amazon Comprehend, Amazon SageMaker, and other AWS AI/ ML services.
See you next month for more on AWS ML!
About the author
Laura Jones is a product marketing lead for AWS AI/ML where she focuses on sharing the stories of AWS’s customers and educating organizations on the impact of machine learning. As a Florida native living and surviving in rainy Seattle, she enjoys coffee, attempting to ski and enjoying the great outdoors.
Enhancing recommendation filters by filtering on item metadata with Amazon Personalize
We’re pleased to announce enhancements to recommendation filters in Amazon Personalize, which provide you greater control on recommendations your users receive by allowing you to exclude or include items to recommend based on criteria that you define. For example, when recommending products for your e-retail store, you can exclude unavailable items from recommendations. If you’re recommending videos to users, you can choose to only recommend premium content if the user is in a particular subscription tier. You typically address this by writing custom code to implement their business rules, but you can now save time and streamline your architectures by using recommendation filters in Amazon Personalize.
Based on over 20 years of personalization experience, Amazon Personalize enables you to improve customer engagement by powering personalized product and content recommendations and targeted marketing promotions. Amazon Personalize uses machine learning (ML) to create high-quality recommendations for your websites and applications. You can get started without any prior ML experience using simple APIs to easily build sophisticated personalization capabilities in just a few clicks. Amazon Personalize processes and examines your data, identifies what is meaningful, automatically picks the right ML algorithm, and trains and optimizes a custom model based on your data. All of your data is encrypted to be private and secure, and is only used to create recommendations for your users.
Setting up and using recommendation filters is simple, taking only a few minutes to define and deploy your custom business rules with a real-time campaign. You can use the Amazon Personalize console or API to create a filter with your business logic using the Amazon Personalize domain specific language (DSL). You can apply this filter while querying for real-time recommendations using the GetRecommendations
or GetPersonalizedRanking
API, or while generating recommendations in batch mode through a batch inference job.
This post walks you through setting up and using item and user metadata-based recommendation filters in Amazon Personalize.
Prerequisites
To define and apply filters, you first need to set up the following Amazon Personalize resources. For instructions on the Amazon Personalize console, see Getting Started (Console).
- Create a dataset group.
- Create an
Interactions
dataset using the following schema and import data using the interactions-100k.csv data file:{ "type": "record", "name": "Interactions", "namespace": "com.amazonaws.personalize.schema", "fields": [ { "name": "USER_ID", "type": "string" }, { "name": "ITEM_ID", "type": "string" }, { "name": "EVENT_VALUE", "type": [ "null", "float" ] }, { "name": "TIMESTAMP", "type": "long" }, { "name": "EVENT_TYPE", "type": "string" } ], "version": "1.0" }
- Create an Items dataset using the following schema and import data using the csv data file:
{ "type": "record", "name": "Items", "namespace": "com.amazonaws.personalize.schema", "fields": [ { "name": "ITEM_ID", "type": "string" }, { "name": "GENRE", "type": "string" "categorical": true } ], "version": "1.0" }
- Create a solution using any recipe. In this post, we use the
aws-hrnn
recipe. - Create a campaign.
Creating your filter
Now that you have set up your Amazon Personalize resources, you can define and test custom filters.
Filter expression language
Amazon Personalize uses its own DSL called filter expressions to determine which items to exclude or include in a set of recommendations. Filter expressions are scoped to a dataset group. You can only use them to filter results for solution versions (an Amazon Personalize model trained using your datasets in the dataset group) or campaigns (a deployed solution version for real-time recommendations). Amazon Personalize can filter items based on user-item interaction, item metadata, or user metadata datasets.
The following are some examples of filter expressions by item:
- To remove all items in the
"Comedy"
genre, use the following filter expression:EXCLUDE ItemId WHERE item.genre in ("Comedy")
- To include items with a
"number of downloads"
less than 20, use the following filter expression:INCLUDE ItemId WHERE item.number_of_downloads < 20
The following are some examples of filter expressions by interaction:
- To remove items that have been clicked or streamed by a user, use the following filter expression:
EXCLUDE ItemId WHERE interaction.event_type in ("click", "stream")
- To include items that a user has interacted with in any way, use the following filter expression:
INCLUDE ItemId WHERE interactions.event_type in ("*")
You can also filter by user:
- To exclude items where the number of downloads is less than 20 if the current user’s age is over 18 but less than 30, use the following filter expression:
EXCLUDE ItemId WHERE item.number_of_downloads < 20 IF CurrentUser.age > 18 AND CurrentUser.age < 30
You can also chain multiple expressions together, allowing you to pass the result of one expression to another in the same filter using a pipe ( | ) to separate them:
- The following filter expression example includes two expressions. The first expression includes items in the
"Comedy"
genre, and the result of this filter is passed to another expression that excludes items with the description"classic"
:INCLUDE Item.ID WHERE item.genre IN (“Comedy”) | EXCLUDE ItemID WHERE item.description IN ("classic”)
For more information, see Datasets and Schemas. For more information about filter definition DSL, see Filtering Recommendations.
Creating a filter on the console
You can use the preceding DSL to create a custom filter on the Amazon Personalize console. To create a filter, complete the following steps:
- On the Amazon Personalize console, choose Filters.
- Choose Create filter.
- For Filter name, enter the name for your filter.
- For Expression, select Build expression.
Alternatively, you can add your expression manually.
- To chain additional expressions with your filter, choose +.
- To add additional filter expressions, choose Add expression.
- Choose Finish.
Creating a filter takes you to a page containing detailed information about your filter. You can view more information about your filter, including the filter ARN and the corresponding filter expression you created. You can also delete filters on this page or create more filters from the summary page.
You can also create filters via the createFilter
API in Amazon Personalize. For more information, see Filtering Recommendations.
Applying your filter to real-time recommendations on the console
The Amazon Personalize console allows you to spot-check real-time recommendations on the Campaigns page. From this page, you can test your filters while retrieving recommendations for a specific user on demand. To do so, navigate to the Campaigns tab; this should be in the same dataset group that you used to create the filter. You can then test the impact of applying the filter on the recommendations.
Recommendations without a filter
The following screenshot shows recommendations returned with no filter applied.
Recommendations with a filter
The following screenshot shows results after you remove the "Action"
genre from the recommendations by applying the filter we previously defined.
If we investigate the Items
dataset provided in this example, item 546 is in the genre "Action"
and we excluded the "Action"
genre in our filter.
This information tells us that Item 546
should be excluded from recommendations. The results show that the filter removed items in the genre "Action"
from the recommendation.
Applying your filter to batch recommendations on the console
To apply a filter to batch recommendations on the console, follow the same process as real-time recommendations. On the Create batch inference job page, choose the filter name to apply a previously created filter to your batch recommendations.
Applying your filter to real-time recommendations through the SDK
You can also apply filters to recommendations that are served through your SDK or APIs by supplying the filterArn
as an additional and optional parameter to your GetRecommendations
calls. Use "filterArn"
as the parameter key and supply the filterArn
as a string for the value. filterArn
is a unique identifying key that the CreateFilter
API call returns. You can also find a filter’s ARN on the filter’s detailed information page.
The following example code is a request body for the GetRecommendations
API that applies a filter to a recommendation:
{
"campaignArn": "arn:aws:personalize:us-west-2:000000000000:campaign/test-campaign",
"userId": "1",
"itemId": "1",
"numResults": 5,
"filterArn": "arn:aws:personalize:us-west-2:000000000000:filter/test-filter"
}
Applying your filter to batch recommendations through the SDK
To apply filters to batch recommendations when using a SDK, you provide the filterArn
in the request body as an optional parameter. Use "filterArn"
as the key and the filterArn
as the value.
Summary
Customizable recommendation filters in Amazon Personalize allow you to fine-tune recommendations to provide more tailored experiences that improve customer engagement and conversion according to your business needs without having to implement post-processing logic on your own. For more information about optimizing your user experience with Amazon Personalize, see What Is Amazon Personalize?
About the Author
Matt Chwastek is a Senior Product Manager for Amazon Personalize. He focuses on delivering products that make it easier to build and use machine learning solutions. In his spare time, he enjoys reading and photography.
Code-free machine learning: AutoML with AutoGluon, Amazon SageMaker, and AWS Lambda
One of AWS’s goals is to put machine learning (ML) in the hands of every developer. With the open-source AutoML library AutoGluon, deployed using Amazon SageMaker and AWS Lambda, we can take this a step further, putting ML in the hands of anyone who wants to make predictions based on data—no prior programming or data science expertise required.
AutoGluon automates ML for real-world applications involving image, text, and tabular datasets. AutoGluon trains multiple ML models to predict a particular feature value (the target value) based on the values of other features for a given observation. During training, the models learn by comparing their predicted target values to the actual target values available in the training data, using appropriate algorithms to improve their predictions accordingly. When training is complete, the resulting models can predict the target feature values for observations they have never seen before, even if you don’t know their actual target values.
AutoGluon automatically applies a variety of techniques to train models on data with a single high-level API call—you don’t need to build models manually. Based on a user-configurable evaluation metric, AutoGluon automatically selects the highest-performing combination, or ensemble, of models. For more information about how AutoGluon works, see Machine learning with AutoGluon, an open source AutoML library.
To get started with AutoGluon, see the AutoGluon GitHub repo. For more information about trying out sophisticated AutoML solutions in your applications, see the AutoGluon website. Amazon SageMaker is a fully managed service that provides every developer and data scientist with the ability to build, train, and deploy ML models efficiently. AWS Lambda lets you run code without provisioning or managing servers, can be triggered automatically by other AWS services like Amazon Simple Storage Service (Amazon S3), and allows you to build a variety of real-time data processing systems.
With AutoGluon, you can achieve state-of-the-art predictive performance on new observations with as few as three lines of Python code. In this post, we achieve the same results with zero lines of code—making AutoML accessible to non-developers—by using AWS services to deploy a pipeline that trains ML models and makes predictions on tabular data using AutoGluon. After deploying the pipeline in your AWS account, all you need to do to get state-of-the-art predictions on your data is upload it to an S3 bucket with a provided AutoGluon package.
The code-free ML pipeline
The pipeline starts with an S3 bucket, which is where you upload the training data that AutoGluon uses to build your models, the testing data you want to make predictions on, and a pre-made package containing a script that sets up AutoGluon. After you upload the data to Amazon S3, a Lambda function kicks off an Amazon SageMaker model training job that runs the pre-made AutoGluon script on the training data. When the training job is finished, AutoGluon’s best-performing model makes predictions on the testing data, and these predictions are saved back to the same S3 bucket. The following diagram illustrates this architecture.
Deploying the pipeline with AWS CloudFormation
You can deploy this pipeline automatically in an AWS account using a pre-made AWS CloudFormation template. To get started, complete the following steps:
- Choose the AWS Region in which you’d like to deploy the template. If you’d like to deploy it in another region, please download the template from GitHub and upload it to CloudFormation yourself.
Northern Virginia Oregon Ireland Sydney - Sign in to the AWS Management Console.
- For Stack name, enter a name for your stack (for example,
code-free-automl-stack
). - For BucketName, enter a unique name for your S3 bucket (for example,
code-free-automl-yournamehere
). - For TrainingInstanceType, enter your compute instance.
This parameter controls the instance type Amazon SageMaker model training jobs use to run AutoGluon on your data. AutoGluon is optimized for the m5 instance type, and 50 hours of Amazon SageMaker training time with the m5.xlarge instance type are included as part of the AWS Free Tier. We recommend starting there and adjusting the instance type up or down based on how long your initial job takes and how quickly you need the results.
- Select the IAM creation acknowledgement checkbox and choose Create stack.
- Continue with the AWS CloudFormation wizard until you arrive at the Stacks page.
It takes a moment for AWS CloudFormation to create all the pipeline’s resources. When you see the CREATE_COMPLETE
status (you may need to refresh the page), the pipeline is ready for use.
- To see all the components shown in the architecture, choose the Resources tab.
- To navigate to the S3 bucket, choose the corresponding link.
Before you can use the pipeline, you have to upload the pre-made AutoGluon package to your new S3 bucket.
- Create a folder called
source
in that bucket.
- Upload the sourcedir.tar.gz package there; keep the default object settings.
Your pipeline is now ready for use!
Preparing the training data
To prepare your training data, go back to the root of the bucket (where you see the source folder) and make a new directory called data
; this is where you upload your data.
Gather the data you want your models to learn from (the training data). The pipeline is designed to make predictions for tabular data, the most common form of data in real-world applications. Think of it like a spreadsheet; each column represents the measurement of some variable (feature value), and each row represents an individual data point (observation).
For each observation, your training dataset must include columns for explanatory features and the target column containing the feature value you want your models to predict.
Store the training data in a CSV file called <Name>_train.csv
, where <Name> can be replaced with anything.
Make sure that the header name of the desired target column (the value of the very first row of the column) is set to target
so AutoGluon recognizes it. See the following screenshot of an example dataset.
Preparing the test data
You also need to provide the testing data you want to make predictions for. If this dataset already contains values for the target column, you can compare these actual values to your model’s predictions to evaluate the quality of the model.
Store the testing dataset in another CSV file called <Name>_test.csv
, replacing <Name> with the same string you chose for the corresponding training data.
Make sure that the column names match those of <Name>_train.csv
, including naming the target column target
(if present).
Upload the <Name>_train.csv
and <Name>_test.csv
files to the data folder you made earlier in your S3 bucket.
The code-free ML pipeline kicks off automatically when the upload is finished.
Training the model
When the training and testing dataset files are uploaded to Amazon S3, AWS logs the occurrence of an event and automatically triggers the Lambda function. This function launches the Amazon SageMaker training job that uses AutoGluon to train an ensemble of ML models. You can view the job’s status on the Amazon SageMaker console, in the Training jobs section (see the following screenshot).
Performing inference
When the training job is complete, the best-performing model or weighted combination of models (as determined by AutoGluon) is used to compute predictions for the target feature value of each observation in the testing dataset. These predictions are automatically stored in a new directory within a results
directory in your S3 bucket, with the filename <Name>_test_predictions.csv
.
AutoGluon produces other useful output files, such as <Name>_leaderboard.csv
(a ranking of each individual model trained by AutoGluon and its predictive performance) and <Name>_model_performance.txt
(an extended list of metrics corresponding to the best-performing model). All these files are available for download to your local machine from the Amazon S3 console (see the following screenshot).
Extensions
The trained model artifact from AutoGluon’s best-performing model is also saved in the output folder (see the following screenshot).
You can extend this solution by deploying that trained model as an Amazon SageMaker inference endpoint to make predictions on new data in real time or by running an Amazon SageMaker batch transform job to make predictions on additional testing data files. For more information, see Work with Existing Model Data and Training Jobs.
You can also reuse this automated pipeline with custom model code by replacing the AutoGluon sourcedir.tar.gz
package we prepared for you in the source
folder. If you unzip that package and look at the Python script inside, you can see that it simply runs AutoGluon on your data. You can adjust some of the parameters defined there to better match your use case. That script and all the other resources used to set up this pipeline are freely available in this GitHub repository.
Cleaning up
The pipeline doesn’t cost you anything more to leave up in your account because it only uses fully managed compute resources on demand. However, if you want to clean it up, simply delete all the files in your S3 bucket and delete the launched CloudFormation stack. Make sure to delete the files first; AWS CloudFormation doesn’t automatically delete an S3 bucket with files inside.
To delete the files from your S3 bucket, on the Amazon S3 console, select the files and choose Delete from the Actions drop-down menu.
To delete the CloudFormation stack, on the AWS CloudFormation console, choose the stack and choose Delete.
In the confirmation window, choose Delete stack.
Conclusion
In this post, we demonstrated how to train ML models and make predictions without writing a single line of code—thanks to AutoGluon, Amazon SageMaker, and AWS Lambda. You can use this code-free pipeline to leverage the power of ML without any prior programming or data science expertise.
If you’re interested in getting more guidance on how you can best use ML in your organization’s products and processes, you can work with the Amazon ML Solutions Lab. The Amazon ML Solutions Lab pairs your team with Amazon ML experts to prepare data, build and train models, and put models into production. It combines hands-on educational workshops with brainstorming sessions and advisory professional services to help you work backward from business challenges, and go step-by-step through the process of developing ML-based solutions. At the end of the program, you can take what you have learned through the process and use it elsewhere in your organization to apply ML to business opportunities.
About the Authors
Abhi Sharma is a deep learning architect on the Amazon ML Solutions Lab team, where he helps AWS customers in a variety of industries leverage machine learning to solve business problems. He is an avid reader, frequent traveler, and driving enthusiast.
Ryan Brand is a Data Scientist in the Amazon Machine Learning Solutions Lab. He has specific experience in applying machine learning to problems in healthcare and the life sciences, and in his free time he enjoys reading history and science fiction.
Tatsuya Arai Ph.D. is a biomedical engineer turned deep learning data scientist on the Amazon Machine Learning Solutions Lab team. He believes in the true democratization of AI and that the power of AI shouldn’t be exclusive to computer scientists or mathematicians.
Announcing the AWS DeepComposer Chartbusters Spin the Model challenge
Whether your jam is reggae, hip-hop or electronic you can get creative and enter the latest AWS DeepComposer Chartbusters challenge! The Spin the Model challenge launches today and is open until August 23, 2020. AWS DeepComposer gives developers a creative way to get started with machine learning. Chartbusters is a monthly challenge where you can use AWS DeepComposer to create original compositions and compete to top the charts and win prizes.
To participate in the challenge you first need to train a model and create a composition using your dataset and the Amazon SageMaker notebook. You don’t need a physical keyboard to participate in the challenge. Next, you import the composition on the AWS DeepComposer console, and submit the composition to SoundCloud. When you submit a composition, AWS DeepComposer automatically adds it to the Spin the Model challenge playlist in SoundCloud.
You can use the A deep dive into training an AR-CNN model learning capsule available on the AWS DeepComposer console to learn the concepts to train a model. To access the learning capsule, sign in to the AWS DeepComposer console and choose learning capsules in the navigation pane. Choose A deep dive into training an AR-CNN model to begin learning.
Training a model
We have provided a sample notebook to create a custom model. To use the notebook, first create the Amazon SageMaker notebook instance.
- On the Amazon SageMaker console, under Notebook, choose Notebook instances.
- Choose Create notebook instance.
- For Notebook instance type, choose ml.c5.4xlarge.
- For IAM role, choose a new or existing role.
- For Root access, select Enable.
- For Encryption key, choose No Custom Encryption.
- For Repository, choose Clone a public Git repository to this notebook instance only.
- For Git repository URL, enter
https://github.com/aws-samples/aws-deepcomposer-samples
. - Choose Create notebook instance.
- Select your notebook instance and choose Open Jupyter.
- In the ar-cnn folder, choose
AutoRegressiveCNN.ipynb
.
You’re likely prompted to choose a kernel.
- From the drop-down menu, choose conda_tensorflow_p36.
- Choose Set Kernel.
This notebook contains instructions and code to create and train a custom model from scratch.
- To run the code cells, choose the code cell you want to run and choose Run.
If the kernel has an empty circle, it means it’s available and ready to run the code.
If the kernel has a filled circle, it means the kernel is busy. Wait for it to become available before you run the next line of code.
- Provide the path for your dataset in the dataset summary section. Replace the current
data_dir
path with your dataset directory.
Your dataset should be in the .mid format.
- After you provide the dataset directory path, you can experiment by changing the hyperparameters to train the model.
Training a model typically takes 5 hours or longer, depending on the dataset size and hyperparameter choices.
- After you train the model, create a composition by using the code in the Inference section.
You can use the sample input MIDI files provided in the GitHub repo to generate a composition. Alternatively, you can play the input melody in AWS DeepComposer and download the melody to create a new composition.
- After you create your composition, download it by navigating to the /outputs folder and choosing the file to download.
Submitting your composition
You can now import your composition in AWS DeepComposer. This step is necessary to submit the composition to the Spin the Model Chartbusters challenge.
- On the AWS DeepComposer console, choose Input melody.
- For Source of input melody, choose Imported track.
- For Imported track, choose Choose file to upload the file.
- Use the AR-CNN algorithm to further enhance the input melody.
- To submit your composition for the challenge, choose Chartbusters in the navigation pane.
- Choose Submit a composition.
- Choose your composition from the drop-down menu.
- Provide a track name for your composition and choose Submit.
AWS DeepComposer submits your composition to the Spin the Model playlist on SoundCloud. You can choose Vote on SoundCloud on the console to review and listen to other submissions for the challenge.
Conclusion
Congratulations! You have submitted your entry for the AWS DeepComposer Chartbusters challenge. Invite your friends and family to listen to and like your composition!
Learn more about AWS DeepComposer Chartbusters at https://aws.amazon.com/deepcomposer/chartbusters.
About the Author
Jyothi Nookula is a Principal Product Manager for AWS AI devices. She loves to build products that delight her customers. In her spare time, she loves to paint and host charity fund raisers for her art exhibitions.
Announcing the winner for the AWS DeepComposer Chartbusters Bach to the Future challenge
We are excited to announce the top 10 compositions and the winner for the AWS DeepComposer Chartbusters Bach to the Future challenge. AWS DeepComposer gives developers a creative way to get started with machine learning. Chartbusters is a monthly challenge where you can use AWS DeepComposer to create original compositions and compete to top the charts and win prizes. The first challenge, Bach to the Future, required developers to use a new generative AI algorithm provided on the AWS DeepComposer console to create compositions in the style of Bach. It was an intense competition with high-quality submissions, making it a good challenge for our judges to select the chart-toppers!
Top 10 compositions
First, we shortlisted the top 20 compositions by using a total of customer likes and count of plays on SoundCloud. Then, our human experts (Mike Miller and Gillian Armstrong) and the AWS DeepComposer AI judge evaluated compositions for musical quality, creativity, and emotional resonance to select the top 10 ranked compositions.
The winner for the Bach to the Future challenge is… (cue drum roll) Catherine Chui! You can listen to the winning composition on SoundCloud. The top 10 compositions for the Bach to the Future challenge are:
You can listen to the playlist featuring the top 10 compositions on SoundCloud or on the AWS DeepComposer console.
The winner, Catherine Chui, will receive an AWS DeepComposer Chartbusters gold record. Catherine will be telling the story of how she created this tune and the experience of getting hands on with AWS DeepComposer in an upcoming post, right here on the AWS ML Blog.
Congratulations, Catherine Chui!
It’s time to move onto the next Chartbusters challenge — Spin the Model. The challenge launches today and is open until August 23, 2020. For more information about the competition and how to participate, see Announcing the AWS DeepComposer Chartbusters Spin the Model challenge.
About the Author
Jyothi Nookula is a Principal Product Manager for AWS AI devices. She loves to build products that delight her customers. In her spare time, she loves to paint and host charity fund raisers for her art exhibitions.
Create a multi-region Amazon Lex bot with Amazon Connect for high availability
AWS customers rely on Amazon Lex bots to power their Amazon Connect self service conversational experiences on telephone and other channels. With Lex, callers (or customers, in Amazon Connect terminology) can get their questions conveniently answered regardless of agent availability. What architecture patterns can you use to make a bot resilient to service availability issues? In this post, we describe a cross-regional approach to yield higher availability by deploying Amazon Lex bots in multiple Regions.
Architecture overview
In this solution, Amazon Connect flows can achieve business continuity with minimal disruptions in the event of service availability issues with Amazon Lex. The architecture pattern uses the following components:
- Two Amazon Lex bots, each in a different Region.
- An Amazon Connect flow integrated with the bots triggered based on the result from the region check AWS Lambda function.
- A Lambda function to check the health of the bot.
- A Lambda function to read the Amazon DynamoDB table for the primary bot’s Region for a given Amazon Connect Region.
- A DynamoDB table to store a Region mapping between Amazon Connect and Amazon Lex. The health check function updates this table. The region check function reads this table for the most up-to-date primary Region mapping for Amazon Connect and Amazon Lex.
The goal of having identical Amazon Lex Bots in two Regions is to bring up the bot in the secondary Region and make it the primary in the event of an outage in the primary Region.
Multi-region pattern for Amazon Lex
The next two sections describe how an Amazon Connect flow integrated with an Amazon Lex bot can recover quickly in case of a service failure or outage in the primary Region and start servicing calls using Amazon Lex in the secondary Region.
The health check function calls one of the two Amazon Lex Runtime API calls—PutSession or PostText, depending on the TEST_METHOD Lambda environment variable. You can choose either one based on your preference and use case. The PutSession API call doesn’t have any extra costs associated with Amazon Lex, but it doesn’t test any natural language understanding (NLU) features of Amazon Lex. The PostTextAPI allows you to check the NLU functionality of Amazon Lex, but includes a minor cost.
The health check function updates the lexRegion column of the DynamoDB table (lexDR) with the Region name in which the test passed. If the health check passes the test in the primary Region, lexRegion gets updated with the name of the primary Region. If the health check fails, the function issues a call to the corresponding Runtime API based on the TEST_METHOD environment variable in the secondary Region. If the test succeeds, the lexRegion column in the DynamoDB table gets updated to the secondary Region; otherwise, it gets updated with err, which indicates both Regions have an outage.
On every call that Amazon Connect receives, it issues a region check function call to get the active Amazon Lex Region for that particular Amazon Connect Region. The primary Region returned by the region check function is the last entry written to the DynamoDB table by the health check function. Amazon Connect invokes the respective Get Customer Input Block configured with the Amazon Lex bot in the Region returned by the region check function. If the function returns the same Region as the Amazon Connect Region, it indicates that the health check has passed, and Amazon Connect calls the Amazon Lex bot in its same Region. If the function returns the secondary Region, Amazon Connect invokes the bot in the secondary Region.
Deploying Amazon Lex bots
You need to create an identical bot in both your primary and secondary Region. In this blog post, we selected us-east-1 as the primary and us-west-2 secondary Region. Begin by creating the bot in your primary Region, us-east-1.
- On the Amazon Lex console, click Create.
- In the Try a Sample section, select
OrderFlowers
. Select COPPA to No - Leave all other settings at their default value and click Create.
- The bot is created and will start to build automatically.
- After your bot is built (in 1–2 minutes), choose Publish.
- Create an alias with the name
ver_one
.
Repeat the above steps for us-west-2
. You should now have a working Amazon Lex bot in both us-east-1
and us-west-2
.
Creating a DynamoDB table
Make sure your AWS Region is us-east-1
.
- On the DynamoDB console, choose Create.
- For Table name, enter
lexDR
. - For Primary key, enter
connectRegion
with type String. - Leave everything else at their default and choose Create.
- On the Items tab, choose Create item.
- Set the
connectRegion
value tous-east-1
, and Append a new column of type String calledlexRegion
and set its value tous-east-1
.
- Click Save.
Creating IAM roles for Lambda functions
In this step, you create an AWS Identity and Access Management (IAM) for both Lambda functions to use.
- On the IAM console, click on Access management and select Policies.
- Click on Create Policy.
- Click on JSON.
- Paste the following custom IAM policy that allows read and write access to the DynamoDB table,
lexDR
. Replace the “xxxxxxxxxxxx” in the policy definition with your AWS Account Number.{ "Version": "2012-10-17", "Statement": [{ "Sid": "VisualEditor0", "Effect": "Allow", "Action": ["dynamodb:GetItem", "dynamodb:UpdateItem"], "Resource": "arn:aws:dynamodb:us-east-1:xxxxxxxxxxxx:table/lexDR" }] }
- Click on Review Policy.
- Give it a name
DynamoDBReadWrite
and click on Create Policy. - On the IAM console, click on Roles under Access management and then click on Create Role.
- Select Lambda for the service and click Next.
- Attach the following permissions policies:
AWSLambdaBasicExecutionRole
AmazonLexRunBotsOnly
DynamoDBReadWrite
- Click Next: Tags. Skip the Tags page by clicking Next: Review.
- Name the role
lexDRRole
. Click Save.
Deploying the region check function
You first create a Lambda function to read from the DynamoDB table to decide which Amazon Lex bot is in the same Region as the Amazon Connect instance. This function is later called by Amazon Connect or your application that’s using the bot.
- On the Lambda console, choose Create function.
- For Function name, enter
lexDRGetRegion
. - For Runtime, choose Python 3.8.
- Under Permissions, choose Use an existing role.
- Choose the role
lexDRRole
. - Choose Create function.
- In the Lambda code editor, enter the following code (downloaded from lexDRGetRegion.zip):
import json import boto3 import os import logging dynamo_client=boto3.client('dynamodb') logger = logging.getLogger() logger.setLevel(logging.DEBUG) def getCurrentPrimaryRegion(key): result = dynamo_client.get_item( TableName=os.environ['TABLE_NAME'], Key = { "connectRegion": {"S": key } } ) logger.debug(result['Item']['lexRegion']['S'] ) return result['Item']['lexRegion']['S'] def lambda_handler(event, context): logger.debug(event) region = event["Details"]["Parameters"]["region"] return { 'statusCode': 200, 'primaryCode': getCurrentPrimaryRegion(region) }
- In the Environment variables section, choose Edit.
- Add an environment variable with Key as
TABLE_NAME
and Value aslexDR
. - Click Save to save the environment variable.
- Click Save to save the Lambda function.
Deploying the health check function
Create another Lambda function in us-east-1
to implement the health check functionality.
- On the Lambda console, choose Create function.
- For Function name, enter
lexDRTest
. - For Runtime, choose Python 3.8.
- Under Permissions, choose Use an existing role.
- Choose
lexDRRole
. - Choose Create function.
- In the Lambda code editor, enter the following code (downloaded from lexDRTest.zip):
import json import boto3 import sys import os dynamo_client = boto3.client('dynamodb') primaryRegion = os.environ['PRIMARY_REGION'] secondaryRegion = os.environ['SECONDARY_REGION'] tableName = os.environ['TABLE_NAME'] primaryRegion_client = boto3.client('lex-runtime',region_name=primaryRegion) secondaryRegion_client = boto3.client('lex-runtime',region_name=secondaryRegion) def getCurrentPrimaryRegion(): result = dynamo_client.get_item( TableName=tableName, Key={ 'connectRegion': {'S': primaryRegion} } ) return result['Item']['lexRegion']['S'] def updateTable(region): result = dynamo_client.update_item( TableName= tableName, Key={ 'connectRegion': {'S': primaryRegion } }, UpdateExpression='set lexRegion = :region', ExpressionAttributeValues={ ':region': {'S':region} } ) #SEND MESSAGE/PUT SESSION ENV VA def put_session(botname, botalias, user, region): print(region,botname, botalias) client = primaryRegion_client if region == secondaryRegion: client = secondaryRegion_client try: response = client.put_session(botName=botname, botAlias=botalias, userId=user) if (response['ResponseMetadata'] and response['ResponseMetadata']['HTTPStatusCode'] and response['ResponseMetadata']['HTTPStatusCode'] != 200) or (not response['sessionId']): return 501 else: if getCurrentPrimaryRegion != region: updateTable(region) return 200 except: print('ERROR: {}',sys.exc_info()[0]) return 501 def send_message(botname, botalias, user, region): print(region,botname, botalias) client = primaryRegion_client if region == secondaryRegion: client = secondaryRegion_client try: message = os.environ['SAMPLE_UTTERANCE'] expectedOutput = os.environ['EXPECTED_RESPONSE'] response = client.post_text(botName=botname, botAlias=botalias, userId=user, inputText=message) if response['message']!=expectedOutput: print('ERROR: Expected_Response=Success, Response_Received='+response['message']) return 500 else: if getCurrentPrimaryRegion != region: updateTable(region) return 200 except: print('ERROR: {}',sys.exc_info()[0]) return 501 def lambda_handler(event, context): print(event) botName = os.environ['BOTNAME'] botAlias = os.environ['BOT_ALIAS'] testUser = os.environ['TEST_USER'] testMethod = os.environ['TEST_METHOD'] if testMethod == 'send_message': primaryRegion_response = send_message(botName, botAlias, testUser, primaryRegion) else: primaryRegion_response = put_session(botName, botAlias, testUser, primaryRegion) if primaryRegion_response != 501: primaryRegion_client.delete_session(botName=botName, botAlias=botAlias, userId=testUser) if primaryRegion_response != 200: if testMethod == 'send_message': secondaryRegion_response = send_message(botName, botAlias, testUser, secondaryRegion) else: secondaryRegion_response = put_session(botName, botAlias, testUser, secondaryRegion) if secondaryRegion_response != 501: secondaryRegion_client.delete_session(botName=botName, botAlias=botAlias, userId=testUser) if secondaryRegion_response != 200: updateTable('err') #deleteSessions(botName, botAlias, testUser) return {'statusCode': 200,'body': 'Success'}
- In the Environment variables section, choose Edit, and add the following environment variables:
- BOTNAME –
OrderFlowers
- BOT_ALIAS –
ver_one
- SAMPLE_UTTERANCE –
I would like to order some flowers.
(The example utterance you want to use to send a message to the bot.) - EXPECTED_RESPONSE –
What type of flowers would you like to order?
(The expected response from the bot when it receives the above sample utterance.) - PRIMARY_REGION –
us-east-1
- SECONDARY_REGION –
us-west-2
- TABLE_NAME –
lexDR
- TEST_METHOD –
put_session
orsend_message
- send_message : This method calls the Lex Runtime function postText function which takes an utterance and maps it to one of the trained intents. postText will test the Natural Language Understanding capability of Lex. You will also iincur a small charge of $0.00075 per request)
- put_session: This method calls the Lex Runtime function put_session function which creates a new session for the user. put_session will NOT test the Natual Language Understanding capability of Lex.)
- TEST_USER –
test
- BOTNAME –
- Click Save to save the environment variable.
- In the Basic Settings section, update the Timeout value to 15 seconds.
- Click Save to save the Lambda function.
Creating an Amazon CloudWatch rule
To trigger the health check function to run every 5 minutes, you create an Amazon CloudWatch rule.
- On the CloudWatch console, under Events, choose Rules.
- Choose Create rule.
- Under Event Source, change the option to Schedule.
- Set the Fixed rate of to 5 minutes
- Under Targets, choose Add target.
- Choose Lambda function as the target.
- For Function, choose
lexDRTest
. - Under Configure input, choose Constant(JSON text), and enter
{}
- Choose Configure details.
- Under Rule definition, for Name, enter
lexHealthCheckRule
. - Choose Create rule.
You should now have a lexHealthCheckRule
CloudWatch rule scheduled to invoke your lexDRTest
function every 5 minutes. This checks if your primary bot is healthy and updates the DynamoDB table accordingly.
Creating your Amazon Connect instance
You now create an Amazon Connect instance to test the multi-region pattern for the bots in the same Region where you created the lexDRTest
function.
- Create an Amazon Connect instance if you don’t already have one.
- On the Amazon Connect console, choose the instance alias where you want the Amazon Connect flow to be.
- Choose Contact flows.
- Under Amazon Lex, select
OrderFlowers
bot fromus-east-1
and click Add Lex Bot - Select
OrderFlowers
bot fromus-west-2
and click Add Lex Bot
- Under AWS Lambda, select
lexDRGetRegion
and click Add Lambda Function. - Log in to your Amazon Connect instance by clicking Overview in the left panel and clicking the login link.
- Click Routing in the left panel, and then click Contact Flows in the drop down menu.
- Click the Create Contact Flow button.
- Click the down arrow button next to the Save button, and click on Import Flow.
- Download the contact flow Flower DR Flow. Upload this file in the Import Flow dialog.
- In the Contact Flow, Click on the Inovke AWS Lambda Function block, and it will open a properties panel on the right.
- Select the
lexDRGetRegion
function and click Save. - Click on the Publish button to publish the contact flow.
Associating a phone number with the contact flow
Next, you will associate a phone number with your contact flow, so you can call in and test the OrderFlowers bot.
- Click on the Routing option in the left navigation panel.
- Click on Phone Numbers.
- Click on Claim Number.
- Select your country code and select a Phone Number.
- In the Contact flow/IVR select box, select the contact flow
Flower DR Flow
imported in the earlier step. - Wait for a few minutes, and then call into that number to interact with the OrderFlowers bot.
Testing your integration
To test this solution, you can simulate a failure in the us-east-1 Region by implementing the following:
- Open Amazon Lex Console in
us-east-1
Region - Select the
OrderFlowers
bot. - Click on Settings.
- Delete the bot alias
ver_one
When the health check runs the next time, it will try to communicate with the Lex Bot in us-east-1 region. It will fail in getting a successful response, as the bot alias no longer exists. So, it will then make the call to the secondary Region, us-west-2. Upon receiving a successful response. Upon receiving this response, it will update the lexRegion column in the lexDR, DynamoDB table with us-west-2.
After this, all subsequent calls to Connect in us-east-1 will start interacting with the Lex Bot in us-west-2. This automatic switch over demonstrates how this architectural pattern can help achieve business continuity in the event of a service failure.
Between the time you delete the bot alias, and the next health check run, calls to Amazon Connect will receive a failure. However, after the health check runs, you will see a continuity in business operational automatically. The smaller the duration between your health check runs, the shorter the outage you will have. The duration between your health check runs can be changed by editing the Amazon CloudWatch rule, lexHealthCheckRule
.
To make the health check pass in us-east-1
again, recreate the ver_one
alias of the OrderFlowers
bot in us-east-1
.
Cleanup
To avoid incurring any charges in the future, delete all the resources created above.
- Amazon Lex bot
OrderFlowers
created inus-east-1
andus-west-2
- The Cloudwatch rule
lexHealthCheckRule
- The DynamoDB Table
lexDR
- The Lambda functions
lexDRTest
andlexDRGetRegion
- Delete the IAM role
lexDRRole
- Delete the Contact Flow
Flower DR Flow
Conclusion
Coupled with Amazon Lex for self-service, Amazon Connect allows you to easily create intuitive customer service experiences. This post offers a multi-region approach for high availability so that, if a bot or the supporting fulfillment APIs are under pressure in one Region, resources from a different Region can continue to serve customer demand.
About the Authors
Shanthan Kesharaju is a Senior Architect in the AWS ProServe team. He helps our customers with their Conversational AI strategy, architecture, and development. Shanthan has an MBA in Marketing from Duke University, MS in Management Information Systems from Oklahoma State University, and a Bachelors in Technology from Kakaitya University in India. He is also currently pursuing his third Masters in Analytics from Georgia Tech.
Soyoung Yoon is a Conversation A.I. Architect at AWS Professional Services where she works with customers across multiple industries to develop specialized conversational assistants which have helped these customers provide their users faster and accurate information through natural language. Soyoung has M.S. and B.S. in Electrical and Computer Engineering from Carnegie Mellon University.
Optimizing your engagement marketing with personalized recommendations using Amazon Personalize and Braze
Today’s marketer has a wide array of channels to communicate with their customers. However, sending the right message to the right customer on the right channel at the right time remains the preeminent challenge marketers face. In this post, I show you how to combine Braze, a customer engagement platform built on AWS for today’s on-demand, always-connected customers, and Amazon Personalize to meet this challenge and deliver experiences that surprise and delight your customers.
Braze makes it easy to organize your customers into audiences, which update in real-time, based on their behavior and profile traits. Messaging campaigns are created to target audiences through messaging channels such as email, SMS, and push notifications. Multi-step and multi-channel engagement journeys can also be designed using Braze Canvas. Campaigns and Canvases are triggered manually, based on a schedule, or even due to customer actions. However, your ability to personalize messages sent to customers is limited to what is available in their profile. Including product and content recommendations based on the learned interests of each customer as they engage with your web and mobile application is needed to truly personalize each message.
Amazon Personalize is an AWS service that uses machine learning algorithms to create recommender systems based on the behavioral data of your customers. The recommenders are private to your AWS account and based only on the data you provide. Through the Braze Connected Content feature, you are able to connect Braze to the same Amazon Personalize recommenders used to power recommendations in your web and mobile application. Since Amazon Personalize is able to adjust recommendations for each customer based on their behavior in real-time, the messages sent through Braze reflect their current preferences and intent.
Overview of solutions
I present two architectures in this post: one that uses the real-time capabilities of Braze and Amazon Personalize, and another that trades some of the freshness of real-time recommendations for a more cost-effective batch approach. The approach you select should match the goals of your engagement strategy and the scale of your messaging needs. Fortunately, the features and integration options of Braze and Amazon Personalize provide the flexibility to suit your operational requirements.
Real-time integration
We start with a real-time integration architecture. The following diagram depicts the relevant components of a sample ecommerce application in which you use Amazon Personalize to provide machine learning (ML)-powered recommenders, referred to as solutions. The primary data used to build solutions is user-item interaction history. For an ecommerce application, this includes events such as viewing a product, adding a product to a shopping cart, and purchasing a product. When rich metadata on events, items, and users is available, you can incorporate it to further improve the relevance of recommendations from the recommender. Examples of metadata include device type, location, and season for events; category, genre, and price point for items; and users’ age, gender, and subscription tier. After you create solutions, you can create autoscaling API endpoints called campaigns with just a few clicks to retrieve personalized recommendations.
Later in this post, I show you how to deploy this application in your AWS account. A self-guided workshop is also packaged with the application that you use to walk through sending personalized emails with Braze.
Our example ecommerce application retrieves personalized recommendations from a Recommendations microservice that appends the recommended item IDs from Amazon Personalize with rich product information from a Products microservice. As users engage with the application and indicate interest by viewing a product, adding a product to their shopping cart, or purchasing a product, events representing these actions are streamed to Amazon Personalize via the AWS Amplify JavaScript client library where Amazon Personalize automatically adjusts recommendations in real time based on user activity.
With personalization built into the application, you can connect Amazon Personalize with Braze to deliver personalized recommendations through outbound engagement channels such as email, SMS, and push notifications.
Braze allows you to create message templates that use the Liquid templating language to substitute placeholders in your template with values from a customer’s profile or even from an external resource. In the real-time architecture, we use the Recommendations microservice from the sample application as the external resource and Braze Connected Content as the feature to retrieve personalized recommendations to include in your message templates. The following Connected Content Liquid tag, placed at the top of your message, illustrates how to call the Recommendations service from Braze to retrieve recommendations for a user:
{% connected_content http://<RecommendationsServiceHostName>/recommendations?userID={{${user_id}}}&fullyQualifyImageUrls=1&numResults=4 :save result %}
The tag has the following elements:
- Liquid tags are framed within {
%
and%
} This allows you to embed tags and expressions inside message templates that may also contain text or HTML. - The tag type is declared just after the start of the tag. In this case,
connected_content
is the tag type. For the full list of supported tags, see Personalization Using Liquid Tags. - You next define a fully-qualified URL to the HTTP resource that Connected Content calls for each user. You replace
<RecommendationsServiceHostName>
with the host name for the Elastic Load Balancer for the Recommendations service in your deployment of the sample application. - The Recommendations service provides a few resources for different personalization features. The resource for user recommendations is accessed from the
/recommendations
path. - The query string parameters come next. The user is identified by the
userID
parameter, and the{{${user_id}}}
expression instructs Braze to interpolate the user’s ID for each call to the service. - The last two query string parameters,
fullyQualifyImageUrls=1
andnumResults=4
, tell the Recommendations service that we want the product image URLs to be fully qualified so they can be displayed in the user’s email client and, in this case, to only return the top four recommendations, respectively. - The
:save result
expression tells Braze to assign the JSON response from the Recommendations service to a template variable named result. With the response saved, you can then access elements of the response using Liquid tags in the rest of the template.
The following code shows the format of a response from the Recommendations service:
[
{
"product": {
"id": "2",
"url": "http://recs.cloudfront.net/#/product/2",
"sk": "",
"name": "Striped Shirt",
"category": "apparel",
"style": "shirt",
"description": "A classic look for the summer season.",
"price": 9.99,
"image": "http://recs.cloudfront.net/images/apparel/1.jpg",
"featured": "true"
}
},
{
"product": {
"id": "1",
"url": "http://recs.cloudfront.net/#/product/1",
"sk": "",
"name": "Black Leather Backpack",
"category": "accessories",
"style": "bag",
"description": "Our handmade leather backpack will look great at the office or out on the town.",
"price": 109.99,
"image": "http://recs.cloudfront.net/images/accessories/1.jpg",
"featured": "true"
}
},
...
]
For brevity, the preceding code only shows the first two recommended products. Several product attributes are available that you can use in the Braze message template to represent each recommendation. To access a specific element of an array or list as we have here, you can use array subscripting notation in your Liquid tag. For example, the following tag interpolates the product name for the first recommended product in the response. For the preceding sample response, the tag resolves to “Striped Shirt”
:
{{result[0].product.name}}
When you combine the information in the personalized recommendation response from the Recommendations service with Liquid tags, the possibilities for building message designs are endless. The following code is a simplified example of how you could display a product recommendation in an HTML email template:
<table>
<tr>
<td>
<a href="{{result[0].product.url}}" target="_blank">
<img src="{{result[0].product.image}}" width="200" alt="{{result[0].product.name}}" />
</a>
</td>
<td>
<h2>{{result[0].product.name}}</h2>
<p>{{result[0].product.description}}</p>
<p>Only <strong>$ {{result[0].product.price}}</strong>!</p>
<a class="button" href="{{result[0].product.url}}">Buy Now</a>
</td>
</tr>
</table>
Batch integration
The batch integration architecture replaces the use of the Braze Connected Content feature with an Amazon Personalize batch recommendations job that is used to push attribute updates to Braze. Batch recommendations involve creating a file in an Amazon Simple Storage Service (Amazon S3) bucket that includes the users who you wish to generate recommendations for. A reference to this file is then used to submit a job to Amazon Personalize to generate recommendations for each user in the file and output the results to another Amazon S3 file of your choosing. You can use the output of the batch recommendations job to associate personalized recommendations with user profiles in Braze as custom attributes. The Liquid tags in the message templates we saw earlier are changed to access the recommendations as custom attributes from the user profile rather than the Connected Content response.
As noted earlier, the trade-off you’re making with the batch approach is sacrificing the freshness of real-time recommendations for a more cost-effective solution. Because batch recommendations don’t require an Amazon Personalize campaign, the additional requests from Connected Content to your campaign for each user are eliminated. For Braze campaigns that target extremely large segments, this can result in a significant reduction in requests. Furthermore, if you don’t need an Amazon Personalize campaign for other purposes or you’re creating an Amazon Personalize solution dedicated to email personalization, you can forego creating a campaign entirely.
The following diagram illustrates one of the many possible approaches to designing a batch architecture. The web application components from the real-time architecture still apply; they are excluded from this diagram for brevity.
You use Amazon CloudWatch Events to periodically trigger an AWS Lambda function that builds an input file for an Amazon Personalize batch recommendations job. When the batch recommendations job is complete, another Lambda function processes the output file, decorates the recommended items with rich product information, and enqueues user update events in Amazon Kinesis Data Streams. Finally, another Lambda function consumes the stream’s events and uses the Braze User API to update user profiles.
The use of a Kinesis data stream provides a few key benefits, including decoupling the batch job from the transactional Braze user update process and the ability to pause, restart, and replay user update events.
Real-time integration walkthrough
You implement the real-time integration in the Retail Demo Store sample ecommerce application. In this post, we walk you through the process of deploying this project in your AWS account and describe how to launch the self-guided Braze workshop bundled with the application.
You complete the following steps:
- Deploy the Retail Demo Store project to your AWS account using the supplied AWS CloudFormation templates (25–30 minutes).
- Build Amazon Personalize solutions and campaigns that provide personalized recommendations (2 hours).
- Import users into Braze and build a Braze campaign that uses Connected Content to retrieve personalized recommendations from Amazon Personalize (1 hour).
- Clean up resources.
Prerequisites
For this walkthrough, you need the following prerequisites:
- An AWS account
- A user in your AWS account with the necessary privileges to deploy the project
- A Braze account
If you don’t have a Braze account, please contact your Braze representative. We also assume that you have completed at least the Getting Started with Braze LAB course.
Step 1: Deploying the Retail Demo Store to your AWS account
From the following table, choose Launch Stack in the Region of your choice. This list of Regions doesn’t represent all possible Regions where you can deploy the project, just the Regions currently configured for deployment.
Region | Launch |
US East (N. Virginia) | ![]() |
US West (Oregon) | ![]() |
Europe (Ireland) | ![]() |
Accept all the default template parameter values and launch the template. The deployment of the project’s resources takes 25–30 minutes.
Step 2: Building Amazon Personalize campaigns
Before you can provide personalized product recommendations, you first need to train the ML models and provision the inference endpoints in Amazon Personalize that you need to retrieve recommendations. The CloudFormation template deployed in Step 1 includes an Amazon SageMaker notebook instance that provides a Jupyter notebook with detailed step-by-step instructions. The notebook takes approximately 2 hours to complete.
- Sign in to the AWS account where you deployed the CloudFormation template in Step 1.
- On the Amazon SageMaker console, choose Notebook instances.
- If you don’t see the
RetailDemoStore
notebook instance, make sure you’re in the same Region where you deployed the project. - To access the notebook instance, choose Open Jupyter or Open JupyterLab.
- When the Jupyter web interface is loaded for the notebook instance, choose the
workshop/1-Personalization/1.1-Personalize.ipynb
.
The notebooks are organized in a directory structure, so you may have to choose the workshop
folder to see the notebook subdirectories.
- When you have the
1.1-Personalize
notebook open, step through the workshop by reading and running each cell.
You can choose Run from the Jupyter toolbar sequentially run the code in the cells.
Step 3: Sending personalized messages from Braze
With the Amazon Personalize solutions and campaigns to produce personalized recommendations in place, you can now import users into your Braze account, build a messaging template that uses Braze Connected Content to retrieve recommendations from Amazon Personalize, and build a Braze campaign to send targeted emails to your users.
Similar to the Personalization workshop in Step 1, the Braze messaging workshop steps you through the process. This notebook takes approximately 1 hour to complete.
- If necessary, repeat the instructions in Step 1 to open a Jupyter or JupyterLab browser window from the Amazon SageMaker notebook instance in your Retail Demo Store deployment.
- When the Jupyter web interface is loaded for the notebook instance, choose the
workshop/4-Messaging/4.2-Braze.ipynb
notebook.
As with before, you may have to choose the workshop
folder to see the notebook subdirectories.
- When you have the
4.2-Braze
notebook open, step through the workshop by reading and running each cell.
Step 4: Cleaning up
To avoid incurring future charges, delete the resources the Retail Demo Store project created by deleting the CloudFormation stack you used during deployment. For more information about the source code for this post and the full Retail Demo Store project, see the GitHub repo.
Conclusion
As marketers compete for the attention of customers through outbound messaging, there is increasing pressure to effectively target the right users, at the right time, on the right channel, and with the right messaging. Braze provides the solution to the first three challenges. You can solve the final challenge with Braze Connected Content and Amazon Personalize, and deliver highly personalized product and content recommendations that reflect each customer’s current interests.
How are you using outbound messaging to reach your customers? Is there an opportunity to increase engagement with your customers with more relevant and personalized content?
About Braze
Braze is an AWS Advanced Technology Partner and holder of the AWS Digital Customer Experience and Retail competencies. Top global brands such as ABC News, Urban Outfitters, Rakuten, and Gap are sending tens of billions of messages per month to over 2 billion monthly active users with Braze.
About the Author
James Jory is a Solutions Architect in Applied AI with AWS. He has a special interest in personalization and recommender systems and a background in ecommerce, marketing technology, and customer data analytics. In his spare time, he enjoys camping and auto racing simulation.
Translating documents, spreadsheets, and presentations in Office Open XML format using Amazon Translate
Now you can translate .docx, .xlsx, and .pptx documents using Amazon Translate.
Every organization creates documents, spreadsheets, and presentations to communicate and share information with a large group and keep records for posterity. These days, we interact with people who don’t share the same language as ours. The need for translating such documents has become even more critical in a globally interconnected world. Some large organizations hire a team of professional translators to help with document translation, which involves a lot of time and overhead cost. Multiple tools are available online that enable you to copy and paste text to get the translated equivalent in the language of your choice, but there are few secure and easy methods that allow for native support of translating such documents while keeping formatting intact.
Amazon Translate now supports translation of Office Open XML documents in DOCX, PPTX, and XLSX format. Amazon Translate is a fully managed neural machine translation service that delivers high-quality and affordable language translation in 55 languages. For the full list of languages, see Supported Languages and Language Codes. The document translation feature is available wherever batch translation is available. For more information, see Asynchronous Batch Processing.
In this post, we walk you through a step-by-step process to translate documents on the AWS Management Console. You can also access the Amazon Translate BatchTranslation API for document translation via the AWS Command Line Interface (AWS CLI) or the AWS SDK.
Solution overview
This post walks you through the following steps:
- Create an AWS Identity and Access Management (IAM) role that can access your Amazon Simple Storage Service (Amazon S3) buckets.
- Sort your documents by file type and language.
- Perform the batch translation.
Creating an IAM role to access your S3 buckets
In this post, we create a role that has access to all the S3 buckets in your account to translate documents, spreadsheets, and presentations. You provide this role to Amazon Translate to let the service access your input and output S3 locations. For more information, see AWS Identity and Access Management Documentation.
- Sign in to your personal AWS account.
- On the IAM console, under Access management, choose Roles.
- Choose Create role.
- Choose Another AWS account.
- For Account ID, enter your ID.
- Go to the next page.
- For Filter policies, search and add the
AmazonS3FullAccess
policy.
- Go to the next page.
- Enter a name for the role, for example,
TranslateBatchAPI
. - Go to the role you just created.
- On the Trust relationships tab, choose Edit trust relationship.
- Enter the following service principals:
"Service": [ "translate.aws.internal", "translate.amazonaws.com" ],
For example, see the following screenshot.
Sorting your documents
Amazon Translate batch translation works on documents stored in a folder inside an S3 bucket. Batch translation doesn’t work if the file is saved in the root of the S3 bucket. Batch translation also doesn’t support translation of nested files. So you first need to upload the documents you wish to translate in a folder inside an S3 bucket. Sort the documents such that the folders contain files of the same type (DOCX, PPTX, XLSX) and are in the same language. If you have multiple documents of different file types that you need to translate, sort the files such that each Amazon S3 prefix has only one type of document format written in one language.
- On the Amazon S3 console, choose Create bucket.
- Walk through the steps to create your buckets.
For this post, we create two buckets: input-translate-bucket
and output-translate-bucket
.
The buckets contain the following folders for each file type:
docx
pptx
xlsx
Performing batch translation
To implement your batch translation, complete the following steps:
- On to the Amazon Translate console, choose Batch Translation.
- Choose Create job.
For this post, we walk you through translating documents in DOCX format.
- For Name, enter
BatchTranslation
. - For Source language, choose En.
- For Target language, choose Es.
- For Input S3 location, enter
s3://input-translate-bucket/docx/
. - For File format, choose docx.
- For Output S3 location, enter
s3://output-translate-bucket/
. - For Access permissions, select Use an existing IAM role.
- For IAM role, enter
TranslateBatchAPI
.
Because this is an asynchronous translation, the translation begins after the machine resource for the translation is allocated. This can take up to 15 minutes. For more information about performing batch translation jobs, see Starting a Batch Translation Job.
The following screenshot shows the details of your BatchTranslation
job.
When the translation is complete, you can find the output in a folder in your S3 bucket. See the following screenshot.
Conclusion
In this post, we discussed implementing asynchronous batch translation to translate documents in DOCX format. You can repeat the same procedure for spreadsheets and presentations. The translation is simple and you pay only for the number of characters (including spaces) you translate in each format. You can start translating office documents today in all Regions where batch translation is supported. If you’re new to Amazon Translate, try out the Free Tier, which offers 2 million characters per month for the first 12 months, starting from your first translation request.
About the Author
Watson G. Srivathsan is the Sr. Product Manager for Amazon Translate, AWS’s natural language processing service. On weekends you will find him exploring the outdoors in the Pacific Northwest.