Amazon AWS – Page 293

How Xpertal is creating the Contact Center of the future with Amazon Lex

November 25, 2020

by Chester Perez Amazon AWS

This is a joint blog post with AWS Solutions Architects, Jorge Alfaro Hidalgo and Mauricio Zajbert, and Chester Perez, the Contact Center Manager at Xpertal. Fomento Económico Mexicano, S.A.B. de C.V. (FEMSA) is a Mexican multinational beverage and retail company headquartered in Monterrey, Mexico.

Fomento Económico Mexicano, S.A.B. de C.V., or FEMSA, is a Mexican multinational beverage and retail company headquartered in Monterrey, Mexico. Xpertal Global Services is FEMSA’s service unit that offers consulting, IT, back-office transactional, and consumable procurement services to the rest of FEMSA’s business units. Xpertal operates a Contact Center which serves as an internal help desk for employees and has 150 agents that handles 4 million calls per year. Their goal is to automate the majority calls by 2023 with a chatbot and only escalate complex queries requiring human intervention to live agents.

The contact center started this transformation 2 years ago with Robotic Process Automation (RPA) solutions and it has already been a big success. They’ve removed repetitive tasks such as password resets and doubled the number of requests serviced with the same number of agents.

This technology was helpful, but to achieve the next level of automation, they needed to start looking at systems that could naturally emulate human interactions. This is where Amazon AI has been helpful. As part of the journey, Xpertal started exploring Amazon Lex, a service to create self-service virtual agents to improve call response times. In addition, they used other AI services such as Amazon Comprehend, Amazon Polly, and Amazon Connect to automate other parts of the contact center.

In the first stage, Xpertal used Amazon Comprehend, a natural language processing service to classify support request emails automatically and route them to the proper resolution teams. This process used to take 4 hours to perform manually, and was reduced to 15 minutes with Amazon Comprehend.

Next, Xpertal started to build bots supporting US Spanish with Amazon Lex for multiple internal websites. They’ve been able to optimize each bot to fit each business units’ need and integrate it with Amazon Connect, an omni-channel cloud contact center. Then the Lex bot can also help to resolve employee calls coming into the Contact Center. With today’s launch of Latin American Spanish, they are excited to migrate and create an even more localized experience for their employees.

It was easy to integrate Amazon Lex with Amazon Connect and other third-party collaboration tools used within FEMSA to achieve an omni-channel system for support requests. Some of the implemented channels include email, phone, collaboration tools, and internal corporate websites.

In addition, Amazon Lex has been integrated with a diverse set of information sources within FEMSA to create a virtual help desk that enables their employees to find answers faster. These information sources include their CRM and internal ticketing systems. It’s now possible for users to easily chat with the help desk system to create support tickets or get a status update. They’ve also been able to build these types of conversational interactions with other systems and databases to provide more natural responses to users.

The following diagram shows the solution architecture for Xpertal’s Contact Center.

This architecture allows calls coming into the Contact Center to be routed to Amazon Connect. Amazon Connect then invokes Amazon Lex to identify the caller’s need. Subsequently, Amazon Lex uses AWS Lambda to interact with applications’ databases to either fulfill the user’s need by retrieving the information needed or creating a ticket to escalate the user’s request to the appropriate support team.

In addition, all customer calls are recorded and transcribed with Amazon Transcribe for post-call analytics to identify improvement areas and usage trends. The Xpertal team is effectively able to track user interactions. By analyzing the user utterances the bot didn’t understand, the team is able to monitor the solution’s effectiveness and continuously improve containment rates.

Xpertal’s Contact Center Manager, Chester Perez, shares, “Our goal is to keep evolving as an organization and find better ways to deliver our products and improve customer satisfaction. Our talented internal team developed various initiatives focused on bringing more intelligence and automation into our internal contact center to provide self-service capabilities, improve call deflection rates, reduce call wait times, and increase agent productivity. With Amazon Lex’s easy to use interface, our Contact Center team was able to create bots after a 1-hour training session. Thanks to AWS AI services, we can finally focus on how to apply the technology for our users’ benefit and not on what’s behind it.”

Summary

AWS has been working with a variety of customers such as Xpertal to find ways for AI services like Amazon Lex to boost self-service capabilities that lead to call containment and improve the overall contact center productivity and customer experience in Spanish.

Get started with this “How to create a virtual call center agent with Amazon Lex” tutorial with any of our localized Spanish language choices. Amazon Lex now offers Spanish, US Spanish, and LATAM Spanish. Depending on your contact center goals, learn more about Amazon Connect’s omni-channel, cloud-based contact center or bring your own telephony (BYOT) with AWS Contact Center Intelligence.

About the Authors

Chester Perez has 17 years working with FEMSA group, has contributed in areas of development, infrastructure and architecture. He has also designed and implemented teams that provide specialized support and center of excellence in data center. He is currently Manager of the Contact Center at Xpertal and his main challenge is to improve the quality and efficiency of the service it provides, by transforming the area into a technological and internal talent sense.

Jorge Alfaro Hidalgo is an Enterprise Solutions Architect in AWS Mexico with more than 20 years of experience in IT industry, he is passionate about helping enterprises to AWS cloud journey building innovative solutions to achieve their business objectives.

Mauricio Zajbert has more than 30 years of experience in the IT industry and a fully recovered infrastructure professional; he’s currently Solutions Architecture Manager for Enterprise accounts in AWS Mexico leading a team that helps customers in their cloud journey. He’s lived through several technology waves and deeply believes none has offered the benefits of the cloud.

Announcing the launch of Amazon Comprehend Events

November 24, 2020

by Graham Horwood Amazon AWS

Every day, financial organizations need to analyze news articles, SEC filings, and press releases, as well as track financial events such as bankruptcy announcements, changes in executive leadership at companies, and announcements of mergers and acquisitions. They want to accurately extract the key data points and associations among various people and organizations mentioned within an announcement to update their investment models in a timely manner. Traditional natural language processing services can extract entities such as people, organizations and locations from text, but financial analysts need more. They need to understand how these entities relate to each other in the text.

Today, Amazon Comprehend is launching Comprehend Events, a new API for event extraction from natural language text documents. With this launch, you can use Comprehend Events to extract granular details about real-world events and associated entities expressed in unstructured text. This new API allows you to answer who-what-when-where questions over large document sets, at scale and without prior NLP experience.

This post gives an overview of the NLP capabilities that Comprehend Events supports, along with suggestions for processing and analyzing documents with this feature. We’ll close with a discussion of several solutions that use Comprehend Events, such as knowledge base population, semantic search, and document triage, all of which can be developed with companion AWS services for storing, visualizing, and analyzing the predictions made by Comprehend Events.

Comprehend Events overview

The Comprehend Events API, under the hood, converts unstructured text into structured data that answers who-what-when-where-how questions. Comprehend Events lets you extract the event structure from a document, distilling pages of text down to easily processed data for consumption by your AI applications or graph visualization tools. In the following figure, an Amazon press release announcing the 2017 acquisition of Whole Foods Market, Inc. is rendered as a graph showing the core semantics of the acquisition event, as well as the status of Whole Foods’ CEO post merger.

Amazon (AMZN) today announced that they will acquire Whole Foods Market (WFM) for $42 per share in an all-cash transaction valued at approximately $13.7 billion, including Whole Foods Market’s net debt. Whole Foods Market will continue to operate stores under the Whole Foods Market brand and source from trusted vendors and partners around the world. John Mackey will remain as CEO of Whole Foods Market and Whole Foods Market’s headquarters will stay in Austin, Texas.

From: Amazon Press Center Release Archive

Extracted event triggers – Which events took place. In our example, CORPORATE_ACQUISITION and EMPLOYMENT events were detected. Not shown in the preceding figure, the API also returns which words in the text indicate the occurrence of the event, for example the words “acquire” and “transaction” in the context of the document indicate that a CORPORATE_ACQUISITION took place. The Comprehend Events API returns a variety of insights into the event semantics of a document:

Extracted entity mentions – Which words in the text indicate which entities are involved in the event, including named entities such as “Whole Foods Market” and common nouns such as “today.” The API also returns the type of the entity detected, for example ORGANIZATION for “Whole Foods Market.”
Event argument role (also known as slot filling) – Which entities play which roles in which events; for example Amazon is an INVESTOR in the acquisition event.
Groups of coreferential event triggers – Which triggers in the document refer to the same event. The API also groups triggers such as “transaction” and “acquire” around the CORPORATE_ACQUISITION event (not shown above).
Groups of coreferential entity mentions – Which mentions in the document refer to the same entity. For example, the API returns the grouping of “Amazon” with “they” as a single entity (not shown above).

At the time of launch, Comprehend Events is available as an asynchronous API supporting extraction of a fixed set of event types in the finance domain. This domain includes a variety of event types (such as CORPORATE_ACQUISITION and IPO), both standard and novel entity types (such as PER and ORG vs. STOCK_CODE and MONETARY_VALUE), and the argument roles that can connect them (such as INVESTOR, OFFERING_DATE, or EMPLOYER). For the complete ontology, see the Detect Events API documentation.

To demonstrate the functionality of the feature, we’ll show you how to process a small set of sample documents, using both the Amazon Comprehend console and the Python SDK.

Formatting documents for processing

The first step is to transform raw documents into a suitable format for processing. Comprehend Events imposes a few requirements on document size and composition:

Individual documents must be UTF-8 encoded and no more than 10 KB in length. As a best practice, we recommend segmenting larger documents at logical boundaries (section headers) or performing sentence segmentation with existing open-source tools.
For best performance, markup (such as HTML), tabular material, and other non-prose spans of text should be removed from documents. The service is intended to process paragraphs of unstructured text.
A single job must not contain more than 50 MB of data. Larger datasets must be divided into smaller sets of documents for parallel processing. The different document format modes also impose size restrictions:
- One document per file (ODPF) – A maximum of 5,000 files in a single Amazon Simple Storage Service (Amazon S3) location.
- One document per line (ODPL) – A maximum of 5,000 lines in a single text file. Newline characters (n, r, rn) should be replaced with other whitespace characters within a given document.

For this post, we use a set of 117 documents sampled from Amazon’s Press Center: sample_finance_dataset.txt. The documents are formatted as a single ODPL text file and already conform to the preceding requirements. To implement this solution on your own, just upload the text file to an S3 bucket in your account before continuing with the following steps.

Job creation option 1: Using the Amazon Comprehend console

Creating a new Events labeling job takes only a few minutes.

On the Amazon Comprehend console, choose Analysis jobs.
Chose Create job.
For Name, enter a name (for this post, we use events-test-job).
For Analysis type¸ choose Events.
For Language, choose English.
For Target event types, choose your types of events (for example, Corporate acquisition).

In the Input data section, for S3 location, enter the location of the sample ODPL file you downloaded earlier.
In the Output data section, for S3 location, enter a location for the event output.

For IAM role, choose to use an existing AWS Identity and Access (IAM) role or create a new one.

Choose Create job.

A new job appears in the Analysis jobs queue.

Job creation option 2: Using the SDK

Alternatively, you can perform these same steps with the Python SDK. First, we specify Comprehend Events job parameters, just as we would with any other Amazon Comprehend feature. See the following code:

# Client and session information
session = boto3.Session()
comprehend_client = session.client(service_name="comprehend")

# Constants for S3 bucket and input data file.
bucket = "comprehend-events-blogpost-us-east-1"
filename = 'sample_finance_dataset.txt'
input_data_s3_path = f's3://{bucket}/' + filename
output_data_s3_path = f's3://{bucket}/'

# IAM role with access to Comprehend and specified S3 buckets
job_data_access_role = 'arn:aws:iam::xxxxxxxxxxxxx:role/service-role/AmazonComprehendServiceRole-test-events-role'

# Other job parameters
input_data_format = 'ONE_DOC_PER_LINE'
job_uuid = uuid.uuid1()
job_name = f"events-job-{job_uuid}"
event_types = ["BANKRUPTCY", "EMPLOYMENT", "CORPORATE_ACQUISITION", 
               "INVESTMENT_GENERAL", "CORPORATE_MERGER", "IPO",
               "RIGHTS_ISSUE", "SECONDARY_OFFERING", "SHELF_OFFERING",
               "TENDER_OFFERING", "STOCK_SPLIT"]

Next, we use the start_events_detection_job API endpoint to start the analysis of the input data file and capture the job ID, which we use later to poll and retrieve results:

# Begin the inference job
response = comprehend_client.start_events_detection_job(
    InputDataConfig={'S3Uri': input_data_s3_path,
                     'InputFormat': input_data_format},
    OutputDataConfig={'S3Uri': output_data_s3_path},
    DataAccessRoleArn=job_data_access_role,
    JobName=job_name,
    LanguageCode='en',
    TargetEventTypes=event_types
)

# Get the job ID
events_job_id = response['JobId']

An asynchronous Comprehend Events job typically takes a few minutes for a small number of documents and up to several hours for lengthier inference tasks. For our sample dataset, inference should take approximately 20 minutes. It’s helpful to poll the API using the describe_events_detection_job endpoint. When the job is complete, the API returns a JobStatus of COMPLETED. See the following code:

# Get current job status
job = comprehend_client.describe_events_detection_job(JobId=events_job_id)

# Loop until job is completed
waited = 0
timeout_minutes = 30
while job['EventsDetectionJobProperties']['JobStatus'] != 'COMPLETED':
    sleep(60)
    waited += 60
    assert waited//60 < timeout_minutes, "Job timed out after %d seconds." % waited
    job = comprehend_client.describe_events_detection_job(JobId=events_job_id)

Finally, we collect the Events inference output from Amazon S3 and convert to a list of dictionaries, each of which contains the predictions for a given document:

# The output filename is the input filename + ".out"
output_data_s3_file = job['EventsDetectionJobProperties']['OutputDataConfig']['S3Uri'] + filename + '.out'

# Load the output into a result dictionary    # Get the files.
results = []
with smart_open.open(output_data_s3_file) as fi:
    results.extend([json.loads(line) for line in fi.readlines() if line])

The Comprehend Events API output schema

When complete, the output is written to Amazon S3 in JSON lines format, with each line encoding all the event extraction predictions for a single document. Our output schema includes the following information:

Comprehend Events system output contains separate objects for entities and events, each organized into groups of coreferential objects.
The API output includes the text, character offset, and type of each entity mention and trigger.
Event argument roles are linked to entity groups by an EntityIndex.
Confidence scores for classification tasks are given as Score. Confidence of entity and trigger group membership is given with GroupScore.
Two additional fields, File and Line, are present as well, allowing you to track document provenance.

The following Comprehend Events API output schema represents entities as lists of mentions and events as lists of triggers and arguments:

{ 
    "Entities": [
        {
            "Mentions": [
                {
                    "BeginOffset": number,
                    "EndOffset": number,
                    "Score": number,
                    "GroupScore": number,
                    "Text": "string",
                    "Type": "string"
                }, ...
            ]
        }, ...
    ],
    "Events": [
        {
            "Type": "string",
            "Arguments": [
                {
                    "EntityIndex": number,
                    "Role": "string",
                    "Score": number
                }, ...
            ],
            "Triggers": [
                {
                    "BeginOffset": number,
                    "EndOffset": number,
                    "Score": number,
                    "Text": "string",
                    "GroupScore": number,
                    "Type": "string"
                }, ...
            ]
        }, ...
    ]
    "File": "string",
    "Line": "string
}

Analyzing Events output

The API output encodes all the semantic relationships necessary to immediately produce several useful visualizations of any given document. We walk through a few such depictions of the data in this section, referring you to the Amazon SageMaker Jupyter notebook accompanying this post for the working Python code necessary to produce them. We use the press release about Amazon’s acquisition of Whole Foods mentioned earlier in this post as an example.

Visualizing entity and trigger spans

As with any sequence labeling task, one of the simplest visualizations for Comprehend Events output is highlighting triggers and entity mentions, along with their respective tags. For this post, we use displaCy‘s ability to render custom tags. In the following visualization, we see some of the the usual range of entity types detected by NER systems (PERSON, ORGANIZATION), as well as finance-specific ones, such as STOCK_CODE and MONETARY_VALUE. Comprehend Events detects non-named entities (common nouns and pronouns) as well as named ones. In addition to entities, we also see tagged event triggers, such as “merger” (CORPORATE_MERGER) and “acquire” (CORPORATE_ACQUISITION).

Graphing event structures

Highlighting tagged spans is informative because it localizes system predictions about entity and event types in the text. However, it doesn’t show the most informative thing about the output: the predicted argument role associations among events and entities. The following plot depicts the event structure of the document as a semantic graph. In the graph, vertices are entity mentions and triggers; edges are the argument roles held by the entities in relation to the triggers. For simple renderings of a small number of events, we recommend common open-source tools such as networkx and pyvis, which we used to produce this visualization. For larger graphs, and graphs of large numbers of documents, we recommend a more robust solution for graph storage, such as Amazon Neptune.

Tabulating event structures

Lastly, you can always render the event structure produced by the API as a flat table, indicating, for example, the argument roles of the various participants in each event, as in the following table. The table demonstrates how Comprehend Events groups entity mentions and triggers into coreferential groups. You can use these textual mention groups to verify and analyze system predictions.

Setting up the Comprehend Events AWS CloudFormation stack

You can quickly try out this example for yourself by deploying our sample code into your own account from the provided AWS CloudFormation template. We’ve included all the necessary steps in a Jupyter notebook, so you can easily walk through creating the preceding visualizations and see how it all works. From there, you can easily modify it to run over other custom datasets, modify the results, ingest them into other systems, and build upon the solution. Complete the following steps:

Choose Launch Stack:

After the template loads in the AWS CloudFormation console, choose Next.
For Stack name, enter a name for your deployment.
Choose Next.

Choose Next on the following page.
Select the check box acknowledging this template will create IAM resources.

This allows the SageMaker notebook instance to talk with Amazon S3 and Amazon Comprehend.

Choose Create stack.

When stack creation is complete, browse to your notebook instances on the SageMaker console.

A new instance is already loaded with the example data and Jupyter notebook.

Choose Open Jupyter for the comprehend-events-blog notebook.

The data and notebook are already loaded on the instance. This was done through a SageMaker lifecycle configuration.

Choose the notebooks folder.
Choose the comprehend_events_finance_tutorial.ipynb notebook.

Step through the notebook to try Comprehend Events out yourself.

Applications using Comprehend Events

We have demonstrated applying Comprehend Events to a small set of documents and demonstrated visualizing the event structures found in a sample document. The power of Comprehend Events, however, lies in its ability to extract and structure business-relevant facts from large collections of unstructured documents. In this section, we discuss a few potential solutions that you could build on top of the foundation provided by Comprehend Events.

Knowledge graph construction

Business and financial services analysts need to visually explore event-based relationships among corporate entities, identifying potential patterns over large collections of data. Without a tool like Comprehend Events, you have to manually identify entities of interests and events in documents and manually enter them in network visualization tools for tracking. Comprehend Events allows you to populate knowledge graphs over large collections of data. You can store these graphs and search, for example, in Neptune and explore using network visualization tools without expensive manual extraction.

Semantic search

Analysts also need to find documents in which actors of interest participate in events of interest (at places, at times). The most common approach to this task involves enterprise search: using complex Boolean queries to find co-occurring strings that typically match your desired search patterns. Natural language is rich and highly variable, however, and even the best searches often miss key details in unstructured text. Comprehend Events allows you to populate a search index with event-argument associations, enriching free text search with extracted event data. You can process collections of documents with Comprehend Events, index the documents in Amazon Elasticsearch Service (Amazon ES) with the extracted event data, and enable field-based search over event-argument tuples in downstream applications.

Document triage

An additional application of Comprehend Events is simple filtration of large text collections for events of interest. This task is typically performed with a tool such as Amazon Comprehend customer classification, but requires hundreds or thousands of annotated training documents to produce a custom model. Comprehend Events allows developers without such training data to process a large collection of documents and detect financial events found in the event taxonomy. You can simply process batches of documents with the asynchronous API and route documents matching pre-defined event patterns to downstream applications.

Conclusion

This post has demonstrated the application and utility of Comprehend Events for information processing in the finance domain. This new feature gives you the ability to enrich your applications with close semantic analysis of financial events from unstructured text, all without any NLP model training or tuning. For more information, just check out our documentation or try out the above walkthrough for yourself in the Console or in our Jupyter notebook through CloudFormation or on Github. We’re exciting to hear your comments and questions in the comments section!

About the Authors

Graham Horwood is a data scientist at Amazon AI. His work focuses on natural language processing technologies for customers in the public and commercial sectors.

Ben Snively is an AWS Public Sector Specialist Solutions Architect. He works with government, non-profit, and education customers on big data/analytical and AI/ML projects, helping them build solutions using AWS.

Sameer Karnik is a Sr. Product Manager leading product for Amazon Comprehend, AWS’s natural language processing service.

Bringing your own R environment to Amazon SageMaker Studio

November 24, 2020

by Nick Minaie Amazon AWS

Amazon SageMaker Studio is the first fully integrated development environment (IDE) for machine learning (ML). With a single click, data scientists and developers can quickly spin up SageMaker Studio notebooks to explore datasets and build models. On October 27, 2020, Amazon released a custom images feature that allows you to launch SageMaker Studio notebooks with your own images.

SageMaker Studio notebooks provide a set of built-in images for popular data science and ML frameworks and compute options to run notebooks. The built-in SageMaker images contain the Amazon SageMaker Python SDK and the latest version of the backend runtime process, also called kernel. With the custom images feature, you can register custom built images and kernels, and make them available to all users sharing a SageMaker Studio domain. You can start by cloning and extending one of the example Docker files provided by SageMaker, or build your own images from scratch.

This post focuses on adding a custom R image to SageMaker Studio so you can build and train your R models with SageMaker. After attaching the custom R image, you can select the image in Studio and use R to access the SDKs using the RStudio reticulate package. For more information about R on SageMaker, see Coding with R on Amazon SageMaker notebook instances and R User Guide to Amazon SageMaker.

You can create images and image versions and attach image versions to your domain using the S ageMaker Studio Control Panel, the AWS SDK for Python (Boto3), and the AWS Command Line Interface (AWS CLI)—for more information about CLI commands, see AWS CLI Command Reference. This post explains both AWS CLI and SageMaker console UI methods to attach and detach images to a SageMaker Studio domain.

Prerequisites

Before getting started, you need to meet the following prerequisites:

Install the AWS CLI on your local machine. This post uses AWS CLI version 2. You should make necessary adjustments if you use a different AWS CLI version.
Permissions to access the Amazon Elastic Container Registry (Amazon ECR). For more information, see Amazon ECR Managed Policies.
Install Docker on your local machine. For more information, see Orientation and setup. This is necessary for building Docker images and pushing them to Amazon ECR.
- For instructions on building a container from a Studio development environment, see Using the Amazon SageMaker Studio Image Build CLI to build container images from your Studio notebooks and the SageMaker Docker Build GitHub repo.
Dockerfile for the R image. A sample Dockerfile is provided in this post that you can customize for your own specific case, but you can also use your own Dockerfile.
- Building the R image from the Dockerfile installs dependencies that may be licensed under copyleft licenses such as GPLv3. You should review the license terms and make sure they are acceptable for your use case before proceeding to build this image.
An AWS Identity and Access Management (IAM) role that has the AmazonSageMakerFullAccess policy attached. If you have onboarded to SageMaker Studio, you can get the role from the Studio Summary section on the SageMaker Studio Control Panel.
A SageMaker Studio domain. For instructions, see Onboard to Amazon SageMaker Studio.

Creating your Dockerfile

Before attaching your image to Studio, you need to build a Docker image using a Dockerfile. You can build a customized Dockerfile using base images or other Docker image repositories, such as Jupyter Docker-stacks repository, and use or revise the ones that fit your specific need.

SageMaker maintains a repository of sample Docker images that you can use for common use cases (including R, Julia, Scala, and TensorFlow). This repository contains examples of Docker images that are valid custom images for Jupyter KernelGateway Apps in SageMaker Studio. These custom images enable you to bring your own packages, files, and kernels for use within SageMaker Studio.

For more information about the specifications that apply to the container image that is represented by a SageMaker image version, see Custom SageMaker image specifications.

For this post, we use the sample R Dockerfile. This Dockerfile takes the base Python 3.6 image and installs R system library prerequisites, conda via Miniconda, and R packages and Python packages that are usable via reticulate. You can create a file named Dockerfile using the following script and copy it to your installation folder. You can customize this Dockerfile for your specific use case and install additional packages.

# This project is licensed under the terms of the Modified BSD License 
# (also known as New or Revised or 3-Clause BSD), as follows:

#    Copyright (c) 2001-2015, IPython Development Team
#    Copyright (c) 2015-, Jupyter Development Team

# All rights reserved.

FROM python:3.6

ARG NB_USER="sagemaker-user"
ARG NB_UID="1000"
ARG NB_GID="100"

# Setup the "sagemaker-user" user with root privileges.
RUN 
    apt-get update && 
    apt-get install -y sudo && 
    useradd -m -s /bin/bash -N -u $NB_UID $NB_USER && 
    chmod g+w /etc/passwd && 
    echo "${NB_USER}    ALL=(ALL)    NOPASSWD:    ALL" >> /etc/sudoers && 
    # Prevent apt-get cache from being persisted to this layer.
    rm -rf /var/lib/apt/lists/*

USER $NB_UID

# Make the default shell bash (vs "sh") for a better Jupyter terminal UX
ENV SHELL=/bin/bash 
    NB_USER=$NB_USER 
    NB_UID=$NB_UID 
    NB_GID=$NB_GID 
    HOME=/home/$NB_USER 
    MINICONDA_VERSION=4.6.14 
    CONDA_VERSION=4.6.14 
    MINICONDA_MD5=718259965f234088d785cad1fbd7de03 
    CONDA_DIR=/opt/conda 
    PATH=$CONDA_DIR/bin:${PATH}

# Heavily inspired from https://github.com/jupyter/docker-stacks/blob/master/r-notebook/Dockerfile

USER root

# R system library pre-requisites
RUN apt-get update && 
    apt-get install -y --no-install-recommends 
    fonts-dejavu 
    unixodbc 
    unixodbc-dev 
    r-cran-rodbc 
    gfortran 
    gcc && 
    rm -rf /var/lib/apt/lists/* && 
    mkdir -p $CONDA_DIR && 
    chown -R $NB_USER:$NB_GID $CONDA_DIR && 
    # Fix for devtools https://github.com/conda-forge/r-devtools-feedstock/issues/4
    ln -s /bin/tar /bin/gtar

USER $NB_UID

ENV PATH=$CONDA_DIR/bin:${PATH}

# Install conda via Miniconda
RUN cd /tmp && 
    curl --silent --show-error --output miniconda-installer.sh https://repo.anaconda.com/miniconda/Miniconda3-${MINICONDA_VERSION}-Linux-x86_64.sh && 
    echo "${MINICONDA_MD5} *miniconda-installer.sh" | md5sum -c - && 
    /bin/bash miniconda-installer.sh -f -b -p $CONDA_DIR && 
    rm miniconda-installer.sh && 
    conda config --system --prepend channels conda-forge && 
    conda config --system --set auto_update_conda false && 
    conda config --system --set show_channel_urls true && 
    conda install --quiet --yes conda="${CONDA_VERSION%.*}.*" && 
    conda update --all --quiet --yes && 
    conda clean --all -f -y && 
    rm -rf /home/$NB_USER/.cache/yarn


# R packages and Python packages that are usable via "reticulate".
RUN conda install --quiet --yes 
    'r-base=4.0.0' 
    'r-caret=6.*' 
    'r-crayon=1.3*' 
    'r-devtools=2.3*' 
    'r-forecast=8.12*' 
    'r-hexbin=1.28*' 
    'r-htmltools=0.4*' 
    'r-htmlwidgets=1.5*' 
    'r-irkernel=1.1*' 
    'r-rmarkdown=2.2*' 
    'r-rodbc=1.3*' 
    'r-rsqlite=2.2*' 
    'r-shiny=1.4*' 
    'r-tidyverse=1.3*' 
    'unixodbc=2.3.*' 
    'r-tidymodels=0.1*' 
    'r-reticulate=1.*' 
    && 
    pip install --quiet --no-cache-dir 
    'boto3>1.0<2.0' 
    'sagemaker>2.0<3.0' && 
    conda clean --all -f -y

WORKDIR $HOME
USER $NB_UID

Setting up your installation folder

You need to create a folder on your local machine and add the following files in that folder:

.
├── Dockerfile
├── app-image-config-input.json
├── create-and-attach-image.sh
├── create-domain-input.json
└── default-user-settings.json

In the following scripts, the Amazon Resource Names (ARNs) should have a format similar to:

arn:partition:service:region:account-id:resource-id
arn:partition:service:region:account-id:resource-type/resource-id
arn:partition:service:region:account-id:resource-type:resource-id

Dockerfile is the Dockerfile that you created in the previous step.

Create a file named app-image-config-input.json with the following content:

{
    "AppImageConfigName": "custom-r-image-config",
    "KernelGatewayImageConfig": {
        "KernelSpecs": [
            {
                "Name": "ir",
                "DisplayName": "R (Custom R Image)"
            }
        ],
        "FileSystemConfig": {
            "MountPath": "/home/sagemaker-user",
            "DefaultUid": 1000,
            "DefaultGid": 100
        }
    }

Create a file named default-user-settings.json with the following content. If you’re adding multiple custom images, add to the list of CustomImages.

{
  "DefaultUserSettings": {
    "KernelGatewayAppSettings": {
      "CustomImages": [
          {
                   "ImageName": "custom-r",
                   "AppImageConfigName": "custom-r-image-config"
                }
            ]
        }
    }
}

Create one last file in your installation folder named create-and-attach-image.sh using the following bash script. The script runs the following in order:

Creates a repository named smstudio-custom in Amazon ECR and logs into that repository
Builds an image using the Dockerfile and attaches a tag to the image r
Pushes the image to Amazon ECR
Creates an image for SageMaker Studio and attaches the Amazon ECR image to that image

Creates an AppImageConfigfor this image using app-image-config-input.json

# Replace with your AWS account ID and your Region, e.g. us-east-1, us-west-2
ACCOUNT_ID = <AWS ACCOUNT ID>
REGION = <STUDIO DOMAIN REGION>

# create a repository in ECR, and then login to ECR repository
aws --region ${REGION} ecr create-repository --repository-name smstudio-custom
aws ecr --region ${REGION} get-login-password | docker login --username AWS 
    --password-stdin ${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/smstudio-custom

# Build the docker image and push to Amazon ECR (modify image tags and name as required)
$(aws ecr get-login --region ${REGION} --no-include-email)
docker build . -t smstudio-r -t ${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/smstudio-custom:r
docker push ${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/smstudio-custom:r

# Using with SageMaker Studio
## Create SageMaker Image with the image in ECR (modify image name as required)
ROLE_ARN = "<YOUR EXECUTION ROLE ARN>"

aws sagemaker create-image 
    --region ${REGION} 
    --image-name custom-r 
    --role-arn ${ROLE_ARN}

aws sagemaker create-image-version 
    --region ${REGION} 
    --image-name custom-r 
    --base-image ${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/smstudio-custom:r

## Create AppImageConfig for this image (modify AppImageConfigName and 
## KernelSpecs in app-image-config-input.json as needed)
## note that 'file://' is required in the file path
aws sagemaker create-app-image-config 
    --region ${REGION} 
    --cli-input-json file://app-image-config-input.json

Updating an existing SageMaker Studio domain with a custom image

If you already have a Studio domain, you don’t need to create a new domain, and can easily update your existing domain by attaching the custom image. You can do this either using the AWS CLI for Amazon SageMaker or the SageMaker Studio Control Panel (which we discuss in the following sections). Before going to the next steps, make sure your domain is in Ready status, and get your Studio domain ID from the Studio Control Panel. The domain ID should be in d-xxxxxxxx format.

Using the AWS CLI for SageMaker

In the terminal, navigate to your installation folder and run the following commands. This makes the bash scrip executable:

chmod +x create-and-attach-image.sh

Then execute the following command in terminal:

./create-and-attach-image.sh

After you successfully run the bash script, you need update your existing domain by executing the following command in the terminal. Make sure you provide your domain ID and Region.

aws sagemaker update-domain --domain-id <DOMAIN_ID> 
    --region <REGION_ID> 
    --cli-input-json file://default-user-settings.json

After executing this command, your domain status shows as Updating for a few seconds and then shows as Ready again. You can now open Studio.

When in the Studio environment, you can use the Launcher to launch a new activity, and should see the custom-r (latest) image listed in the dropdown menu under Select a SageMaker image to launch your activity.

Using the SageMaker console

Alternatively, you can update your domain by attaching the image via the SageMaker console. The image that you created is listed on the Images page on the console.

To attach this image to your domain, on the SageMaker Studio Control Panel, under Custom images attached to domain, choose Attach image.
For Image source, choose Existing image.
Choose an existing image from the list.
Choose a version of the image from the list.
Choose Next.
Choose the IAM role. For more information, see Create a custom SageMaker image (Console).
Choose Next.
Under Studio configuration, enter or change the following settings. For information about getting the kernel information from the image, see DEVELOPMENT in the SageMaker Studio Custom Image Samples GitHub repo.
1. For EFS mount path, enter the path within the image to mount the user’s Amazon Elastic File System (Amazon EFS) home directory.
2. For Kernel name, enter the name of an existing kernel in the image.
3. (Optional) For Kernel display name, enter the display name for the kernel.
4. Choose Add kernel.
5. (Optional) For Configuration tags, choose Add new tag and add a configuration tag.

For more information, see the Kernel discovery and User data sections of Custom SageMaker image specifications.

Choose Submit.
Wait for the image version to be attached to the domain.

While attaching, your domain status is in Updating. When attached, the version is displayed in the Custom images list and briefly highlighted, and your domain status shows as Ready.

The SageMaker image store automatically versions your images. You can select a pre-attached image and choose Detach to detach the image and all versions, or choose Attach image to attach a new version. There is no limit to the number of versions per image or the ability to detach images.

Using a custom image to create notebooks

When you’re done updating your Studio domain with the custom image, you can use that image to create new notebooks. To do so, choose your custom image from the list of images in the Launcher. In this example, we use custom-r. This shows the list of kernels that you can use to create notebooks. Create a new notebook with the R kernel.

If this is the first time you’re using this kernel to create a notebook, it may take about a minute to start the kernel, and the Kernel Starting message appears on the lower left corner of your Studio. You can write R scripts while the kernel is starting but can only run your script after your kernel is ready. The notebook is created with a default ml.t3.medium instance attached to it. You can see R (Custom R Image) kernel and the instance type on the upper right corner of the notebook. You can change ML instances on the fly in SageMaker Studio. You can also right-size your instances for different workloads. For more information, see Right-sizing resources and avoiding unnecessary costs in Amazon SageMaker.

To test the kernel, enter the following sample R script in the first cell and run the script. This script tests multiple aspects, including importing libraries, creating a SageMaker session, getting the IAM role, and importing data from public repositories.

The abalone dataset in this post is from Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science (http://archive.ics.uci.edu/ml/datasets/Abalone).

# Simple script to test R Kernel in SageMaker Studio

# Import reticulate, readr and sagemaker libraries
library(reticulate)
library(readr)
sagemaker <- import('sagemaker')

# Create a sagemaker session
session <- sagemaker$Session()

# Get execution role
role_arn <- sagemaker$get_execution_role()

# Read a csv file from UCI public repository
# Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. 
# Irvine, CA: University of California, School of Information and Computer Science
data_file <- 'http://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data'

# Copy data to a dataframe, rename columns, and show dataframe head
abalone <- read_csv(file = data_file, col_names = FALSE, col_types = cols())
names(abalone) <- c('sex', 'length', 'diameter', 'height', 'whole_weight', 'shucked_weight', 'viscera_weight', 'shell_weight', 'rings')
head(abalone)

If the image is set up properly and the kernel is running, the output should look like the following screenshot.

Listing, detaching, and deleting custom images

If you want to see the list of custom images attached to your Studio, you can either use the AWS CLI or go to SageMaker console to view the attached image in the Studio Control Panel.

Using the AWS CLI for SageMaker

To view your list of custom images via the AWS CLI, enter the following command in the terminal (provide the Region in which you created your domain):

aws sagemaker list-images --region <region-id>

The response includes the details for the attached custom images:

{
    "Images": [
        {
            "CreationTime": "xxxxxxxxxxxx",
            "ImageArn": "arn:aws:sagemaker:us-east-2:XXXXXXX:image/custom-r",
            "ImageName": "custom-r",
            "ImageStatus": "CREATED",
            "LastModifiedTime": "xxxxxxxxxxxxxx"
        },
        ....
    ]
}

If you want to detach or delete an attached image, you can do it on the SageMaker Studio Control Panel (see Detach a custom SageMaker image). Alternatively, use the custom image name from your default-user-settings.json file and rerun the following command to update the domain by detaching the image:

aws sagemaker update-domain --domain-id <YOUR DOMAIN ID> 
    --cli-input-json file://default-user-settings.json

Then, delete the app image config:

aws sagemaker delete-app-image-config 
    --app-image-config-name custom-r-image-config

Delete the SageMaker image, which also deletes all image versions. The container images in Amazon ECR that are represented by the image versions are not deleted.

aws sagemaker delete-image 
    --region <region-id> 
    --image-name custom-r

After deleting the image, it will not be listed under custom images in SageMaker Studio. For more information, see Clean up resources.

Using the SageMaker console

You can also detach (and delete) images from your domain via the Studio Control Panel UI. To do so, under Custom images attached to domain, select the image and choose Detach. You have the option to also delete all versions of the image from your domain. This detaches the image from the domain.

Getting logs in Amazon CloudWatch

You can also get access to SageMaker Studio logs in Amazon CloudWatch, which you can use for troubleshooting your environment. The metrics are captured under the /aws/sagemaker/studio namespace.

To access the logs, on the CloudWatch console, choose CloudWatch Logs. On the Log groups page, enter the namespace to see logs associated with the Jupyter server and the kernel gateway.

For more information, see Log Amazon SageMaker Events with Amazon CloudWatch.

Conclusion

This post outlined the process of attaching a custom Docker image to your Studio domain to extend Studio’s built-in images. We discussed how you can update an existing domain with a custom image using either the AWS CLI for SageMaker or the SageMaker console. We also explained how you can use the custom image to create notebooks with custom kernels.

For more information, see the following resources:

About the Authors

Nick Minaie is an Artificial Intelligence and Machine Learning (AI/ML) Specialist Solution Architect, helping customers on their journey to well-architected machine learning solutions at scale. In his spare time, Nick enjoys family time, abstract painting, and exploring nature.

Sam Liu is a product manager at Amazon Web Services (AWS). His current focus is the infrastructure and tooling of machine learning and artificial intelligence. Beyond that, he has 10 years of experience building machine learning applications in various industries. In his spare time, he enjoys making short videos for technical education or animal protection.

Alexa & Friends features Ruhi Sarikaya, Alexa AI director of applied science

November 24, 2020

by admin Amazon AWS

Newly named IEEE Fellow discusses his experience in the field of conversational AI, and the ways he and his team are working to make Alexa more intelligent.Read More

Automatically assessing conversations with Alexa

November 24, 2020

by admin Amazon AWS

Model for estimating customer satisfaction with interactions that span multiple domains improves on predecessors by 27%.Read More

Building natural conversation flows using context management in Amazon Lex

November 23, 2020

by Blake DeLee Amazon AWS

Understanding the direction and context of an ever-evolving conversation is beneficial to building natural, human-like conversational interfaces. Being able to classify utterances as the conversation develops requires managing context across multiple turns. Consider a caller who asks their financial planner for insights regarding their monthly expenses: “What were my expenses this year?” They may also ask for more granular information, such as “How about for last month?” As the conversation progresses, the bot needs to understand if the context is changing and adjust its responses accordingly.

Amazon Lex is a service for building conversational interface in voice and text. Previously, you had to write code to manage context via session attributes. Depending on the intent, the code had to orchestrate the invocation of the next intent. As the conversation complexity and the intent count increased, managing the orchestration could become more cumbersome.

Starting today, Amazon Lex supports context management natively, so you can manage the context directly without the need for custom code. As initial prerequisite intents are filled, you can create contexts to invoke related intents. This simplifies bot design and expedites the creation of conversational experiences.

Use case

This post uses the following conversation to model a bot for financial planning:

User:    What was my income in August?
Agent:  Your income in August was $2345.
User:    Ok. How about September?
Agent:  Your income in September was $4567.
User: What were my expenses in July?
Agent: Your expenses for July were $123.
User:    Ok thanks.

Building the Amazon Lex bot FinancialPlanner

In this post, we build an Amazon Lex bot called FinancialPlanner, which is available for download. Complete the following steps:

Create the following intents:
1. ExpensesIntent – Elicits information, such as account ID and period, and provides expenses detail
2. IncomeIntent – Elicits information, such as account ID and period, and provides income detail
3. ExpensesFollowup – Invoked after expenses intent to respond to a follow-up query, such as “How about [expenses] last month?”
4. IncomeFollowup – Invoked after income intent to respond to a follow-up query about income, such as “How about [income] last month?”
5. Fallback – Captures any input that the bot can’t process by the configured intents

Set up context tags for the expenses intents.

The context management feature defines input tags and output tags that the bot developer can set. You use these tags to manage the conversation flow. For our use case, we set expenses as the output context tag in ExpensesIntent. We also use this as the input context for ExpensesFollowupIntent. We can also configure the output tag with a timeout, measured by conversation turns or seconds since the initial intent was invoked.

The following screenshot shows the Context configuration section on the Amazon Lex console.

The following screenshot shows the specific parameters for the expenses tag.

Set up context tags for the income intents.

Similar to expenses, we now set the context for income intents. For IncomeIntent, set the output context tag as income. We use this context as the input context for IncomeFollowupIntent.

Build the bot and test it on the Amazon Lex console.

To test the bot, provide the input “What were my expenses this year” followed by “How about last month?” For the second request, the bot selects ExpensesFollowupIntent because the expenses context is active. Alternatively, if you start with “What was my income this year?” followed by “How about last year?”, the bot invokes the IncomeFollowupIntent because the income context is active.

The following screenshot illustrates how the context tags are used to invoke the appropriate intent.

You can configure the behavior of the context attributes by editing the threshold. Editing the number of turns sets the limit for the number of interactions with the bot, and the number of seconds is from the original input tag being set. As long as the intent with the output tag occurs before the turn- or time-based timeout, the user can invoke the intent based on the input context.

Along with the context management feature, you can also set default slot values. You can set the slots to populate from a context, a session attribute, or a value. In our sample bot model, the {month} slot in ExpensesIntent is set to August as the default slot value.

Conclusion

With the new Amazon Lex context management feature, you can easily orchestrate when to enable intents based on prior intents, and pass specific user data values from one intent to another. This capability allows you to create sophisticated, multi-turn conversational experiences without having to write custom code. Context carry-over, along with default slot values, simplifies bot development and allows you to easily create more natural, conversational user experiences. For more information, see Setting Intent Context documentation.

About the Authors

Blake DeLee is a Rochester, NY-based conversational AI consultant with AWS Professional Services. He has spent five years in the field of conversational AI and voice, and has experience bringing innovative solutions to dozens of Fortune 500 businesses. Blake draws on a wide-ranging career in different fields to build exceptional chatbot and voice solutions.

As a Product Manager on the Amazon Lex team, Harshal Pimpalkhute spends his time trying to get machines to engage (nicely) with humans.

Esther Lee is a Product Manager for AWS Language AI Services. She is passionate about the intersection of technology and education. Out of the office, Esther enjoys long walks along the beach, dinners with friends and friendly rounds of Mahjong.

Customizing your machine translation using Amazon Translate Active Custom Translation

November 23, 2020

by Watson Srivathsan Amazon AWS

When translating the English phrase “How are you?” to Spanish, would you prefer to use “¿Cómo estás?” or “¿Cómo está usted?” instead?

Amazon Translate is a neural machine translation service that delivers fast, high-quality, and affordable language translation. Today, we’re excited to introduce Active Custom Translation (ACT), a feature that gives you more control over your machine translation output. You can now influence what machine translation output you would like to get between “¿Cómo estás?” or “¿Cómo está usted?”. To make ACT work, simply provide your translation examples in TMX, TSV, or CSV format to create parallel data (PD), and Amazon Translate uses your PD along with your batch translation job to customize the translation output at runtime. If you have PD that shows “How are you?” being translated to “¿Cómo está usted?”, ACT knows to customize the translation to “¿Cómo está usted?”.

Today, professional translators use examples of previous translations to provide more customized translations for customers. Similar to profession translators, Amazon Translate can now provide customized translations by learning from your translation examples.

Traditionally, this customization was done by creating a custom translation model—a specific-purpose translation engine built using customer data. Building custom translation models is complex, tedious, and expensive. It requires special expertise to prepare the data for training, testing, and validation. Then you build, deploy, and maintain the model by updating the model frequently. To save on model training and management costs, you may choose to delay updating your custom translation model, which means your models are always stale—negatively affecting your custom translation experience. In spite of all this work, these custom models perform well when the translation job is within the domain of your data. However, they tend to perform worse than a generic model when the translation job is outside of the domain of your customization data.

Amazon Translate ACT introduces an innovative way of providing customized translation output on the fly with your parallel data, without building a custom translation model. ACT output quality is always up to date with your PD. ACT provides the best translations for jobs both within the domain and outside the domain of PD. For example, if a source sentence isn’t in the domain of the PD, the translation output is still as good as the generic translation with no significant deterioration in translation quality. You no longer need to go through the tedious process of building and retraining custom translation models for each incoming use case. Just update the PD, and the ACT output automatically adapts to the most recent PD, without needing any retraining.

“Innovation is in our DNA. Our customers look to AWS to lead in customization of machine translation. Current custom translation technology is inefficient, cumbersome, and expensive,” says Marcello Federico, Principal Applied Scientist at Amazon Machine Learning, AWS. “Active Custom Translation allows our customers to focus on the value of their latest data and forget about the lifecycle management of custom translation models. We innovated on behalf of the customer to make custom machine translation easy.”

Don’t just take our word for it

Custom.MT implements machine translation for localization groups and translation companies. Konstantin Dranch, Custom.MT co-founder, shares, “Amazon Translate’s ACT is a breakthrough machine translation setup. A manual engine retraining takes 15–16 work hours, that’s why most language teams in the industry update their engines only once a month or once a quarter. With ACT, retraining is continuous and engines improve every day based on edits by human translators. Even before the feature was released to the market, we saw tremendous interest from leading software localization teams. With a higher quality of machine translation, enterprise teams can save millions of USD in manual translations and improve other KPIs, such as international user engagement and time to market.”

Welocalize is a leading global localization and translation company. Senior Manager of AI Deployments at Welocalize Alex Yanishevsky says, “Welocalize produces high-quality translations, so our customers can transform their content and data to grow globally and expand into international markets. Active Custom Translation from Amazon Translate allows us to customize our translations at runtime and provides us with significant flexibility in our production cycles. In addition, we see great business value and engine quality improvement since we can retrain engines frequently without incurring additional hosting or training charges.”

One Hour Translation is a leading professional language services provider. Yair Tal, CEO of One Hour Translation, says, “The customer demand for customized Neural Machine Translation (NMT) is growing every month because of the cost savings. As one of the first to try Amazon Translate ACT, we have found that ACT provides the best translation output for many language pairs. With ACT, training and maintenance is simple and the Translate API integrates with our system seamlessly. Translate’s pay-as-you-translate pricing helps our clients, both big and small, get translation output that is tailored for their needs without paying to train custom models.”

Building an Active Custom Translation job

Active Custom Translation’s capabilities are built right into the Amazon Translate experience. In this post, we walk you through the step-by-step process of using your data and getting a customized machine translated output securely. ACT is now available on batch translation, so first familiarize yourself with how to create a batch translation job.

You need data to customize your translation for terms or phrases that are unique to a specific domain, such as life sciences, law, or finance. You bring in examples of high-quality translations (source sentence and translated target sentence) in your preferred domain as a file in TMX, TSV, or CSV format. This data should also be UTF-8 encoded. You use this data to create a PD. Amazon Translate uses this PD to customize your machine translation. Each PD can be up to 1 GB large. You can upload up to 1,000 PD per account per Region. The 1,000 parallel data limit can be increased upon request. You get free storage for parallel data for up to 200 GB. You pay the local Amazon Simple Storage Service (Amazon S3) rate for excess data stored.

For our use case, I have my data in TSV format, and the name of my file is Mydata.tsv. I first upload this file to an S3 location (for this post, I store my data in s3://input-s3bucket/Paralleldata/).

The following table summarizes the contents of the file.

en	es
Amazon Translate is a neural machine translation service.	Amazon Translate es un servicio de traducción automática basado en redes neuronales.
Neural machine translation is a form of language translation automation that uses deep learning models.	La traducción automática neuronal es una forma de automatizar la traducción de lenguajes utilizando modelos de aprendizaje profundo.
How are you?	¿Cómo está usted?

We run this example in the US West (Oregon) Region, us-west-2.

CreateParallelData

Calling the CreateParallelData API creates a PD resource record in our database and asynchronously starts a workflow for processing the PD file and ingesting it into our service.

CLI

The following CLI commands are formatted for Unix, Linux, and macOS. For Windows, replace the backslash () Unix continuation character at the end of each line with a caret (^).

Run the following CLI command:

aws translate create-parallel-data 
--name ${PARALLEL_DATA_NAME}
--parallel-data-config S3Uri=${S3_URI},Format=${FORMAT} 
--region ${REGION}

I use Mydata.tsv to create my PD my-parallel-data-1:

aws translate create-parallel-data 
--name my-parallel-data-1 
--parallel-data-config S3Uri= s3://input-s3bucket/Paralleldata/Mydata.tsv,Format=TSV 
--region us-west-2

You get a response like the following code:

{
    "Name": "my-parallel-data-1",
    "Status": "CREATING"
}

This means that your PD is being created now.

Run aws translate create-parallel-data help for more information.

Console

To use the Amazon Translate console, complete the following steps:

On the Amazon Translate console, under Customization, choose Parallel data.
Choose Create parallel data.

For Name, insert my-parallel-data-1.
For Parallel data location in S3, enter your S3 location (for this post, s3://input-s3bucket/Paralleldata/Mydata.tsv).
For File format¸ you can choose CSV, TSV, or TMX. For this post, we choose Tab-separated values (.tsv).

Your data is always secure with Amazon Translate. It’s encrypted using an AWS owned encryption key by default. You can encrypt it using a key from your current account or use a key from a different account.

For this post, for Encryption key, we select Use AWS owned key.
Choose Create parallel data.

ListParallelData

Calling the ListParallelData API returns a list of PD that exists and their details (it doesn’t include a pre-signed Amazon S3 URL for downloading the data)

CLI

Run the following CLI command:

aws translate list-parallel-data 
--region us-west-2

You get a response like the following code:

{
    "ParallelDataPropertiesList": [
        {
            "Name": "my-parallel-data-1",
            "Arn": "arn:aws:translate:us-west-2:123456789012:parallel-data/my-parallel-data-1",
            "Status": "ACTIVE",
            "SourceLanguageCode": "en",
            "TargetLanguageCodes": [
                "es"
            ],
            "ParallelDataConfig": {
                "S3Uri": "s3://input-s3bucket/Paralleldata/Mydata.tsv",
                "Format": "TSV"
            },
            "ImportedDataSize": 532,
            "ImportedRecordCount": 3,
            "FailedRecordCount": 0,
            "CreatedAt": 1234567890.406,
            "LastUpdatedAt": 1234567890.675
        }
    ]
}

The "Status": "ACTIVE" means your PD is ready for you to use.

Run aws translate list-parallel-data help for more information.

Console

This following screenshot shows the result for list-parallel-data on the Amazon Translate console.

GetParallelData

Calling the GetParallelData API returns details of the named parallel data and a pre-signed Amazon S3 URL for downloading the data.

CLI

Run the following CLI command:

aws translate get-parallel-data 
--name ${PARALLEL_DATA_NAME} 
--region ${REGION}

For example, my code looks like the following:

aws translate get-parallel-data 
--name my-parallel-data-1 
--region us-west-2

You get a response like the following code:

{
    "ParallelDataProperties": {
        "Name": "my-parallel-data-1",
        "Arn": "arn:aws:translate:us-west-2:123456789012:parallel-data/my-parallel-data-1",
        "Status": "ACTIVE",
        "SourceLanguageCode": "en",
        "TargetLanguageCodes": [
            "es"
        ],
        "ParallelDataConfig": {
            "S3Uri": "s3://input-s3bucket/Paralleldata/Mydata.tsv",
            "Format": "TSV"
        },
        "ImportedDataSize": 532,
        "ImportedRecordCount": 3,
        "FailedRecordCount": 0,
        "CreatedAt": 1234567890.406,
        "LastUpdatedAt": 1234567890.675
    },
    "DataLocation": {
        "RepositoryType": "S3",
        "Location": "xxx"
    }
}

“Location” contains the pre-signed Amazon S3 URL for downloading the data.

Run aws translate get-parallel-data help for more information.

Console

On the Amazon Translate console, choose one of the PD files on the Parallel data page.

You’re directed to another page that includes the detail for this parallel data file. The following screenshot shows the details for get-parallel-data.

UpdateParallelData

Calling the UpdateParallelData API replaces the old parallel data with the new one.

CLI

Run the following CLI command:

aws translate update-parallel-data 
--name ${PARALLEL_DATA_NAME}
--parallel-data-config S3Uri=${NEW_S3_URI},Format=${FORMAT} 
--region us-west-2

For this post, Mydata1.tsv is my new parallel data. My code looks like the following:

aws translate update-parallel-data 
--name my-parallel-data-1 
--parallel-data-config S3Uri= s3://input-s3bucket/Paralleldata/Mydata1.tsv,Format=TSV 
--region us-west-2

You get a response like the following code:

{
    "Name": "my-parallel-data-1",
    "Status": "ACTIVE",
    "LatestUpdateAttemptStatus": "UPDATING",
    "LatestUpdateAttemptAt": 1234567890.844
}

The "LatestUpdateAttemptStatus": "UPDATING" means your parallel data is being updated now.

Wait for a few minutes and run get-parallel-data again. You can see the parallel data get updated, such as in the following code:

{
    "ParallelDataProperties": {
            "Name": "my-parallel-data-1",
            "Arn": "arn:aws:translate:us-west-2:123456789012:parallel-data/my-parallel-data-1",
            "Status": "ACTIVE",
            "SourceLanguageCode": "en",
            "TargetLanguageCodes": [
                "es"
            ],
            "ParallelDataConfig": {
                "S3Uri": "s3://input-s3bucket/Paralleldata/Mydata1.tsv",
                "Format": "TSV"
            },
        ...
    }
}

We can see that the parallel data has been updated from Mydata.tsv to Mydata1.tsv.

Run aws translate update-parallel-data help for more information.

Console

On the Amazon Translate console, choose the parallel data file and choose Update.

You can replace the new parallel data file with the existing one by specifying the new Amazon S3 URL.

Creating your first Active Custom Translation job

In this section, we discuss the different ways you can create your ACT job.

StartTextTranslationJob

Calling the StartTextTranslationJob starts a batch translation. When you add parallel data to a batch translation job, you create an ACT job. Amazon Translate customizes your ACT output to match the style, tone, and word choices it finds in your PD. ACT is a premium product, so see Amazon Translate pricing for pricing information. You can only specify one parallel data file to use with the text translation job.

CLI

Run the following command:

aws translate start-text-translation-job 
--input-data-config ContentType=${CONTENT_TYPE},S3Uri=${INPUT_S3_URI} 
--output-data-config S3Uri=${OUTPUT_S3_URI} 
--data-access-role-arn ${DATA_ACCESS_ROLE}
--source-language-code=${SOURCE_LANGUAGE_CODE} --target-language-codes=${TARGET_LANGUAGE_CODE} 
--parallel-data-names ${PARALLEL_DATA_NAME}
--region ${REGION}
--job-name ${JOB_NAME}

For example, my code looks like the following:

aws translate start-text-translation-job 
--input-data-config ContentType=application/vnd.openxmlformats-officedocument.spreadsheetml.sheet,S3Uri= s3://input-s3bucket/inputfile/ 
--output-data-config S3Uri= s3://output-s3bucket/Output/ 
--data-access-role-arn arn:aws:iam::123456789012:role/TranslateBatchAPI 
--source-language-code=en --target-language-codes=es 
--parallel-data-names my-parallel-data-1 
--region us-west-2 
--job-name ACT1

You get a response like the following code:

{
    "JobId": "4446f95f20c88a4b347449d3671fbe3d",
    "JobStatus": "SUBMITTED"
}

This output means the job has been submitted successfully.

Run aws translate start-text-translation-job help for more information.

Console

For instructions on running a batch translation job on the Amazon Translate console, see Translating documents, spreadsheets, and presentations in Office Open XML format using Amazon Translate. Choose my-parallel-data-1 as the parallel data to create your first ACT job, ACT1.

Congratulations! You have created your first ACT job. ACT is available in the following Regions:

US East (Northern Virginia)
US West (Oregon)
Europe (Ireland)

Running your Active Custom Translation job

ACT works on asynchronous batch translation for language pairs that have English as either the source or target language.

Now, let’s try to translate the following text from English to Spanish and see how ACT helps to customize the output:

“How are you?” is one of the most common questions you’ll get asked when meeting someone. The most common response is “good”

The following is the output you get when you translate without any customization:

“¿Cómo estás?” es una de las preguntas más comunes que se le harán cuando conozca a alguien. La respuesta más común es “Buena”

The following is the output you get when you translate using ACT with my-parallel-data-1 as the PD:

“¿Cómo está usted?” es una de las preguntas más comunes que te harán cuando te reúnas con alguien. La respuesta más común es “Buena”

Conclusion

Amazon Translate ACT introduces a powerful way of providing personalized translation output with the following benefits:

You don’t have to build a custom translation model
You only pay for what you translate using ACT
There is no additional model building or model hosting cost
Your data is always secure and always under your control
You get the best machine translation even when your source text is outside the domain of your parallel data
You can update your parallel data as often as you need for no additional cost

Try ACT today. Bring your parallel data and start customizing your machine translation output. For more information about Amazon Translate ACT, see Asynchronous Batch Processing.

Related resources

For additional resources, see the following:

About the Authors

Watson G. Srivathsan is the Sr. Product Manager for Amazon Translate, AWS’s natural language processing service. On weekends you will find him exploring the outdoors in the Pacific Northwest.

Xingyao Wang is the Software Develop Engineer for Amazon Translate, AWS’s natural language processing service. She likes to hang out with her cats at home.

Getting started with Amazon Kendra ServiceNow Online connector

November 21, 2020

by David Shute Amazon AWS

Amazon Kendra is a highly accurate and easy-to-use intelligent search service powered by machine learning (ML). To make it simple to search data across multiple content repositories, Amazon Kendra offers a number of native data source connectors to help get your documents easily ingested and indexed.

This post describes how you can use the Amazon Kendra ServiceNow connector. To allow the connector to access your ServiceNow site, you need to know your ServiceNow version, the Amazon Kendra index, the ServiceNow host URL, and the credentials of a user with the ServiceNow admin role attached to it. The ServiceNow credentials needed for the Amazon Kendra ServiceNow connector to work are securely stored in AWS Secrets Manager, and can be entered during the connector setup.

Currently, Amazon Kendra has two provisioning editions: the Amazon Kendra Developer Edition for building proof of concepts (POCs), and the Amazon Kendra Enterprise Edition. Amazon Kendra connectors work with both these editions.

The Amazon Kendra ServiceNow Online connector indexes Service Catalog items and public articles that have a published state, so a knowledge base article must have the public role under Can Read, and Cannot Read must be null or not set.

Prerequisites

To get started, you need the following:

The ServiceNow host URL
Username and Password of a user with the admin role
Know your ServiceNow version

The user that you use for the connector needs to have the admin role in ServiceNow. This is defined on ServiceNow’s User Administration page (see the section Insufficient Permissions for more information).

When setting up the ServiceNow connector, we need to define if our build is London or a different ServiceNow version. To obtain our build name, we can go on the System Diagnostics menu and choose Stats.

In the following screenshot, my build name is Orlando, so I indicate on the connector that my version is Others.

Creating a ServiceNow connector in the Amazon Kendra console

The following section describes the process of deploying an Amazon Kendra index and configuring a ServiceNow connector.

Create your index. For instructions, see Getting started with the Amazon Kendra SharePoint Online connector.

If you already have an index, you can skip this step.

The next step is to set up the data sources. One of the advantages of implementing Amazon Kendra is that you can use a set of pre-built connectors for data sources, such as Amazon Simple Storage Service (Amazon S3), Amazon Relational Database Service (Amazon RDS), Salesforce, ServiceNow, and SharePoint Online, among others.

For this post, we use the ServiceNow connector.

On the Amazon Kendra console, choose Indexes.
Choose MyServiceNowindex.
Choose Add data sources.

Choose ServiceNow Online.

For Name, enter a connector name.
For Description, enter an optional description.
For Tags¸ you can optionally assign tags to your data source.
Choose Next.

In this next step, we define targets.

For ServiceNow host, enter the host name.
For ServiceNow version, enter your version (for this post, we choose Others).
For IAM role, we can create a new AWS Identity and Access Management (IAM) role or use an existing one.

For more information, see IAM role for ServiceNow data sources.

This role has four functions:

Amazon CloudWatch Logs For more information, see Monitoring Amazon Kendra with Amazon CloudWatch Logs.
The BatchPutDocument
The BatchDeleteDocument
Permissions to get user credentials stored on Secrets Manager. For more information, see Retrieving the secret value.

If you use an existing IAM role, you have to grant permissions to this secret in Secrets Manager. If you create a new IAM and a new secret, no further action is required.

Choose Next.

You then need to define ServiceNow authentication details, the content to index, and the synchronization schedule.

The ServiceNow user you provide for the connector needs to have the admin role.

In the Authentication section, for Type of authentication, choose an existing secret or create a new one. For this post, we choose New.
Enter your secret’s name, username, and password.

In the ServiceNow configuration section, we define the content types we need to index: Knowledge articles, Service catalog items, or both.
You also define if it include the item attachments.

Amazon Kendra only indexes public articles that have a published state, so a knowledge base article must have the public role under Can Read, and Cannot Read must be null or not set.

You can include or exclude some file extensions (for example, for Microsoft Word, we have six different types of extensions).

For Frequency, choose Run on demand.

Add field mappings.

Even though this is an optional step, it’s a good idea to add this extra layer of metadata to our documents from ServiceNow. This metadata enables you to improve accuracy through manual tuning, filtering, and faceting. There is no way to add metadata to already ingested documents, so if you want to add metadata later, you need to delete this data source and recreate a data source with metadata and ingest your documents again.

If you map fields through the console when setting up the ServiceNow connector for the first time, these fields are created automatically. If you configure the connector via the API, you need update your index first and define those new fields.

You can map ServiceNow properties to Amazon Kendra index fields. The following table is the list of fields that we can map.

ServiceNow Field Name	Suggested Amazon Kendra Field Name
content	_document_body
displayUrl	sn_display_url
first_name	sn_ka_first_name
kb_category	sn_ka_category
kb_catagory_name	_category
kb_knowledge_base	sn_ka_knowledge_base
last_name	sn_ka_last_name
number	sn_kb_number
published	sn_ka_publish_date
repItemType	sn_repItemType
short_description	_document_title
sys_created_by	sn_createdBy
sys_created_on	_created_at
sys_id	sn_sys_id
sys_updated_by	sn_updatedBy
sys_updated_on	_last_updated_at
url	sn_url
user_name	sn_ka_user_name
valid_to	sn_ka_valid_to
workflow_state	sn_ka_workflow_state

Even though there are suggested Kendra field names you can define, you can map a field into a different name.

The following table summarizes the available service catalog fields.

ServiceNow Field Name	Suggested Amazon Kendra Field Name
category	sn_sc_category
category_full_name	sn_sc_category_full_name
category_name	_category
description	_document_body
displayUrl	sn_display_url
repItemType	sn_repItemType
sc_catalogs	sn_sc_catalogs
sc_catalogs_name	sn_sc_catalogs_name
short_description	_document_body
sys_created_by	sn_createdBy
sys_created_on	_created_at
sys_id	sn_sys_id
sys_updated_by	sn_updatedBy
sys_updated_on	_last_updated_at
title	_document_title
url	sn_url

For this post, our Amazon Kendra index has a custom index field called MyCustomUsername, which you can use to map the Username field from different data sources. This custom field was created under the index’s facet definition. The following screenshot shows a custom mapping.

Review the settings and choose Create data source.

After your ServiceNow data source is created, you see a banner similar to the following screenshot.

Choose Sync now to start the syncing and document ingestion process.

If everything goes as expected, you can see the status as Succeeded.

Testing

Now that you have synced your ServiceNow site you can test it on the Amazon Kendra’s search console.

In my case, my ServiceNow site has the demo examples, so I asked what is the storage on the ipad 3, which returned information from a service catalog item:

Creating a ServiceNow connector with Python

We saw how to create an index on the Amazon Kendra console; now we create a new Amazon Kendra index and a ServiceNow connector and sync it by using the AWS SDK for Python (Boto3). Boto3 makes it easy to integrate your Python application, library, or script with AWS services, including Amazon Kendra.

My personal preference to test my Python scripts is to spin up an Amazon SageMaker notebook instance, a fully managed ML Amazon Elastic Compute Cloud (Amazon EC2) instance that runs the Jupyter Notebook app. For instructions, see Create an Amazon SageMaker Notebook Instance.

To create an index using the AWS SDK, we need to have the policy AmazonKendraFullAccess attached to the role we use.

Also, Amazon Kendra requires different roles to operate:

IAM roles for indexes, which are needed by Amazon Kendra to write to Amazon CloudWatch Logs.
IAM roles for data sources, which are needed when we use the CreateDataSource These roles require a specific set of permissions depending on the connector we use. Because we use ServiceNow data sources, it must provide permissions to:
- Secrets Manager, where the ServiceNow online credentials are stored.
- Permission to use the AWS Key Management Service (AWS KMS) customer master Key (CMK) to decrypt the credentials by Secrets Manager.
- Permission to use the BatchPutDocument and BatchDeleteDocument operations to update the index.

For more information, see IAM access roles for Amazon Kendra.

Our current requirements are:

Amazon SageMaker Notebooks execution role with permission to create an Amazon Kendra index using an Amazon SageMaker notebook
Amazon Kendra IAM role for CloudWatch
Amazon Kendra IAM role for ServiceNow connector
ServiceNow credentials stored on Secrets Manager

To create an index, we use the following code:

import boto3
 from botocore.exceptions import ClientError
 import pprint
 import time
  
 kendra = boto3.client("kendra")
  
 print("Creating an index")
  
 description = <YOUR_INDEX_DESCRIPTION>
 index_name = <YOUR_NEW_INDEX_NAME>
 role_arn = <KENDRA_ROLE_WITH_CLOUDWATCH_PERMISSIONS ROLE>
  
 try:
     index_response = kendra.create_index(
         Description = <DESCRIPTION>,
         Name = index_name,
         RoleArn = role_arn,
         Edition = "DEVELOPER_EDITION",
         Tags=[
         {
             'Key': 'Project',
             'Value': 'SharePoint Test'
         } 
         ]
     )
  
     pprint.pprint(index_response)
  
     index_id = index_response['Id']
  
     print("Wait for Kendra to create the index.")
  
     while True:
         # Get index description
         index_description = kendra.describe_index(
             Id = index_id
         )
         # If status is not CREATING quit
         status = index_description["Status"]
         print("    Creating index. Status: "+status)
         if status != "CREATING":
             break
         time.sleep(60)
  
 except  ClientError as e:
         print("%s" % e)
  
 print("Done creating index.")

While our index is being created, we obtain regular updates (every 60 seconds to be exact, check line 38) until the process is finished. See the following code:

Creating an index
 {'Id': '3311b507-bfef-4e2b-bde9-7c297b1fd13b',
  'ResponseMetadata': {'HTTPHeaders': {'content-length': '45',
                                       'content-type': 'application/x-amz-json-1.1',
                                       'date': 'Wed, 12 Aug 2020 12:58:19 GMT',
                                       'x-amzn-requestid': 'a148a4fc-7549-467e-b6ec-6f49512c1602'},
                       'HTTPStatusCode': 200,
                       'RequestId': 'a148a4fc-7549-467e-b6ec-6f49512c1602',
                       'RetryAttempts': 2}}
 Wait for Kendra to create the index.
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: ACTIVE
 Done creating index

The preceding code indicates that our index has been created and our new index ID is 3311b507-bfef-4e2b-bde9-7c297b1fd13b (your ID is different from our example code). This information is included as ID in the response.

Our Amazon Kendra index is up and running now.

If you have metadata attributes associated with your ServiceNow articles, you want to do three things:

Determine the Amazon Kendra attribute name you want for each of your ServiceNow metadata attributes. By default, Amazon Kendra has six reserved fields (_category, created_at, _file_type, _last_updated_at, _source_uri, and _view_count).
Update the index with the UpdateIndex API call with the Amazon Kendra attribute names.
Map each ServiceNow metadata attribute to each Amazon Kendra metadata attribute.

You can find a table with metadata attributes and the suggested Amazon Kendra fields under step 20 on the previous section.

For this post, I have the metadata attribute UserName associated with my ServiceNow article and I want to map it to the field MyCustomUsername on my index. The following code shows how to add the attribute MyCustomUsername to my Amazon Kendra index. After we create this custom field in our index, we map our field Username from ServiceNow to it. See the following code:

try:
     update_response = kendra.update_index(
         Id='3311b507-bfef-4e2b-bde9-7c297b1fd13b',
         RoleArn='arn:aws:iam::<MY-ACCOUNT-NUMBER>:role/service-role/AmazonKendra-us-east-1-KendraRole',
         DocumentMetadataConfigurationUpdates=[
         {
             'Name': <MY_CUSTOM_FIELD_NAME>,
             'Type': 'STRING_VALUE',
             'Search': {
                 'Facetable': True,
                 'Searchable': True,
                 'Displayable': True
             }
         }   
     ]
     )
 except  ClientError as e:
         print('%s' % e)   
 pprint.pprint(update_response)

If everything goes well, we receive a 200 response:

{'ResponseMetadata': {'HTTPHeaders': {'content-length': '0',
                                       'content-type': 'application/x-amz-json-1.1',
                                       'date': 'Wed, 12 Aug 2020 12:17:07 GMT',
                                       'x-amzn-requestid': '3eba66c9-972b-4757-8d92-37be17c8f8a2},
                       'HTTPStatusCode': 200,
                       'RequestId': '3eba66c9-972b-4757-8d92-37be17c8f8a2',
                       'RetryAttempts': 0}} 
  

 }

We also need to have GetSecretValue for our secret stored in Secrets Manager.

If you need to create a new secret in Secrets Manager to store your ServiceNow credentials, make sure the role you use has permissions to CreateSecret and tagging for Secrets Manager. The policy should look like the following code:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "SecretsManagerWritePolicy",
            "Effect": "Allow",
            "Action": [
                "secretsmanager:UntagResource",
                "secretsmanager:CreateSecret",
                "secretsmanager:TagResource"
            ],
            "Resource": "*"
        }
    ]
}

The following code creates a secret in Secrets Manager:

secretsmanager = boto3.client('secretsmanager')

SecretName = <YOURSECRETNAME>
SharePointCredentials = "{'username': <YOUR_SERVICENOW_SITE_USERNAME>, 'password': <YOUR_SERVICENOW_SITE_PASSWORD>}"

try:
  create_secret_response = secretsmanager.create_secret(
  Name=SecretName,
  Description='Secret for a servicenow data source connector',
  SecretString=SharePointCredentials,
  Tags=[
   {
    'Key': 'Project',
    'Value': 'ServiceNow Test'
   }
 ]
 )
except ClientError as e:
  print('%s' % e)
  pprint.pprint(create_secret_response)

If everything goes well, you get a response with your secret’s ARN:

{'ARN':<YOUR_SECRETS_ARN>,
 'Name': <YOUR_SECRET_NAME>,
 'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
                                      'content-length': '161',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Sat, 22 Aug 2020 14:44:13 GMT',
                                      'x-amzn-requestid': '68c9a153-c08e-42df-9e6d-8b82550bc412'},
                      'HTTPStatusCode': 200,
                      'RequestId': '68c9a153-c08e-42df-9e6d-8b82550bc412',
                      'RetryAttempts': 0},
 'VersionId': 'bee74dab-6beb-4723-a18b-4829d527aad8'}

Now that we have our Amazon Kendra index, our custom field, and our ServiceNow credentials, we can proceed with creating our data source.

To ingest documents from this data source, we need an IAM role with Kendra:BatchPutDocument and kendra:BatchDeleteDocument permissions. For more information, see IAM roles for Microsoft SharePoint Online data sources. We use the ARN for this IAM role when invoking the CreateDataSource API.

Make sure the role you use for your data source connector has a trust relationship with Amazon Kendra. It should look like the following code:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "kendra.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]

The following code is the policy structure we need:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetSecretValue"
            ],
            "Resource": [
                "arn:aws:secretsmanager:region:account ID:secret:secret ID"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "kms:Decrypt"
            ],
            "Resource": [
                "arn:aws:kms:region:account ID:key/key ID"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "kendra:BatchPutDocument",
                "kendra:BatchDeleteDocument"
            ],
            "Resource": [
                "arn:aws:kendra:<REGION>:<account_ID>:index/<index_ID>"
            ],
            "Condition": {
                "StringLike": {
                    "kms:ViaService": [
                        "kendra.amazonaws.com"
                    ]
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::<BUCKET_NAME>/*"
            ]
        }
    ]
}

Finally, the following code is my role’s ARN:

arn:aws:iam::<MY_ACCOUNT_NUMBER>:role/Kendra-Datasource

Following the least privilege principle, we only allow our role to put and delete documents in our index, and read the secrets to connect to our ServiceNow site.

One detail we can specify when creating a data source is the sync schedule, which indicates how often our index syncs with the data source we create. This schedule is defined on the Schedule key of our request. You can use schedule expressions for rules to define how often you want to sync your data source. For this post, I use the ScheduleExpression 'cron(0 11 * * ? *)', which means that our data source is synced every day at 11:00 AM.

I use the following code. Make sure you match your SiteURL and SecretARN, as well as your IndexID. Additionally, FieldMappings is where you map between the ServiceNow attribute names with the Amazon Kendra index attribute names. I chose the same attribute name in both, but you can call the Amazon Kendra attribute whatever you’d like.

For more details, see create_data_source(**kwargs).

print('Create a data source')
 
SecretArn= "<YOUR_SERVICENOW_ONLINE_USER_AND_PASSWORD_SECRETS_ARN>"
SiteUrl = "<YOUR_SERVICENOW_SITE_URL>"
DSName= "<YOUR_NEW_DATASOURCE_NAME>"
IndexId= "<YOUR_INDEX_ID>"
DSRoleArn= "<YOUR_DATA_SOURCE_ROLE>"
ScheduleExpression='cron(0 11 * * ? *)'
try:try:
    datasource_response = kendra.create_data_source(
    Name=DSName,
    IndexId=IndexId,        
    Type='SERVICENOW',
    Configuration={
        'ServiceNowConfiguration': {
            'HostUrl': SiteUrl,
            'SecretArn': SecretArn,
            'ServiceNowBuildVersion': 'OTHERS',
            'KnowledgeArticleConfiguration': 
            {
                'CrawlAttachments': True,
                'DocumentDataFieldName': 'content',
                'DocumentTitleFieldName': 'short_description',
                'FieldMappings': 
                [
                    {
                        'DataSourceFieldName': 'sys_created_on',
                        'DateFieldFormat': 'yyyy-MM-dd hh:mm:ss',
                        'IndexFieldName': '_created_at'
                    },
                    {
                        'DataSourceFieldName': 'sys_updated_on',
                        'DateFieldFormat': 'yyyy-MM-dd hh:mm:ss',
                        'IndexFieldName': '_last_updated_at'
                    },
                    {
                        'DataSourceFieldName': 'kb_category_name',
                        'IndexFieldName': '_category'
                    },
                    {
                        'DataSourceFieldName': 'sys_created_by',
                        'IndexFieldName': 'MyCustomUsername'
                    }
                ],
                'IncludeAttachmentFilePatterns': 
                [
                    '.*\.(dotm|ppt|pot|pps|ppa)$',
                    '.*\.(doc|dot|docx|dotx|docm)$',
                    '.*\.(pptx|ppsx|pptm|ppsm|html)$',
                    '.*\.(txt)$',
                    '.*\.(hml|xhtml|xhtml2|xht|pdf)$'
                ]
            },
            'ServiceCatalogConfiguration': {
                'CrawlAttachments': True,
                'DocumentDataFieldName': 'description',
                'DocumentTitleFieldName': 'title',
                'FieldMappings': 
                [
                    {
                        'DataSourceFieldName': 'sys_created_on',
                        'DateFieldFormat': 'yyyy-MM-dd hh:mm:ss',
                        'IndexFieldName': '_created_at'
                    },
                    {
                        'DataSourceFieldName': 'sys_updated_on',
                        'DateFieldFormat': 'yyyy-MM-dd hh:mm:ss',
                        'IndexFieldName': '_last_updated_at'
                    },
                    {
                        'DataSourceFieldName': 'category_name',
                        'IndexFieldName': '_category'
                    }
                ],
                'IncludeAttachmentFilePatterns': 
                [
                    '.*\.(dotm|ppt|pot|pps|ppa)$',
                    '.*\.(doc|dot|docx|dotx|docm)$',
                    '.*\.(pptx|ppsx|pptm|ppsm|html)$',
                    '.*\.(txt)$',
                    '.*\.(hml|xhtml|xhtml2|xht|pdf)$'
                ]
            },
        },
    },
    Description='My ServiceNow Datasource',
    RoleArn=DSRoleArn,
    Schedule=ScheduleExpression,
    Tags=[
        {
            'Key': 'Project',
            'Value': 'ServiceNow Test'
        }
    ])
    pprint.pprint(datasource_response)
    print('Waiting for Kendra to create the DataSource.')
    datasource_id = datasource_response['Id']
    while True:
        # Get index description
        datasource_description = kendra.describe_data_source(
            Id=datasource_id,
            IndexId=IndexId
        )
        # If status is not CREATING quit
        status = datasource_description["Status"]
        print("    Creating index. Status: "+status)
        if status != "CREATING":
            break
        time.sleep(60)    

except  ClientError as e:
        pprint.pprint('%s' % e)
pprint.pprint(datasource_response)

If everything goes well, we should receive a 200 status response:

Create a DataSource
{'Id': '507686a5-ff4f-4e82-a356-32f352fb477f',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '45',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Sat, 22 Aug 2020 15:50:08 GMT',
                                      'x-amzn-requestid': '9deaea21-1d38-47b0-a505-9bb2efb0b74f'},
                      'HTTPStatusCode': 200,
                      'RequestId': '9deaea21-1d38-47b0-a505-9bb2efb0b74f',
                      'RetryAttempts': 0}}
Waiting for Kendra to create the DataSource.
    Creating index. Status: CREATING
    Creating index. Status: ACTIVE
{'Id': '507686a5-ff4f-4e82-a356-32f352fb477f',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '45',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Sat, 22 Aug 2020 15:50:08 GMT',
                                      'x-amzn-requestid': '9deaea21-1d38-47b0-a505-9bb2efb0b74f'},
                      'HTTPStatusCode': 200,
                      'RequestId': '9deaea21-1d38-47b0-a505-9bb2efb0b74f',
                      'RetryAttempts': 0}}

Even though we have defined a schedule for syncing my data source, we can sync on demand by using the method start_data_source_sync_job:

DSId=<YOUR DATA SOURCE ID>
IndexId=<YOUR INDEX ID>
 
try:
    ds_sync_response = kendra.start_data_source_sync_job(
    Id=DSId,
    IndexId=IndexId
)
except  ClientError as e:
        print('%s' % e)  
        
pprint.pprint(ds_sync_response)

The response should look like the following code:

{'ExecutionId': '3d11e6ef-3b9e-4283-bf55-f29d0b10e610',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '54',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Sat, 22 Aug 2020 15:52:36 GMT',
                                      'x-amzn-requestid': '55d94329-50af-4ad5-b41d-b173f20d8f27'},
                      'HTTPStatusCode': 200,
                      'RequestId': '55d94329-50af-4ad5-b41d-b173f20d8f27',
                      'RetryAttempts': 0}}

Testing

Finally, we can query our index. See the following code:

response = kendra.query(
IndexId=<YOUR_INDEX_ID>,
QueryText='Is there a service that has 11 9s of durability?')
if response['TotalNumberOfResults'] > 0:
    print(response['ResultItems'][0]['DocumentExcerpt']['Text'])
    print("More information: "+response['ResultItems'][0]['DocumentURI'])
else:
    print('No results found, please try a different search term.')

Common errors

In this section, we discuss errors that may occur, whether using the Amazon Kendra console or the Amazon Kendra API.

You should look at CloudWatch logs and error messages returned in the Amazon Kendra console or via the Amazon Kendra API. The CloudWatch logs help you determine the reason for a particular error, whether you experience it using the console or programmatically.

Common errors when trying to access ServiceNow as a data source are:

Insufficient permissions
Invalid credentials
Secrets Manager error

Insufficient permissions

A common scenario you may come across is when you have the right credentials but your user doesn’t have enough permissions for the Amazon Kendra ServiceNow connector to crawl your knowledge base and service catalog items.

You receive the following error message:

We couldn't sync the following data source: 'MyServiceNowOnline', at start time Sep 12, 2020, 1:08 PM CDT. Amazon Kendra can't connect to the ServiceNow server with the specified credentials. Check your credentials and try your request again.

If you can log in to your ServiceNow instance, make sure that the user you designed for the connector has the admin role.

On your ServiceNow instance, under User Administration, choose Users.

On the users list, choose the user ID of the user you want to use for the connector.

On the Roles tab, verify that your user has the admin

If you don’t have that role attached to your user, choose Edit to add it.

On the Amazon Kendra console, on your connector configuration page, choose Sync now.

Invalid credentials

You may encounter an error with the following message:

We couldn't sync the following data source: 'MyServiceNowOnline', at start time Jul 28, 2020, 3:59 PM CDT. Amazon Kendra can't connect to the ServiceNow server with the specified credentials. Check your credentials and try your request again.

To investigate, complete the following steps:

Choose the error message to review the CloudWatch logs.

You’re redirected CloudWatch Logs Insights.

Choose Run Query to start analyzing the logs.

We can verify our credentials by going to Secrets Manager and reviewing our credentials stored in the secret.

Choose your secret name.

Choose Retrieve secret value.

If your password doesn’t match, choose Edit,

And the username or password and choose Save.

Go back to your data source in Amazon Kendra, and choose Sync now.

Secrets Manager error

You may encounter an error stating that the customer’s secret can’t be fetched. This may happen if you use an existing secret and the IAM role used for syncing your ServiceNow data source doesn’t have permissions to access the secret.

To address this issue, first we need our secret’s ARN.

On the Secrets Manager console, choose your secret’s name (for this post, AmazonKendra-ServiceNow-demosite).
Copy the secret’s ARN.
On the IAM console, search for the role we use to sync our ServiceNow data source (for this post, AmazonKendra-servicenow-role).
For Permissions, choose Add inline policy.
Following the least privilege principle, for Service, choose Secrets Manager.
For Access Level, choose Read and GetSecretValue.
For Resources, enter our secret’s ARN.

Your settings should look similar to the following screenshot.

Enter a name for your policy.
Choose Create Policy.

After your policy has been created and attached to your data source role, try to sync again.

Conclusion

You have now learned how to ingest the documents from your ServiceNow site into your Amazon Kendra index. We hope this post helps you take advantage of the intelligent search capabilities in Amazon Kendra to find accurate answers from your enterprise content.

For more information about Amazon Kendra, see AWS re:Invent 2019 – Keynote with Andy Jassy on YouTube, Amazon Kendra FAQs, and What is Amazon Kendra?

About the Authors

David Shute is a Senior ML GTM Specialist at Amazon Web Services focused on Amazon Kendra. When not working, he enjoys hiking and walking on a beach.

Juan Pablo Bustos is an AI Services Specialist Solutions Architect at Amazon Web Services, based in Dallas, TX. Outside of work, he loves spending time writing and playing music as well as trying random restaurants with his family.

Amazon Augmented AI is now a HIPAA eligible service

November 20, 2020

by Anuj Gupta Amazon AWS

Amazon Augmented AI (Amazon A2I) is now a HIPAA eligible service. Amazon A2I makes it easy to build the workflows required for human review of machine learning (ML) predictions. HIPPA eligibility applies to AWS Regions where the service is available and means you can use Amazon A2I add human review of protected health information (PHI) to power your healthcare workflows through your private workforce.

If you have a HIPAA Business Associate Addendum (BAA) in place with AWS, you can now start using Amazon A2I for your HIPAA eligible workloads. If you don’t have a BAA in place with AWS, or if you have any other questions about running HIPAA-regulated workloads on AWS, please contact us. For information and best practices about configuring AWS HIPAA eligible services to store, process, and transmit PHI, see the Architecting for HIPAA Security and Compliance on Amazon Web Services whitepaper.

Amazon A2I makes it easy to build the workflows required for human review of ML predictions. Amazon A2I brings human review to all developers, removing the undifferentiated heavy lifting associated with building human review systems and managing large numbers of human reviewers. Many healthcare customers like Change Healthcare, EPL Innovative Solutions, and partners like TensorIoT are already exploring new ways to use the power of ML to automate their current workloads and transform how they provide care to patients and use AWS services to help them meet their compliance needs under HIPAA.

Change Healthcare

Change Healthcare is a leading independent healthcare technology company that provides data and analytics-driven solutions to improve clinical, financial, and patient engagement outcomes in the US healthcare system.

“At Change Healthcare, we help accelerate healthcare’s transformation by innovating to remove inefficiencies, reduce costs, and improve outcomes. We have a robust set of integrated artificial intelligence engines that bring new insights, impact, and innovation to the industry. Critical to our results is enabling human-in-the-loop to understand our data and automate workflows. Amazon A2I makes it easy to build the workflows required for human review of ML predictions. With Amazon A2I becoming HIPAA eligible, we are able to involve the human in the workflow and decision-making process, helping to increase efficiency with the millions of documents we process to create even more value for patients, payers, and providers.”

—Luyuan Fang, Chief AI Officer, Change Healthcare

TensorIoT

TensorIoT was founded on the instinct that the majority of compute is moving to the edge and all things are becoming smarter. TensorIoT is creating solutions to simplify the way enterprises incorporate Intelligent Edge Computing devices, AI, and their data into their day-to-day operations.

“TensorIoT has been working with customers to build document ingestion pipelines since Amazon Textract was in preview. Amazon A2I helps us easily add human-in-the-loop for document workflows, and one of the most frequently requested features from our healthcare customers is the ability to handle protected health information. Now with Amazon A2I being added to HIPAA eligible services, our healthcare customers will also be able to significantly increase the ingestion speed and accuracy of documents to provide insights and drive better outcomes for their doctors and patients.”

—Charles Burden, Head of Business Development, TensorIoT, AWS Advanced Consulting Partner

EPL Innovative Solutions

EPL Innovative Solutions, charged with the mission of “Changing Healthcare,” is an orchestrated effort, based on decades of experience in various healthcare theaters across the nation, to assess systems, identify shortcomings, plan and implement strategies, and lead the process of change for the betterment of the patient, provider, organization, or community experience.

“At EPL Innovative Solutions, we are excited to add Amazon A2I to our revolutionary cloud-based platform to serve healthcare clients that rely on us for HIPAA secure, accurate, and efficient medical coding and auditing services. We needed industry experts to optimize our platform, so we reached out to Belle Fleur Technologies as our AWS Partner for the execution of this solution to allow 100% human-in-the-loop review to meet our industry-leading standards of speed and accuracy.”

—Amanda Donoho, COO, EPL Innovative Solutions

Summary

Amazon A2I helps you automate your human review workloads and is now a HIPAA eligible service. For video presentations, sample Jupyter notebooks, or more information about use cases like document processing, object detection, sentiment analysis, text translation, and others, see Amazon Augmented AI Resources.

About the Author

Anuj Gupta is the Product Manager for Amazon Augmented AI. He is focusing on delivering products that make it easier for customers to adopt machine learning. In his spare time, he enjoys road trips and watching Formula 1.

How DeepMap optimizes their video inference workflow with Amazon SageMaker Processing

November 20, 2020

by Frank Wang Amazon AWS

Although we might think the world is already sufficiently mapped by the advent of global satellite images and street views, it’s far from complete because much of the world is still uncharted territory. Maps are designed for humans, and can’t be consumed by autonomous vehicles, which need a very different technology of maps with much higher precision.

DeepMap, a Palo Alto startup, is the leading technology provider of HD mapping and localization services for autonomous vehicles. These two services are integrated to provide high-precision localization maps, likely down to a centimeter precision. This demands processing a high volume of data to maintain precision and localization accuracy. In addition, road conditions can change minute to minute, so the maps guiding self-driving cars have to update in real time. DeepMap accumulates years of experience of mapping server development and uses the latest big data, machine learning (ML), and AI technology to build out their video inferencing and mapping pipeline.

In this post, we describe how DeepMap is revamping their video inference workflow by using Amazon SageMaker Processing, a customizable data processing and model evaluation feature, to streamline their workload by reducing complexity, processing time, and cost. We start out by describing the current challenges DeepMap is facing. Then we go over the proof of concept (POC) implementation and production architecture for the new solution using Amazon SageMaker Processing. Finally, we conclude with the performance improvements they achieved with this solution.

Challenges

DeepMap’s current video inference pipeline needs to process large amounts of video data collected by their cars, which are equipped with cameras and LIDAR laser scanning devices and drive on streets and freeways to collect video and image data. It’s a complicated, multi-step, batch processing workflow. The following diagram shows the high-level architecture view of the video processing and inferencing workflow.

With their previous workflow architecture, DeepMap discovered a scalability issue that increased processing time and cost due to the following reasons:

Multiple steps and separate batch processing stages, which could be error-prone and interrupt the workflow
Additional storage required in Amazon Simple Storage Service (Amazon S3) for intermediate steps
Sequential processing steps that prolonged the total time to complete inference

DeepMap’s infrastructure team recognized the issue and approached the AWS account team for guidance on how to better optimize their workflow using AWS services. The problem was originally presented as a workflow orchestration issue. A few AWS services were proposed and discussed for workflow orchestration, including:

Amazon Simple Workflow Service (Amazon SWF)
AWS Step Functions

However, after a further deep dive into their objectives and requirements, they determined that these services addressed the multiple steps coordination issue but not the storage and performance optimization objectives. Also, DeepMap wanted to keep the solution in the realm of the Amazon SageMaker ML ecosystem, if possible. After a debrief and further engagement with the Amazon SageMaker product team, a recently released Amazon SageMaker feature—Amazon SageMaker Processing—was proposed as a viable and potentially best fit solution for the problem.

Amazon SageMaker Processing comes to the rescue

Amazon SageMaker Processing lets you easily run the preprocessing, postprocessing, and model evaluation workloads on a fully managed infrastructure. Besides the full set of additional data and processing capabilities, Amazon SageMaker Processing is particularly attractive and promising for DeepMap’s problem at hand because of its flexibility in the following areas:

Setting up your customized container environment, also known as bring your own container (BYOC)
Custom coding to integrate with other application APIs that reside in your VPCs

These were the key functionalities DeepMap was looking for to redesign and optimize their current inference workflow. They quickly agreed on a proposal to explore Amazon SageMaker Processing and move forward as a proof of concept (POC).

Solution POC

The following diagram shows the POC architecture, which illustrates how a container in Amazon SageMaker Processing can make real-time API calls to a private VPC endpoint. The full architecture of the new video inference workload is depicted in the next section.

The POC demonstration includes the following implementation details:

Sample source data – Two video files (from car view). The following images show examples of Paris and Beijing streets.

Data stores – Two S3 buckets:
- Source video bucket – s3://sourcebucket/input
- Target inference result bucket – s3://targetbucket/output
Custom container – An AWS pre-built deep learning container based on MXNET with other needed packages and the pretrained model.
Model – A pre-trained semantic segmentation GluonCV model from the GluonCV model zoo. GluonCV provides implementations of state-of-the-art deep learning algorithms in computer vision. It aims to help engineers, researchers, and students quickly prototype products, validate new ideas, and learn computer vision. The GluonCV model zoo contains six kinds of pretrained models: classification, object detection, segmentation, pose estimation, action recognition, and depth prediction. For this post, we use deeplab_resnet101_citys, which was trained with Cityscape dataset and focuses on semantic understanding of urban street scenes, so this model is suitable for car view images. The following images are a sample of segmentation inference; we can see the model assigned red for people and blue for cars.

Amazon SageMaker Processing environment – Two instances (p3.2xlarge) configured for private access to the VPC API endpoint.
Mock API server – A web server in a private VPC mimicking DeepMap’s video indexing APIs. When invoked, it responds with a “Hello, Builders!” message.
Custom processing script – An API call to the mock API endpoint in the private VPC to extract frames from the videos, perform segmentation model inference on the frames, and store the results.

Amazon SageMaker Processing launches the instances you specified, downloads the container image and datasets, runs your script, and uploads the results to the S3 bucket automatically. We use the Amazon SageMaker Python SDK to launch the processing job. See the following code:

from sagemaker.network import NetworkConfig
from sagemaker.processing import (ProcessingInput, ProcessingOutput,
                                  ScriptProcessor)

instance_count = 2
"""
This network_config is for Enable VPC mode, which means the processing instance could access resources within vpc
change to your security_group_id and subnets_id
security_group_ids = ['YOUR_SECURITY_GROUP_ID']
subnets = ["YOUR_SUBNETS_ID1","YOUR_SUBNETS_ID2"]
"""
security_group_ids = vpc_security_group_ids
subnets = vpc_subnets

network_config = NetworkConfig(enable_network_isolation=False, 
                               security_group_ids=security_group_ids, 
                               subnets=subnets)

video_formats = [".mp4", ".avi"]
image_width = 1280
image_height = 720
frame_time_interval = 1000

script_processor = ScriptProcessor(command=['python3'],
                image_uri=processing_repository_uri,
                role=role,
                instance_count=instance_count,
                instance_type='ml.p3.2xlarge',
                network_config=network_config)

# with S3 shard key
script_processor.run(code='processing.py',
                      inputs=[ProcessingInput(
                        source=input_data,
                        destination='/opt/ml/processing/input_data',
                        s3_data_distribution_type='ShardedByS3Key')],
                      outputs=[ProcessingOutput(destination=output_data,
                                                source='/opt/ml/processing/output_data',
                                                s3_upload_mode = 'Continuous')],
                      arguments=['--api_server_address', vpc_api_server_address,
                                '--video_formats', "".join(video_formats),
                                '--image_width', str(image_width),
                                '--image_height', str(image_height),
                                '--frame_time_interval', str(frame_time_interval)]
                     )
script_processor_job_description = script_processor.jobs[-1].describe()
print(script_processor_job_description)

We use the ShardedByS3Key mode for the S3_data_distribution_type to leverage the feature in Amazon SageMaker that shards the objects by Amazon S3 prefix, so the instance receives 1/N of the total objects for faster parallel processing. Because this video inference job is just one part of DeepMap’s entire map processing workflow, S3_upload_mode is set to Continuous to streamline with the subsequent processing tasks. For the complete POC sample codes, see the GitHub repo.

The POC was successfully completed, and the positive results demonstrated the capability and flexibility of Amazon SageMaker Processing. It met the following requirements for DeepMap:

Dynamically invoke an API in a private VPC endpoint for real-time custom processing needs
Reduce the unnecessary intermediate storage for video frames

Production solution architecture

With the positive results from the demonstration of the POC, DeepMap’s team decided to re-architect their current video process and inference workflow by using Amazon SageMaker Processing. The following diagram depicts the high-level architecture of the new workflow.

The DeepMap team initiated a project to implement this new architecture. The initial production development setting is as follows:

Data source – Camera streams (30fps) collected from the cars are chopped and stored as 1-second h264 encoded video clips. All video clips are stored in the source S3 buckets.
Video processing – Within a video clip (of 30 frames in total), only a fraction of key frames are useful for map making. The relevant key frames information is stored in DeepMap’s video metadata database. Video processing codes run in an Amazon SageMaker Processing container, which call a video indexing API via a VPC private endpoint to retrieve relevant key frames for inferencing.
Deep learning inference – Deep learning inference code queries the key frame information from the database, decodes the key frames in memory, and applies the deep learning model using the semantic segmentation algorithm to produce the results and store the output in the S3 result bucket. The inference codes also run within the Amazon SageMaker Processing custom containers.

Testing example – We use a video clip of a collected road scene in .h264 format (000462.h264). Key frame metadata information about the video clip is stored in the database. The following is an excerpt of the key frame metadata information dumped from the database:

image_paths {
  image_id {
    track_id: 12728
    sample_id: 4686
  }
  camera_video_data {
    stream_index: 13862
    key_frame_index: 13860
    video_path: "s3://sensor-data/4e78__update1534_vehicle117_2020-06-11__upload11960/image_00/rectified-video/000462.h264"
  }
}
image_paths {
  image_id {
    track_id: 12728
    sample_id: 4687
  }
  camera_video_data {
    stream_index: 13864
    key_frame_index: 13860
    video_path: "s3://sensor-data/4e78__update1534_vehicle117_2020-06-11__upload11960/image_00/rectified-video/000462.h264"
  }
}

A relevant key frame is returned from the video index API call for the subsequent inference task (such as the following image).

The deep learning inference result is performed using the semantic segmentation algorithm running in Amazon SageMaker Processing to determine the proper lane line from the key frame. Using the preceding image as input, we receive the following output.

Performance improvements

As of this writing, DeepMap has already seen the expected performance improvements using the newly optimized workflow, and been able to achieve the following:

Streamline and reduce the complexity of current video-to-image preprocessing workflow. The real-time API video indexing call has reduced two steps to one.
Reduce the total time for video preprocessing and image DL inferencing. Through the streamlined process, they can now run decoding and deep learning inference on different processors (CPU and GPU) in different threads, potentially saving 100% preprocessing time (as long as the inference takes longer than the video decoding, which is true in most cases).
Reduce the intermediate storage spaces to store the images for inference job. Each camera frame (1920×1200, encoded as JPEG format) takes 500 KB to store, but a 1-second video clip (x264 encoded) with 30 continuous frames takes less than 2 MB storage (thanks to the video encoding). So, the storage reduction rate is about (1 – 2MB / (500KB * 30)) ~= 85%.

The following table summarizes the overall improvements of the new optimized workflow.

Measurements	Before	After	Performance Improvements
Processing steps	Two steps	One step	50% simpler workflow
Processing time	Video preprocessing to extract key frames	Parallel processing (with multiple threads in Amazon SageMaker Processing containers)	100% reduction of video preprocessing time
Storage	Intermediate S3 buckets for preprocessed video frames	None (in-memory)	85% reduction
Compute	Separate compute resources for video pre-processing using Amazon Elastic Compute Cloud (Amazon EC2)	None (running in the Amazon SageMaker Processing container)	100% reduction of video preprocessing compute resources

Conclusion

In this post, we described how DeepMap used the new Amazon SageMaker Processing capability to redesign their video inference workflow to achieve a more streamlined workflow. Not only did they save on storage costs, they also improved their total processing time.

Their successful use case also demonstrates the flexibility and scalability of Amazon SageMaker Processing, which can help you build more scalable ML processing and inferencing workloads. For more information about integrating Amazon SageMaker Processing, see Amazon SageMaker Processing – Fully Managed Data Processing and Model Evaluation. For more information about using services such as Step Functions to build more efficient ML workflows, see Building machine learning workflows with Amazon SageMaker Processing jobs and AWS Step Functions.

Try out Amazon SageMaker Processing today to further optimize your ML workloads.

About the Authors

Frank Wang is a Startup Senior Solutions Architect at AWS. He has worked with a wide range of customers with focus on Independent Software Vendors (ISVs) and now startups. He has several years of engineering leadership, software architecture, and IT enterprise architecture experiences, and now focuses on helping customers through their cloud journey on AWS.

Shishuai Wang is an ML Specialist Solutions Architect working with the AWS WWSO team. He works with AWS customers to help them adopt machine learning on a large scale. He enjoys watching movies and traveling around the world.

Yu Zhang is a staff software engineer and the technical lead of the Deep Learning platform at DeepMap. His research interests include Large-Scale Image Concept Learning, Image Retrieval, and Geo and Climate Informatic

Tom Wang is a founding team member and Director of Engineering at DeepMap, managing their cloud infrastructure, backend services, and map pipelines. Tom has 20+ years of industry experience in database storage systems, distributed big data processing, and map data infrastructure. Prior to DeepMap, Tom was a tech lead at Apple Maps and key contributor to Apple’s map data infrastructure and map data validation platform. Tom holds an MS degree in computer science from the University of Wisconsin-Madison.