Customizing your machine translation using Amazon Translate Active Custom Translation

Customizing your machine translation using Amazon Translate Active Custom Translation

When translating the English phrase “How are you?” to Spanish, would you prefer to use “¿Cómo estás?” or “¿Cómo está usted?” instead?

Amazon Translate is a neural machine translation service that delivers fast, high-quality, and affordable language translation. Today, we’re excited to introduce Active Custom Translation (ACT), a feature that gives you more control over your machine translation output. You can now influence what machine translation output you would like to get between “¿Cómo estás?” or “¿Cómo está usted?”. To make ACT work, simply provide your translation examples in TMX, TSV, or CSV format to create parallel data (PD), and Amazon Translate uses your PD along with your batch translation job to customize the translation output at runtime. If you have PD that shows “How are you?” being translated to “¿Cómo está usted?”, ACT knows to customize the translation to “¿Cómo está usted?”.

Today, professional translators use examples of previous translations to provide more customized translations for customers. Similar to profession translators, Amazon Translate can now provide customized translations by learning from your translation examples.

Traditionally, this customization was done by creating a custom translation model­—a specific-purpose translation engine built using customer data. Building custom translation models is complex, tedious, and expensive. It requires special expertise to prepare the data for training, testing, and validation. Then you build, deploy, and maintain the model by updating the model frequently. To save on model training and management costs, you may choose to delay updating your custom translation model, which means your models are always stale—negatively affecting your custom translation experience. In spite of all this work, these custom models perform well when the translation job is within the domain of your data. However, they tend to perform worse than a generic model when the translation job is outside of the domain of your customization data.

Amazon Translate ACT introduces an innovative way of providing customized translation output on the fly with your parallel data, without building a custom translation model. ACT output quality is always up to date with your PD. ACT provides the best translations for jobs both within the domain and outside the domain of PD. For example, if a source sentence isn’t in the domain of the PD, the translation output is still as good as the generic translation with no significant deterioration in translation quality. You no longer need to go through the tedious process of building and retraining custom translation models for each incoming use case. Just update the PD, and the ACT output automatically adapts to the most recent PD, without needing any retraining.

“Innovation is in our DNA. Our customers look to AWS to lead in customization of machine translation. Current custom translation technology is inefficient, cumbersome, and expensive,” says Marcello Federico, Principal Applied Scientist at Amazon Machine Learning, AWS. “Active Custom Translation allows our customers to focus on the value of their latest data and forget about the lifecycle management of custom translation models. We innovated on behalf of the customer to make custom machine translation easy.”

Don’t just take our word for it

Custom.MT implements machine translation for localization groups and translation companies. Konstantin Dranch, Custom.MT co-founder, shares, “Amazon Translate’s ACT is a breakthrough machine translation setup. A manual engine retraining takes 15–16 work hours, that’s why most language teams in the industry update their engines only once a month or once a quarter. With ACT, retraining is continuous and engines improve every day based on edits by human translators. Even before the feature was released to the market, we saw tremendous interest from leading software localization teams. With a higher quality of machine translation, enterprise teams can save millions of USD in manual translations and improve other KPIs, such as international user engagement and time to market.”

Welocalize is a leading global localization and translation company. Senior Manager of AI Deployments at Welocalize Alex Yanishevsky says, “Welocalize produces high-quality translations, so our customers can transform their content and data to grow globally and expand into international markets. Active Custom Translation from Amazon Translate allows us to customize our translations at runtime and provides us with significant flexibility in our production cycles. In addition, we see great business value and engine quality improvement since we can retrain engines frequently without incurring additional hosting or training charges.”

One Hour Translation is a leading professional language services provider. Yair Tal, CEO of One Hour Translation, says, “The customer demand for customized Neural Machine Translation (NMT) is growing every month because of the cost savings. As one of the first to try Amazon Translate ACT, we have found that ACT provides the best translation output for many language pairs. With ACT, training and maintenance is simple and the Translate API integrates with our system seamlessly. Translate’s pay-as-you-translate pricing helps our clients, both big and small, get translation output that is tailored for their needs without paying to train custom models.”

Building an Active Custom Translation job

Active Custom Translation’s capabilities are built right into the Amazon Translate experience. In this post, we walk you through the step-by-step process of using your data and getting a customized machine translated output securely. ACT is now available on batch translation, so first familiarize yourself with how to create a batch translation job.

You need data to customize your translation for terms or phrases that are unique to a specific domain, such as life sciences, law, or finance. You bring in examples of high-quality translations (source sentence and translated target sentence) in your preferred domain as a file in TMX, TSV, or CSV format. This data should also be UTF-8 encoded. You use this data to create a PD. Amazon Translate uses this PD to customize your machine translation. Each PD can be up to 1 GB large. You can upload up to 1,000 PD per account per Region. The 1,000 parallel data limit can be increased upon request. You get free storage for parallel data for up to 200 GB. You pay the local Amazon Simple Storage Service (Amazon S3) rate for excess data stored.

For our use case, I have my data in TSV format, and the name of my file is Mydata.tsv. I first upload this file to an S3 location (for this post, I store my data in s3://input-s3bucket/Paralleldata/).

The following table summarizes the contents of the file.

en es
Amazon Translate is a neural machine translation service. Amazon Translate es un servicio de traducción automática basado en redes neuronales.
Neural machine translation is a form of language translation automation that uses deep learning models. La traducción automática neuronal es una forma de automatizar la traducción de lenguajes utilizando modelos de aprendizaje profundo.
How are you? ¿Cómo está usted?

We run this example in the US West (Oregon) Region, us-west-2.

CreateParallelData

Calling the CreateParallelData API creates a PD resource record in our database and asynchronously starts a workflow for processing the PD file and ingesting it into our service.

CLI

The following CLI commands are formatted for Unix, Linux, and macOS. For Windows, replace the backslash () Unix continuation character at the end of each line with a caret (^).

Run the following CLI command:

aws translate create-parallel-data 
--name ${PARALLEL_DATA_NAME}
--parallel-data-config S3Uri=${S3_URI},Format=${FORMAT} 
--region ${REGION}

I use Mydata.tsv to create my PD my-parallel-data-1:

aws translate create-parallel-data 
--name my-parallel-data-1 
--parallel-data-config S3Uri= s3://input-s3bucket/Paralleldata/Mydata.tsv,Format=TSV 
--region us-west-2 

You get a response like the following code:

{
    "Name": "my-parallel-data-1",
    "Status": "CREATING"
}

This means that your PD is being created now.

Run aws translate create-parallel-data help for more information.

Console

To use the Amazon Translate console, complete the following steps:

  1. On the Amazon Translate console, under Customization, choose Parallel data.
  2. Choose Create parallel data.

  1. For Name, insert my-parallel-data-1.
  2. For Parallel data location in S3, enter your S3 location (for this post, s3://input-s3bucket/Paralleldata/Mydata.tsv).
  3. For File format¸ you can choose CSV, TSV, or TMX. For this post, we choose Tab-separated values (.tsv).

Your data is always secure with Amazon Translate. It’s encrypted using an AWS owned encryption key by default. You can encrypt it using a key from your current account or use a key from a different account.

  1. For this post, for Encryption key, we select Use AWS owned key.
  2. Choose Create parallel data.

ListParallelData

Calling the ListParallelData API returns a list of PD that exists and their details (it doesn’t include a pre-signed Amazon S3 URL for downloading the data)

CLI

Run the following CLI command:

aws translate list-parallel-data 
--region us-west-2

You get a response like the following code:

{
    "ParallelDataPropertiesList": [
        {
            "Name": "my-parallel-data-1",
            "Arn": "arn:aws:translate:us-west-2:123456789012:parallel-data/my-parallel-data-1",
            "Status": "ACTIVE",
            "SourceLanguageCode": "en",
            "TargetLanguageCodes": [
                "es"
            ],
            "ParallelDataConfig": {
                "S3Uri": "s3://input-s3bucket/Paralleldata/Mydata.tsv",
                "Format": "TSV"
            },
            "ImportedDataSize": 532,
            "ImportedRecordCount": 3,
            "FailedRecordCount": 0,
            "CreatedAt": 1234567890.406,
            "LastUpdatedAt": 1234567890.675
        }
    ]
}

The "Status": "ACTIVE" means your PD is ready for you to use.

Run aws translate list-parallel-data help for more information.

Console

This following screenshot shows the result for list-parallel-data on the Amazon Translate console.

GetParallelData

Calling the GetParallelData API returns details of the named parallel data and a pre-signed Amazon S3 URL for downloading the data.

CLI

Run the following CLI command:

aws translate get-parallel-data 
--name ${PARALLEL_DATA_NAME} 
--region ${REGION}

For example, my code looks like the following:

aws translate get-parallel-data 
--name my-parallel-data-1 
--region us-west-2

You get a response like the following code:

{
    "ParallelDataProperties": {
        "Name": "my-parallel-data-1",
        "Arn": "arn:aws:translate:us-west-2:123456789012:parallel-data/my-parallel-data-1",
        "Status": "ACTIVE",
        "SourceLanguageCode": "en",
        "TargetLanguageCodes": [
            "es"
        ],
        "ParallelDataConfig": {
            "S3Uri": "s3://input-s3bucket/Paralleldata/Mydata.tsv",
            "Format": "TSV"
        },
        "ImportedDataSize": 532,
        "ImportedRecordCount": 3,
        "FailedRecordCount": 0,
        "CreatedAt": 1234567890.406,
        "LastUpdatedAt": 1234567890.675
    },
    "DataLocation": {
        "RepositoryType": "S3",
        "Location": "xxx"
    }
}

“Location” contains the pre-signed Amazon S3 URL for downloading the data.

Run aws translate get-parallel-data help for more information.

Console

On the Amazon Translate console, choose one of the PD files on the Parallel data page.

You’re directed to another page that includes the detail for this parallel data file. The following screenshot shows the details for get-parallel-data.

UpdateParallelData

Calling the UpdateParallelData API replaces the old parallel data with the new one.

CLI

Run the following CLI command:

aws translate update-parallel-data 
--name ${PARALLEL_DATA_NAME}
--parallel-data-config S3Uri=${NEW_S3_URI},Format=${FORMAT} 
--region us-west-2

For this post, Mydata1.tsv is my new parallel data. My code looks like the following:

aws translate update-parallel-data 
--name my-parallel-data-1 
--parallel-data-config S3Uri= s3://input-s3bucket/Paralleldata/Mydata1.tsv,Format=TSV 
--region us-west-2

You get a response like the following code:

{
    "Name": "my-parallel-data-1",
    "Status": "ACTIVE",
    "LatestUpdateAttemptStatus": "UPDATING",
    "LatestUpdateAttemptAt": 1234567890.844
}

The "LatestUpdateAttemptStatus": "UPDATING" means your parallel data is being updated now.

Wait for a few minutes and run get-parallel-data again. You can see the parallel data get updated, such as in the following code:

{
    "ParallelDataProperties": {
            "Name": "my-parallel-data-1",
            "Arn": "arn:aws:translate:us-west-2:123456789012:parallel-data/my-parallel-data-1",
            "Status": "ACTIVE",
            "SourceLanguageCode": "en",
            "TargetLanguageCodes": [
                "es"
            ],
            "ParallelDataConfig": {
                "S3Uri": "s3://input-s3bucket/Paralleldata/Mydata1.tsv",
                "Format": "TSV"
            },
        ...
    }
}

We can see that the parallel data has been updated from Mydata.tsv to Mydata1.tsv.

Run aws translate update-parallel-data help for more information.

Console

On the Amazon Translate console, choose the parallel data file and choose Update.

You can replace the new parallel data file with the existing one by specifying the new Amazon S3 URL.

Creating your first Active Custom Translation job

In this section, we discuss the different ways you can create your ACT job.

StartTextTranslationJob

Calling the StartTextTranslationJob starts a batch translation. When you add parallel data to a batch translation job, you create an ACT job. Amazon Translate customizes your ACT output to match the style, tone, and word choices it finds in your PD. ACT is a premium product, so see Amazon Translate pricing for pricing information. You can only specify one parallel data file to use with the text translation job.

CLI

Run the following command:

aws translate start-text-translation-job 
--input-data-config ContentType=${CONTENT_TYPE},S3Uri=${INPUT_S3_URI} 
--output-data-config S3Uri=${OUTPUT_S3_URI} 
--data-access-role-arn ${DATA_ACCESS_ROLE}
--source-language-code=${SOURCE_LANGUAGE_CODE} --target-language-codes=${TARGET_LANGUAGE_CODE} 
--parallel-data-names ${PARALLEL_DATA_NAME}
--region ${REGION}
--job-name ${JOB_NAME}

For example, my code looks like the following:

aws translate start-text-translation-job 
--input-data-config ContentType=application/vnd.openxmlformats-officedocument.spreadsheetml.sheet,S3Uri= s3://input-s3bucket/inputfile/ 
--output-data-config S3Uri= s3://output-s3bucket/Output/ 
--data-access-role-arn arn:aws:iam::123456789012:role/TranslateBatchAPI 
--source-language-code=en --target-language-codes=es 
--parallel-data-names my-parallel-data-1 
--region us-west-2 
--job-name ACT1

You get a response like the following code:

{
    "JobId": "4446f95f20c88a4b347449d3671fbe3d",
    "JobStatus": "SUBMITTED"
}

This output means the job has been submitted successfully.

Run aws translate start-text-translation-job help for more information.

Console

For instructions on running a batch translation job on the Amazon Translate console, see Translating documents, spreadsheets, and presentations in Office Open XML format using Amazon Translate. Choose my-parallel-data-1 as the parallel data to create your first ACT job, ACT1.

Congratulations! You have created your first ACT job. ACT is available in the following Regions:

  • US East (Northern Virginia)
  • US West (Oregon)
  • Europe (Ireland)

Running your Active Custom Translation job

ACT works on asynchronous batch translation for language pairs that have English as either the source or target language.

Now, let’s try to translate the following text from English to Spanish and see how ACT helps to customize the output:

“How are you?” is one of the most common questions you’ll get asked when meeting someone. The most common response is “good”

The following is the output you get when you translate without any customization:

“¿Cómo estás?” es una de las preguntas más comunes que se le harán cuando conozca a alguien. La respuesta más común es “Buena”

The following is the output you get when you translate using ACT with my-parallel-data-1 as the PD:

“¿Cómo está usted?” es una de las preguntas más comunes que te harán cuando te reúnas con alguien. La respuesta más común es “Buena”

Conclusion

Amazon Translate ACT introduces a powerful way of providing personalized translation output with the following benefits:

  • You don’t have to build a custom translation model
  • You only pay for what you translate using ACT
  • There is no additional model building or model hosting cost
  • Your data is always secure and always under your control
  • You get the best machine translation even when your source text is outside the domain of your parallel data
  • You can update your parallel data as often as you need for no additional cost

Try ACT today. Bring your parallel data and start customizing your machine translation output. For more information about Amazon Translate ACT, see Asynchronous Batch Processing.

Related resources

For additional resources, see the following:

 


About the Authors

Watson G. Srivathsan is the Sr. Product Manager for Amazon Translate, AWS’s natural language processing service. On weekends you will find him exploring the outdoors in the Pacific Northwest.

 

 

 

Xingyao Wang is the Software Develop Engineer for Amazon Translate, AWS’s natural language processing service. She likes to hang out with her cats at home.

Read More

Find your inner poet with help from America's greats

Find your inner poet with help from America’s greats

Behold! the living thrilling lines

That course the blood like madd’ning wines,

And leap with scintillating spray

Across the guards of ecstasy.

The flame that lights the lurid spell

Springs from the soul’s artesian well,

Its fairy filament of art

Entwines the fragments of a heart.

Poetry by Georgia Douglas Johnson

When you write the living thrilling lines of a poem, you put yourself into each verse. Whether you’re writing for family, friends or an audience of thousands, each poem carries a part of you. When composing such a poem, each line is carefully crafted, which requires a lot of creative energy. Verse by Verse can help get those creative juices flowing: it’s our experiment using AI to augment the creative process of composing a poem. It will offer ideas that you can use, alter, or reject as you see fit. Verse by Verse is a creative helper, an inspiration—not a replacement. Here’s how it works.

Your muses

Using Verse by Verse, you can compose a poem with suggestions coming from some of America’s classic poets: Dickinson, Whitman, Poe, Wheatley, Longfellow and others. In order to make this possible, we’ve trained AI systems that provide suggestions in the style of each individual poet to act as your muses while you compose a poem of your own.

Poets featured in this tool

Composing

After choosing which poets to act as your muses and the structure of your poem, you can begin composing. Once you’ve written the first line of verse, Verse by Verse will start to suggest possible next verses.

Writing a poem

We give you full control of this creative process. You can choose to continue writing your own verses, use one of the suggestions, or even edit one of the suggestions to make it more personal. Once you’re satisfied with your poem, give it a title and finalize it. We give you two options: copy the text itself, or download the poem as an image. In either case, you can easily save the poem and share it with others.

Finished poem

Verse suggestions

Verse by Verse’s suggestions are not the original lines of verse the poets had written, but novel verses generated to sound like lines of verse the poets could have written. We did this by first training our generative models on a large collection of classic poetry, then fine tuning the models on each individual poet’s body of work to try to capture their style of writing.

Additionally, to be able to suggest relevant verses, the system was trained to have a general semantic understanding of what lines of verse would best follow a previous line of verse. So even if you write on topics not commonly seen in classic poetry, the system will try its best to make suggestions that are relevant.

Get writing

Verse by Verse can be used as a tool for inspiration, offering suggestions for ways of writing you may have never thought of. You can use it as an aid to learn about these various poets and the styles that they wrote in.

Have fun, and see where it takes you—perhaps down the road less traveled.

Read More

Videos from the TensorFlow User Group Summit in India

Videos from the TensorFlow User Group Summit in India

Posted by Siddhant Agarwal and Biswajeet Mallik, Program Managers

Logo of TFUG India Summit

TensorFlow has a strong developer community in India with 13 TensorFlow User Groups and 20+ Google Developer Experts. In September, these groups came together to organise the “TFUG India Summit“, a 4-day online event with four tracks. You can check out the recordings for these talks below.

Read More

Learning from Language Explanations

Learning from Language Explanations

Imagine you’re a machine learning practitioner and you want to solve some classification problem, like classifying groups of colored squares as being either 1s or 0s. Here’s what you would typically do: collect a large dataset of examples, label the data, and train a classifier:

But humans don’t learn like this. We have a very powerful and intuitive mechanism for communicating information about the world – language!

With just the phrase at least 2 red squares, we’ve summarized the entire dataset presented above in a much more efficient manner.

Language is a crucial medium for human learning: we use it to convey beliefs about the world, teach others, and describe things that are hard to experience directly. Thus, language ought to be a simple and effective way to supervise machine learning models. Yet past approaches to learning from language have struggled to scale up to the general tasks targeted by modern deep learning systems and the freeform language explanations used in these domains. In two short papers presented at ACL 2020 this year, we use deep neural models to learn from language explanations to help tackle a variety of challenging tasks in natural language processing (NLP) and computer vision.

What’s the challenge?

Given that language is such an intuitive interface for humans to teach others,
why is it so hard to use language for machine learning?

The principal challenge is the grounding
problem
: understanding language
explanations in the context of other inputs. Building models that can
understand rich and ambiguous language is tricky enough, but building models
that can relate language to the surrounding world is even more challenging. For
instance, given the explanation at least two red squares, a model must not
only understand the terms red and square, but also how they refer to
particular parts of (often complex) inputs.

Past work (1,
2,
3) has relied on semantic
parsers
which
convert natural language statements (e.g. at least two red squares) to formal
logical representations (e.g. Count(Square AND Red) > 2). If we can easily
check whether explanations apply to our inputs by executing these logical
formulas, we can use our explanations as features to train our model.
However, semantic parsers only work on simple domains
where we can hand-engineer a logical grammar of explanations we might expect to
see. They struggle to handle richer and vaguer language or scale up to more
complex inputs, such as images.

Fortunately, modern deep neural language models such as
BERT are beginning to show promise at
solving many language understanding tasks. Our papers propose to alleviate the
grounding problem by using neural language models that are either trained to
ground language explanations in the domain of interest, or come pre-trained
with general-purpose “knowledge” that can be used to interpret explanations. We
will show that these neural models allow us to learn from richer and more
diverse language for more challenging settings.

Representation Engineering with Natural Language Explanations

In our first paper, we examine how to build text classifiers with language
explanations.
Consider the task of relation extraction, where we are given a
short paragraph and must identify whether two people mentioned in the
paragraph are married. While state-of-the-art NLP models can likely solve
this task from data alone, humans might use language to describe ways to tell
whether two people are married—for example, people who go on honeymoons are
typically married
. Can such language explanations be used to train better
classifiers?

In the same way that we might take an input , and extract features (e.g.
the presence of certain words) to train a model, we can use explanations to
provide additional features. For example, knowing that honeymoons are relevant
for this task, if we can create a honeymoon feature that reliably activates
whenever the two people in a paragraph are described as going on a honeymoon,
this should be useful signal for training a better model.

But creating such features requires some sort of explanation interpretation
mechanism that tells us whether an explanation is true for an input. Semantic
parsers are one such tool: given and went on honeymoon, we could
parse this explanation into a logical form which, when run on an input,
produces 1 if the word honeymoon appears between and . But what about
a vaguer explanation like and are in love? How can we parse this?

While semantic parsing is efficient and accurate in small domains, it can be
overly brittle, as it can only interpret explanations which adhere to a fixed
set of grammatical rules and functions that we must specify in advance (e.g.
contains and extract_text).
Instead, we turn to the soft reasoning
capabilities of BERT, a neural language model. BERT is particularly effective
at the task of textual entailment: determining whether a sentence implies or
contradicts another sentence (e.g. does She ate pizza imply that She ate
food?
Yes!). In our proposed ExpBERT model, we take a BERT model
trained for textual entailment, and instead ask it to identify whether a
paragraph in our task entails an explanation. The features produced by BERT
during this process replace the indicator features produced by the semantic
parser above.

Does the soft reasoning power of BERT improve over semantic parsing? On the
marriage identification task, we find that ExpBERT leads to substantial
improvements over a classifier that is trained on the input features only (No
Explanations). Importantly, using a semantic parser to try to parse
explanations doesn’t help much, since there are general explanations (in
love
) that are difficult to convert to logical forms.

In the full paper, we compare to more baselines, explore larger relation
extraction tasks (e.g. TACRED),
conduct ablation studies to understand what kinds of explanations are
important, and examine how much more efficient explanations are compared to
additional data.

Shaping Visual Representations with Language

The work we’ve just described uses natural language explanations for a single
task like marriage identification. However, work in cognitive
science
suggests that
language also equips us with the right features and abstractions that help us
solve future tasks.
For example, explanations that indicate whether person is married to
also highlight other concepts that are crucial to human relationships:
children, daughters, honeymoons, and more. Knowing these additional
concepts are not just useful for identifying married people; they are also
important if we would later like to identify other relationships
(e.g. siblings, mother, father).

In machine learning, we might ask: how can language point out the right
features for challenging and underspecified domains, if we
ultimately wish to solve new tasks where no language is available? In our
second paper, we explore this setting,
additionally increasing the challenge by seeing whether language can improve
the learning of representations across modalities—here, vision.

We’re specifically interested in few-shot visual reasoning tasks like the following (here, from the ShapeWorld dataset):

Given a small training set of examples of a visual concept, the task is to
determine whether a held-out test image expresses the same concept. Now, what
if we assume access to language explanations of the relevant visual concepts at
training time? Can we use these to learn a better model, even if no language
is available at test time
?

We frame this as a meta-learning task:
instead of training and testing a model on a single task, we
train a model on a set of tasks, each with a small training set and
an accompanying language description (the meta-train set). We then test
generalization to a meta-test set of unseen tasks, for which no language is
available:

First, let’s look at how we might solve this task without language. One typical
approach is Prototype Networks, where we learn some model
(here, a deep convolutional neural network)
that embeds the training images, averages them, and compares to an embedding of
the test image:

To use language, we propose a simple approach called Language Shaped Learning
(LSL): if we have access to explanations at training time, we encourage the
model to learn representations that are not only helpful for classification,
but are predictive of the language explanations. We do this by introducing an
auxiliary training objective (i.e. it is not related to the ultimate task of
interest), where we simultaneously train a recurrent neural network (RNN)
decoder to predict the explanation(s) from the representation of the
input images. Crucially, training this decoder depends on the
parameters of our image model , so this process should encourage
to better encode the features and abstractions exposed in
language.

In effect, we are training the model to “think out loud” when representing
concepts at training time. At test time, we simply discard the RNN decoder, and
do classification as normal with the “language-shaped” image embeddings.

We apply this model to both the ShapeWorld dataset described above, and a more
realistic Birds
dataset, with real images and human language:

In both cases, this auxiliary training objective improves performance over a
no-explanation baseline (Meta), and Learning with Latent
Language
(L3), a similar model proposed
for this setting that uses language as a discrete bottleneck (see the paper for
details):

In the full paper, we also explore which parts of language are most important
(spoiler: a little bit of everything), and how much language is needed for
LSL to improve over models that don’t use language (spoiler: surprisingly little!)

Moving Forward

As NLP systems grow in their ability to understand and produce language, so too
grows the potential for machine learning systems to learn from language to
solve other challenging tasks. In the papers above, we’ve shown that deep
neural language models can be used to successfully learn from language
explanations to improve generalization across a variety of tasks in vision and
NLP.

We think this is an exciting new avenue for training machine learning models,
and similar ideas are already being explored in areas such as reinforcement
learning (4,
5). We envision a future where in order to
solve a machine learning task, we no longer have to collect a large labeled
dataset, but instead interact naturally and expressively with a model in the
same way that humans have interacted with each other for millennia—through
language
.

Acknowledgments

Thanks to our coauthors (Pang Wei Koh, Percy Liang, and Noah Goodman), and to
Nelson Liu, Pang Wei Koh, and the rest of the SAIL blog team for reviewing and
publishing this blog post. This research was supported in part by the Facebook
Fellowship
(to Pang Wei Koh), the NSF Graduate Research Fellowship (to Jesse Mu), Toyota Research
Institute
, and the Office of Naval Research.

Read More

Getting started with Amazon Kendra ServiceNow Online connector

Getting started with Amazon Kendra ServiceNow Online connector

Amazon Kendra is a highly accurate and easy-to-use intelligent search service powered by machine learning (ML). To make it simple to search data across multiple content repositories, Amazon Kendra offers a number of native data source connectors to help get your documents easily ingested and indexed.

This post describes how you can use the Amazon Kendra ServiceNow connector. To allow the connector to access your ServiceNow site, you need to know your ServiceNow version, the Amazon Kendra index, the ServiceNow host URL, and the credentials of a user with the ServiceNow admin role attached to it. The ServiceNow credentials needed for the Amazon Kendra ServiceNow connector to work are securely stored in AWS Secrets Manager, and can be entered during the connector setup.

Currently, Amazon Kendra has two provisioning editions: the Amazon Kendra Developer Edition for building proof of concepts (POCs), and the Amazon Kendra Enterprise Edition. Amazon Kendra connectors work with both these editions.

The Amazon Kendra ServiceNow Online connector indexes Service Catalog items and public articles that have a published state, so a knowledge base article must have the public role under Can Read, and Cannot Read must be null or not set.

Prerequisites

To get started, you need the following:

  • The ServiceNow host URL
  • Username and Password of a user with the admin role
  • Know your ServiceNow version

The user that you use for the connector needs to have the admin role in ServiceNow. This is defined on ServiceNow’s User Administration page (see the section Insufficient Permissions for more information).

When setting up the ServiceNow connector, we need to define if our build is London or a different ServiceNow version. To obtain our build name, we can go on the System Diagnostics menu and choose Stats.

In the following screenshot, my build name is Orlando, so I indicate on the connector that my version is Others.

Creating a ServiceNow connector in the Amazon Kendra console

The following section describes the process of deploying an Amazon Kendra index and configuring a ServiceNow connector.

  1. Create your index. For instructions, see Getting started with the Amazon Kendra SharePoint Online connector.

If you already have an index, you can skip this step.

The next step is to set up the data sources. One of the advantages of implementing Amazon Kendra is that you can use a set of pre-built connectors for data sources, such as Amazon Simple Storage Service (Amazon S3), Amazon Relational Database Service (Amazon RDS), Salesforce, ServiceNow, and SharePoint Online, among others.

For this post, we use the ServiceNow connector.

  1. On the Amazon Kendra console, choose Indexes.
  2. Choose MyServiceNowindex.
  3. Choose Add data sources.

  1. Choose ServiceNow Online.

  1. For Name, enter a connector name.
  2. For Description, enter an optional description.
  3. For Tags¸ you can optionally assign tags to your data source.
  4. Choose Next.

In this next step, we define targets.

  1. For ServiceNow host, enter the host name.
  2. For ServiceNow version, enter your version (for this post, we choose Others).
  3. For IAM role, we can create a new AWS Identity and Access Management (IAM) role or use an existing one.

For more information, see IAM role for ServiceNow data sources.

This role has four functions:

If you use an existing IAM role, you have to grant permissions to this secret in Secrets Manager. If you create a new IAM and a new secret, no further action is required.

  1. Choose Next.

You then need to define ServiceNow authentication details, the content to index, and the synchronization schedule.

The ServiceNow user you provide for the connector needs to have the admin role.

  1. In the Authentication section, for Type of authentication, choose an existing secret or create a new one. For this post, we choose New.
  2. Enter your secret’s name, username, and password.

  1. In the ServiceNow configuration section, we define the content types we need to index: Knowledge articles, Service catalog items, or both.
  2. You also define if it include the item attachments.

Amazon Kendra only indexes public articles that have a published state, so a knowledge base article must have the public role under Can Read, and Cannot Read must be null or not set.

  1. You can include or exclude some file extensions (for example, for Microsoft Word, we have six different types of extensions).

  1. For Frequency, choose Run on demand.

  1. Add field mappings.

Even though this is an optional step, it’s a good idea to add this extra layer of metadata to our documents from ServiceNow. This metadata enables you to improve accuracy through manual tuning, filtering, and faceting. There is no way to add metadata to already ingested documents, so if you want to add metadata later, you need to delete this data source and recreate a data source with metadata and ingest your documents again.

If you map fields through the console when setting up the ServiceNow connector for the first time, these fields are created automatically. If you configure the connector via the API, you need update your index first and define those new fields.

You can map ServiceNow properties to Amazon Kendra index fields. The following table is the list of fields that we can map.

ServiceNow Field Name Suggested Amazon Kendra Field Name
content _document_body
displayUrl sn_display_url
first_name sn_ka_first_name
kb_category sn_ka_category
kb_catagory_name _category
kb_knowledge_base sn_ka_knowledge_base
last_name sn_ka_last_name
number sn_kb_number
published sn_ka_publish_date
repItemType sn_repItemType
short_description _document_title
sys_created_by sn_createdBy
sys_created_on _created_at
sys_id sn_sys_id
sys_updated_by sn_updatedBy
sys_updated_on _last_updated_at
url sn_url
user_name sn_ka_user_name
valid_to sn_ka_valid_to
workflow_state sn_ka_workflow_state

Even though there are suggested Kendra field names you can define, you can map a field into a different name.

The following table summarizes the available service catalog fields.

ServiceNow Field Name Suggested Amazon Kendra Field Name
category sn_sc_category
category_full_name sn_sc_category_full_name
category_name _category
description _document_body
displayUrl sn_display_url
repItemType sn_repItemType
sc_catalogs sn_sc_catalogs
sc_catalogs_name sn_sc_catalogs_name
short_description _document_body
sys_created_by sn_createdBy
sys_created_on _created_at
sys_id sn_sys_id
sys_updated_by sn_updatedBy
sys_updated_on _last_updated_at
title _document_title
url sn_url

For this post, our Amazon Kendra index has a custom index field called MyCustomUsername, which you can use to map the Username field from different data sources. This custom field was created under the index’s facet definition. The following screenshot shows a custom mapping.

  1. Review the settings and choose Create data source.

After your ServiceNow data source is created, you see a banner similar to the following screenshot.

  1. Choose Sync now to start the syncing and document ingestion process.

If everything goes as expected, you can see the status as Succeeded.

Testing

Now that you have synced your ServiceNow site you can test it on the Amazon Kendra’s search console.

In my case, my ServiceNow site has the demo examples, so I asked what is the storage on the ipad 3, which returned information from a service catalog item:

Creating a ServiceNow connector with Python

We saw how to create an index on the Amazon Kendra console; now we create a new Amazon Kendra index and a ServiceNow connector and sync it by using the AWS SDK for Python (Boto3). Boto3 makes it easy to integrate your Python application, library, or script with AWS services, including Amazon Kendra.

My personal preference to test my Python scripts is to spin up an Amazon SageMaker notebook instance, a fully managed ML Amazon Elastic Compute Cloud (Amazon EC2) instance that runs the Jupyter Notebook app. For instructions, see Create an Amazon SageMaker Notebook Instance.

To create an index using the AWS SDK, we need to have the policy AmazonKendraFullAccess attached to the role we use.

Also, Amazon Kendra requires different roles to operate:

  • IAM roles for indexes, which are needed by Amazon Kendra to write to Amazon CloudWatch Logs.
  • IAM roles for data sources, which are needed when we use the CreateDataSource These roles require a specific set of permissions depending on the connector we use. Because we use ServiceNow data sources, it must provide permissions to:
    • Secrets Manager, where the ServiceNow online credentials are stored.
    • Permission to use the AWS Key Management Service (AWS KMS) customer master Key (CMK) to decrypt the credentials by Secrets Manager.
    • Permission to use the BatchPutDocument and BatchDeleteDocument operations to update the index.

For more information, see IAM access roles for Amazon Kendra.

Our current requirements are:

  • Amazon SageMaker Notebooks execution role with permission to create an Amazon Kendra index using an Amazon SageMaker notebook
  • Amazon Kendra IAM role for CloudWatch
  • Amazon Kendra IAM role for ServiceNow connector
  • ServiceNow credentials stored on Secrets Manager

To create an index, we use the following code:

import boto3
 from botocore.exceptions import ClientError
 import pprint
 import time
  
 kendra = boto3.client("kendra")
  
 print("Creating an index")
  
 description = <YOUR_INDEX_DESCRIPTION>
 index_name = <YOUR_NEW_INDEX_NAME>
 role_arn = <KENDRA_ROLE_WITH_CLOUDWATCH_PERMISSIONS ROLE>
  
 try:
     index_response = kendra.create_index(
         Description = <DESCRIPTION>,
         Name = index_name,
         RoleArn = role_arn,
         Edition = "DEVELOPER_EDITION",
         Tags=[
         {
             'Key': 'Project',
             'Value': 'SharePoint Test'
         } 
         ]
     )
  
     pprint.pprint(index_response)
  
     index_id = index_response['Id']
  
     print("Wait for Kendra to create the index.")
  
     while True:
         # Get index description
         index_description = kendra.describe_index(
             Id = index_id
         )
         # If status is not CREATING quit
         status = index_description["Status"]
         print("    Creating index. Status: "+status)
         if status != "CREATING":
             break
         time.sleep(60)
  
 except  ClientError as e:
         print("%s" % e)
  
 print("Done creating index.")

While our index is being created, we obtain regular updates (every 60 seconds to be exact, check line 38) until the process is finished. See the following code:

Creating an index
 {'Id': '3311b507-bfef-4e2b-bde9-7c297b1fd13b',
  'ResponseMetadata': {'HTTPHeaders': {'content-length': '45',
                                       'content-type': 'application/x-amz-json-1.1',
                                       'date': 'Wed, 12 Aug 2020 12:58:19 GMT',
                                       'x-amzn-requestid': 'a148a4fc-7549-467e-b6ec-6f49512c1602'},
                       'HTTPStatusCode': 200,
                       'RequestId': 'a148a4fc-7549-467e-b6ec-6f49512c1602',
                       'RetryAttempts': 2}}
 Wait for Kendra to create the index.
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: CREATING
     Creating index. Status: ACTIVE
 Done creating index

The preceding code indicates that our index has been created and our new index ID is 3311b507-bfef-4e2b-bde9-7c297b1fd13b (your ID is different from our example code). This information is included as ID in the response.

Our Amazon Kendra index is up and running now.

If you have metadata attributes associated with your ServiceNow articles, you want to do three things:

  1. Determine the Amazon Kendra attribute name you want for each of your ServiceNow metadata attributes. By default, Amazon Kendra has six reserved fields (_category, created_at, _file_type, _last_updated_at, _source_uri, and _view_count).
  2. Update the index with the UpdateIndex API call with the Amazon Kendra attribute names.
  3. Map each ServiceNow metadata attribute to each Amazon Kendra metadata attribute.

You can find a table with metadata attributes and the suggested Amazon Kendra fields under step 20 on the previous section.

For this post, I have the metadata attribute UserName associated with my ServiceNow article and I want to map it to the field MyCustomUsername on my index. The following code shows how to add the attribute MyCustomUsername to my Amazon Kendra index. After we create this custom field in our index, we map our field Username from ServiceNow to it. See the following code:

try:
     update_response = kendra.update_index(
         Id='3311b507-bfef-4e2b-bde9-7c297b1fd13b',
         RoleArn='arn:aws:iam::<MY-ACCOUNT-NUMBER>:role/service-role/AmazonKendra-us-east-1-KendraRole',
         DocumentMetadataConfigurationUpdates=[
         {
             'Name': <MY_CUSTOM_FIELD_NAME>,
             'Type': 'STRING_VALUE',
             'Search': {
                 'Facetable': True,
                 'Searchable': True,
                 'Displayable': True
             }
         }   
     ]
     )
 except  ClientError as e:
         print('%s' % e)   
 pprint.pprint(update_response)

If everything goes well, we receive a 200 response:

{'ResponseMetadata': {'HTTPHeaders': {'content-length': '0',
                                       'content-type': 'application/x-amz-json-1.1',
                                       'date': 'Wed, 12 Aug 2020 12:17:07 GMT',
                                       'x-amzn-requestid': '3eba66c9-972b-4757-8d92-37be17c8f8a2},
                       'HTTPStatusCode': 200,
                       'RequestId': '3eba66c9-972b-4757-8d92-37be17c8f8a2',
                       'RetryAttempts': 0}} 
  

 }

We also need to have GetSecretValue for our secret stored in Secrets Manager.

If you need to create a new secret in Secrets Manager to store your ServiceNow credentials, make sure the role you use has permissions to CreateSecret and tagging for Secrets Manager. The policy should look like the following code:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "SecretsManagerWritePolicy",
            "Effect": "Allow",
            "Action": [
                "secretsmanager:UntagResource",
                "secretsmanager:CreateSecret",
                "secretsmanager:TagResource"
            ],
            "Resource": "*"
        }
    ]
}

The following code creates a secret in Secrets Manager:

secretsmanager = boto3.client('secretsmanager')

SecretName = <YOURSECRETNAME>
SharePointCredentials = "{'username': <YOUR_SERVICENOW_SITE_USERNAME>, 'password': <YOUR_SERVICENOW_SITE_PASSWORD>}"

try:
  create_secret_response = secretsmanager.create_secret(
  Name=SecretName,
  Description='Secret for a servicenow data source connector',
  SecretString=SharePointCredentials,
  Tags=[
   {
    'Key': 'Project',
    'Value': 'ServiceNow Test'
   }
 ]
 )
except ClientError as e:
  print('%s' % e)
  pprint.pprint(create_secret_response)

If everything goes well, you get a response with your secret’s ARN:

{'ARN':<YOUR_SECRETS_ARN>,
 'Name': <YOUR_SECRET_NAME>,
 'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
                                      'content-length': '161',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Sat, 22 Aug 2020 14:44:13 GMT',
                                      'x-amzn-requestid': '68c9a153-c08e-42df-9e6d-8b82550bc412'},
                      'HTTPStatusCode': 200,
                      'RequestId': '68c9a153-c08e-42df-9e6d-8b82550bc412',
                      'RetryAttempts': 0},
 'VersionId': 'bee74dab-6beb-4723-a18b-4829d527aad8'}

Now that we have our Amazon Kendra index, our custom field, and our ServiceNow credentials, we can proceed with creating our data source.

To ingest documents from this data source, we need an IAM role with Kendra:BatchPutDocument and kendra:BatchDeleteDocument permissions. For more information, see IAM roles for Microsoft SharePoint Online data sources. We use the ARN for this IAM role when invoking the CreateDataSource API.

Make sure the role you use for your data source connector has a trust relationship with Amazon Kendra. It should look like the following code:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "kendra.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]

The following code is the policy structure we need:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetSecretValue"
            ],
            "Resource": [
                "arn:aws:secretsmanager:region:account ID:secret:secret ID"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "kms:Decrypt"
            ],
            "Resource": [
                "arn:aws:kms:region:account ID:key/key ID"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "kendra:BatchPutDocument",
                "kendra:BatchDeleteDocument"
            ],
            "Resource": [
                "arn:aws:kendra:<REGION>:<account_ID>:index/<index_ID>"
            ],
            "Condition": {
                "StringLike": {
                    "kms:ViaService": [
                        "kendra.amazonaws.com"
                    ]
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::<BUCKET_NAME>/*"
            ]
        }
    ]
}

Finally, the following code is my role’s ARN:

arn:aws:iam::<MY_ACCOUNT_NUMBER>:role/Kendra-Datasource

Following the least privilege principle, we only allow our role to put and delete documents in our index, and read the secrets to connect to our ServiceNow site.

One detail we can specify when creating a data source is the sync schedule, which indicates how often our index syncs with the data source we create. This schedule is defined on the Schedule key of our request. You can use schedule expressions for rules to define how often you want to sync your data source. For this post, I use the ScheduleExpression 'cron(0 11 * * ? *)', which means that our data source is synced every day at 11:00 AM.

I use the following code. Make sure you match your SiteURL and SecretARN, as well as your IndexID. Additionally, FieldMappings is where you map between the ServiceNow attribute names with the Amazon Kendra index attribute names. I chose the same attribute name in both, but you can call the Amazon Kendra attribute whatever you’d like.

For more details, see  create_data_source(**kwargs).

print('Create a data source')
 
SecretArn= "<YOUR_SERVICENOW_ONLINE_USER_AND_PASSWORD_SECRETS_ARN>"
SiteUrl = "<YOUR_SERVICENOW_SITE_URL>"
DSName= "<YOUR_NEW_DATASOURCE_NAME>"
IndexId= "<YOUR_INDEX_ID>"
DSRoleArn= "<YOUR_DATA_SOURCE_ROLE>"
ScheduleExpression='cron(0 11 * * ? *)'
try:try:
    datasource_response = kendra.create_data_source(
    Name=DSName,
    IndexId=IndexId,        
    Type='SERVICENOW',
    Configuration={
        'ServiceNowConfiguration': {
            'HostUrl': SiteUrl,
            'SecretArn': SecretArn,
            'ServiceNowBuildVersion': 'OTHERS',
            'KnowledgeArticleConfiguration': 
            {
                'CrawlAttachments': True,
                'DocumentDataFieldName': 'content',
                'DocumentTitleFieldName': 'short_description',
                'FieldMappings': 
                [
                    {
                        'DataSourceFieldName': 'sys_created_on',
                        'DateFieldFormat': 'yyyy-MM-dd hh:mm:ss',
                        'IndexFieldName': '_created_at'
                    },
                    {
                        'DataSourceFieldName': 'sys_updated_on',
                        'DateFieldFormat': 'yyyy-MM-dd hh:mm:ss',
                        'IndexFieldName': '_last_updated_at'
                    },
                    {
                        'DataSourceFieldName': 'kb_category_name',
                        'IndexFieldName': '_category'
                    },
                    {
                        'DataSourceFieldName': 'sys_created_by',
                        'IndexFieldName': 'MyCustomUsername'
                    }
                ],
                'IncludeAttachmentFilePatterns': 
                [
                    '.*\.(dotm|ppt|pot|pps|ppa)$',
                    '.*\.(doc|dot|docx|dotx|docm)$',
                    '.*\.(pptx|ppsx|pptm|ppsm|html)$',
                    '.*\.(txt)$',
                    '.*\.(hml|xhtml|xhtml2|xht|pdf)$'
                ]
            },
            'ServiceCatalogConfiguration': {
                'CrawlAttachments': True,
                'DocumentDataFieldName': 'description',
                'DocumentTitleFieldName': 'title',
                'FieldMappings': 
                [
                    {
                        'DataSourceFieldName': 'sys_created_on',
                        'DateFieldFormat': 'yyyy-MM-dd hh:mm:ss',
                        'IndexFieldName': '_created_at'
                    },
                    {
                        'DataSourceFieldName': 'sys_updated_on',
                        'DateFieldFormat': 'yyyy-MM-dd hh:mm:ss',
                        'IndexFieldName': '_last_updated_at'
                    },
                    {
                        'DataSourceFieldName': 'category_name',
                        'IndexFieldName': '_category'
                    }
                ],
                'IncludeAttachmentFilePatterns': 
                [
                    '.*\.(dotm|ppt|pot|pps|ppa)$',
                    '.*\.(doc|dot|docx|dotx|docm)$',
                    '.*\.(pptx|ppsx|pptm|ppsm|html)$',
                    '.*\.(txt)$',
                    '.*\.(hml|xhtml|xhtml2|xht|pdf)$'
                ]
            },
        },
    },
    Description='My ServiceNow Datasource',
    RoleArn=DSRoleArn,
    Schedule=ScheduleExpression,
    Tags=[
        {
            'Key': 'Project',
            'Value': 'ServiceNow Test'
        }
    ])
    pprint.pprint(datasource_response)
    print('Waiting for Kendra to create the DataSource.')
    datasource_id = datasource_response['Id']
    while True:
        # Get index description
        datasource_description = kendra.describe_data_source(
            Id=datasource_id,
            IndexId=IndexId
        )
        # If status is not CREATING quit
        status = datasource_description["Status"]
        print("    Creating index. Status: "+status)
        if status != "CREATING":
            break
        time.sleep(60)    

except  ClientError as e:
        pprint.pprint('%s' % e)
pprint.pprint(datasource_response)

If everything goes well, we should receive a 200 status response:

Create a DataSource
{'Id': '507686a5-ff4f-4e82-a356-32f352fb477f',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '45',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Sat, 22 Aug 2020 15:50:08 GMT',
                                      'x-amzn-requestid': '9deaea21-1d38-47b0-a505-9bb2efb0b74f'},
                      'HTTPStatusCode': 200,
                      'RequestId': '9deaea21-1d38-47b0-a505-9bb2efb0b74f',
                      'RetryAttempts': 0}}
Waiting for Kendra to create the DataSource.
    Creating index. Status: CREATING
    Creating index. Status: ACTIVE
{'Id': '507686a5-ff4f-4e82-a356-32f352fb477f',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '45',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Sat, 22 Aug 2020 15:50:08 GMT',
                                      'x-amzn-requestid': '9deaea21-1d38-47b0-a505-9bb2efb0b74f'},
                      'HTTPStatusCode': 200,
                      'RequestId': '9deaea21-1d38-47b0-a505-9bb2efb0b74f',
                      'RetryAttempts': 0}}

Even though we have defined a schedule for syncing my data source, we can sync on demand by using the method start_data_source_sync_job:

DSId=<YOUR DATA SOURCE ID>
IndexId=<YOUR INDEX ID>
 
try:
    ds_sync_response = kendra.start_data_source_sync_job(
    Id=DSId,
    IndexId=IndexId
)
except  ClientError as e:
        print('%s' % e)  
        
pprint.pprint(ds_sync_response)

The response should look like the following code:

{'ExecutionId': '3d11e6ef-3b9e-4283-bf55-f29d0b10e610',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '54',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Sat, 22 Aug 2020 15:52:36 GMT',
                                      'x-amzn-requestid': '55d94329-50af-4ad5-b41d-b173f20d8f27'},
                      'HTTPStatusCode': 200,
                      'RequestId': '55d94329-50af-4ad5-b41d-b173f20d8f27',
                      'RetryAttempts': 0}}

Testing

Finally, we can query our index. See the following code:

response = kendra.query(
IndexId=<YOUR_INDEX_ID>,
QueryText='Is there a service that has 11 9s of durability?')
if response['TotalNumberOfResults'] > 0:
    print(response['ResultItems'][0]['DocumentExcerpt']['Text'])
    print("More information: "+response['ResultItems'][0]['DocumentURI'])
else:
    print('No results found, please try a different search term.')

Common errors

In this section, we discuss errors that may occur, whether using the Amazon Kendra console or the Amazon Kendra API.

You should look at CloudWatch logs and error messages returned in the Amazon Kendra console or via the Amazon Kendra API. The CloudWatch logs help you determine the reason for a particular error, whether you experience it using the console or programmatically.

Common errors when trying to access ServiceNow as a data source are:

  • Insufficient permissions
  • Invalid credentials
  • Secrets Manager error

Insufficient permissions

A common scenario you may come across is when you have the right credentials but your user doesn’t have enough permissions for the Amazon Kendra ServiceNow connector to crawl your knowledge base and service catalog items.

You receive the following error message:

We couldn't sync the following data source: 'MyServiceNowOnline', at start time Sep 12, 2020, 1:08 PM CDT. Amazon Kendra can't connect to the ServiceNow server with the specified credentials. Check your credentials and try your request again.

If you can log in to your ServiceNow instance, make sure that the user you designed for the connector has the admin role.

  1. On your ServiceNow instance, under User Administration, choose Users.

  1. On the users list, choose the user ID of the user you want to use for the connector.

  1. On the Roles tab, verify that your user has the admin

  1. If you don’t have that role attached to your user, choose Edit to add it.

  1. On the Amazon Kendra console, on your connector configuration page, choose Sync now.

Invalid credentials

You may encounter an error with the following message:

We couldn't sync the following data source: 'MyServiceNowOnline', at start time Jul 28, 2020, 3:59 PM CDT. Amazon Kendra can't connect to the ServiceNow server with the specified credentials. Check your credentials and try your request again.

To investigate, complete the following steps:

  1. Choose the error message to review the CloudWatch logs.

You’re redirected CloudWatch Logs Insights.

  1. Choose Run Query to start analyzing the logs.

We can verify our credentials by going to Secrets Manager and reviewing our credentials stored in the secret.

  1. Choose your secret name.

  1. Choose Retrieve secret value.

  1. If your password doesn’t match, choose Edit,

  1. And the username or password and choose Save.

  1. Go back to your data source in Amazon Kendra, and choose Sync now.

Secrets Manager error

You may encounter an error stating that the customer’s secret can’t be fetched. This may happen if you use an existing secret and the IAM role used for syncing your ServiceNow data source doesn’t have permissions to access the secret.

To address this issue, first we need our secret’s ARN.

  1. On the Secrets Manager console, choose your secret’s name (for this post, AmazonKendra-ServiceNow-demosite).
  2. Copy the secret’s ARN.
  3. On the IAM console, search for the role we use to sync our ServiceNow data source (for this post, AmazonKendra-servicenow-role).
  4. For Permissions, choose Add inline policy.
  5. Following the least privilege principle, for Service, choose Secrets Manager.
  6. For Access Level, choose Read and GetSecretValue.
  7. For Resources, enter our secret’s ARN.

Your settings should look similar to the following screenshot.

  1. Enter a name for your policy.
  2. Choose Create Policy.

After your policy has been created and attached to your data source role, try to sync again.

Conclusion

You have now learned how to ingest the documents from your ServiceNow site into your Amazon Kendra index. We hope this post helps you take advantage of the intelligent search capabilities in Amazon Kendra to find accurate answers from your enterprise content.

For more information about Amazon Kendra, see AWS re:Invent 2019 – Keynote with Andy Jassy on YouTube, Amazon Kendra FAQs, and What is Amazon Kendra?

 


About the Authors

David Shute is a Senior ML GTM Specialist at Amazon Web Services focused on Amazon Kendra. When not working, he enjoys hiking and walking on a beach.

 

 

 

Juan Pablo Bustos is an AI Services Specialist Solutions Architect at Amazon Web Services, based in Dallas, TX. Outside of work, he loves spending time writing and playing music as well as trying random restaurants with his family.

Read More

Amazon Augmented AI is now a HIPAA eligible service

Amazon Augmented AI is now a HIPAA eligible service

Amazon Augmented AI (Amazon A2I) is now a HIPAA eligible service. Amazon A2I makes it easy to build the workflows required for human review of machine learning (ML) predictions. HIPPA eligibility applies to AWS Regions where the service is available and means you can use Amazon A2I add human review of protected health information (PHI) to power your healthcare workflows through your private workforce.

If you have a HIPAA Business Associate Addendum (BAA) in place with AWS, you can now start using Amazon A2I for your HIPAA eligible workloads. If you don’t have a BAA in place with AWS, or if you have any other questions about running HIPAA-regulated workloads on AWS, please contact us. For information and best practices about configuring AWS HIPAA eligible services to store, process, and transmit PHI, see the Architecting for HIPAA Security and Compliance on Amazon Web Services whitepaper.

Amazon A2I makes it easy to build the workflows required for human review of ML predictions. Amazon A2I brings human review to all developers, removing the undifferentiated heavy lifting associated with building human review systems and managing large numbers of human reviewers. Many healthcare customers like Change Healthcare, EPL Innovative Solutions, and partners like TensorIoT are already exploring new ways to use the power of ML to automate their current workloads and transform how they provide care to patients and use AWS services to help them meet their compliance needs under HIPAA. 

Change Healthcare

Change Healthcare is a leading independent healthcare technology company that provides data and analytics-driven solutions to improve clinical, financial, and patient engagement outcomes in the US healthcare system.

“At Change Healthcare, we help accelerate healthcare’s transformation by innovating to remove inefficiencies, reduce costs, and improve outcomes. We have a robust set of integrated artificial intelligence engines that bring new insights, impact, and innovation to the industry. Critical to our results is enabling human-in-the-loop to understand our data and automate workflows. Amazon A2I makes it easy to build the workflows required for human review of ML predictions. With Amazon A2I becoming HIPAA eligible, we are able to involve the human in the workflow and decision-making process, helping to increase efficiency with the millions of documents we process to create even more value for patients, payers, and providers.”

—Luyuan Fang, Chief AI Officer, Change Healthcare

TensorIoT

TensorIoT was founded on the instinct that the majority of compute is moving to the edge and all things are becoming smarter. TensorIoT is creating solutions to simplify the way enterprises incorporate Intelligent Edge Computing devices, AI, and their data into their day-to-day operations.

“TensorIoT has been working with customers to build document ingestion pipelines since Amazon Textract was in preview. Amazon A2I helps us easily add human-in-the-loop for document workflows, and one of the most frequently requested features from our healthcare customers is the ability to handle protected health information. Now with Amazon A2I being added to HIPAA eligible services, our healthcare customers will also be able to significantly increase the ingestion speed and accuracy of documents to provide insights and drive better outcomes for their doctors and patients.”

—Charles Burden, Head of Business Development, TensorIoT, AWS Advanced Consulting Partner

EPL Innovative Solutions

EPL Innovative Solutions, charged with the mission of “Changing Healthcare,” is an orchestrated effort, based on decades of experience in various healthcare theaters across the nation, to assess systems, identify shortcomings, plan and implement strategies, and lead the process of change for the betterment of the patient, provider, organization, or community experience.

“At EPL Innovative Solutions, we are excited to add Amazon A2I to our revolutionary cloud-based platform to serve healthcare clients that rely on us for HIPAA secure, accurate, and efficient medical coding and auditing services. We needed industry experts to optimize our platform, so we reached out to Belle Fleur Technologies as our AWS Partner for the execution of this solution to allow 100% human-in-the-loop review to meet our industry-leading standards of speed and accuracy.”

—Amanda Donoho, COO, EPL Innovative Solutions

Summary

Amazon A2I helps you automate your human review workloads and is now a HIPAA eligible service. For video presentations, sample Jupyter notebooks, or more information about use cases like document processing, object detection, sentiment analysis, text translation, and others, see Amazon Augmented AI Resources.

 


About the Author

Anuj Gupta is the Product Manager for Amazon Augmented AI. He is focusing on delivering products that make it easier for customers to adopt machine learning. In his spare time, he enjoys road trips and watching Formula 1.

Read More

Registration now open for the 2020 Testing and Verification Symposium

The fourth annual Facebook Testing and Verification (TAV) Symposium brings together academia and industry in an open environment to exchange ideas and showcase the top experts from testing and verification scientific research and practice. Taking place virtually this year, the symposium is open to all testing and verification practitioners and researchers and is free to attend. Those interested in attending may submit their registration request below.

RegisterAttendees are invited to join the event over the course of three days, from December 1 to 3. The symposium’s agenda will include several talks that will offer opportunities for Q&A via the event platform.

“The TAV Symposium is all about bringing communities together: testing and verification, and academia and industry,” says Peter O’Hearn, Facebook researcher and University College London professor. “Speakers include leading academic researchers in TAV as well as folks from industry who deploy TAV techniques to practicing engineers. Both of these groups of people are pushing boundaries on what is known and on how TAV techniques can be used to help people. We believe that cross-fertilization of ideas is incredibly valuable, helping us all go further together.”

At the 2019 TAV Symposium, Facebook Software Engineer Nadia Alshahwan gave a talk on Sapienz testing and shared some of her team’s challenges and solutions with the community. For the 2020 symposium, she hopes to continue to have valuable discussions with attendees. “I can’t wait to see what this year will bring,” says Alshahwan. “The TAV Symposium has allowed my team to discover new collaborations and gain valuable feedback from this diverse audience.”

Below is the list of confirmed speakers, which can also be found on the registration page. As speakers are confirmed, they will be added on the registration site leading up to the event.

Confirmed speakers

Nadia Alshahwan (Facebook)

David Clark (University College London)

David Dill (Novi at Facebook)

Alistair Donaldson (Google; Imperial College London)

Philippa Gardner (Imperial College London)

Mark Harman (Facebook; University College London)

John Hughes (Quviq AB; Chalmers University, Gothenburg)

Bryan O’Sullivan (Facebook)

Sukyoung Ryu (KAIST)

For more information about speakers, including full bios and topics, visit the registration page.

To learn more about TAV research at Facebook, watch the 2018 TAV RFP video and read the summaries from the 2018 TAV Symposium and the 2019 TAV Symposium (images below).

The post Registration now open for the 2020 Testing and Verification Symposium appeared first on Facebook Research.

Read More

How DeepMap optimizes their video inference workflow with Amazon SageMaker Processing

How DeepMap optimizes their video inference workflow with Amazon SageMaker Processing

Although we might think the world is already sufficiently mapped by the advent of global satellite images and street views, it’s far from complete because much of the world is still uncharted territory. Maps are designed for humans, and can’t be consumed by autonomous vehicles, which need a very different technology of maps with much higher precision.

DeepMap, a Palo Alto startup, is the leading technology provider of HD mapping and localization services for autonomous vehicles. These two services are integrated to provide high-precision localization maps, likely down to a centimeter precision. This demands processing a high volume of data to maintain precision and localization accuracy. In addition, road conditions can change minute to minute, so the maps guiding self-driving cars have to update in real time. DeepMap accumulates years of experience of mapping server development and uses the latest big data, machine learning (ML), and AI technology to build out their video inferencing and mapping pipeline.

In this post, we describe how DeepMap is revamping their video inference workflow by using Amazon SageMaker Processing, a customizable data processing and model evaluation feature, to streamline their workload by reducing complexity, processing time, and cost. We start out by describing the current challenges DeepMap is facing. Then we go over the proof of concept (POC) implementation and production architecture for the new solution using Amazon SageMaker Processing. Finally, we conclude with the performance improvements they achieved with this solution.

Challenges

DeepMap’s current video inference pipeline needs to process large amounts of video data collected by their cars, which are equipped with cameras and LIDAR laser scanning devices and drive on streets and freeways to collect video and image data. It’s a complicated, multi-step, batch processing workflow. The following diagram shows the high-level architecture view of the video processing and inferencing workflow.


With their previous workflow architecture, DeepMap discovered a scalability issue that increased processing time and cost due to the following reasons:

  • Multiple steps and separate batch processing stages, which could be error-prone and interrupt the workflow
  • Additional storage required in Amazon Simple Storage Service (Amazon S3) for intermediate steps
  • Sequential processing steps that prolonged the total time to complete inference

DeepMap’s infrastructure team recognized the issue and approached the AWS account team for guidance on how to better optimize their workflow using AWS services. The problem was originally presented as a workflow orchestration issue. A few AWS services were proposed and discussed for workflow orchestration, including:

However, after a further deep dive into their objectives and requirements, they determined that these services addressed the multiple steps coordination issue but not the storage and performance optimization objectives. Also, DeepMap wanted to keep the solution in the realm of the Amazon SageMaker ML ecosystem, if possible. After a debrief and further engagement with the Amazon SageMaker product team, a recently released Amazon SageMaker feature—Amazon SageMaker Processing—was proposed as a viable and potentially best fit solution for the problem.

Amazon SageMaker Processing comes to the rescue

Amazon SageMaker Processing lets you easily run the preprocessing, postprocessing, and model evaluation workloads on a fully managed infrastructure. Besides the full set of additional data and processing capabilities, Amazon SageMaker Processing is particularly attractive and promising for DeepMap’s problem at hand because of its flexibility in the following areas:

  • Setting up your customized container environment, also known as bring your own container (BYOC)
  • Custom coding to integrate with other application APIs that reside in your VPCs

These were the key functionalities DeepMap was looking for to redesign and optimize their current inference workflow. They quickly agreed on a proposal to explore Amazon SageMaker Processing and move forward as a proof of concept (POC).

Solution POC

The following diagram shows the POC architecture, which illustrates how a container in Amazon SageMaker Processing can make real-time API calls to a private VPC endpoint. The full architecture of the new video inference workload is depicted in the next section.

 

The POC demonstration includes the following implementation details:

  • Sample source data – Two video files (from car view). The following images show examples of Paris and Beijing streets.


  • Data stores – Two S3 buckets:
    • Source video buckets3://sourcebucket/input
    • Target inference result buckets3://targetbucket/output
  • Custom container – An AWS pre-built deep learning container based on MXNET with other needed packages and the pretrained model.
  • Model – A pre-trained semantic segmentation GluonCV model from the GluonCV model zoo. GluonCV provides implementations of state-of-the-art deep learning algorithms in computer vision. It aims to help engineers, researchers, and students quickly prototype products, validate new ideas, and learn computer vision. The GluonCV model zoo contains six kinds of pretrained models: classification, object detection, segmentation, pose estimation, action recognition, and depth prediction. For this post, we use deeplab_resnet101_citys, which was trained with Cityscape dataset and focuses on semantic understanding of urban street scenes, so this model is suitable for car view images. The following images are a sample of segmentation inference; we can see the model assigned red for people and blue for cars.


  • Amazon SageMaker Processing environment – Two instances (p3.2xlarge) configured for private access to the VPC API endpoint.
  • Mock API server – A web server in a private VPC mimicking DeepMap’s video indexing APIs. When invoked, it responds with a “Hello, Builders!” message.
  • Custom processing script – An API call to the mock API endpoint in the private VPC to extract frames from the videos, perform segmentation model inference on the frames, and store the results.

Amazon SageMaker Processing launches the instances you specified, downloads the container image and datasets, runs your script, and uploads the results to the S3 bucket automatically. We use the Amazon SageMaker Python SDK to launch the processing job. See the following code:

from sagemaker.network import NetworkConfig
from sagemaker.processing import (ProcessingInput, ProcessingOutput,
                                  ScriptProcessor)

instance_count = 2
"""
This network_config is for Enable VPC mode, which means the processing instance could access resources within vpc
change to your security_group_id and subnets_id
security_group_ids = ['YOUR_SECURITY_GROUP_ID']
subnets = ["YOUR_SUBNETS_ID1","YOUR_SUBNETS_ID2"]
"""
security_group_ids = vpc_security_group_ids
subnets = vpc_subnets

network_config = NetworkConfig(enable_network_isolation=False, 
                               security_group_ids=security_group_ids, 
                               subnets=subnets)

video_formats = [".mp4", ".avi"]
image_width = 1280
image_height = 720
frame_time_interval = 1000

script_processor = ScriptProcessor(command=['python3'],
                image_uri=processing_repository_uri,
                role=role,
                instance_count=instance_count,
                instance_type='ml.p3.2xlarge',
                network_config=network_config)

# with S3 shard key
script_processor.run(code='processing.py',
                      inputs=[ProcessingInput(
                        source=input_data,
                        destination='/opt/ml/processing/input_data',
                        s3_data_distribution_type='ShardedByS3Key')],
                      outputs=[ProcessingOutput(destination=output_data,
                                                source='/opt/ml/processing/output_data',
                                                s3_upload_mode = 'Continuous')],
                      arguments=['--api_server_address', vpc_api_server_address,
                                '--video_formats', "".join(video_formats),
                                '--image_width', str(image_width),
                                '--image_height', str(image_height),
                                '--frame_time_interval', str(frame_time_interval)]
                     )
script_processor_job_description = script_processor.jobs[-1].describe()
print(script_processor_job_description)

We use the ShardedByS3Key mode for the S3_data_distribution_type to leverage the feature in Amazon SageMaker that shards the objects by Amazon S3 prefix, so the instance receives 1/N of the total objects for faster parallel processing. Because this video inference job is just one part of DeepMap’s entire map processing workflow, S3_upload_mode is set to Continuous to streamline with the subsequent processing tasks. For the complete POC sample codes, see the GitHub repo.

The POC was successfully completed, and the positive results demonstrated the capability and flexibility of Amazon SageMaker Processing. It met the following requirements for DeepMap:

  • Dynamically invoke an API in a private VPC endpoint for real-time custom processing needs
  • Reduce the unnecessary intermediate storage for video frames

Production solution architecture

With the positive results from the demonstration of the POC, DeepMap’s team decided to re-architect their current video process and inference workflow by using Amazon SageMaker Processing. The following diagram depicts the high-level architecture of the new workflow.


The DeepMap team initiated a project to implement this new architecture. The initial production development setting is as follows:

  • Data source – Camera streams (30fps) collected from the cars are chopped and stored as 1-second h264 encoded video clips. All video clips are stored in the source S3 buckets.
  • Video processing – Within a video clip (of 30 frames in total), only a fraction of key frames are useful for map making. The relevant key frames information is stored in DeepMap’s video metadata database. Video processing codes run in an Amazon SageMaker Processing container, which call a video indexing API via a VPC private endpoint to retrieve relevant key frames for inferencing.
  • Deep learning inference – Deep learning inference code queries the key frame information from the database, decodes the key frames in memory, and applies the deep learning model using the semantic segmentation algorithm to produce the results and store the output in the S3 result bucket. The inference codes also run within the Amazon SageMaker Processing custom containers.
  • Testing example – We use a video clip of a collected road scene in .h264 format (000462.h264). Key frame metadata information about the video clip is stored in the database. The following is an excerpt of the key frame metadata information dumped from the database:
    image_paths {
      image_id {
        track_id: 12728
        sample_id: 4686
      }
      camera_video_data {
        stream_index: 13862
        key_frame_index: 13860
        video_path: "s3://sensor-data/4e78__update1534_vehicle117_2020-06-11__upload11960/image_00/rectified-video/000462.h264"
      }
    }
    image_paths {
      image_id {
        track_id: 12728
        sample_id: 4687
      }
      camera_video_data {
        stream_index: 13864
        key_frame_index: 13860
        video_path: "s3://sensor-data/4e78__update1534_vehicle117_2020-06-11__upload11960/image_00/rectified-video/000462.h264"
      }
    }

A relevant key frame is returned from the video index API call for the subsequent inference task (such as the following image).

The deep learning inference result is performed using the semantic segmentation algorithm running in Amazon SageMaker Processing to determine the proper lane line from the key frame. Using the preceding image as input, we receive the following output.

Performance improvements

As of this writing, DeepMap has already seen the expected performance improvements using the newly optimized workflow, and been able to achieve the following:

  • Streamline and reduce the complexity of current video-to-image preprocessing workflow. The real-time API video indexing call has reduced two steps to one.
  • Reduce the total time for video preprocessing and image DL inferencing. Through the streamlined process, they can now run decoding and deep learning inference on different processors (CPU and GPU) in different threads, potentially saving 100% preprocessing time (as long as the inference takes longer than the video decoding, which is true in most cases).
  • Reduce the intermediate storage spaces to store the images for inference job. Each camera frame (1920×1200, encoded as JPEG format) takes 500 KB to store, but a 1-second video clip (x264 encoded) with 30 continuous frames takes less than 2 MB storage (thanks to the video encoding). So, the storage reduction rate is about (1 – 2MB / (500KB * 30)) ~= 85%.

The following table summarizes the overall improvements of the new optimized workflow.

Measurements Before After Performance Improvements
Processing steps Two steps One step 50% simpler workflow
Processing time Video preprocessing to extract key frames Parallel processing (with multiple threads in Amazon SageMaker Processing containers) 100% reduction of video preprocessing time
Storage Intermediate S3 buckets for preprocessed video frames None (in-memory) 85% reduction
Compute Separate compute resources for video pre-processing using Amazon Elastic Compute Cloud (Amazon EC2) None (running in the Amazon SageMaker Processing container) 100% reduction of video preprocessing compute resources

Conclusion

In this post, we described how DeepMap used the new Amazon SageMaker Processing capability to redesign their video inference workflow to achieve a more streamlined workflow. Not only did they save on storage costs, they also improved their total processing time.

Their successful use case also demonstrates the flexibility and scalability of Amazon SageMaker Processing, which can help you build more scalable ML processing and inferencing workloads. For more information about integrating Amazon SageMaker Processing, see Amazon SageMaker Processing – Fully Managed Data Processing and Model Evaluation. For more information about using services such as Step Functions to build more efficient ML workflows, see Building machine learning workflows with Amazon SageMaker Processing jobs and AWS Step Functions.

Try out Amazon SageMaker Processing today to further optimize your ML workloads.

 


About the Authors

Frank Wang is a Startup Senior Solutions Architect at AWS. He has worked with a wide range of customers with focus on Independent Software Vendors (ISVs) and now startups. He has several years of engineering leadership, software architecture, and IT enterprise architecture experiences, and now focuses on helping customers through their cloud journey on AWS.

 

 

Shishuai Wang is an ML Specialist Solutions Architect working with the AWS WWSO team. He works with AWS customers to help them adopt machine learning on a large scale. He enjoys watching movies and traveling around the world.

 

 

 

Yu Zhang is a staff software engineer and the technical lead of the Deep Learning platform at DeepMap. His research interests include Large-Scale Image Concept Learning, Image Retrieval, and Geo and Climate Informatic

 

 

 

Tom Wang is a founding team member and Director of Engineering at DeepMap, managing their cloud infrastructure, backend services, and map pipelines. Tom has 20+ years of industry experience in database storage systems, distributed big data processing, and map data infrastructure. Prior to DeepMap, Tom was a tech lead at Apple Maps and key contributor to Apple’s map data infrastructure and map data validation platform. Tom holds an MS degree in computer science from the University of Wisconsin-Madison.

Read More

Supercomputing Chops: China’s Tsinghua Takes Top Flops in SC20 Student Cluster Battle

Supercomputing Chops: China’s Tsinghua Takes Top Flops in SC20 Student Cluster Battle

Props to team top flops.

Virtual this year, the SC20 Student Cluster Competition was still all about teams vying for top supercomputing performance in the annual battle for HPC bragging rights.

That honor went to Beijing’s Tsinghua University, whose six-member undergraduate student team clocked in 300 teraflops of processing performance.

A one teraflop computer can process one trillion floating-point operations per second.

The Virtual Student Cluster Competition was this year’s battleground for 19 teams. Competitors consisted of either high school or undergraduate students. Teams were made up of six members, an adviser and vendor partners.

Real-World Scenarios

In the 72-hour competition, student teams designed and built virtual clusters running NVIDIA GPUs in the Microsoft Azure cloud. Students completed a set of benchmarks and real-world scientific workloads.

Teams ran the Gromac molecular dynamics application, tackling COVID-19 research. They also ran the CESM application to work on optimizing climate modeling code. The “reproducibility challenge” called on the teams to replicate results from an SC19 research paper.

Among other hurdles, teams were tossed a surprise exascale computing project mini-application, miniVite, to test their chops at compiling, running and optimizing.

A leaderboard tracked performance results of their submissions and the amount of money spent on Microsoft Azure as well as the burn rate of their spending by the hour on cloud resources.

Roller-Coaster Computing Challenges

The Georgia Institute of Technology competed for its second time. This year’s squad, dubbed Team Phoenix, had the good fortune of landing advisor Vijay Thakkar, a Gordon Bell Prize nominee this year.

Half of the team members were teaching assistants for introductory systems courses at Georgia Tech, said team member Sudhanshu Agarwal.

Georgia Tech used NVIDIA GPUs “wherever it was possible, as GPUs reduced computation time,” said Agarwal.

“We had a lot of fun this year and look forward to participating in SC21 and beyond,” he said.

Pan Yueyang, a junior in computer science at Peking University, joined his university’s supercomputing team before taking the leap to participate in the SC20 battle. But it was full of surprises, he noted.

He said that during the competition his team ran into a series of unforeseen hiccups. “Luckily it finished as required and the budget was slightly below the limitation,” he said.

Jacob Xiaochen Li, a junior in computer science at the University of California, San Diego, said his team was relying on NVIDIA GPUs for the MemXCT portion of the competition to reproduce the scaling experiment along with memory bandwidth utilization. “Our results match the original chart closely,” he said, noting there were some hurdles along the way.

Po Hao Chen, a sophmore in computer science at Boston University, said he committed to the competition because he’s always enjoyed algorithmic optimization. Like many, he had to juggle the competition with the demands of courses and exams.

“I stayed up for three whole days working on the cluster,” he said. “And I really learned a lot from this competition.”

Teams and Flops

Tsinghua University, China
300 TFLOPS

ETH Zurich
129 TFLOPS

Southern University of Science and Technology
120 TFLOPS

Texas A&M University
113 TFLOPS

Georgia Institute of Technology
108 TFLOPS

Nanyang Technological University, Singapore
105 TFLOPS

University of Warsaw
75.0 TFLOPS

University of Illinois
71.6 TFLOPS

Massachusetts Institute of Technology
64.9 TFLOPS

Peking University
63.8 TFLOPS

University of California, San Diego
53.9 TFLOPS

North Carolina State University
44.3 TFLOPS

Clemson University
32.6 TFLOPS

Friedrich-Alexander University Erlangen-Nuremberg
29.0 TFLOPS

Northeastern University
21.1 TFLOPS

Shanghai Jiao Tong University
19.9 TFLOPS

ShanghaiTech University
14.4 TFLOPS

University of Texas
13.1 TFLOPS

Wake Forest University
9.172 TFLOPS

 

The post Supercomputing Chops: China’s Tsinghua Takes Top Flops in SC20 Student Cluster Battle appeared first on The Official NVIDIA Blog.

Read More

The Language Interpretability Tool (LIT): Interactive Exploration and Analysis of NLP Models

The Language Interpretability Tool (LIT): Interactive Exploration and Analysis of NLP Models

Posted by James Wexler, Software Developer and Ian Tenney, Software Engineer, Google Research

As natural language processing (NLP) models become more powerful and are deployed in more real-world contexts, understanding their behavior is becoming increasingly critical. While advances in modeling have brought unprecedented performance on many NLP tasks, many research questions remain about not only the behavior of these models under domain shift and adversarial settings, but also their tendencies to behave according to social biases or shallow heuristics.

For any new model, one might want to know in which cases a model performs poorly, why a model makes a particular prediction, or whether a model will behave consistently under varying inputs, such as changes to textual style or pronoun gender. But, despite the recent explosion of work on model understanding and evaluation, there is no “silver bullet” for analysis. Practitioners must often experiment with many techniques, looking at local explanations, aggregate metrics, and counterfactual variations of the input to build a better understanding of model behavior, with each of these techniques often requiring its own software package or bespoke tool. Our previously released What-If Tool was built to address this challenge by enabling black-box probing of classification and regression models, thus enabling researchers to more easily debug performance and analyze the fairness of machine learning models through interaction and visualization. But there was still a need for a toolkit that would address challenges specific to NLP models.

With these challenges in mind, we built and open-sourced the Language Interpretability Tool (LIT), an interactive platform for NLP model understanding. LIT builds upon the lessons learned from the What-If Tool with greatly expanded capabilities, which cover a wide range of NLP tasks including sequence generation, span labeling, classification and regression, along with customizable and extensible visualizations and model analysis.

LIT supports local explanations, including salience maps, attention, and rich visualizations of model predictions, as well as aggregate analysis including metrics, embedding spaces, and flexible slicing. It allows users to easily hop between visualizations to test local hypotheses and validate them over a dataset. LIT provides support for counterfactual generation, in which new data points can be added on the fly, and their effect on the model visualized immediately. Side-by-side comparison allows for two models, or two individual data points, to be visualized simultaneously. More details about LIT can be found in our system demonstration paper, which was presented at EMNLP 2020.

Exploring a sentiment classifier with LIT.

Customizability
In order to better address the broad range of users with different interests and priorities that we hope will use LIT, we’ve built the tool to be easily customizable and extensible from the start. Using LIT on a particular NLP model and dataset only requires writing a small bit of Python code. Custom components, such as task-specific metrics calculations or counterfactual generators, can be written in Python and added to a LIT instance through our provided APIs. Also, the front end itself can be customized, with new modules that integrate directly into the UI. For more on extending the tool, check out our documentation on GitHub.

Demos
To illustrate some of the capabilities of LIT, we have created a few demos using pre-trained models. The full list is available on the LIT website, and we describe two of them here:

  • Sentiment analysis: In this example, a user can explore a BERT-based binary classifier that predicts if a sentence has positive or negative sentiment. The demo uses the Stanford Sentiment Treebank of sentences from movie reviews to demonstrate model behavior. One can examine local explanations using saliency maps provided by a variety of techniques (such as LIME and integrated gradients), and can test model behavior with perturbed (counterfactual) examples using techniques such as back-translation, word replacement, or adversarial attacks. These techniques can help pinpoint under what scenarios a model fails, and whether those failures are generalizable, which can then be used to inform how best to improve a model.
    Analyzing token-based salience of an incorrect prediction. The word “laughable” seems to be incorrectly raising the positive sentiment score of this example.
  • Masked word prediction: Masked language modeling is a “fill-in-the-blank” task, where the model predicts different words that could complete a sentence. For example, given the prompt, “I took my ___ for a walk”, the model might predict a high score for “dog.” In LIT one can explore this interactively by typing in sentences or choosing from a pre-loaded corpus, and then clicking specific tokens to see what a model like BERT understands about language, or about the world.
    Interactively selecting a token to mask, and viewing a language model’s predictions.

LIT in Practice and Future Work
Although LIT is a new tool, we have already seen the value that it can provide for model understanding. Its visualizations can be used to find patterns in model behavior, such as outlying clusters in embedding space, or words with outsized importance to the predictions. Exploration in LIT can test for potential biases in models, as demonstrated in our case study of LIT exploring gender bias in a coreference model. This type of analysis can inform next steps in improving model performance, such as applying MinDiff to mitigate systemic bias. It can also be used as an easy and fast way to create an interactive demo for any NLP model.

Check out the tool either through our provided demos, or by bringing up a LIT server for your own models and datasets. The work on LIT has just started, and there are a number of new capabilities and refinements planned, including the addition of new interpretability techniques from cutting edge ML and NLP research. If there are other techniques that you’d like to see added to the tool, please let us know! Join our mailing list to stay up to date as LIT evolves. And as the code is open-source, we welcome feedback on and contributions to the tool.

Acknowledgments
LIT is a collaborative effort between the Google Research PAIR and Language teams. This post represents the work of the many contributors across Google, including Andy Coenen, Ann Yuan, Carey Radebaugh, Ellen Jiang, Emily Reif, Jasmijn Bastings, Kristen Olson, Leslie Lai, Mahima Pushkarna, Sebastian Gehrmann, and Tolga Bolukbasi. We would like to thank all those who contributed to the project, both inside and outside Google, and the teams that have piloted its use and provided valuable feedback.

Read More