Getting a batch job completion message from Amazon Translate

Getting a batch job completion message from Amazon Translate

Amazon Translate is a neural machine translation service that delivers fast, high-quality, and affordable language translation. Neural machine translation is a form of language translation automation that uses deep learning models to deliver more accurate and natural-sounding translation than traditional statistical and rule-based translation algorithms. The translation service is trained on a wide variety of content across different use cases and domains to perform well on many kinds of content.

The Amazon Translate asynchronous batch processing capability enables organizations to translate a large collection of text or HTML documents. They can translate the collection of documents from one language to another with just a single API call. The ability to process data at scale is becoming important to organizations across all industries. In this blog post, we are going to demonstrate how you can build a notification mechanism to message you when a batch translation job is complete. This can enable end-end automation by triggering other Lambda functions or integrate with SQS for any post processing steps.

Solution overview

The following diagram illustrates the high-level architecture of the solution.

Architecture diagram depicting polling mechanism for batch translation job

The solution contains the following steps:

  1. A user starts a batch translation job.
  2. An Amazon CloudWatch Events rule picks up the event and triggers the AWS Step Functions
  3. The Job Poller AWS Lambda function polls the job status every 5 minutes.
  4. When the Amazon Translate batch job is complete, an email notification is sent via an Amazon Simple Notification Service (Amazon SNS) topic.

To implement this solution, you must create the following:

  1. An SNS topic
  2. An AWS Identity and Access Management (IAM) role
  3. A Lambda function
  4. A Step Functions state machine
  5. A CloudWatch Events rule

Creating an SNS topic

To create an SNS topic, complete the following steps:

  1. On the Amazon SNS console, create a new topic.
  2. For Topic name, enter a name (for example, TranslateJobNotificationTopic).
  3. Choose Create topic.

You can now see the TranslateJobNotificationTopic page. The Details section displays the topic’s name, ARN, display name (optional), and the AWS account ID of the Topic owner.

  1. In the Details section, copy the topic ARN to the clipboard (arn:aws:sns:us-east-1:123456789012:TranslateJobNotificationTopic).
  2. On the left navigation pane, choose Subscriptions.
  3. Choose Create subscription.
  4. On the Create subscription page, enter the topic ARN of the topic you created earlier (arn:aws:sns:us-east-1:123456789012:TranslateJobNotificationTopic).
  5. For Protocol, select Email.
  6. For Endpoint, enter an email address that can receive notifications.
  7. Choose Create subscription.

For email subscriptions, you have to first confirm the subscription by choosing the confirm subscription link in the email you received.

Creating an IAM role for the Lambda function

To create an IAM role, complete the following steps. For more information, see Creating an IAM Role.

  1. On the IAM console, choose Policies.
  2. Choose Create Policy.
  3. On the JSON tab, enter the following IAM policy:
{    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "translate:DescribeTextTranslationJob",
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "arn:aws:logs:us-east-1:123456789012:log-group:/aws/lambda/TranslateJobStatusPoller:*"
        }
    ]
}

Update the resource property for CloudWatch Logs permission to reflect your configuration for Region, AWS account ID, and the Lambda function name.

  1. Choose Review policy.
  2. Enter a name (MyLambdaPolicy) for this policy and choose Create policy.
  3. Record the name of this policy for later steps.
  4. On the left navigation pane, choose Roles.
  5. Choose Create role.
  6. On the Select role type page, choose Lambda and the Lambda use case.
  7. Choose Next: Permissions.
  8. Filter policies by the policy name that you just created, and select the check-box.
  9. Choose Next: Tags.
  10. Add an appropriate tag.
  11. Choose Next: Review.
  12. Give this IAM role an appropriate name, and note it for future use.
  13. Choose Create role.

Creating a Lambda function

To create a Lambda function, complete the following steps. For more information, see Create a Lambda Function with the Console.

  1. On the Lambda console, choose Author from scratch.
  2. For Function Name, enter the name of your function (for example, TranslateJobStatusPoller).
  3. For Runtime, choose Python 3.8.
  4. For Execution role, select Use an existing role.
  5. Choose the IAM role you created in the previous step.
  6. Choose Create Function.
  7. Remove the default function and enter the following code into the Function Code window:
# Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License.
# A copy of the License is located at## http://aws.amazon.com/apache2.0/
# or in the "license" file accompanying this file.
# This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,
# either express or implied. See the License for the specific language governing permissions
# and limitations under the License.
# Description: This Lambda function is part of the a step function that checks the status of Amazon translate batch job. 
# Author: Sudhanshu Malhotra
import boto3
import logging
import os

from botocore.exceptions import ClientError

logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)

def msgpublish(jobid):
    client = boto3.client('translate')
    try:
        response = client.describe_text_translation_job(JobId=jobid)
        logger.debug('Job Status is: {}' .format(response['TextTranslationJobProperties']['JobStatus']))
        return(response['TextTranslationJobProperties']['JobStatus'])
    
    except ClientError as e:
        logger.error("An error occured: %s" % e)
    
def lambda_handler(event, context):
    logger.setLevel(logging.DEBUG)
    logger.debug('Job ID is: {}' .format(event))
    return(msgpublish(event))
  1. Choose Save.

Creating a state machine

To create a state machine, complete the following steps. For more information, see Create a State Machine.

  1. On the Step Functions console, on the Define state machine page, choose Start with a template.
  2. Choose Hello world.
  3. Under Type, choose Standard.
  4. Under Definition, enter the following Amazon States Language. Make sure to replace the Lambda function and SNS topic ARN.
{
"Comment": "Polling step function for translate job complete",
"StartAt": "LambdaPoll",
"States": {
"LambdaPoll": {
"Type": "Task",
"Resource": "<ARN of the Lambda Function created in step 3>",
"InputPath": "$.detail.responseElements.jobId",
"ResultPath": "$.detail.responseElements.jobStatus",
"Next": "Job Complete?",
"Retry": [
{
"ErrorEquals": [
"States.ALL"
],
"IntervalSeconds": 1,
"MaxAttempts": 3,
"BackoffRate": 2
}
]
},
"Job Complete?": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.detail.responseElements.jobStatus",
"StringEquals": "IN_PROGRESS",
"Next": "Wait X Seconds"
},
{
"Variable": "$.detail.responseElements.jobStatus",
"StringEquals": "SUBMITTED",
"Next": "Wait X Seconds"
},
{
"Variable": "$.detail.responseElements.jobStatus",
"StringEquals": "COMPLETED",
"Next": "Notify"
},
{
"Variable": "$.detail.responseElements.jobStatus",
"StringEquals": "FAILED",
"Next": "Notify"
},
{
"Variable": "$.detail.responseElements.jobStatus",
"StringEquals": "STOPPED",
"Next": "Notify"
}
],
"Default": "Wait X Seconds"
},
"Wait X Seconds": {
"Type": "Wait",
"Seconds": 60,
"Next": "LambdaPoll"
},
"Notify": {
"Type": "Task",
"Resource": "arn:aws:states:::sns:publish",
"Parameters": {
"Subject": "Translate Batch Job Notification",
"Message": {
"JobId.$": "$.detail.responseElements.jobId",
"S3OutputLocation.$": "$.detail.requestParameters.outputDataConfig.s3Uri",
"JobStatus.$": "$.detail.responseElements.jobStatus"
},
"MessageAttributes": {
"JobId": {
"DataType": "String",
"StringValue.$": "$.detail.responseElements.jobId"
},
"S3OutputLocation": {
"DataType": "String",
"StringValue.$": "$.detail.requestParameters.outputDataConfig.s3Uri"
}
},
"TopicArn": "<ARN of the SNS topic created in step 1>"
},
"End": true
}
}
}
  1. Use the graph in the Visual Workflow pane to check that your Amazon States Language code describes your state machine correctly. You should see something like the following screenshot.
    Amazon State machine depicting various states of batch translation job
  1. Choose Next.
  2. For Name, enter a name for the state machine.
  3. Under Permissions, select Create new role.

You now see an info block with the details of the role and the associated permissions.

IAM policy screenshot for State machine

  1. Choose Create state machine.

Creating a CloudWatch Events rule

To create a CloudWatch Events rule, complete the following steps. This rule catches when a user performs a StartTextTranslationJob API event and triggers the step function (set as a target).

  1. On the CloudWatch console, choose Rules.
  2. Choose Create rule.
  3. On the Step 1: Create rule page, under Event Source, select Event Pattern.
  4. Choose Build custom event pattern from the drop-down menu.
  5. Enter the following code into the preview pane:
{ 
    "source": [ "aws.translate" ], 
    "detail-type": [ "AWS API Call via CloudTrail" ], 
    "detail": { 
            "eventSource": [ "translate.amazonaws.com" ], 
            "eventName": [ "StartTextTranslationJob" ] 
            } 
 }
  1. For Targets, select Step Functions state machine.
  2. Select the state machine you created earlier.
  3. For permission to send events to Step Functions, select Create a new role for this specific resource.
  4. Choose Configure details.
  5. On the Step 2: Configure rule details page, enter a name and description for the rule.
  6. For State, select Enabled.
  7. Choose Create rule.

Validating the solution

To test this solution, I first create an Amazon Translate batch job and provide the input text Amazon Simple Storage Service (Amazon S3) location, output Amazon S3 location, target language, and the data access service role ARN. For instructions on creating a batch translate job, see Asynchronous Batch Processing or Translating documents with Amazon Translate, AWS Lambda, and the new Batch Translate API.

The following screenshot shows my batch job on the Translation jobs page.

Amazon translate job start screenshot of Translate console

The CloudWatch Events rule picks up the StartTextTranslationJob API and triggers the state machine. When the job is complete, I get an email notification via Amazon SNS.

Translate job complete notification email screenshot showing job status, job name and output location of the translated job

Conclusion

In this post, we demonstrated how you can use Step Functions to poll for an Amazon Translate batch job. For this use case, we configured an email notification to send when a job is complete; however, you can use this framework to trigger other Lambda functions or integrate with Amazon Simple Queue Service (Amazon SQS) for any postprocessing automated steps, enabling you to build an end-to-end automated workflow. For further reading, see the following:

About the Authors


Sudhanshu Malhotra is a Boston-based Enterprise Solutions Architect for AWS. He is a technology enthusiast who enjoys helping customers find innovative solutions to complex business challenges. His core areas of focus are DevOps, Machine Learning, and Security. When he’s not working with customers on their journey to the cloud, he enjoys reading, hiking, and exploring new cuisines.

 

 

 

Siva Rajamani is a Boston-based Enterprise Solutions Architect for AWS. He enjoys working closely with customers, supporting their digital transformation and AWS adoption journey. His core areas of focus are Serverless, Application Integration, and Security. Outside of work, he enjoys outdoor activities and watching documentaries.

 

 

Read More

Announcing the winner of the AWS DeepComposer Chartbusters Spin the Model challenge

Announcing the winner of the AWS DeepComposer Chartbusters Spin the Model challenge

We’re excited to announce the top 10 compositions and the winner of the AWS DeepComposer Chartbusters Spin the Model challenge. AWS DeepComposer provides a creative and hands-on experience for learning generative AI and machine learning. Chartbusters is a global monthly challenge where you can use AWS DeepComposer to create original compositions and compete to top the charts and win prizes.

The Spin the Model challenge that ran from July 31, 2020, to August 23, 2020, required developers to create a custom genre model.

Top 10 compositions

The competition was intense! The high-quality submissions made it challenging for our judges to select the chart-toppers. Our panel of experts, Kesha Williams, Umut Isik, and Wayne Chi selected the top 10 ranked compositions by evaluating the quality of the music and creativity. They also checked the submitted notebooks to ensure that the compositions were generated by custom models.

The winner of the Spin the Model challenge is… (cue drumroll) Lena Taupier! You can listen to her winning composition and the top 10 compositions on SoundCloud or on the AWS DeepComposer console. The following screenshot shows the top 10 compositions for the Spin the Model challenge.

Lena will receive an AWS DeepComposer Chartbusters gold record and tell her story in an upcoming blog post, right here on the AWS Machine Learning Blog.

Congratulations, Lena Taupier!

It’s time to move on to the next Chartbusters challenge: The Sounds of Science. The challenge launches today and is open until September 23rd, 2020. Although you don’t need a physical keyboard to compete, you can take advantage of the September price promotion and buy the AWS DeepComposer keyboard for $89.00 to enhance your music generation experience. For more information about the competition and how to participate, see the Sounds of Science Chartbusters challenge blog post.


About the Author

Maryam Rezapoor is a Senior Product Manager with AWS AI Ecosystem team. As a former biomedical researcher and entrepreneur, she finds her passion in working backward from customers’ needs to create new impactful solutions. Outside of work, she enjoys hiking, photography, and gardening.

 

 

 

Read More

How to run distributed training using Horovod and MXNet on AWS DL Containers and AWS  Deep Learning AMIs

How to run distributed training using Horovod and MXNet on AWS DL Containers and AWS  Deep Learning AMIs

Distributed training of large deep learning models has become an indispensable way of model training for computer vision (CV) and natural language processing (NLP) applications. Open source frameworks such as Horovod provide distributed training support to Apache MXNet, PyTorch, and TensorFlow. Converting your non-distributed Apache MXNet training script to use distributed training with Horovod only requires 4-5 lines of additional code. Horovod is an open-source distributed deep learning framework created by Uber. It leverages efficient inter-GPU and inter-node communication methods such as NVIDIA Collective Communications Library (NCCL) and Message Passing Interface (MPI) to distribute and aggregate model parameters between workers. The primary use case of Horovod is to make distributed deep learning fast and easy: to take a single-GPU training script and scale it successfully to train across many GPUs in parallel. For those unfamiliar with using Horovod and Apache MXNet for distributed training, we recommend first reading our previous blog post on the subject before diving into this example.

MXNet is integrated with Horovod through the common distributed training APIs defined in Horovod. You can convert the non-distributed training script to a Horovod world by following the higher level code skeleton.  This is a streamlined user experience where the user only has to add few lines of code to make it Horovod compatible. However, other pain points may still make distributed training not flow as smoothly as expected. For example, you may need to install additional software and libraries and resolve your incompatibilities to make distributed training works. Horovod requires a certain version of Open MPI, and if you want to leverage high-performance training on NVIDIA GPUs, you need to install a NCCL library. Another pain point you may encounter is when trying to scale up the number of training nodes in the cluster. You need to make sure all the software and libraries in the new nodes are properly installed and configured.

AWS Deep Learning Containers (AWS DL Containers) has greatly simplified the process of launching new training instances in a cluster, and the latest release includes all the required libraries to run distributed training using MXNet with Horovod. The AWS Deep Learning AMIs (DLAMI) comes with popular open-source deep learning frameworks and pre-configured CUDA, cuDNN, Open MPI, and NCCL libraries.

In this post, we demonstrate how to run distributed training using Horovod and MXNet via AWS DL Containers and the DLAMIs.

Getting started with AWS DL Containers

AWS DL Containers are a set of Docker images pre-installed with deep learning frameworks to make it easy to deploy custom machine learning (ML) environments quickly. The AWS DL Containers provide optimized environments with different deep learning frameworks (MXNet, TensorFlow, PyTorch), Nvidia CUDA (for GPU instances), and Intel MKL (for CPU instances) libraries, and are available in Amazon Elastic Container Registry (Amazon ECR). You can launch AWS DL Containers on Amazon Elastic Kubernetes Service (Amazon EKS), self-managed Kubernetes on Amazon Elastic Compute Cloud (Amazon EC2), and Amazon Elastic Container Service (Amazon ECS). For more information about launching AWS DL Containers, follow this link.

Training an MXNet model with Deep Learning Containers on Amazon EC2

The MXNet Deep Learning Container comes with pre-installed libraries such as MXNet, Horovod, NCCL, MPI, CUDA, and cuDNN. The following diagram illustrates this architecture.

For instructions on setting up AWS DL Containers on an EC2 instance, see: Train a Deep Learning model with AWS Deep Learning Containers on Amazon EC2. For a hands-on tutorial running a Horovod training script, complete steps 1-5 of the preceding post. To use an MXNet framework, complete the following for step 6:

CPU:

  1. Download the Docker image from Amazon ECR repository.
docker run -it 763104351884.dkr.ecr.us-east-1.amazonaws.com/mxnet-training:1.6.0-cpu-py27-ubuntu16.04

  1. In the terminal of the container, run the following command to train the MNIST example.
git clone --recursive https://github.com/horovod/horovod.git
mpirun -np 1 -H localhost:1 --allow-run-as-root python horovod/examples/mxnet_mnist.py

GPU:

  1. Download the Docker image from Amazon ECR repository.
nvidia-docker run -it 763104351884.dkr.ecr.us-east-1.amazonaws.com/mxnet-training:1.6.0-gpu-py27-cu101-ubuntu16.04

  1. In the terminal of the container, run the following command to train the MNIST example.
git clone --recursive https://github.com/horovod/horovod.git
mpirun -np 4 -H localhost:4 --allow-run-as-root python horovod/examples/mxnet_mnist.py

If the final output looks like the following code, you successfully ran the training script:

[1,0]<stderr>:INFO:root:Epoch[4]    Train: accuracy=0.987580    Validation: accuracy=0.988582
[1,0]<stderr>:INFO:root:Training finished with Validation Accuracy of 0.988582

For instructions on ending the EC2 instances, execute the step 7 of the preceding post. One can follow the same steps as described above for their own training script.

Training a MXNet model with Deep Learning Containers on Amazon EKS

Amazon EKS is a managed service that makes it easy for you to run Kubernetes on AWS without needing to install, operate, and maintain your own Kubernetes control plane or nodes. Kubernetes is an open-source system for automating the deployment, scaling, and management of containerized applications. In this post, we show you how to set up a deep learning environment using Amazon EKS and AWS DL Containers. With Amazon EKS, you can scale a production-ready environment for multiple-node training and inference with Kubernetes containers.

The following diagram illustrates this architecture:

For instructions on setting up a deep learning environment with Amazon EKS and AWS DL Containers, see Amazon EKS Setup. To set up an Amazon EKS cluster, use the open-source tool called eksctl. It is recommended to use an EC2 instance with the latest DLAMI. You can spin up a GPU cluster or CPU cluster based on your use case. For this post, follow the Amazon EKS Setup instructions until the Manage Your Cluster section.

When your Amazon EKS cluster is up and running, you can run the Horovod MXNet training on the cluster. For instructions, see MXNet with Horovod distributed GPU training, which uses a Docker image that already contains a Horovod training script and a three-node cluster with node-type=p3.8xlarge. This tutorial runs the Horovod example script for MXNet on an MNIST model. The Horovod examples directory also contains an Imagenet script, which you can run on the same Amazon EKS cluster.

Getting started with the AWS DLAMI

The AWS DLAMI are machine learning images loaded with deep learning frameworks and their dependent libraries such as NVIDIA CUDA, NVIDIA cuDNN, NCCL, Intel MKL-DNN, and many others. DLAMI is a one-stop shop for deep learning in the cloud. You can launch EC2 instances with Ubuntu or Amazon Linux. DLAMI comes with pre-installed deep learning frameworks such as Apache MXNet, TensorFlow, Keras, and PyTorch. You can train custom models, experiment with new deep learning algorithms, and learn new deep learning skills and techniques. The AMIs also offer GPU and CPU acceleration through pre-configured drivers, Anaconda virtual environments, and come with popular Python packages.

The DLAMI for Ubuntu and Amazon Linux now comes with pre-installed Horovod support with a MXNet backend. You can scale your ML model from a single GPU to multiple GPUs or a multi-node cluster using an EC2 GPU instance. You can also achieve greater scaling efficiency and higher multi-GPU training performance by using Horovod with MXNet as compared to native MXNet KVStore.

All versions of the DLAMI beginning with Ubuntu 18.04 v27.0, Amazon Linux v27.0, and Amazon Linux 2 v27.0 support Horovod with MXNet. You can use any AWS CPU or GPU machine to spin up the instance using deep learning images. It is recommended to use CPU instances of type C5, C5n, or C4 (optimized for high-performance, compute-intensive workloads) and GPU instances of type P2 and P3 (the latest generation of general-purpose GPU instances).

You can run Horovod training on a single-node or multi-node cluster. A single-node cluster consists of a single machine. A multi-node cluster consists of more than one homogeneous machine. In this post, we walk you through running Horovod multi-node cluster training using MXNet.

Creating a multi-node cluster using the DLAMI

You can spin up the EC2 instances with AWS CloudFormation templates, the AWS Command Line Interface (AWS CLI), or on the Amazon EC2 console. For this post, we use the Amazon EC2 console. We launch an identical number of EC2 instances with the same DLAMI. We spin up the instances in the same Region, placement group, and security zone because those factors play an important role in achieving high performance.

  1. On the Amazon EC2 console, search for Deep Learning AMI.
  2. Choose Select for any Deep Learning AMI (Ubuntu).

  1. You now have to choose the instance type.

AWS supports various categories of instances. Based on your use case such as training time and cost, you can select General Purpose instances such as M5, Compute optimized instances such as C5, or GPU based instances such as family of P2 or P3. You can create a cluster of as many instances as possible based on your requirement. For this post, we select four p3.8xlarge instances with a total of 16 GPUs.

  1. Choose Next: Configure Instance Details.

Next, we need to configure the instances.

  1. For Number of instances, enter 4.
  2. Enter your specific network, subnet, and placement group.

If you don’t have a placement group, you can create one.

  1. Choose Next: Add Storage.

You can change this number based on your dataset size. For the demo purpose, we used the default value.

  1. Choose Next: Add Tags.
  2. For Key, enter Name.
  3. For Value, enter Horovod_MXNet.

  1. Choose Next: Configure Security Group.
  2. Create your own security group or use the existing one.

  1. Choose Review and Launch.
  2. Review your instance launch details and choose Launch.

After you choose Launch, you’re asked to select an existing key pair or create a new one.

  1. For Key pair name, enter a key pair.

If you don’t have a key pair, choose Download Key Pair.

  1. Choose Launch Instances.

If you see a green banner message, you launched the instance successfully.

  1. Choose View Instances.

  1. Search for horovod_MXNet to see the four instances you created.

We need to do one more step in our cluster setup. All the instances should be able to communicate with each other, so we have to add our security group ID to all the instances’ inbound rules.

  1. Select one instance from the four which you created.
  2. On the Description tab, choose Security groups (for this post, launch-wizard-144).

  1. On the Inbound tab, copy the security group ID (sg-00e9376c8f3cab57f).
  2. Choose Edit inbound rules.
  3. Choose Add rule.
  4. Select All traffic and SSH.
  5. Choose Save rules.

You can now see your inbound rules listed.

  1. Repeat the process to add a security group in the inbound rules for all the instances so they can communicate with each other.

You are now done with setting up your cluster.

Horovod with MXNet training on a multi-node cluster

For Horovod with MXNet training on a multi-node cluster, complete the following steps:

  1. Copy your PEM key from your local machine to one of the EC2 instances (primary node):
// For Ubuntu user
scp -i <your_pem_key_path> ubuntu@<IPv4_Public_IP>:/home/ubuntu/

// For Amazon Linux user
scp -I <your_pem_key_path> ec2-user@<IPv4_Public_IP>:/home/ec2-user/

  1. SSH into your primary node:
// For Ubuntu user
$ ssh -i <your_pem_key> ubuntu@<IPv4_Public_IP>

// For Amazon Linux user
$ ssh -i <your_pem_key> ec2-user@<IPv4_Public_IP>

  1. Enable the passwordless SSHing between EC2 instances, without providing the PEM file. Enter the following command into your primary node:
eval `ssh-agent`
ssh-add <your_pem_key>

  1. When you SSH or connect for the first time from one EC2 instance to another, you see the following message:
$ ssh <another_ec2_ipv4_address>
The authenticity of host 'xxx.xx.xx.xx' can't be established.
ECDSA key fingerprint is SHA256:xxxaaabbbbccc.
Are you sure you want to continue connecting (yes/no)?

# Make sure you are able to SSH from one EC2 to another without this authenticity, otherwise horovod won't able to communicate with other machines

# SOLUTION:
# Open file "/etc/ssh/ssh_config" and add this lines at the end
Host *
   StrictHostKeyChecking no
   UserKnownHostsFile=/dev/null
   
  1. Activate the CONDA environment:
// If using Python 3.6
$ source activate mxnet_p36
  1. As an optional step, confirm Horovod is using MXNet on the backend by running the following command (as of this writing, the Horovod version is 0.19.5):
$ horovodrun -cb

// Output
Horovod v0.19.5:

Available Frameworks:
    [ ] TensorFlow
    [ ] PyTorch
    [X] MXNet

Available Controllers:
    [X] MPI
    [X] Gloo

Available Tensor Operations:
    [X] NCCL
    [ ] DDL
    [ ] CCL
    [X] MPI
    [X] Gloo
  1. We have provided a sample MNIST example for you to run the Horovod training.
$ horovodrun -np 4 python examples/horovod/mxnet/train_mxnet_hvd_mnist.py

// Output
[1,0]<stderr>:INFO:root:Namespace(batch_size=64, dtype='float32', epochs=5, lr=0.002, momentum=0.9, no_cuda=True)
[1,1]<stderr>:INFO:root:Namespace(batch_size=64, dtype='float32', epochs=5, lr=0.002, momentum=0.9, no_cuda=True)
[1,1]<stderr>:INFO:root:downloaded http://data.mxnet.io/mxnet/data/mnist.zip into data-1/mnist.zip successfully
[1,1]<stderr>:[04:29:14] src/io/iter_mnist.cc:113: MNISTIter: load 30000 images, shuffle=1, shape=[64,1,28,28]
// ....... <output truncated> ...........
[1,0]<stderr>:INFO:root:Epoch[4]    Train: accuracy=0.987647    Validation: accuracy=0.986178
[1,0]<stderr>:INFO:root:Training finished with Validation Accuracy of 0.986178
[1,1]<stderr>:INFO:root:Training finished with Validation Accuracy of 0.986178
  1. Don’t forget to stop or terminate the instance when you no longer need it.

For more information about the horovodrun command, see here.

The preceding code just shows how to run the Horovod training script on a multi-node EC2 instance. You can find the Horovod MXNet example script on the Horovod GitHub repo. Additionally, you can bring your own training script that’s compatible with Horovod and MXNet and train the model on a single node and multi-node cluster. To learn more about the performance comparison between Horovod and Parameter Server, this blog post illustrates the difference as ResNet50 scales from 1 to 64 GPUs.

When using Horovod, keep the following in mind:

  • All your instances must be the same type
  • All your instances must have the same environment
  • The data should be stored in the same location across nodes.
  • The training script should be in the same location across nodes.

Conclusion

In this post, we demonstrated how to run the distributed training using Horovod and MXNet on Amazon EC2 and Amazon EKS using AWS DL Containers and AWS DLAMI. Using Horovod, your Apache MXNet models can be distributed across a cluster of instances, providing a significant increase in performance with only minimal changes to your training script.

For more information about deep learning and MXNet, see the MXNet crash course and Dive into Deep Learning book. You can also get started on the MXNet website and MXNet GitHub examples directory.

If you are new to distributed training, we highly recommend reading the paper Horovod: fast and easy distributed deep learning inTensorFlow. You can also install Horovod, build Horovod with MXNet, and follow the MNIST or ImageNet use case. You can find more Horovod MXNet examples in GluonCV example and GluonNLP example on GitHub.


About the Authors

Chaitanya Bapat is a Software Engineer with the AWS Deep Learning team. He works on Apache MXNet and integrating the framework with Sagemaker, DLC and DLAMI. In his spare time, he loves watching sports and enjoys reading books and learning Spanish.

 

 

 

Karan Jariwala is a Software Development Engineer on the AWS Deep Learning team. His work focuses on training deep neural networks. Outside of work, he enjoys hiking, swimming, and playing tennis.

Read More

Using Amazon Textract with AWS PrivateLink

Using Amazon Textract with AWS PrivateLink

Amazon Textract now supports Amazon Virtual Private Cloud (Amazon VPC) endpoints via AWS PrivateLink so you can securely initiate API calls to Amazon Textract from within your VPC and avoid using the public internet.

In this post, we show you how to access Amazon Textract APIs from within your VPC without traversing the public internet, and how to use VPC endpoint policies to restrict access to Amazon Textract.

Amazon Textract is a fully managed machine learning (ML) service that automatically extracts text and data from scanned documents. It goes beyond simple optical character recognition (OCR) to identify, understand, and extract data from forms and tables.

You can use AWS PrivateLink to access Amazon Textract securely by keeping your network traffic within the AWS network, while simplifying your internal network architecture. It enables you to privately access Amazon Textract APIs from your VPC in a scalable manner by using interface VPC endpoints. A VPC endpoint is an elastic network interface in your subnet with a private IP address that serves as the entry point for all Amazon Textract API calls. A VPC endpoint enables you to privately connect your VPC to supported AWS services and VPC endpoint services powered by AWS PrivateLink without requiring an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection. Instances in your VPC don’t require public IP addresses to communicate with resources in the service. Traffic between your VPC and the other service doesn’t leave the AWS network.

The following diagram illustrates the solution architecture.

Prerequisites

To get started, you need to have a VPC set up in the AWS Region of your choice. For instructions, see Getting started with Amazon VPC. In this post, we use the us-east-2 Region. You should also have an AWS account with sufficient access to create resources in the following services:

  • Amazon Textract
  • AWS PrivateLink

Solution overview

The walkthrough includes the following high-level steps:

  1. Create VPC endpoints.
  2. Use Amazon Textract via AWS PrivateLink.

Creating VPC endpoints

To create a VPC endpoint, complete the following steps. We use the us-east-2 Region in this post, so the console and URLs may differ depending on the Region you choose.

  1. On the Amazon VPC console, choose Endpoints.
  2. Choose Create Endpoint.
  3. For Service category, select AWS services.
  4. For Service Name, choose amazonaws.us-east-2-textract or com.amazonaws.us-east-2.textract-fips.
  5. For VPC, enter the VPC you want to use.
  6. For Availability Zone, select your preferred Availability Zones.
  7. For Enable DNS name, select Enable for this endpoint.

This creates a private hosted zone that enables you to access the resources in your VPC using custom DNS domain names, such as example.com, instead of using private IPv4 addresses or private DNS hostnames provided by AWS. The Amazon Textract DNS hostname that the AWS Command Line Interface (AWS CLI) and Amazon Textract SDKs use by default (https://textract.Region.amazonaws.com) resolves to your VPC endpoint.

  1. For Security group, choose the security group to associate with the endpoint network interface.

If you don’t specify a security group, the default security group for your VPC is associated.

  1. Choose Create Endpoint.

When the Status changes to available, your VPC endpoint is ready for use.

  1. Choose the Policy tab to apply more restrictive access control to the VPC endpoint.

The following example policy limits VPC endpoint access to only the DetectDocumentText API. An IAM principal, even with access to all Textract APIs, can still only access the specific API in the following policy using this VPC endpoint. This is an additional layer of access control applied at the VPC endpoint. You should apply the principle of least privilege when defining your own policy. For more information, see Controlling access to services with VPC endpoints.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "textract:DetectDocumentText"
            ],
            "Resource": [
                "*"
            ],
            "Effect": "Allow",
            "Principal": "*"
        }
    ]
}

Now that you have set up your VPC endpoint, the following section shows you how to access Amazon Textract APIs from within that VPC using AWS PrivateLink.

Accessing Amazon Textract APIs via AWS PrivateLink

After you set up the relevant VPC endpoint policies, you have two options to configure endpoints in order to access Amazon Textract APIs:

The following code is an example AWS CLI command to run from within the VPC:

$ aws textract detect-document-text --document '{"S3Object":{"Bucket":"textract-test-bucket","Name":"example-doc.jpg"}}' --region us-east-2
  • You can also use the DNS name that was generated when creating the VPC endpoint. These DNS names are in the form of *.us-east-2.vpce.amazonaws.com or *.textract-fips.us-east-2.vpce.amazonaws.com. For example: vpce-0f1aa01f0ce676709-il663k5n.textract.us-east-2.vpce.amazonaws.com.

The following code is an example AWS CLI command to run from within the VPC:

aws textract detect-document-text --document '{"S3Object":{"Bucket":"textract-test-bucket","Name":"example-doc.jpg"}}' --region us-east-2 --endpoint https://vpce-05e9d346575f9cb38-1wdh6mi2.textract.us-east-2.vpce.amazonaws.com

Conclusion

You now have successfully configured a VPC endpoint for Amazon Textract in your AWS account. Traffic to Amazon Textract APIs from that VPC endpoint are only within the AWS network. The VPC endpoint policy you configured further allows you to restrict which Amazon Textract APIs are accessible from within that VPC.


About the Author

Raj Copparapu is a Product Manager focused on putting machine learning in the hands of every developer.

 

 

 

Thomas joined Amazon Web Services in 2016 initially working on Application Auto Scaling before moving into this current role at Textract. Before joining AWS, he worked in engineering roles in the domains of computer graphics and networking. Thomas holds a master’s degree in engineering from the university of Leuven in Belgium.

 

Read More

Announcing the AWS DeepComposer Chartbusters challenge, The Sounds of Science

Announcing the AWS DeepComposer Chartbusters challenge, The Sounds of Science

We’re excited to announce the next AWS DeepComposer Chartbusters challenge, in which developers interactively collaborate with AI using the new edit melody feature launching today! Chartbusters is a monthly challenge where you can use AWS DeepComposer to create original compositions on the console using machine learning techniques, compete to top the charts, and win prizes. Following the completion of the Bach to the Future and Spin the Model challenges, we’re thrilled to announce the launch of the third Chartbusters challenge: The Sounds of Science. This challenge launches today and participants can submit their compositions until September 23, 2020.

To improve your music creation experience, we’re offering the AWS DeepComposer keyboard at a special price of $89 for a limited time during September on amazon.com.

In this challenge, you need to create background music using AWS DeepComposer to accompany a short video clip. The Autoregressive CNN (AR-CNN) algorithm and newly released edit melody feature on the AWS DeepComposer console enables you to iterate on the musical quality while allowing you to retain creativity and uniqueness, as you create the perfect composition to match the video’s theme.

The following screenshot shows the new Edit melody option.

The AR-CNN algorithm in the AWS DeepComposer Music Studio enhances your original input melody by adding or removing notes, and makes the newly generated input melody sound more Bach-like. Next, you can use the edit melody feature to assist the AI by adding or removing specific notes or even change their pitch and length by using an interactive view of the input piano roll. The edit melody feature facilitates better human-AI collaboration by allowing you to correct mistakes made by the model during inference. You can then resubmit your newly modified track, and choose Enhance input melody to create another composition.

How to compete

To take part in The Sounds of Science, just do the following:

  1. Watch the competition video. Your goal is to create background music that best matches this video.
  2. Go to AWS DeepComposer Music Studio and create a melody with the keyboard, import a melody, or choose a sample melody on the console.
  3. Choose the Autoregressive generative AI technique, and then choose the Autoregressive CNN Bach. You have four parameters that you can choose to adjust: Maximum notes to add, Maximum notes to remove, Sampling iterations, and Creative risk.
  4. Choose the appropriate values and then choose Enhance input melody.
  5. Use the Edit melody feature to add or remove notes. You can also change the note duration and pitch.
  6. When finished, choose Apply changes.
  7. Repeat these steps until you’re satisfied with the generated music.

When you’re happy with your composition, you can submit to SoundCloud.

  1. On the navigation panel, choose Chartbusters and choose Submit a composition.
  2. Choose your composition from the drop-down menu, provide a track name for your composition, and choose Submit.

AWS DeepComposer then submits your composition to the Sounds of Science playlist on SoundCloud. You don’t need to submit the video.

Conclusion

Congratulations! You’ve successfully submitted your composition to the AWS DeepComposer Chartbusters challenge The Sounds of Science. Invite your friends and family to listen to your creation on SoundCloud and vote on it!

To learn more about the different generative AI techniques supported by AWS DeepComposer, check out the learning capsules available on the AWS DeepComposer console.


About the Authors

Rahul Suresh is an Engineering Manager with the AWS AI org, where he has been working on AI based products for making machine learning accessible for all developers. Prior to joining AWS, Rahul was a Senior Software Developer at Amazon Devices and helped launch highly successful smart home products. Rahul is passionate about building machine learning systems at scale and is always looking for getting these advanced technologies in the hands of customers. In addition to his professional career, Rahul is an avid reader and a history buff.

 

 

Maryam Rezapoor is a Senior Product Manager with AWS AI Ecosystem team. As a former biomedical researcher and entrepreneur, she finds her passion in working backward from customers’ needs to create new impactful solutions. Outside of work, she enjoys hiking, photography, and gardening.

 

 

 

 

 

Read More

This month in AWS Machine Learning: August 2020 edition

This month in AWS Machine Learning: August 2020 edition

Every day there is something new going on in the world of AWS Machine Learning—from launches to new use cases to interactive trainings. We’re packaging some of the not-to-miss information from the ML Blog and beyond for easy perusing each month. Check back at the end of each month for the latest roundup.

Launches

This month we gave you a new way to add intelligence to your contact center, improved personalized recommendations, made our Machine Learning University content available, and more. Read on for our August launches:

Use cases

Get ideas and architectures from AWS customers, partners, ML Heroes, and AWS experts on how to apply ML to your use case:

Explore more ML stories

Want more news about developments in ML? Check out the following stories:

Mark your calendars

Join us for the following exciting ML events:

  • Register for the Public Sector AWS Artificial Intelligence and Machine Learning Week, September 14–18, 2020. Whether you’re in a government, nonprofit, university, or hospital setting, this webinar series is designed to help educate those new to AI, spark new ideas for business stakeholders, and deep dive into technical implementation for developers.
  • AWS Power Hour: Machine Learning streams every Thursday at 4:00 PM PST on Twitch. The series offers free, fun, and interactive training with AWS expert hosts as they demonstrate how to build apps with AWS AI services. Designed for developers—even those without prior ML experience—the show helps you learn to build apps that showcase natural language, speech recognition, and other personalized recommendations. Tune in live, or catch the recorded episodes whenever it’s convenient for you.
  • AWS and Pluralsight are hosting a three-part webinar series on the ins-and-outs of AWS DeepRacer. In the series, you will learn about the basics of DeepRacer, reinforcement learning and refinement, and the future of DeepRacer. View the first two webinars and register for the live webinar on September 22 here.

Also, if you missed it, the season finale of SageMaker Fridays aired on August 28. Stay tuned for more news on season 2!

See you next month for more on AWS ML!


About the Author

Laura Jones is a product marketing lead for AWS AI/ML where she focuses on sharing the stories of AWS’s customers and educating organizations on the impact of machine learning. As a Florida native living and surviving in rainy Seattle, she enjoys coffee, attempting to ski and enjoying the great outdoors.

Read More

Getting started with the Amazon Kendra SharePoint Online connector

Getting started with the Amazon Kendra SharePoint Online connector

Amazon Kendra is a highly accurate and easy-to-use enterprise search service powered by machine learning (ML). To get started with Amazon Kendra, we offer data source connectors to get your documents easily ingested and indexed.

This post describes how to use Amazon Kendra’s SharePoint Online connector. To allow the connector to access your SharePoint Online site, you only need to provide the index URL and the credentials of a user with owner rights. These access credentials will be securely stored in AWS Secrets Manager.

Currently, Amazon Kendra has two provisioning editions: the Amazon Kendra Developer Edition for building proof of concepts (POCs) and the Amazon Kendra Enterprise Edition. Amazon Kendra connectors work with both editions.

Prerequisites

To get started, you need the following:

  • A SharePoint Online site
  • A SharePoint Online user with owner rights

Owner rights are the minimum admin rights needed for the connector to access and ingest documents from your SharePoint site. This follows the AWS principle of granting least privilege access.

The metadata in your SharePoint Online documents must be specifically mapped to Amazon Kendra attributes. This mapping is done in the Attributes and field mappings section in this post. The SharePoint document title is mapped to the Amazon Kendra system attribute _document_title. If you skip the field mapping step, you need to create a new data connector to the SharePoint Online site.

The AWS Identity and Access Management (IAM) role for the SharePoint Online data source is not the same as the Amazon Kendra index IAM role. Please read the section Defining targets: Site URL and data source IAM role carefully. It’s important to pay particular attention to the interplay between the SharePoint Online data source’s IAM role and the secrets manager that contains your SharePoint Online credentials.

For this post, we assume that you already have a SharePoint Online site deployed.

Setting up a SharePoint Online connector for Amazon Kendra from the console

The following section describes the process of deploying an Amazon Kendra index and configuring a SharePoint Online connector. If you already have an index, you can skip to the Configuring the SharePoint Online connector section.

For this use case, our SharePoint Online site contains a collection of AWS whitepapers with custom columns, such as Topics.

Creating an Amazon Kendra index

In an Amazon Kendra setup workflow, the first step is to create an index, where you define an IAM role and the method you want Amazon Kendra to use for data encryption. For this use case, we create a new role.

If you use an existing role, check that it has permission to write to an Amazon CloudWatch log. For more information, see IAM roles for indexes.

Next, you select which provisioning edition to use. For this post, I select the Developer edition. If you’re new to Amazon Kendra, we recommend creating an Amazon Kendra Developer Edition index because it’s a more cost-efficient way to explore Amazon Kendra. For production environments, we highly recommended using the Enterprise Edition because it allows for more storage capacity and queries per day, and is designed for high availability.

Configuring the SharePoint Online connector

After you create your index, you set up the data sources. One of the advantages of implementing Amazon Kendra is that you can use a set of prebuilt connectors for data sources such as Amazon Simple Storage Service (Amazon S3), Amazon Relational Database Service (Amazon RDS), SharePoint Online, and Salesforce.

For this use case, we choose SharePoint Online.

Assigning a name to the data source

In the Define attributes section, you enter a name for the data source, an optional description, and assign optional tags.

Defining targets: Site URL and data source IAM role

In the Define targets section, you enter the targets where you need to define the SharePoint Online site URLs where the documents reside and the IAM role that the connecter uses to operate. It’s important to remember that this IAM role is different from the one used to create the index. For more information, see IAM roles for data sources.

If you don’t have an IAM role for this task, you can easily create one by choosing Create New Role. For this use case, I use a previously created role.

Under the URL text box, you can select Use change log, which enables the connector to use the SharePoint change log to determine the documents that need to be updated in the index. If your SharePoint change log is too large, your sync process may take longer.

You can also select Crawl attachments, which allows the crawler to include the attachments associated with items stored in your site.

You can also include or exclude documents by using regular expressions. You can define patterns that Amazon Kendra either uses to exclude certain documents from indexing or include only documents with that pattern. For more information, see SharePointConfiguration.

Providing SharePoint Online credentials

In the Configure settings section, you set up your SharePoint Online user (if you don’t have one created, you can create an additional user). The credentials you enter are stored in the Secrets Manager.

Save the authentication information and set up the sync run schedule, which determines how often Kendra checks your SharePoint Online site URLs for changes. For this use case, I choose to Run on demand.

Attributes and field mappings

In this next step, you can create field mappings. Even though this is an optional step, it’s a good idea to add this extra layer of metadata to your documents from SharePoint Online. Metadata enables you to improve accuracy through manual tuning, filtering, and faceting. You can’t add metadata to already ingested documents, so if you want to add metadata later, you need to delete this data source and recreate this data source with metadata and re-ingest your documents.

The default SharePoint Online metadata fields are Title, Created, and Modified.

One powerful feature is the ability to create custom field mappings. For example, on my SharePoint Online site, I created a column named Category. By importing this extra piece of information, we can create filters based on category names.

To import that extra information, you create a custom field mapping by choosing Add a new field mapping button.

If you’re combining multiple data sources, you can map this new field to an existing field. For this use case, I have other documents that have the attribute Category, so I choose Option A to map fields to an existing document attributes field in my Amazon Kendra index. For more information, see Creating custom document attributes.

Also, on my SharePoint Site, I have an additional field called Topic. Because I don’t have that field on my index yet, I select Option B and enter the data source field name and select the data type (for this use case, String).

Field names are case-sensitive, so we need to make sure we match them. Additionally, when a data field on SharePoint is renamed, only the display name changes. This means that if you want to import a data field, you need to refer to the original name. A way to find it is to sort by that column and check the name as listed on the address bar.

Let’s check what field is used for sorting:

Reviewing settings and creating a SharePoint Online data source

As a last step, you review the settings and create the data source. The Domain(s) and role section provides additional configuration information.

After you create your SharePoint Online data source, a banner similar to the following screenshot will appear at the top of your screen. To start the syncing and document ingestion process, choose Sync now.

You see a banner indicating the progress of the data source sync job. After the sync job is finished, you can test your index.

Testing

You can test your new index on the Amazon Kendra search console. See the following screenshot.

Also, if you configured extra fields as facetable, you can filter your documents by those facets. See the following screenshot.

Creating an Amazon Kendra index with a SharePoint Online connector with Python

In addition to the console, you can create a new Amazon Kendra index SharePoint online connector and sync it by using the AWS SDK for Python (Boto3). Boto3 makes it easy to integrate your Python application, library, or script with AWS services, including Amazon Kendra.

My personal preference for testing my Python scripts is to spin up an Amazon SageMaker notebook instance, a fully managed ML Amazon Elastic Compute Cloud (Amazon EC2) instance that runs the Jupyter Notebook app. For instructions, see Create an Amazon SageMaker Notebook Instance.

IAM roles requirements and overview

To create an index using the AWS SDK, you need to have the policy AmazonKendraFullAccess attached to the role you are using.

At a high level, these are the different roles Amazon Kendra requires:

  • IAM roles for indexes – Needed to write to CloudWatch Logs.
  • IAM roles for data sources – Needed when you use the CreateDataSource method. These roles require a specific set of permissions depending on the connector you use. For our use case, it needs permissions to access the following:
    • Secrets Manager, where the SharePoint online credentials are stored.
    • The AWS Key Management Service (AWS KMS) customer master key (CMK) to decrypt the credentials by Secrets Manager.
    • The BatchPutDocument and BatchDeleteDocument operations to update the index.
    • The Amazon S3 bucket that contains the SSL certificate used to communicate with the SharePoint Site (we use SSL for this use case).

For more information, see IAM access roles for Amazon Kendra.

For this method, you need:

  • An Amazon SageMaker notebooks role with permission to create an Amazon Kendra index where you’re using the notebook
  • An Amazon Kendra IAM role for CloudWatch
  • An Amazon Kendra IAM role for the SharePoint Online connector
  • A SharePoint Online credentials store on Secrets Manager

Creating an Amazon Kendra index

To create an index, you use the following code:

import boto3
from botocore.exceptions import ClientError
import pprint
import time
 
kendra = boto3.client("kendra")
 
print("Creating an index")
 
description = <YOUR INDEX DESCRIPTION>
index_name = <YOUR NEW INDEX NAME>
role_arn = "KENDRA ROLE WITH CLOUDWATCH PERMISSIONS ROLE"
 
try:
    index_response = kendra.create_index(
        Description = description,
        Name = index_name,
        RoleArn = role_arn,
        Edition = "DEVELOPER_EDITION",
        Tags=[
        {
            'Key': 'Project',
            'Value': 'SharePoint Test'
        } 
        ]
    )
 
    pprint.pprint(index_response)
 
    index_id = index_response['Id']
 
    print("Wait for Kendra to create the index.")
 
    while True:
        # Get index description
        index_description = kendra.describe_index(
            Id = index_id
        )
        # If status is not CREATING quit
        status = index_description["Status"]
        print("    Creating index. Status: "+status)
        if status != "CREATING":
            break
        time.sleep(60)
 
except  ClientError as e:
        print("%s" % e)
 
print("Done creating index.")

While your index is being created, you get regular updates (every 60 seconds; check line 38) until the process is complete. See the following code:

Creating an index
{'Id': '3311b507-bfef-4e2b-bde9-7c297b1fd13b',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '45',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Mon, 20 Jul 2020 19:58:19 GMT',
                                      'x-amzn-requestid': 'a148a4fc-7549-467e-b6ec-6f49512c1602'},
                      'HTTPStatusCode': 200,
                      'RequestId': 'a148a4fc-7549-467e-b6ec-6f49512c1602',
                      'RetryAttempts': 2}}
Wait for Kendra to create the index.
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: ACTIVE
Done creating index

When your index is ready it will provide an ID 3311b507-bfef-4e2b-bde9-7c297b1fd13b on the response. Your index ID will be different than the ID in this post.

Adding attributes to the Amazon Kendra index

If you have metadata attributes associated with your SharePoint Online documents, you should do the following:

  1. Determine the Amazon Kendra attribute name you want for each of your SharePoint Online metadata attributes. By default, Amazon Kendra has six reserved fields (_category, created_at, _file_type, _last_updated_at, _source_uri, and _view_count).
  2. Update the Amazon Kendra index with the Amazon Kendra attribute names.
  3. Map each SharePoint Online metadata attribute to each Amazon Kendra metadata attribute.

If you have the metadata attribute Topic associated with your SharePoint Online document, and you want to use the same attribute name in the Amazon Kendra index, the following code adds the attribute Topic to your Amazon Kendra index:

try:
    update_response = kendra.update_index(
        Id='3311b507-bfef-4e2b-bde9-7c297b1fd13b',
        RoleArn='arn:aws:iam::<YOUR ACCOUNT NUMBER>-NUMBER:role/service-role/AmazonKendra-us-east-1-KendraRole',
        DocumentMetadataConfigurationUpdates=[
        {
            'Name': 'Topic',
            'Type': 'STRING_VALUE',
            'Search': {
                'Facetable': True,
                'Searchable': True,
                'Displayable': True
            }
        }   
    ]
    )
except  ClientError as e:
        print('%s' % e)   
pprint.pprint(update_response) 

If everything goes well, we receive a 200 response:

{'ResponseMetadata': {'HTTPHeaders': {'content-length': '0',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Mon, 20 Jul 2020 20:17:07 GMT',
                                      'x-amzn-requestid': '3eba66c9-972b-4757-8d92-37be17c8f8a2},
                      'HTTPStatusCode': 200,
                      'RequestId': '3eba66c9-972b-4757-8d92-37be17c8f8a2',
                      'RetryAttempts': 0}} 
}

Providing the SharePoint Online credentials

You also need to have GetSecretValue for your secret stored in Secrets Manager.

If you need to create a new secret in Secrets Manager to store the SharePoint Online credentials, make sure the role you use has permissions to create a secret and tagging. See the following policy code:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "SecretsManagerWritePolicy",
            "Effect": "Allow",
            "Action": [
                "secretsmanager:UntagResource",
                "secretsmanager:CreateSecret",
                "secretsmanager:TagResource"
            ],
            "Resource": "*"
        }
    ]
}

To create a secret on Secrets Manager, enter the following code:

secretsmanager = boto3.client('secretsmanager')

SecretName = <YOUR SECRETNAME>
SharePointCredentials = "{'username': <YOUR SHAREPOINT SITE USERNAME>, 'password': <YOUR SHAREPOINT SITE PASSWORD>}"

try:
  create_secret_response = secretsmanager.create_secret(
  Name=SecretName,
  Description='Secret for a Sharepoint data source connector',
  SecretString=SharePointCredentials,
  Tags=[
   {
    'Key': 'Project',
    'Value': 'SharePoint Test'
   }
 ]
 )
except ClientError as e:
  print('%s' % e)
  pprint.pprint(create_secret_response)

If everything went well, you get a response with your secret’s ARN:

{'ARN': <YOUR SECRETS ARN>,
 'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
                                      'content-length': '159',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Wed, 22 Jul 2020 16:05:32 GMT',
                                      'x-amzn-requestid': '3d0ac6ff-bd32-4d2e-8107-13e49f070de5'},
                      'HTTPStatusCode': 200,
                      'RequestId': '3d0ac6ff-bd32-4d2e-8107-13e49f070de5',
                      'RetryAttempts': 0},
 'VersionId': '7f7633ce-7f6c-4b10-b5b2-2943dd3fd6ee'}

Creating the SharePoint Online data source

Your Amazon Kendra index is up and running and you have established the attributes that you want to map to our SharePoint Online document’s attributes.

You now need an IAM role with Kendra:BatchPutDocument and kendra:BatchDeleteDocument permissions. For more information, see IAM roles for Microsoft SharePoint Online data sources. We use the ARN for this IAM role when invoking the CreateDataSource API.

Make sure the role you use for your data source connector has a trust relationship with Amazon Kendra. See the following code:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "kendra.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]

The following code is the policy structure used:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetSecretValue"
            ],
            "Resource": [
                "arn:aws:secretsmanager:region:account ID:secret:secret ID"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "kms:Decrypt"
            ],
            "Resource": [
                "arn:aws:kms:region:account ID:key/key ID"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "kendra:BatchPutDocument",
                "kendra:BatchDeleteDocument"
            ],
            "Resource": [
                "arn:aws:kendra:region:account ID:index/index ID"
            ],
            "Condition": {
                "StringLike": {
                    "kms:ViaService": [
                        "kendra.amazonaws.com"
                    ]
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::bucket name/*"
            ]
        }
    ]
}

The following code is my role’s ARN:

arn:aws:iam::<YOUR ACCOUNT NUMBER>:role/Kendra-Datasource

Following the least privilege principle, we only allow our role to put and delete documents in our index and read the secrets to connect to our SharePoint Online site.

When creating a data source, you can specify the sync schedule, which indicates how often your index syncs with the data source we create. This schedule is defined on the Schedule key of our request. You can use schedule expressions for rules to define how often you want to sync your data source. For this use case, the ScheduleExpression is 'cron(0 11 * * ? *)', which sets the data source to sync every day at 11:00 AM.

I use the following code. Make sure you match your SiteURL and SecretARN, as well as your IndexID. Additionally, FieldMappings is where you map between the SharePoint Online attribute name and the Amazon Kendra index attribute name. I use the same attribute name in both, but you can name the Amazon Kendra attribute whatever you’d like.

print('Create a data source')
 
SecretArn= <YOUR SHAREPOINT ONLINE USER AND PASSWORD SECRETS ARN>
SiteUrl = <YOUR SHAREPOINT SITE URL>
DSName= <YOUR NEW DATA SOURCE NAME>
IndexId= <YOUR INDEX ID>
DSRoleArn= <YOUR DATA SOURCE ROLE>
ScheduleExpression='cron(0 11 * * ? *)'

try:
    datasource_response = kendra.create_data_source(
    Name=DSName,
    IndexId=IndexId,        
    Type='SHAREPOINT',
    Configuration={
        'SharePointConfiguration': {
            'SharePointVersion': 'SHAREPOINT_ONLINE',
            'Urls': [
                SiteUrl
            ],
            'SecretArn': SecretArn,
            'CrawlAttachments': True,
            'UseChangeLog': True,
            'FieldMappings': [
                {
                    'DataSourceFieldName': 'Topic',
                    'IndexFieldName': 'Topic'
                },
            ],
            'DocumentTitleFieldName': 'Title'
        },
               },
    Description='My SharePointOnline Datasource',
    RoleArn=DSRoleArn,
    Schedule=ScheduleExpression,
    Tags=[
        {
            'Key': 'Project',
            'Value': 'SharePoint Test'
        }
    ]
    )
    pprint.pprint(datasource_response)
    print('Waiting for Kendra to create the DataSource.')
    datasource_id = datasource_response['Id']
    while True:
        # Get index description
        datasource_description = kendra.describe_data_source(
            Id=datasource_id,
            IndexId=IndexId
        )
        # If status is not CREATING quit
        status = datasource_description["Status"]
        print("    Creating index. Status: "+status)
        if status != "CREATING":
            break
        time.sleep(60)    

except  ClientError as e:
        print('%s' % e)     

At this point, you should receive a 200 response:

Create a data source
{'Id': '527ac6f7-5f3c-46ec-b2cd-43980c714bf7',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '45',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Mon, 20 Jul 2020 15:26:13 GMT',
                                      'x-amzn-requestid': '30480044-0a86-446c-aadc-f64acb4b3a86'},
                      'HTTPStatusCode': 200,
                      'RequestId': '30480044-0a86-446c-aadc-f64acb4b3a86',
                      'RetryAttempts': 0}}

Syncing the data source

Even though you defined a schedule for syncing the data source, you can sync on demand by using start_data_source_sync_job:

DSId=<YOUR DATA SOURCE ID>
IndexId=<YOUR INDEX ID>
 
try:
    ds_sync_response = kendra.start_data_source_sync_job(
    Id=DSId,
    IndexId=IndexId
)
except  ClientError as e:
        print('%s' % e)  
        
pprint.pprint(ds_sync_response)

The response should look like the following code:

{'ExecutionId': '6574acd6-e66f-4797-85cf-278dce9256b4',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '54',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Mon, 20 Jul 2020 15:54:24 GMT',
                                      'x-amzn-requestid': '415547b2-d095-4501-b6ad-eba4b731d109'},
                      'HTTPStatusCode': 200,
                      'RequestId': '415547b2-d095-4501-b6ad-eba4b731d109',
                      'RetryAttempts': 0}}

Testing

Finally, you can query your index. See the following code:

response = kendra.query(
IndexId='3311b507-bfef-4e2b-bde9-7c297b1fd13b',
QueryText='Is there a service that has 11 9s of durability?')
if response['TotalNumberOfResults'] > 0:
    print(response['ResultItems'][0]['DocumentExcerpt']['Text'])
    print("More information: "+response['ResultItems'][0]['DocumentURI'])
else:
    print('No results found, please try a different search term.')

You will get a result like the following code:

Amazon S3 has a data durability of 11 nines. 
For transactional data storage, customers have the option to take advantage of the fully 
managed Amazon Relational Database Service (Amazon RDS) that supports Amazon 
Aurora, PostgreSQL, MySQL, MariaDB, Oracle, and Microsoft SQL Server with high 
More information: https://juansdomain.sharepoint.com/sites/AWSWhitePapers/Shared%20Documents/real-time_communication_aws.pdf

Common errors

Each of the errors noted in this section can occur if you’re using the Amazon Kendra console or the Amazon Kendra API.

You should look at the CloudWatch logs and error messages returned on the Amazon Kendra console or via the Amazon Kendra API. The CloudWatch logs help you determine the reason for a particular error, whether you are experiencing it using the console or programmatically.

Common errors when trying to access SharePoint Online as a data source are:

  • Secrets Manager errors
  • SharePoint credential errors
  • IAM role errors
  • URL errors

In the following sections, we provide more details on how to address each error.

Secrets Manager errors

You might get an error message from the Secrets Manager stating that your role doesn’t have permissions to retrieve the secrets value. This can occur when you create a new secret manager and you don’t add read permissions to the data source role.

Here’s an example of the error message:

Create a DataSource
('An error occurred (ValidationException) when calling the CreateDataSource '
 'operation: Secrets Manager throws the exception: User: '
 'arn:aws:sts::<YOUR ACCOUNT NUMBER>:assumed-role/Kendra-Datasource/DataSourceConfigurationValidator '
 'is not authorized to perform: secretsmanager:GetSecretValue on resource: '
 <YOUR SECRET ARN> '(Service: AWSSecretsManager; Status Code: 400; Error Code: '
 'AccessDeniedException; Request ID: 886ff6ac-f8f3-46b0-94dc-8286fd1682c1; '
 'Proxy: null)')

To address this, you need to make sure that our role has a policy attached to with GetSecretValue permissions on the secret.

If you’re troubleshooting on the console, complete the following steps:

  1. On the Secrets Manager console, copy the secret ARN.

The secret ARN is listed in the Secret details section. See the following screenshot.

  1. On the IAM console, choose Roles.
  2. Search for the role associated with Amazon Kendra.

  1. Choose the role that you assigned to the data source.
  2. Choose Add inline policy.

  1. For Select Service, choose Secrets Manager.
  2. On the visual editor, on the Access Level, choose Read.
  3. Choose GetSecretValue.
  4. Under Resources, select Specific.
  5. Choose Add ARN.
  6. For Specify ARN for secret, enter the secret ARN you copied.

  1. Review and choose Create Policy.

You can now go back to your Amazon Kendra data source setup and finish the process.

SharePoint credential errors

Another common issue can be caused by a failure to crawl the site. On the sync details, the error message may say something about invalid URLs. To dive deeper into the issue, select the error message.

This takes you to the CloudWatch console, where you can enter a query on the latest logs and choose Run Query.

The results appear on the Logs tab.

You can see three records matching the logStream generated by the data source sync job.

For the first document, the error message is “The URLs specified in the data source configuration aren’t valid. The URLs should be either a SharePoint site or list. Check the URLs and try the request again.”

However, it’s interesting to notice that this is the last generated message. let’s see what Document #2 shows us:

You may receive an invalid URL for the data source configuration that is triggered because of an underlying authentication problem.

The easiest way to address this issue is to generate new credentials for the Amazon Kendra crawler.

  1. To set up a user for the crawler to run, log in to your SharePoint Online configuration and open the Microsoft 365 Admin page.

  1. In the User management section, choose Add user.

  1. Fill in the form with the details for the crawler.

For this use case, you don’t need to assign a license for this user.

  1. Set it up as a user without admin center access.

  1. After you create the user, record the generated password because you need to modify it later.

  1. We can now go back to our site and choose the members icon on the top right of the screen.

  1. To add a member, choose Add members.

  1. Add the new user you just created and choose Save.

  1. From the drop-down menu under the new user’s name, choose Owner.

IAM role issues

Another common issue is caused by lack of permissions for the IAM role used to crawl your data source.

You can identify this issue on the CloudWatch logs. See the following code:

{
    "CrawlStatus": "ERROR",
    "ErrorCode": "InvalidRequest",
    "ErrorMessage": "Amazon Kendra can't run the BatchDeleteDocument action with the 
                     specified role. Make sure that the role grants 
                     the kendra:BatchDeleteDocument permission."
}

The permissions needed for this task are BatchPutDocument and BatchDeleteDocument.

Make sure that the resource matches your index ID (you can find your index ID on the index details page on the console).

Wrong SharePoint site URL

You may experience an error stating you need to provide a sharepoint.com URL. Make sure your site URL is under sharepoint.com.

Conclusion

You have now learned how to ingest the documents from your SharePoint Online site into your Amazon Kendra Index, either through the console or programmatically. In this example case, you have loaded some AWS Whitepapers into your index. You are now able to run some queries such as “What AWS service has 11 nines of durability?

Finally, don’t forget to check the other blog posts about Amazon Kendra!


About the Author

Juan Pablo Bustos is an AI Services Specialist Solutions Architect at Amazon Web Services, based in Dallas, TX. Outside of work, he loves spending time writing and playing music as well as trying random restaurants with his family.

 

 

 

David Shute is a Senior ML GTM Specialist at Amazon Web Services focused on Amazon Kendra. When not working, he enjoys hiking and walking on a beach.

 

Read More

Announcing the express testing capability in Amazon Lex

Announcing the express testing capability in Amazon Lex

Amazon Lex now provides the express testing capability on the AWS Management Console to expedite building your chatbot. You can start testing your bot soon after you initiate the build process without having to wait for the entire build to complete. You can use the new testing option to check the basic interaction elements such as the conversation flow, prompts, responses, and fulfillment logic.

Previously, you had to wait for the entire bot to build, which included multiple machine learning models, before you could confirm or test your changes. The new express testing feature allows you to confirm your changes with an intermediate model. You can start testing with exact input utterances right away and continue with more rigorous testing after the entire build is complete. This express testing capability enables you to test only exact matches with training data. With the new process, you can quickly iterate through the build and test phase, reducing the overall time required to deploy a bot in production.

How it works

After you choose Build, Amazon Lex starts preparing the build for express testing. You can view the status via the test window, as shown in the following screenshot.

Figure 1: Preparing build for express testing

For us-east-1, us-west-2, ap-southeast-2, or eu-west-1, scroll down to Advanced options and select Yes to opt in to the advanced features and enable the express testing capability. The feature is enabled by default in other Regions.

When the build completes for express testing, you can test utterances that are an exact match to the sample utterances in the test window. At this point you can test the dialog management, confirmation prompts, and the validation and fulfillment code hooks and responses. Amazon Lex completes the build process in the background.

Figure 2: Ready for express testing

You can test variations of the sample utterances after the bot build is complete. The build is then available for publishing to an alias, as shown in the following screenshot.

Figure 3: Complete build ready for deployment

When the build is successfully complete, the bot is ready for deployment, and you can publish the bot to enable complete conversations via interactive voice response (IVR), mobile apps, channels, and SDKs.

Conclusion

The new express testing capability for Amazon Lex allows you to accelerate iteration, test ideas, and make faster design decisions. The feature is available today in the N. Virginia, Oregon, Dublin, London, Sydney, Frankfurt, Tokyo, and Singapore regions.

For more information, see the Amazon Lex Developer Guide.


About the Authors

Esther Lee is a Product Manager for AWS Language AI Services. She is passionate about the intersection of technology and education. Out of the office, Esther enjoys long walks along the beach, dinners with friends and friendly rounds of Mahjong.

Read More