Enhancing your chatbot experience with web browsing

Enhancing your chatbot experience with web browsing

Chatbots are popping up everywhere. They are qualifying leads, assisting with sales, and automating customer service. However, conversational chatbot experiences have been limited to the space available within the chatbot window.

What if these web-based chatbots could provide an interactive experience that expanded beyond the chat window to include relevant web content based on user inputs? In a previous post we showed you how to deploy a web UI for your chatbot. In this post we will show you how to enhance that experience.

Here is an example of how we add an interactive web UI to the Order Flowers chatbot with the lex-web-ui customization.

Installing the chatbot UI

To install your chatbot, complete the following steps:

  1. Deploy the chatbot UI in your AWS account by launching the following AWS CloudFormation stack:
  2. Set EnableCognitoLogin to true in the parameters.
  3. To check if it’s working, on the AWS CloudFormation console, choose Stacks.
  4. Choose the stack you created.
  5. In the Outputs section, choose ParentPageURL.

You have now deployed the bot in the CloudFront distribution you created.

Installing the chatbot UI enhancer

After you install the chatbot UI, launch the following AWS CloudFormation stack:

There are two parameters for this stack:

  • BotName – The chatbot UI bot you deployed. The parameter value of WebUiOrderFlowers is populated by default.
  • lexwebuiStackName – The name of the deployed in the previous step. The parameter value of lex-web-ui is populated by default.

When the stack is complete, find the URL for the new demo site on the Outputs tab on the AWS CloudFormation console.

Enhancing the existing bot with AWS Lambda

To enhance the existing bot with AWS Lambda, complete the following steps:

  1. On the Amazon Lex console, choose Bots.
  2. Choose the bot you created.
  3. In the Lambda initialization and validation section, for Lambda function, choose the function you created as part of the CloudFormation stack (enhanced-orderflowers-<stackname>).

For production workloads, you should publish a new version of the bot. Amazon Lex takes a snapshot copy of the $LATEST version to publish a new version. For more information, see Exercise 3: Publish a Version and Create an Alias.

Enhancing authentication

You have now set up the enhanced chatbot UI. It’s recommended that you authenticate for a production environment. This post uses Amazon Cognito to add a social identity provider (Google) to your user pool. For instructions, see Adding Social Identity Providers to a User Pool.

This step allows your bot to display your Google calendar while you order your flowers. If you skip this step, the bot still functions normally.

Dynamically viewing content on your webpage

Having content appear and disappear on your website based on your interactions with the bot is a powerful feature.

For example, if you ask the bot to order flowers, the bot messaging interface and the webpage change. This example actively builds HTML on the fly with values that the bot sends back to the end-user.

Enhancing pages with external content to help with flower selection

When you ask the bot to buy roses, the result depends on if you’re in unauthenticated or authenticated mode.

In authenticated mode, the iframe changes from the default homepage to a Wikipedia page about roses. The Area chart also changes to a Roses Sold graph that shows the number of roses sold per year.

In authenticated (with Google) mode, the iframe changes to your Google calendar to help you schedule a delivery day. The Area chart still changes to the Roses Sold graph.

This powerful tool allows content from various parts of the website or the internet to appear by interacting with the bot. This also allows the bot to recognize if you’re authenticated or not and tailor your browsing experience.

Parent page, iframes, session attributes, and dynamic HTML tags

Four main components make up the Order Flowers bot page and how the various pieces interact with each other:

  • Parent page – This page houses all the various components, including the chatbot iframe, dynamically created HTML, and the navigation portal (an iframe that displays various HTML pages external and internal to the website).
  • Chatbot iframe – This is the chatbot UI that the end-user interacts with. The chatbot is loaded using a JavaScript snippet that mounts an iframe to the bottom right of the parent page and preloads it with an API to interact with the parent page.
  • Session attributes – These are arbitrary values that get sent back and forth from the chatbot UI backend to the parent page. You can manipulate these values in Lambda. On the parent page, the session attributes event data is made available in a variable called sessionAttributes.
  • Dynamic HTML <Div> tags – This appears on the top right of the page and displays various charts based on the question asked. You can populate it with any data, not just charts. You manipulate the data by returning the values through the session attributes fields. In the parent page, sessionAttributes.appContext houses this data.

The following diagram illustrates the solution architecture.

Chatbot UI user login with Amazon Cognito

When you’re authenticated through the integrated Amazon Cognito feature, the chatbot UI attaches a signed token as a session attribute. The enhanced Order Flowers webpage uses the token to make additional user attributes available, including fields such as given name, family name, and email address. These fields help return personalized information (for example, addressing you by your name).

Limitations

There are certain limitations to displaying outside webpages and content through the chatbot UI parent page.

If cross-origin resource sharing (CORS) is enabled on the external content that is being pulled into the parent page iframe navigation portal, the browser blocks the content. Browsers don’t block different webpages from the same domain or external webpages that don’t have CORS enabled (for example, Wikipedia). For more information, see Cross-Origin Resource Sharing (CORS) on the MDN web docs website.

In most use cases, you should use the navigation portal to pull in content from your own domain, due to the inherent limitations of iframes and CORS.

Additional Resources

The concepts discussed in this blogpost can be used with the QnaBot. The following README goes in detailed instructions on setting up the solution.

Conclusion

This post demonstrates how to enhance the Order Flowers bot with a Lambda function that parses your JWT token and extracts the relevant information. If you are authenticated through Google, the bot extracts information like your name and email address, and displays your Google calendar to help you schedule your delivery date. The function also verifies that the JWT token signature is valid.

The chatbot UI in this post is based on the aws-lex-web-ui open-source project. For more information, see the GitHub repo.


About the Authors

Mohamed Khalil is a Consultant for AWS Professional Services. Bob Strahan is a Principal Consultant for AWS Professional Services. Bob Potterveld is a Senior Consultant for AWS Professional Services. They help our customers and partners on a variety of projects.

Read More

Processing PDF documents with a human loop using Amazon Textract and Amazon Augmented AI

Processing PDF documents with a human loop using Amazon Textract and Amazon Augmented AI

Businesses across many industries, including financial, medical, legal, and real estate, process a large number of documents for different business operations. Healthcare and life science organizations, for example, need to access data within medical records and forms to fulfill medical claims and streamline administrative processes. Amazon Textract is a machine learning (ML) service that makes it easy to process documents at a large scale by automatically extracting text and data from virtually any type of document. For example, it can extract patient information from an insurance claim or values from a table in a scanned medical chart.

Depending on the business use case, you may want to have a human review of ML predictions. For example, extracting information from a scanned mortgage application or medical claim form might require human review of certain fields due to regulatory requirements or potentially low-quality scans. Amazon Augmented AI (Amazon A2I) allows you to build and manage such human review workflows. This allows human review of ML predictions when needed based on a confidence score threshold, and you can audit the predictions on an ongoing basis. For more information, see Using with Amazon Textract with Amazon Augmented AI for processing critical documents.

In this post, we show how you can use Amazon Textract and Amazon A2I to build a workflow that enables multi-page PDF document processing with a human reviewers loop.

Solution overview

The following architecture shows how you can have a serverless architecture to process multi-page PDF documents with a human review. Although Amazon Textract can process images (PNG and JPG) and PDF documents, Amazon A2I human reviewers need to have individual pages as images and process them individually using the AnalyzeDocument API of Amazon Textract.

To implement this architecture, we take advantage of Amazon Step Functions to build the overall workflow. As the workflow starts, it extracts individual pages from the multi-page PDF document. It then uses the Map state to process multiple pages concurrently using the AnalyzeDocument API. When we call Amazon Textract, we also specify the Amazon A2I workflow as part of the request. This workflow is configured to trigger when form fields are detected below a certain confidence threshold. If triggered, Amazon Textract returns the extracted text and data along with the details. When the human review is complete, the callback task token is used to resume the state machine, combine the pages’ results, and store them in an output Amazon Simple Storage Service (Amazon S3) bucket.

For more information about the demo solution, see the GitHub repo.

Prerequisites

Before you get started, you must install the following prerequisites:

  1. Node.js
  2. Python
  3. AWS Command Line Interface (AWS CLI)—for instructions, see Installing the AWS CLI)

Deploying the solution

The following steps deploy the reference implementation in your AWS account. The solution deploys different components, including an S3 bucket, a Step Function, an Amazon Simple Queue Service (Amazon SQS) queue, and AWS Lambda functions using the AWS Cloud Development Kit (AWS CDK), which is an open-source software development framework to model and provision your cloud application resources using familiar programming languages.

  1. Install AWS CDK:
    npm install -g aws-cdk

  2. Download the GitHub repo to your local machine:
    git clone https://github.com/aws-samples/amazon-textract-a2i-pdf

  3. Go to the folder multipagepdfa2i and enter the following:
    pip install -r requirements.txt

  4. Bootstrap AWS CDK:
    cdk bootstrap

  5. Deploy:
    cdk deploy

Creating a private work team

A work team is a group of people that you select to review your documents. You can create a work team from a workforce, which is made up of Amazon Mechanical Turk workers, vendor-managed workers, or your own private workers that you invite to work on your tasks. Whichever workforce type you choose, Amazon A2I takes care of sending tasks to workers. For this post, you create a work team using a private workforce and add yourself to the team to preview the Amazon A2I workflow.

To create and manage your private workforce, you can use the Labeling workforces page on the Amazon SageMaker console. On the console, you can create a private workforce by entering worker emails or importing a pre-existing workforce from an Amazon Cognito user pool.

If you already have a work team for Amazon SageMaker Ground Truth, you can use the same work team with Amazon A2I and skip to the following section.

To create your private work team, complete the following steps:

  1. On the Amazon SageMaker console, choose Labeling workforces.
  2. On the Private tab, choose Create private team.
  3. Choose Invite new workers by email.
  4. In the Email addresses box, enter the email addresses for your work team (for this post, enter your email address).

You can enter a list of up to 50 email addresses, separated by commas.

  1. Enter an organization name and contact email.
  2. Choose Create private team.

After you create the private team, you get an email invitation. The following screenshot shows an example email.

After you click the link and change your password, you are registered as a verified worker for this team. The following screenshot shows the updated information on the Private tab.

Your one-person team is now ready, and you can create a human review workflow.

Creating a human review workflow

You use a human review workflow to do the following:

  • Define the business conditions under which the Amazon Textract predictions of the document content go to a human for review. For example, you can set confidence thresholds for important words in the form that the model must meet. If inference confidence for that word (or form key) falls below your confidence threshold, the form and prediction go for human review.
  • Create instructions to help workers complete your document review task.
  1. On the Amazon SageMaker console, navigate to the Human review workflows page
  2. Choose Create human review workflow.
  3. In the Workflow settings section, for Name, enter a unique workflow name.
  4. For S3 bucket, enter the S3 bucket that was created in CDK deployment step. It should have a name format as multipagepdfa2i-multipagepdf-xxxxxxxxx. This S3 bucket is where A2I will store the human review results.
  5. For IAM role, choose Create a new role from the drop-down menu. Amazon A2I can create a role automatically for you.
  6. For S3 buckets you specify, select Specific S3 buckets.
  7. Enter the S3 bucket you specified earlier in Step 3; for example, multipagepdfa2i-multipagepdf-xxxxxxxxx.
  8. Choose Create.

You see a confirmation when role creation is complete, and your role is now pre-populated in the IAM role drop-down menu.

  1. For Task type, select Amazon Textract – Key-value pair extraction.

Defining the trigger conditions

For this post, you want to trigger a human review if the key Mail Address is identified with a confidence score of less than 99% or not identified by Amazon Textract in the document. For all other keys, a human review starts if a key is identified with a confidence score less than 90%.

  1. Select Trigger a human review for specific form keys based on the form key confidence score or when specific form keys are missing.
  2. For Key name, enter Mail Address.
  3. Set the identification confidence threshold between 0 and 99.
  4. Set the qualification confidence threshold between 0 and 99.
  5. Select Trigger a human review for all form keys identified by Amazon Textract with confidence scores in a specific range.
  6. Set Identification confidence threshold between 0 and 90.
  7. Set Qualification confidence threshold between 0 and 90.

For model-monitoring purposes, you can also randomly send a specific percent of pages for human review. This is the third option on the Conditions for invoking human review page: Randomly send a sample of forms to humans for review. This post doesn’t include this condition.

Creating a UI template

In the next steps, you create a UI template that the worker sees for document review. Amazon A2I provides pre-built templates that workers use to identify key-value pairs in documents.

  1. In the Worker task template creation section, select Create from a default template.
  2. For Template name, enter a name.

When you use the default template, you can provide task-specific instructions to help the worker complete your task. For this post, you can enter instructions similar to the default instructions you see in the console.

  1. Under Task Description, enter something similar to Please review the Key Value Pairs in this document.
  2. Under Instructions, review the default instructions provided and make modifications as needed.
  3. In the Workers section, select Private.
  4. For Private teams, choose the work team you created earlier.
  5. Choose Create.

You’re redirected to the Human review workflows page and see a confirmation message similar to the following screenshot.

Record your new human review workflow ARN, which you use to configure your human loop in the next section.

Updating the solution with the Human Review workflow

You’re now ready to add your human review workflow ARN.

  1. Within the code you downloaded from GitHub repo, open the file multipagepdfa2i/multipagepdfa2i_stack.py.

On line 23, you should see the following code:

SAGEMAKER_WORKFLOW_AUGMENTED_AI_ARN_EV = ""
  1. Within the quotes, enter the human review workflow ARN you copied at the end of the last section.

Line 23 should now look like the following code:

SAGEMAKER_WORKFLOW_AUGMENTED_AI_ARN_EV = "arn:aws:sagemaker: ...."
  1. Save the changes you made.
  2. Deploy by entering the following code:
    cdk deploy

Testing the workflow

To test your workflow, complete the following steps:

  1. Create a folder named uploads in the S3 bucket that was created by CDK deployment (Example: multipagepdfa2i-multipagepdf-xxxxxxxxx)
  2. Upload the sample PDF document to the uploads For example, uploads/Sampledoc.pdf.
  3. On the Amazon SageMaker console, choose Labeling workforces.
  4. On the Private tab, choose the link under Labeling portal sign-in URL.
  5. Sign in with the account you configured with Amazon Cognito.

If the document required a human review, a job appears under Jobs section .

  1. Select the job you want to complete and choose Start working.

In the reviewer UI, you see instructions and the first document to work on. You can use the toolbox to zoom in and out, fit image, and reposition document. See the following screenshot.

This UI is specifically designed for document-processing tasks. On the right side of the preceding screenshot, the key-value pairs are automatically pre-filled with the Amazon Textract response. As a worker, you can quickly refer to this sidebar to make sure the key-values are identified correctly (which is the case for this post).

When you select any field on the right, a corresponding bounding box appears, which highlights its location on the document. See the following screenshot.

In the following screenshot, Amazon Textract didn’t identify Mail Address. The human review workflow identified this as an important field. Even though Amazon Textract didn’t identify it, the worker task UI asks you to enter d details on the right side.

There may be a series of pages you need to submit based on the Amazon Textract confidence score ranges you configured. When you finish reviewing them, continue with steps below.

  1. When you complete the human review, go to the S3 bucket you used earlier (Example: multipagepdfa2i-multipagepdf-xxxxxxxxx)
  2. In the complete folder, choose the folder that has the name of input document (Example: uploads-Sampledoc.pdf-b5d54fdb75b143ee99f7524de56626a3).

That folder contains output.csv, which contains all your key-value pairs.

The following screenshot shows the content of an example output.csv file.

Conclusion

In this post, we showed you how to use Amazon Textract and Amazon A2I to automatically extract data from scanned multi-page PDF documents, and the human review of the pages for given business criteria. For more information about Amazon Textract and Amazon A2I, see Using Amazon Augmented AI with Amazon Textract.

For video presentations, sample Jupyter notebooks, or more information about use cases like document processing, content moderation, sentiment analysis, text translation, and more, see Amazon Augmented AI Resources.


About the Authors

Nicholas Nelson is an AWS Solutions Architect for Strategic Accounts based out of Seattle, Washington. His interests and experience include Computer Vision, Serverless Technology, and Construction Technology. Outside of work, you can find Nicholas out cycling, paddle boarding, or grilling!

 

 

 

Kashif Imran is a Principal Solutions Architect at Amazon Web Services. He works with some of the largest AWS customers who are taking advantage of AI/ML to solve complex business problems. He provides technical guidance and design advice to implement computer vision applications at scale. His expertise spans application architecture, serverless, containers, NoSQL and machine learning.

 

 

 

Anuj Gupta is Senior Product Manager for Amazon Augmented AI. He focuses on delivering products that make it easier for customers to adopt machine learning. In his spare time, he enjoys road trips and watching Formula 1.

Read More

Setting up human review of your NLP-based entity recognition models with Amazon SageMaker Ground Truth, Amazon Comprehend, and Amazon A2I

Setting up human review of your NLP-based entity recognition models with Amazon SageMaker Ground Truth, Amazon Comprehend, and Amazon A2I

Organizations across industries have a lot of unstructured data that you can evaluate to get entity-based insights. You may also want to add your own entity types unique to your business, like proprietary part codes or industry-specific terms. To create a natural language processing (NLP)-based model, you need to label this data based on your specific entities.

Amazon SageMaker Ground Truth makes it easy to build highly accurate training datasets for machine learning (ML), and Amazon Comprehend lets you train a model without worrying about selecting the right algorithms and parameters for model training. Amazon Augmented AI (Amazon A2I) lets you audit, review, and augment these predicted results.

In this post, we cover how to build a labeled dataset of custom entities using the Ground Truth named entity recognition (NER) labeling feature, train a custom entity recognizer using Amazon Comprehend, and review the predictions below a certain confidence threshold from Amazon Comprehend using human reviewers with Amazon A2I.

We walk you through the following steps using this Amazon SageMaker Jupyter notebook:

  1. Preprocess your input documents.
  2. Create a Ground Truth NER labeling Job.
  3. Train an Amazon Comprehend custom entity recognizer model.
  4. Set up a human review loop for low-confidence detection using Amazon A2I.

Prerequisites

Before you get started, complete the following steps to set up the Jupyter notebook:

  1. Create a notebook instance in Amazon SageMaker.

Make sure your Amazon SageMaker notebook has the necessary AWS Identity and Access Management (IAM) roles and permissions mentioned in the prerequisite section of the notebook.

  1. When the notebook is active, choose Open Jupyter.
  2. On the Jupyter dashboard, choose New, and choose Terminal.
  3. In the terminal, enter the following code:
    cd SageMaker
    git clone “https://github.com/aws-samples/augmentedai-comprehendner-groundtruth”

  4. Open the notebook by choosing SageMakerGT-ComprehendNER-A2I-Notebook.ipynb in the root folder.

You’re now ready to run the following steps through the notebook cells.

Preprocessing your input documents

For this use case, you’re reviewing at chat messages or several service tickets. You want to know if they’re related to an AWS offering. We use the NER labeling feature in Ground Truth to label a SERVICE or VERSION entity from the input messages. We then train an Amazon Comprehend custom entity recognizer to recognize the entities from text like tweets or ticket comments.

The sample dataset is provided at data/rawinput/aws-service-offerings.txt in the GitHub repo. The following screenshot shows an example of the content.

You preprocess this file to generate the following:

  • inputs.csv – You use this file to generate input manifest file for Ground Truth NER labeling.
  • Train.csv and test.csv – You use these files as input for training custom entities. You can find these files in the Amazon Simple Storage Service (Amazon S3) bucket.

Refer to Steps 1a and 1b in the notebook for dataset generation.

Creating a Ground Truth NER labeling job

The purpose is to annotate and label sentences within the input document as belonging to a custom entity that we define. In this section, you complete the following steps:

  1. Create the manifest file that Ground Truth needs.
  2. Set up a labeling workforce.
  3. Create your labeling job.
  4. Start your labeling job and verify its output.

Creating a manifest file

We use the inputs.csv file generated during prepossessing to create a manifest file that the NER labeling feature needs. We generate a manifest file named prefix+-text-input.manifest, which you use for data labeling while creating a Ground Truth job. See the following code:

# Create and upload the input manifest by appending a source tag to each of the lines in the input text file. 
# Ground Truth uses the manifest file to determine labeling tasks

manifest_name = prefix + '-text-input.manifest'
# remove existing file with the same name to avoid duplicate entries
!rm *.manifest
s3bucket = s3res.Bucket(BUCKET)

with open(manifest_name, 'w') as f:
    for fn in s3bucket.objects.filter(Prefix=prefix +'/input/'):
        fn_obj = s3res.Object(BUCKET, fn.key)
        for line in fn_obj.get()['Body'].read().splitlines():                
            f.write('{"source":"' + line.decode('utf-8') +'"}n')
f.close()
s3.upload_file(manifest_name, BUCKET, prefix + "/manifest/" + manifest_name)

The NER labeling job requires its input manifest in the {"source": "embedded text"}. The following screenshot shows the generated input.manifest file from inputs.csv.

Creating a private labeling workforce

With Ground Truth, we use a private workforce to create a labeled dataset.

You create your private workforce on the Amazon SageMaker console. For instructions, see the section Creating a private work team in Developing NER models with Amazon SageMaker Ground Truth and Amazon Comprehend.

Alternatively, follow the steps in the notebook.

For this walkthrough, we use the same private workforce to label and augment low-confidence data using Amazon A2I after custom entity training.

Creating a labeling job

The next step is to create the NER labeling job. This post highlights the key steps. For more information, see Adding a data labeling workflow for named entity recognition with Amazon SageMaker Ground Truth.

  1. On the Amazon SageMaker console, under Ground Truth, choose Labeling jobs.
  2. Choose Create labeling job.
  3. For Job name, enter a job name.
  4. For Input dataset location, enter the Amazon S3 location of the input manifest file you created (s3://bucket//path-to-your-manifest.json).
  5. For Output Dataset Location, enter a S3 bucket with an output prefix (for example, s3://bucket-name/output).
  6. For IAM role, choose Create a new Role.
  7. Select Any S3 Bucket.
  8. Choose Create.
  9. For Task category, choose Text.
  10. Select Named entity recognition.
  11. Choose Next.
  12. For Worker type, select Private.
  13. In Private Teams, select the team you created.
  14. In the Named Entity Recognition Labeling Tool section, for Enter a brief description of the task, enter Highlight the word or group of words and select the corresponding most appropriate label from the right.
  15. In the Instructions box, enter Your labeling will be used to train an ML model for predictions. Please think carefully on the most appropriate label for the word selection. Remember to label at least 200 annotations per label type.
  16. Choose Bold Italics.
  17. In the Labels section, enter the label names you want to display to your workforce.
  18. Choose Create.

Starting your labeling job

Your workforce (or you, if you chose yourself as your workforce) received an email with login instructions.

  1. Choose the URL provided and enter your user name and password.

You are directed to the labeling task UI.

  1. Complete the labeling task by choosing labels for groups of words.
  2. Choose Submit.
  3. After you label all the entries, the UI automatically exits.
  4. To check your job’s status, on the Amazon SageMaker console, under Ground Truth, choose Labeling jobs.
  5. Wait until the job status shows as Complete.

Verifying annotation outputs

To verify your annotation outputs, open your S3 bucket and locate <S3 Bucket Name>/output/<labeling-job-name>/manifests/output/output.manifest. You can review the manifest file that Ground Truth created. The following screenshot shows an example of the entries you see.

Training a custom entity model

We now use the annotated dataset or output.manifest Ground Truth created to train a custom entity recognizer. This section walks you through the steps in the notebook.

Processing the annotated dataset

You can provide labels for Amazon Comprehend custom entities through an entity list or annotations. In this post, we use annotations generated using Ground Truth labeling jobs. You need to convert the annotated output.manifest file to the following CSV format:

File, Line, Begin Offset, End Offset, Type
documents.txt, 0, 0, 11, VERSION

Run the following code in the notebook to generate the annotations.csv file:

# Read the output manifest json and convert into a csv format as expected by Amazon Comprehend Custom Entity Recognizer
import json
import csv

# this will be the file that will be written by the format conversion code block below
csvout = 'annotations.csv'

with open(csvout, 'w', encoding="utf-8") as nf:
    csv_writer = csv.writer(nf)
    csv_writer.writerow(["File", "Line", "Begin Offset", "End Offset", "Type"])
    with open("data/groundtruth/output.manifest", "r") as fr:
        for num, line in enumerate(fr.readlines()):
            lj = json.loads(line)
            #print(str(lj))
            if lj and labeling_job_name in lj:
                for ent in lj[labeling_job_name]['annotations']['entities']:
                    csv_writer.writerow([fntrain,num,ent['startOffset'],ent['endOffset'],ent['label'].upper()])
    fr.close()
nf.close()        

s3_annot_key = "output/" + labeling_job_name + "/comprehend/" + csvout

upload_to_s3(s3_annot_key, csvout)

The following screenshot shows the contents of the file.

Setting up a custom entity recognizer

This post uses the API, but you can optionally create the recognizer and batch analysis job on the Amazon Comprehend console. For instructions, see Build a custom entity recognizer using Amazon Comprehend.

  1. Enter the following code. For s3_train_channel, use the train.csv file you generated in preprocessing step for training the recognizer. For s3_annot_channel, use annotations.csv as a label to train your custom entity recognizer.
    custom_entity_request = {
    
          "Documents": { 
             "S3Uri": s3_train_channel
          },
          "Annotations": { 
             "S3Uri": s3_annot_channel
          },
          "EntityTypes": [
                    {
                        "Type": "SERVICE"
                    },
                    {
                        "Type": "VERSION"
                    }
          ]
    }

  2. Create the entity recognizer using CreateEntityRecognizer The entity recognizer is trained with the minimum required number of training samples to generate some low confidence predictions required for our Amazon A2I workflow. See the following code:
    import datetime
    
    id = str(datetime.datetime.now().strftime("%s"))
    create_custom_entity_response = comprehend.create_entity_recognizer(
            RecognizerName = prefix + "-CER", 
            DataAccessRoleArn = role,
            InputDataConfig = custom_entity_request,
            LanguageCode = "en"
    )
    

    When the entity recognizer job is complete, it creates a recognizer with a performance score. As mentioned earlier we trained the entity recognizer with a minimum number of training samples to generate low confidence predictions we need to trigger the Amazon A2I human loop. You can find these metrics on the Amazon Comprehend console. See the following screenshot.

  3. Create a batch entity detection analysis job to detect entities over a large number of documents.

Use the Amazon Comprehend StartEntitiesDetectionJob operation to detect custom entities in your documents. For instructions on creating an endpoint for real-time analysis using your custom entity recognizer, see Announcing the launch of Amazon Comprehend custom entity recognition real-time endpoints.

To use the EntityRecognizerArn for custom entity recognition, you must provide access to the recognizer to detect the custom entity. This ARN is supplied by the response to the CreateEntityRecognizer operation.

  1. Run the custom entity detection job to get predictions on the test dataset you created during the preprocessing step by running the following cell in the notebook:
    s3_test_channel = 's3://{}/{}'.format(BUCKET, s3_test_key) s3_output_test_data = 's3://{}/{}'.format(BUCKET, "output/testresults/") 
    test_response = comprehend.start_entities_detection_job(   InputDataConfig={ 
    'S3Uri': s3_test_channel, 
    'InputFormat': 'ONE_DOC_PER_LINE'
    }, 
    OutputDataConfig={'S3Uri': s3_output_test_data 
    }, 
    DataAccessRoleArn=role, 
    JobName='a2i-comprehend-gt-blog', 
    EntityRecognizerArn=jobArn, 
    LanguageCode='en')
    

    The following screenshot shows the test results.

Setting up a human review loop

In this section, you set up a human review loop for low-confidence detections in Amazon A2I. It includes the following steps:

  1. Choose your workforce.
  2. Create a human task UI.
  3. Create a worker task template creator function.
  4. Create the flow definition.
  5. Check the human loop status and wait for reviewers to complete the task.

Choosing your workforce

For this post, we use the private workforce we created for the Ground Truth labeling jobs. Use the workforce ARN to set up the workforce for Amazon A2I.

Creating a human task UI

Create a human task UI resource with a UI template in liquid HTML. This template is used whenever a human loop is required.

The following example code is compatible with Amazon Comprehend entity detection:

template = """
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>

<style>
    .highlight {
        background-color: yellow;
    }
</style>

<crowd-entity-annotation
        name="crowd-entity-annotation"
        header="Highlight parts of the text below"
        labels="[{'label': 'service', 'fullDisplayName': 'Service'}, {'label': 'version', 'fullDisplayName': 'Version'}]"
        text="{{ task.input.originalText }}"
>
    <full-instructions header="Named entity recognition instructions">
        <ol>
            <li><strong>Read</strong> the text carefully.</li>
            <li><strong>Highlight</strong> words, phrases, or sections of the text.</li>
            <li><strong>Choose</strong> the label that best matches what you have highlighted.</li>
            <li>To <strong>change</strong> a label, choose highlighted text and select a new label.</li>
            <li>To <strong>remove</strong> a label from highlighted text, choose the X next to the abbreviated label name on the highlighted text.</li>
            <li>You can select all of a previously highlighted text, but not a portion of it.</li>
        </ol>
    </full-instructions>

    <short-instructions>
        Select the word or words in the displayed text corresponding to the entity, label it and click submit
    </short-instructions>

    <div id="recognizedEntities" style="margin-top: 20px">
                <h3>Label the Entity below in the text above</h3>
                <p>{{ task.input.entities }}</p>
    </div>
</crowd-entity-annotation>

<script>

    function highlight(text) {
        var inputText = document.getElementById("inputText");
        var innerHTML = inputText.innerHTML;
        var index = innerHTML.indexOf(text);
        if (index >= 0) {
            innerHTML = innerHTML.substring(0,index) + "<span class='highlight'>" + innerHTML.substring(index,index+text.length) + "</span>" + innerHTML.substring(index + text.length);
            inputText.innerHTML = innerHTML;
        }
    }

    document.addEventListener('all-crowd-elements-ready', () => {
        document
            .querySelector('crowd-entity-annotation')
            .shadowRoot
            .querySelector('crowd-form')
            .form
            .appendChild(recognizedEntities);
    });
</script>
"""

Creating a worker task template creator function

This function is a higher-level abstraction on the Amazon SageMaker package’s method to create the worker task template, which we use to create a human review workflow. See the following code:

def create_task_ui():
    '''
    Creates a Human Task UI resource.

    Returns:
    struct: HumanTaskUiArn
    '''
    response = sagemaker.create_human_task_ui(
        HumanTaskUiName=taskUIName,
        UiTemplate={'Content': template})
    return response
# Task UI name - this value is unique per account and region. You can also provide your own value here.
taskUIName = prefix + '-ui' 

# Create task UI
humanTaskUiResponse = create_task_ui()
humanTaskUiArn = humanTaskUiResponse['HumanTaskUiArn']
print(humanTaskUiArn)

Creating the flow definition

Flow definitions allow you to specify the following:

  • The workforce that your tasks are sent to
  • The instructions that your workforce receives

This post uses the API, but you can optionally create this workflow definition on the Amazon A2I console.

For more information, see Create a Flow Definition.

To set up the condition to trigger the human loop review, enter the following code (you can change the value of the CONFIDENCE_SCORE_THRESHOLD based on what confidence level you want to trigger the human review):

human_loops_started = []

import json

CONFIDENCE_SCORE_THRESHOLD = 90
for line in data:
    print("Line is: " + str(line))
    begin_offset=line['BEGIN_OFFSET']
    end_offset=line['END_OFFSET']
    if(line['CONFIDENCE_SCORE'] < CONFIDENCE_SCORE_THRESHOLD):
        humanLoopName = str(uuid.uuid4())
        human_loop_input = {}
        human_loop_input['labels'] = line['ENTITY']
        human_loop_input['entities']= line['ENTITY']
        human_loop_input['originalText'] = line['ORIGINAL_TEXT']
        start_loop_response = a2i_runtime_client.start_human_loop(
        HumanLoopName=humanLoopName,
        FlowDefinitionArn=flowDefinitionArn,
        HumanLoopInput={
                "InputContent": json.dumps(human_loop_input)
            }
        )
        print(human_loop_input)
        human_loops_started.append(humanLoopName)
        print(f'Score is less than the threshold of {CONFIDENCE_SCORE_THRESHOLD}')
        print(f'Starting human loop with name: {humanLoopName}  n')
    else:
         print('No human loop created. n')

Checking the human loop status and waiting for reviewers to complete the task

To define a function that allows you to check the human loop’s status, enter the following code:

completed_human_loops = []
for human_loop_name in human_loops_started:
    resp = a2i_runtime_client.describe_human_loop(HumanLoopName=human_loop_name)
    print(f'HumanLoop Name: {human_loop_name}')
    print(f'HumanLoop Status: {resp["HumanLoopStatus"]}')
    print(f'HumanLoop Output Destination: {resp["HumanLoopOutput"]}')
    print('n')
    
    if resp["HumanLoopStatus"] == "Completed":
        completed_human_loops.append(resp)

Navigate to the private workforce portal that’s provided as the output of cell 2 from the previous step in the notebook. See the following code:

workteamName = WORKTEAM_ARN[WORKTEAM_ARN.rfind('/') + 1:]
print("Navigate to the private worker portal and do the tasks. Make sure you've invited yourself to your workteam!")
print('https://' + sagemaker.describe_workteam(WorkteamName=workteamName)['Workteam']['SubDomain'])

The UI template is similar to the Ground Truth NER labeling feature. Amazon A2I displays the entity identified from the input text (this is a low-confidence prediction). The human worker can then update or validate the entity labeling as required and choose Submit.

This action generates an updated annotation with offsets and entities as highlighted by the human reviewer.

Cleaning up

To avoid incurring future charges, stop and delete resources such as the Amazon SageMaker notebook instance, Amazon Comprehend custom entity recognizer, and the model artifacts in Amazon S3 when not in use.

Conclusion

This post demonstrated how to create annotations for an Amazon Comprehend custom entity recognition using Ground Truth NER. We used Amazon A2I to augment the low-confidence predictions from Amazon Comprehend.

You can use the annotations that Amazon A2I generated to update the annotations file you created and incrementally train the custom recognizer to improve the model’s accuracy.

For video presentations, sample Jupyter notebooks, or more information about use cases like document processing, content moderation, sentiment analysis, text translation, and more, see Amazon Augmented AI Resources. We’re interested in how you want to extend this solution for your use case and welcome your feedback.


About the Authors

Mona Mona is an AI/ML Specialist Solutions Architect based out of Arlington, VA. She works with World Wide Public Sector team and helps customers adopt machine learning on a large scale. She is passionate about NLP and ML Explainability areas in AI/ML.

 

 

 

 

Prem Ranga is an Enterprise Solutions Architect based out of Houston, Texas. He is part of the Machine Learning Technical Field Community and loves working with customers on their ML and AI journey. Prem is passionate about robotics, is an Autonomous Vehicles researcher, and also built the Alexa-controlled Beer Pours in Houston and other locations.

 

 

 

Read More

Extracting custom entities from documents with Amazon Textract and Amazon Comprehend

Extracting custom entities from documents with Amazon Textract and Amazon Comprehend

Amazon Textract is a machine learning (ML) service that makes it easy to extract text and data from scanned documents. Textract goes beyond simple optical character recognition (OCR) to identify the contents of fields in forms and information stored in tables. This allows you to use Amazon Textract to instantly “read” virtually any type of document and accurately extract text and data without needing any manual effort or custom code.

Amazon Textract has multiple applications in a variety of fields. For example, talent management companies can use Amazon Textract to automate the process of extracting a candidate’s skill set. Healthcare organizations can extract patient information from documents to fulfill medical claims.

When your organization processes a variety of documents, you sometimes need to extract entities from unstructured text in the documents. A contract document, for example, can have paragraphs of text where names and other contract terms are listed in the paragraph of text instead of as a key/value or form structure. Amazon Comprehend is a natural language processing (NLP) service that can extract key phrases, places, names, organizations, events, sentiment from unstructured text, and more. With custom entity recognition, you can to identify new entity types not supported as one of the preset generic entity types. This allows you to extract business-specific entities to address your needs.

In this post, we show how to extract custom entities from scanned documents using Amazon Textract and Amazon Comprehend.

Use case overview

For this post, we process resume documents from the Resume Entities for NER dataset to get insights such as candidates’ skills by automating this workflow. We use Amazon Textract to extract text from these resumes and Amazon Comprehend custom entity recognition to detect skills such as AWS, C, and C++ as custom entities. The following screenshot shows a sample input document.

The following screenshot shows the corresponding output generated using Amazon Textract and Amazon Comprehend.

Solution overview

The following diagram shows a serverless architecture that processes incoming documents for custom entity extraction using Amazon Textract and custom model trained using Amazon Comprehend. As documents are uploaded to an Amazon Simple Storage Service (Amazon S3) bucket, it triggers an AWS Lambda function. The function calls the Amazon Textract DetectDocumentText API to extract the text and calls Amazon Comprehend with the extracted text to detect custom entities.

The solution consists of two parts:

  1. Training:
    1. Extract text from PDF documents using Amazon Textract
    2. Label the resulting data using Amazon SageMaker Ground Truth
    3. Train custom entity recognition using Amazon Comprehend with the labeled data
  2. Inference:
    1. Send the document to Amazon Textract for data extraction
    2. Send the extracted data to the Amazon Comprehend custom model for entity extraction

Launching your AWS CloudFormation stack

For this post, we use an AWS CloudFormation stack to deploy the solution and create the resources it needs. These resources include an S3 bucket, Amazon SageMaker instance, and the necessary AWS Identity and Access Management (IAM) roles. For more information about stacks, see Walkthrough: Updating a stack.

  1. Download the following CloudFormation template and save to your local disk.
  2. Sign in to the AWS Management Console with your IAM user name and password.
  3. On the AWS CloudFormation console, choose Create Stack.

Alternatively, you can choose Launch Stack directly.

  1. On the Create Stack page, choose Upload a template file and upload the CloudFormation template you downloaded.
  2. Choose Next.
  3. On the next page, enter a name for the stack.
  4. Leave everything else at their default setting.
  5. On the Review page, select I acknowledge that AWS CloudFormation might create IAM resources with custom names.
  6. Choose Create stack.
  7. Wait for the stack to finish running.

You can examine various events from the stack creation process on the Events tab. After the stack creation is complete, look at the Resources tab to see all the resources the template created.

  1. On the Outputs tab of the CloudFormation stack, record the Amazon SageMaker instance URL.

Running the workflow on a Jupyter notebook

To run your workflow, complete the following steps:

  1. Open the Amazon SageMaker instance URL that you saved from the previous step.
  2. Under the New drop-down menu, choose Terminal.
  3. On the terminal, clone the GitHub cd Sagemaker; git clone URL.

You can check the folder structure (see the following screenshot).

  1. Open Textract_Comprehend_Custom_Entity_Recognition.ipynb.
  2. Run the cells.

Code walkthrough

Upload the documents to your S3 bucket.

The PDFs are now ready for Amazon Textract to perform OCR. Start the process with a StartDocumentTextDetection asynchronous API call.

For this post, we process two resumes in PDF format for demonstration, but you can process all 220 if needed. The results have all been processed and are ready for you to use.

Because we need to train a custom entity recognition model with Amazon Comprehend (as with any ML model), we need training data. In this post, we use Ground Truth to label our entities. By default, Amazon Comprehend can recognize entities like person, title, and organization. For more information, see Detect Entities. To demonstrate custom entity recognition capability, we focus on candidate skills as entities inside these resumes. We have the labeled data from Ground Truth. The data is available in the GitHub repo <(see: entity_list.csv)>. For instructions on labeling your data, see Developing NER models with Amazon SageMaker Ground Truth and Amazon Comprehend.

Now we have our raw and labeled data and are ready to train our model. To start the process, use the create_entity_recognizer API call. When the training job is submitted, you can see the recognizer being trained on the Amazon Comprehend console.

In the training, Amazon Comprehend sets aside some data for testing. When the recognizer is trained, you can see the performance of each entity and the recognizer overall.

We have prepared a small sample of text to test out the newly trained custom entity recognizer. We run the same step to perform OCR, then upload the Amazon Textract output to Amazon S3 and start a custom recognizer job.

When the job is submitted, you can see the progress on the Amazon Comprehend console under Analysis Jobs.

When the analysis job is complete, you can download the output and see the results. For this post, we converted the JSON result into table format for readability.

Conclusion

ML and artificial intelligence allow organizations to be agile. It can automate manual tasks to improve efficiency. In this post, we demonstrated an end-to-end architecture for extracting entities such as a candidate’s skills on their resume by using Amazon Textract and Amazon Comprehend. This post showed you how to use Amazon Textract to do data extraction and use Amazon Comprehend to train a custom entity recognizer from your own dataset and recognize custom entities. You can apply this process to a variety of industries, such as healthcare and financial services.

To learn more about different text and data extraction features of Amazon Textract, see How Amazon Textract Works.


About the Authors

Yuan Jiang is a Solution Architect with a focus on machine learning. He is a member of the Amazon Computer Vision Hero program.

 

 

 

Sonali Sahu is a Solution Architect and a member of Amazon Machine Learning Technical Field Community. She is also a member of the Amazon Computer Vision Hero program.

 

 

 

Kashif Imran is a Principal Solution Architect and the leader of Amazon Computer Vision Hero program.

 

 

 

 

 

Read More

Increasing engagement with personalized online sports content

Increasing engagement with personalized online sports content

This is a guest post by Mark Wood at Pulselive. In their own words, “Pulselive, based out of the UK, is the proud digital partner to some of the biggest names in sports.”


At Pulselive, we create experiences sports fans can’t live without; whether that’s the official Cricket World Cup website or the English Premier League’s iOS and Android apps.

One of the key things our customers measure us on is fan engagement with digital content such as videos. But until recently, the videos each fan saw were based on a most recently published list, which wasn’t personalized.

Sports organizations are trying to understand who their fans are and what they want. The wealth of digital behavioral data that can be collected for each fan tells a story of how unique they are and how they engage with our content. Based on the increase of available data and the increasing presence of machine learning (ML), Pulselive was asked by customers to provide tailored content recommendations.

In this post, we share our experience of adding Amazon Personalize to our platform as our new recommendation engine and how we increased video consumption by 20%.

Implementing Amazon Personalize

Before we could start, Pulselive had two main challenges: we didn’t have any data scientists on staff and we needed to find a solution that our engineers with minimal ML experience would understand and would still produce measurable results. We considered using external companies to assist (expensive), using tools such as Amazon SageMaker (still quite the learning curve), or Amazon Personalize.

We ultimately chose to use Amazon Personalize for several reasons:

  1. The barrier to entry was low, both technically and financially.
  2. We could quickly conduct an A/B test to demonstrate the value of a recommendation engine.
  3. We could create a simple proof of concept (PoC) with minimal disruption to the existing site.
  4. We were more concerned about the impact and improving the results than having a clear understanding of what was going on under the hood of Amazon Personalize.

Like any other business, we couldn’t afford to have an adverse impact on our daily operations, but still needed the confidence that the solution would work for our environment. Therefore, we started out with A/B testing in a PoC that we could spin up and execute in a matter of days.

Working with the Amazon Prototyping team, we narrowed down a range of options for our first integration to one that would require minimal changes to the website and be easily A/B tested. After examining all locations where a user is presented with a list of videos, we decided that re-ranking the list of videos to watch next would be the quickest to implement personalized content. For this prototype, we used an AWS Lambda function and Amazon API Gateway to provide a new API that would intercept the request for more videos and re-rank them using the Amazon Personalize GetPersonalizedRanking API.

To be considered successful, the experiment needed to demonstrate that statistically significant improvements had been made to either total video views or completion percentage. To make this possible, we needed to test across a sufficiently long enough period of time to make sure that we covered days with multiple sporting events and quieter days with no matches. We hoped to eliminate any behavior that would be dependent on the time of day or whether a match had recently been played by testing across different usage patterns. We set a time frame of 2 weeks to gather initial data. All users were part of the experiment and randomly assigned to either the control group or the test group. To keep the experiment as simple as possible, all videos were part of the experiment. The following diagram illustrates the architecture of our solution.

To get started, we needed to build an Amazon Personalize solution that provided us with the starting point for the experiment. Amazon Personalize requires a user-item interactions dataset to be able to define a solution and create a campaign to recommend videos to a user. We satisfied these requirements by creating a CSV file that contains a timestamp, user ID, and video ID for each video view across several weeks of usage. Uploading the interaction history to Amazon Personalize was a simple process, and we could immediately test the recommendations on the AWS Management Console. To train the model, we used a dataset of 30,000 recent interactions.

To compare metrics for total videos viewed and video completion percentage, we built a second API to record all video interactions in Amazon DynamoDB. This second API solved the problem of telling Amazon Personalize about new interactions via the PutEvents API, which helped keep the ML model up to date.

We tracked all video views and what prompted video views for all users in the experiment. Video prompts included direct linking (for example, from social media), linking from another part of the website, and linking from a list of videos. Each time a user viewed a video page, they were presented with the current list of videos or the new re-ranked list, depending on whether they were in the control or test group. We started our experiment with 5% of total users in the test group. When our approach showed no problems (no obvious drop in video consumption or increase in API errors), we increased this to 50%, with the remaining users acting as the control group, and started to collect data.

Learning from our experiment

After two weeks of A/B testing, we pulled the KPIs we collected from DynamoDB and compared the two variants we tested across several KPIs. We opted to use a few simple KPIs for this initial experiment, but other organizations’ KPIs may vary.

Our first KPI was the number of video views per user per session. Our initial hypothesis was that we wouldn’t see meaningful change given that we were re-ranking a list of videos; however, we measured an increase in views per user by 20%. The following graph summarizes our video views for each group.

In addition to measuring total view count, we wanted to make sure that users were watching videos in full. We tracked this by sending an event for each 25% of the video a user viewed. For each video, we found that the average completion percentage didn’t change very much based on whether the video was recommended by Amazon Personalize or by the original list view. In combination with the number of videos viewed, we concluded that overall viewing time had increased for each user when presented with a personalized list of recommended videos.

We also tracked the position of each video in users’ “recommended video” bar and which item they selected. This allowed us to compare the ranking of a personalized list vs. a publication ordered list. We found that this didn’t make much difference between the two variants, which suggested that our users would most likely select a video that was visible on their screen rather than scrolling to see the entire list.

After we analyzed the results of the experiment, we presented them to the customer with the recommendation that we enable Amazon Personalize as the default method of ranking videos in the future.

Lessons learned

We learned the following lessons on our journey, which may help you when implementing your own solution:

  1. Gather your historical data of user-item interactions; we used about 30,000 interactions.
  2. Focus on recent historical data. Although your immediate position is to get as much historical data as you can, recent interactions are more valuable than older interactions. If you have a very large dataset of historical interactions, you can filter out older interactions to reduce the size of the dataset and training time.
  3. Make sure you can give all users a consistent and unique ID, either by using your SSO solution or by generating session IDs.
  4. Find a spot in your site or app where you can run an A/B test either re-ranking an existing list or displaying a list of recommended items.
  5. Update your API to call Amazon Personalize and fetch the new list of items.
  6. Deploy the A/B test and gradually increase the percentage of users in the experiment.
  7. Instrument and measure so that you can understand the outcome of your experiment.

Conclusion and future steps

We were thrilled by our first foray into the world of ML with Amazon Personalize. We found the entire process of integrating a trained model into our workflow was incredibly simple; and we spent far more time making sure that we had the right KPIs and data capture to prove the usefulness of the experiment than we did implementing Amazon Personalize.

In the future, we will be developing the following enhancements:

  1. Integrating Amazon Personalize throughout our workflow much more frequently by providing our development teams the opportunity to use Amazon Personalize everywhere a list of content is provided.
  2. Expanding the use cases beyond re-ranking to include recommended items. This should allow us to surface older items that are likely to be more popular with each user.
  3. Experiment with how often the model should be retrained—inserting new interactions into the model in real time is a great way to keep things fresh, but the models still needs daily retraining to be most effective.
  4. Exploring options for how we can use Amazon Personalize with all of our customers to help improve fan engagement by recommending the most relevant content in all forms.
  5. Using recommendation filters to expand the range of parameters available for each request. We will soon be targeting additional options such as filtering to include videos of your favorite players.

About the Author

Mark Wood is the Product Solutions Director at Pulselive. Mark has been at Pulselive for over 6 years and has held both Technical Director as well as Software Engineer roles during his tenure with the company. Prior to Pulselive, Mark was a Senior Engineer at Roke and a Developer at Querix. Mark is a graduate from the University of Southampton with a degree in Mathematics with Computer Science.

Read More

Deploying custom models built with Gluon and Apache MXNet on Amazon SageMaker

Deploying custom models built with Gluon and Apache MXNet on Amazon SageMaker

When you build models with the Apache MXNet deep learning framework, you can take advantage of the expansive model zoo provided by GluonCV to quickly train state-of-the-art computer vision algorithms for image and video processing. A typical development environment for training consists of a Jupyter notebook hosted on a compute instance configured by the operating data scientist. To make sure this environment is replicated during use in production, the environment is wrapped inside a Docker container, which is launched and scaled according to the expected load. Hosting the deep learning model is a challenge that generally involves knowledge of server hosting, cluster management, web API protocols, and network security.

In this post, we demonstrate how Amazon SageMaker supports these libraries and how their integration simplifies the deployment of complex algorithms without having to build expertise in web app infrastructure. Whether inference constraints require real-time predictions with low latency, or irregularly-timed batch jobs with a large number of samples, optimal hosting solutions are available and easy to build.

With Amazon SageMaker, most of the undifferentiated heavy lifting is already done. There is no need to build out a container image from scratch or set up a REST API. Instead, you only need to specify various model functions to processes inference data in a manner consistent to the training pipeline. You can follow this post with an end-to-end example, in which we train an object detection model using open-source Apache tools.

Creating a notebook instance

You can run the example code we provide in this post. It’s recommended to run the code inside an Amazon SageMaker instance type of ml.p3.2xlarge or larger to accelerate training time. To create a notebook instance, complete the following steps:

  1. On the Amazon SageMaker console, choose Notebook instances.
  2. Choose Create notebook instance.
  3. Enter the name of your notebook instance, such as mxnet-gluon-deployment.
  4. Set the instance type to p3.2xlarge.
  5. Choose Additional configuration.
  6. Set the volume size to 20 GB.
  7. Choose Create notebook instance.
  8. When the instance is ready, choose Open in JupyterLab.
  9. From the launcher, you can open a terminal and run the provided code.

Generating the model

For this use case, you build an object detection model using a pretrained Faster R-CNN architecture from the GluonCV model zoo on the Pascal VOC dataset. The first step is to obtain the data, which you can do by running the data preparation script pascal_voc.py for use with GluonCV. The script downloads 8.4 GB of annotated images to ~/.mxnet/datasets/voc/. With the dataset in place, run the training script train_faster_rcnn.py from this GluonCV example.

Model parameters are saved after each epoch, with the best performing model indicated by the suffix _best.params.

Preparing the inference container image

To make sure that the compute environment for the inference instance is set according to our needs, run the model within a Docker container that specifies the required configuration. Containers provide a portable, efficient, standalone package of software for flexible deployment. In most cases, using the default MXNet inference container image in Amazon SageMaker is sufficient for hosting Apache MXNet models. However, we built a computer vision model using GluonCV, which isn’t included in the default image. You can now modify the MXNet inference container image to include GluonCV, which you use for deployment.

Our instance requires Docker for the following steps, which is included in Amazon SageMaker instances. First clone the Amazon SageMaker MXNet serving container GitHub repository:

git clone https://github.com/aws/sagemaker-mxnet-serving-container.git
cd sagemaker-mxnet-serving-container

Included in the repo is a Dockerfile that serves our configuration with MXNet 1.6.0, GluonCV 0.6.0, and Python 3.6.8. You can verify the software versions in ./docker/1.6.0/py3/Dockerfile.gpu:

...
ARG MX_URL=https://aws-mxnet-pypi.s3-us-west-2.amazonaws.com/1.6.0/aws_mxnet_cu101mkl-1.6.0-py2.py3-none-manylinux1_x86_64.whl
...
RUN ${PIP} install --no-cache-dir 
    ${MX_URL} 
    git+git://github.com/dmlc/gluon-nlp.git@v0.9.0 
    gluoncv==0.6.0 
    mxnet-model-server==$MMS_VERSION 
    keras-mxnet==2.2.4.1 
    numpy==1.17.4 
    onnx==1.4.1 
    "sagemaker-mxnet-inference<2"
...

There is no need to edit this file for this post, but you can add additional packages to the preceding code as needed.

Now you build the container image. Before executing the docker build command, copy the necessary artifacts to the ./docker/1.6.0/py3 directory. In the following example code, we use gluoncv-mxnet-serving:1.6.0-gpu-py3 as the name and the tag. Note the . at the end of the last command:

cp -r docker/artifacts/* docker/1.6.0/py3
cd docker/1.6.0/py3
docker build -t gluoncv-mxnet-serving:1.6.0-gpu-py3 -f Dockerfile.gpu .

To test the container was built successfully, you can run the container locally. In the following code, replace <docker image id> and <container id> with the output from the commands docker images and docker ps:

# find docker image id
$ docker images
REPOSITORY                                            TAG                               IMAGE ID            CREATED             SIZE
gluoncv-mxnet-serving                                 1.6.0-gpu-py3                     0012f8ebdcab        24 hours ago        6.56GB
nvidia/cuda                                           10.1-cudnn7-runtime-ubuntu16.04   e11e11484e2e        3 months ago        1.71GB

# start the docker container
$docker run <docker image id> 

In a separate terminal, access the shell of the running container:

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
af357bce0c53        0012f8ebdcab        "python /usr/local/b…"   7 hours ago         Up 7 hours          8080-8081/tcp       musing_napier

# access shell of the running docker
$ docker exec -it <container id> /bin/bash

To escape the terminals and tear down the resources, enter exit in the shell accessing the container and enter CTRL+C in the terminal running the container.

Now you’re ready to upload the new MXNet inference container image to Amazon Elastic Container Registry (Amazon ECR) so you can point to this container image when you deploy the model on Amazon SageMaker. For more information, see Pushing an image.

You first authenticate Docker to the Amazon ECR registry with get-login. Assuming the AWS Command Line Interface (AWC CLI) version is prior to 1.17.0, enter the following code to get the authenticated docker login command:

$ aws ecr get-login --region <AWS Region> --no-include-email

For instructions on using AWS CLI version 1.17.0 or higher, see Using an Authorization Token.

Copy the output of the command, then paste and execute it to authenticate your Docker installation into Amazon ECR. Replace with the appropriate Region. For example, to use the US East (N. Virginia) Region, replace with us-east-1.

Create a repository in Amazon ECR using the AWS CLI by running aws ecr create-repository. For this use case, use gluconcv for <repository name>:

$ aws ecr create-repository --repository-name <repository name> --region <AWS Region>

Before pushing the local image to Amazon ECR, tag it with the name of the target repository. The image ID is retrieved with the docker images command and named with the docker tag command and the repository URI, which you can also retrieve on the Amazon ECR console. See the following code:

$ docker images
REPOSITORY                                            TAG                               IMAGE ID            CREATED             SIZE
gluoncv-mxnet-serving                                 1.6.0-gpu-py3                     cb0a03065295        7 minutes ago       4.09GB
nvidia/cuda                                           10.1-cudnn7-runtime-ubuntu16.04   e11e11484e2e        3 months ago        1.71GB

$ docker tag <image id> <AWS account ID>.dkr.ecr.<AWS Region>.amazonaws.com/<repository name>

$ docker images
REPOSITORY                                             TAG                               IMAGE ID            CREATED             SIZE
<AWS account id>.dkr.ecr.<AWS Region>.amazonaws.com/gluoncv   latest                            cb0a03065295        9 minutes ago       4.09GB
gluoncv-mxnet-serving                                  1.6.0-gpu-py3                     cb0a03065295        9 minutes ago       4.09GB
nvidia/cuda                                            10.1-cudnn7-runtime-ubuntu16.04   e11e11484e2e        3 months ago        1.71GB

To push the image to the Amazon ECR repository so that it’s available for hosting on Amazon SageMaker endpoints, use the docker push command. You can confirm that the image is successfully pushed using the aws ecr list-images AWS CLI command:

$ docker push <AWS acconut ID>.dkr.ecr.<AWS Region>.amazonaws.com/<repository name>

$ aws ecr list-images --repository-name gluoncv
{
    "imageIds": [
        {
            "imageDigest": "sha256:66bc1759a4d2e94daff4dd02446024a11c5af29d9259175f11701a0b9ee2d2d1",
            "imageTag": "latest"
        }
    ]
}

Alternatively, you can verify the image exists in the repository by checking on the Amazon ECR console.

When deploying the model, use the image URI as the argument to image. You can run the code to set up the image programmatically from a Jupyter notebook:

account_id = boto3.client('sts').get_caller_identity().get('Account')
region = boto3.session.Session().region_name
ecr_repository = 'mxnet-gluoncv'
tag = ':latest'
image_uri = '{}.dkr.ecr.{}.amazonaws.com/{}'.format(account_id, region, ecr_repository + tag)

# Create ECR repository and push docker image
!docker build -t $ecr_repository -f ./docker/Dockerfile.gpu ./docker -q
!$(aws ecr get-login --region $region --registry-ids $account_id --no-include-email)
!aws ecr create-repository --repository-name $ecr_repository
!docker tag {ecr_repository + tag} $image_uri
!docker push $image_uri

Deploying the model

You can optimize compute resources according to inference requirements based on your use case. If you collect batches of data intermittently and don’t need predictions, you can run batch jobs over the data acquired by spinning up a compute instance when necessary, then process the mass of data, store the predictions, and tear down the instance.

Alternatively, you may require that calls for inference be answered immediately. In this case, spin up a compute instance for real-time inference at an endpoint that consumes data over an API call and returns the model output. You only pay for time when the compute instance is running. We provide details for both use cases in this section.

Prepare the model artifacts by compressing them into a tarball and uploading to Amazon S3, from which the deployed model is read. Because you’re using an architecture that already exists in the GluonCV model, you only need to upload the weights. The .params file from the previous step should ultimately live in s3://<bucket_name>/<prefix>/model.tar.gz. You execute deployment via the Amazon SageMaker SDK. See the following code:

import sagemaker
from sagemaker.mxnet import MXNetModel
model = MXNetModel(
    entry_point='./source_directory/entrypoint.py',
    model_data='s3://{}/{}/{}'.format(bucket_name, s3_prefix, tar_file_name),
    framework_version='1.6.0',
    py_version='py3',
    source_dir='./source_directory/',
    image='<AWS account id>.dkr.ecr.<AWS Region>.amazonaws.com/<repository name>:latest',
    role=sagemaker.get_execution_role()
)

The image ARN argument is the URI of the image you uploaded to the Amazon ECR repository in the preceding section. Make sure that the Region of the Amazon ECR repository and Amazon SageMaker model are the same. Most of the processing, inference, and configuration resides in the following entry_point.py script, which defines the model and the steps necessary to decode the payload so that the MXNet backend properly interprets the data:

entrypoint.py

## import packages ##
import base64
import json
import mxnet as mx
from mxnet import gpu
import numpy as np
import sys
import gluoncv as gcv
from gluoncv import data as gdata


## SageMaker loading function ##
def model_fn(model_dir):
    """
    Load the pretrained model 
    
    Args:
        model_dir (str): directory where model artifacts are saved/loaded
    """
    model = gcv.model_zoo.get_model('faster_rcnn_resnet50_v1b_voc',  pretrained_base=False)
    ctx = mx.gpu(0)
    model.load_parameters(f'{model_dir}/faster_rcnn_resnet50_v1b_voc_best.params', ctx, ignore_extra=True)
    print('Loaded gluoncv model')
    return model, ctx


## SageMaker inference function ##
def transform_fn(net, data, input_content_type, output_content_type):

    ## retrive model and contxt from the first parameter, net
    model, ctx = net

    ## decode image ##
    # for endpoint API calls
    if type(data) == str:
        parsed = json.loads(data)
        img = mx.nd.array(parsed)
    # for batch transform jobs
    else:
        img = mx.img.imdecode(data)
        
        
    ## preprocess ##
    
    # normalization values taken from gluoncv
    # https://gluon-cv.mxnet.io/_modules/gluoncv/data/transforms/presets/rcnn.html
    mean = (0.485, 0.456, 0.406)
    std = (0.229, 0.224, 0.225)
    img = gdata.transforms.image.imresize(img, 800, 600)
    img = mx.nd.image.to_tensor(img)
    img = mx.nd.image.normalize(img, mean=mean, std=std)
    nda = img.expand_dims(0)  
    nda = nda.copyto(ctx)
    
    
    ## inference ##
    cid, score, bbox = model(nda)
    
    # predictions to lists
    cid = cid.asnumpy().tolist()
    score = score.asnumpy().tolist()
    bbox = bbox.asnumpy().tolist()
    
    # format predictions 
    response = []
    for x,y,z in zip(cid[0], score[0], bbox[0]):
        if x[0] == -1.0:
            continue
        response.append([x[0], y[0], z[0]/800, z[1]/600, z[2]/800, z[3]/600])
        
    predictions = {'prediction':response}
    predictionslist = [predictions]
    
    return predictionslist

After you import the supporting libraries for model inference and data processing, define the model in model_fn() by loading the Faster R-CNN architecture and the trained weights you uploaded to Amazon S3. The file name passed in the net.load_parameters() must match the name of the parameters file that you trained and uploaded to Amazon S3 earlier in the tarball. For this use case, the parameters are stored in faster_rcnn_resnet50_v1b_voc_best.params. To utilize the GPU, you must explicitly set the context as such when loading the parameters.

Instructions to run predictions over the model are written in transform_fn(). You can call inference from a living endpoint API or launch it on schedule for batch jobs. The corresponding data type sent to the model varies between these two options. When sent for a real-time prediction over the endpoint API, the transform function receives a string that you can load and interpret according to its underlying data type. Batch transform jobs, on the other hand, send the data directly as a serialized image, which you need to decode with MXNet utilities. You can handle both cases by checking the type of the data object.

The loaded data is normalized according to the default preprocessing steps that GluonCV implements, as enforced in the normalize() function in the entry point script. Lastly, the data is passed through the neural network for inference with the output formatted such that the return payload includes the predicted class ID, confidence of the bounding box, and bounding box attributes.

With all the setup in place, you’re now ready to deploy. See the following code:

predictor = model.deploy(initial_instance_count=1, instance_type='ml.p3.2xlarge')

Testing

With the deployed endpoint up and running, you can make a real-time inference with the returned object from the preceding step. After loading an image into a NumPy array, fire it off for inference:

## inference via endpoint API
home_path = os.path.expanduser('~')
test_image = home_path + '/.mxnet/datasets/voc/VOC2012/JPEGImages/2010_001453.jpg'

# load as as numpy array
test_image_data = np.asarray(imageio.imread(test_image))

# Serializes data and makes a prediction request to the SageMaker endpoint
endpoint_response = predictor.predict(test_image_data)

To visualize the output, draw from the metadata included in the response. See the following code:

## visulize on a test image
img = mpimg.imread(test_image)
fig,ax = plt.subplots(1, dpi=120)
ax.imshow(img)
for box in endpoint_response[0]['prediction']:
    class_id, confidence, xmin, ymin, xmax, ymax = box
    xmin = xmin*img.shape[1]
    xmax = xmax*img.shape[1]
    ymin = ymin*img.shape[0]
    ymax = ymax*img.shape[0]
    if confidence > 0.9:
        height = ymax-ymin
        width = xmax-xmin
        rect = patches.Rectangle(
            (xmin,ymin), width, height, linewidth=1, edgecolor='yellow', facecolor='none')
        ax.add_patch(rect)
ax.axis('off')
plt.show()

After 20 epochs of training, you can see bounding boxes that accurately identifying various objects in the model response. See the following screenshot.

 

The purpose of maintaining an endpoint API is to support a model to be available for real-time predictions. It’s unnecessary to pay for a running endpoint instance if inference jobs are scheduled in advance. For this use case, you send a list of images for prediction to a batch transform job, which spins up a compute instance to run the model and tears it down upon completion. You only pay for the runtime of the instance, which saves costs on downtime. Set up and launch a batch transform job by uploading images to Amazon S3 and defining the data and model paths, along with a few other settings, to a dictionary. See the following code:

## inference via batch transform

# upload a sample of images to SageMaker
test_images = ['/.mxnet/datasets/voc/VOC2012/JPEGImages/2010_003939.jpg',
               '/.mxnet/datasets/voc/VOC2012/JPEGImages/2008_004205.jpg',
               '/.mxnet/datasets/voc/VOC2012/JPEGImages/2009_001139.jpg',
               '/.mxnet/datasets/voc/VOC2012/JPEGImages/2010_001453.jpg',
               '/.mxnet/datasets/voc/VOC2012/JPEGImages/2011_000148.jpg',
               '/.mxnet/datasets/voc/VOC2012/JPEGImages/2011_005806.jpg',
               '/.mxnet/datasets/voc/VOC2012/JPEGImages/2012_004299.jpg']

s3_test_prefix = 'test_images'
for test_image in test_images:
    test_image = home_path + test_image
    s3_client.upload_file(test_image, bucket_name, s3_test_prefix+'/'+test_image.split('/')[-1])

model_name = predictor.endpoint
timestamp = time.strftime('-%Y-%m-%d-%H-%M-%S', time.gmtime())
batch_job_name = "test-batch-job" + timestamp
request = 
{
    "TransformJobName": batch_job_name,
    "ModelName": model_name,
    "MaxConcurrentTransforms": 1,
    "MaxPayloadInMB": 6,
    "BatchStrategy": "SingleRecord",
    "TransformOutput": {
        "S3OutputPath": 's3://{}/test/{}/'.format(bucket_name, batch_job_name)
    },
    "TransformInput": {
        "DataSource": {
            "S3DataSource": {
                "S3DataType": "S3Prefix",
                "S3Uri":'s3://{}/test_images/'.format(bucket_name)
            }
        },
        "ContentType": "application/x-image",
        "SplitType": "None",
        "CompressionType": "None"
    },
    "TransformResources": {
            "InstanceType": "ml.p3.2xlarge",
            "InstanceCount": 1
    }
}

## launch batch transform job
sm_client = boto3.client('sagemaker')

sm_client.create_transform_job(**request)

print("Created Transform job with name: ", batch_job_name)

while(True):
    batch_response = sm_client.describe_transform_job(TransformJobName=batch_job_name)
    status = batch_response['TransformJobStatus']
    if status == 'Completed':
        print("Transform job ended with status: " + status)
        break
    if status == 'Failed':
        message = batch_response['FailureReason']
        print('Transform failed with the following error: {}'.format(message))
        raise Exception('Transform job failed') 
    time.sleep(30)

You can verify the output of the batch transform job by comparing the output of the real-time inference, endpoint_response, to the output from the batch transform job, which was saved to s3://<bucket_name>/test/<batch_job_name>/2010_001453.jpg.out as specified in the S3OutputPath parameter.

Cleaning up

To finish up this walkthrough, tear down the endpoint instance and remove the Amazon SageMaker model. For more information about additional helper methods, see Using Estimators. Delete the Amazon ECR repository and its images through the Amazon ECR client. See the following code:

# tear down the SageMaker endpoint and endpoint configuration
predictor.delete_endpoint()

# delete the SageMaker model
predictor.delete_model()
    
# delete ECR repository
ecr_client = boto3.client('ecr')
ecr_client.delete_repository(repository_name='gluoncv', force=True)

Conclusion

Although training models is a data scientist’s the primary objective, the deployment process is equally crucial. Amazon SageMaker offers efficient methods to put these algorithms into production. Built-in algorithms can accelerate the training process, but you may need custom modeling for your use case. When building a model with MXNet, you must specify the configuration and processing steps necessary to run it in production. For this post, we outlined the steps to load our model to Amazon SageMaker and run inference for real-time predictions and in batch jobs.


About the Authors

Hussain Karimi is a data scientist at the Maching Learning Solutions Lab where he works with customers across various verticals to initate and build automated, algorithmic models that generate business value.

 

 

 

Will Gleave is a Machine Learning Consultant with the NatSec team at AWS Professional Services. In his spare time, he enjoys reading, watching sports, and traveling.

 

 

 

Muhyun Kim is a data scientist at Amazon Machine Learning Solutions Lab. He solves customer’s various business problems by applying machine learning and deep learning, and also helps them gets skilled.

Read More