For two years now, Amazon has been collaborating with the University of California Berkeley Artificial Intelligence Research (BAIR) Lab, giving students the opportunity to work on challenging, real-world problems.Read More
Building a custom Angular application for labeling jobs with Amazon SageMaker Ground Truth
As a data scientist attempting to solve a problem using supervised learning, you usually need a high-quality labeled dataset before starting your model building. Amazon SageMaker Ground Truth makes dataset building for a different range of tasks, like text classification and object detection, easier and more accessible to everyone.
Ground Truth also helps you build datasets for custom user-defined tasks that let you annotate anything. This capability is powered by the following:
- Custom AWS Lambda functions that can be triggered between labeling steps. This allows you to have custom logic pre-labeling like filtering examples or augmenting them with metadata using other services like Amazon Translate or Amazon Rekognition, and post-labeling logic for label consolidation or quality control.
- Custom web templates that let you build unique user interfaces using HTML and Javascript that integrate perfectly with Ground Truth workflows. These templates are easy to build with Crowd HTML Elements, which are a set of common UI elements used for text, video, and audio labeling jobs that you can arrange like blocks in your custom template.
- Availability of a large set of skilled and specialized workforces in the AWS Marketplace and in Amazon Mechanical Turk if you need to augment your private teams of subject matter experts. Vetted partners in the AWS Marketplace cover numerous languages as well as specific skills in video and image annotations that fit different industry needs (like medical labeling).
For complex labeling tasks, such as complex taxonomy classification, extreme multi-class classifications, or autonomous driving labeling tasks, you may need to build a more complex front-end application for your labeling workforce. Front-end frameworks like Angular are helpful in these cases because they bring useful design patterns like model-view-controller (MVC), which makes your codebase more robust and maintainable for a larger team composed of UX/UI designers and software developers.
This post walks you through using Angular and Angular Elements to create fully customizable solutions that work nicely with Ground Truth. This walkthrough assumes that you’re familiar with running a custom labeling job with Ground Truth and Crowd HTML Elements. For more information, see Build a custom data labeling workflow with Amazon SageMaker Ground Truth.
The approach described in this post also works with Amazon Augmented AI (Amazon A2I), which makes it easy to build the workflows required for human review of machine learning predictions. This is possible because Amazon A2I uses Crowd HTML Elements to create custom worker templates. For more information, see Create Custom Worker Templates.
Building a custom UI for complex taxonomy classification
If you manage large supply chains and interact with different types of suppliers, like global food restaurants or automotive manufacturers, you likely receive invoices in different formats and languages. To keep track of your operations and drive financial efficiencies, you need teams behind the scenes to map invoices and receipts to large categories of products and organize them in hierarchical taxonomies.
The following diagram illustrates a hierarchical taxonomy of computer components.
The following diagram illustrates a hierarchical taxonomy of types of food.
Hierarchical taxonomies can have thousands of categories at their leaf level. Such examples can include web directories (the Yahoo! Directory or the Open Directory Project), library classification schemes (Dewey Decimal or Library of Congress), or the classification schemes used in natural science, legal, or medical applications.
What if a natural language processing (NLP) model could help you automatically tag every invoice to the proper category? What if text labeling tools could extract categories from invoices?
Even if accurate classification over large sets of closely related classes is inherently difficult, it all starts with constructing a high-quality dataset in the most cost-efficient manner.
Taxonomy labeling with Angular Elements
For the following use case, you are one of the biggest fast food chains operating and sourcing materials across the world. To build a dataset for your NLP model, you came up with a single-page web app based on UX research that helps your workforce read an invoice description and select the corresponding category in the taxonomy. See the following screenshot.
This implementation makes use of Angular Materials tabs and a filter box that makes navigating the categories easy. It also displays an English translation of your invoice description so that the workers can labels invoices from across the world. Moreover, because it’s built on a framework like Angular, you can improve it down the line with more elements, such as drop-downs for the higher levels of the taxonomy or dynamic content like images or videos based on third-party APIs.
For more information about this application, see the GitHub repo.
The application is built using Angular Elements, which creates Angular components packaged as custom elements (also called web components), a web standard for defining new HTML elements in a framework-agnostic way. This enables you to integrate smoothly with Crowd HTML Elements later on.
Angular Elements inputs and outputs
In this use case, your Angular component expects two inputs: an invoice description and an invoice translation. These are passed to it using tag attributes in the <ng-home>
(the directive that designates the root element of the application). The values are then captured by the @Input()
annotations defined in the Angular Controller in src/app/home.ts
. See the following code:
<ng-home source='10牛ステーキ-20パッケージ-ブランドX' translation='10 beef steak - 20 packages - brand X' id="home">loading</ng-home>
export class Home implements OnInit {
@Input() invoice = '';
@Input() translation = '';
...
The values are rendered using two-binding in the placehoders {{source}}
and {{translation}}
in the Angular View in src/app/home.html
. See the following code:
<!-- Invoice Description -->
<div class="card" >
<div class="card-header">
<h3>Invoice Description</h3>
</div>
<div>
<p id="step1">
<span>Invoice Description: <br />
<b>{{ invoice }}</b></span>
</p>
<p style='font-weight: small; color: gray;' id="step2">
<span>English Translation: <br /> {{ translation }}</span>
</p>
</div>
</div>
The following screenshot shows the Meats tab on the Food Categories page.
When you choose a category and choose Submit, the Angular component should also broadcast a Javascript event contaning the category ID to its parent DOM element. This is achieved using the @Output()
in the Angular Controller in src/app/home.ts
. See the following code:
<button mat-button color="primary" (click)="onSubmit()" id="submitButton">Submit</button>
<table>
...
<tr mat-row *matRowDef="let row; columns: displayedColumns;"
(click)="selectRow(row)" [ngClass]="{ 'highlight': row === selectedRow }">
</tr>
</table>
@Output('rowselected') rowselected = new EventEmitter<any>();
#called when user click on a row in the table ("selecting" a category)
selectRow(row) {
this.selectedRow = row;
}
#called when user click on Submit button
onSubmit(){
this.rowselected.emit(this.selectedRow);
}
Angular integration with Crowd HTML Elements
Communication between Angular Elements and Crowd HTML Elements happens through the mechanism described in the preceding section.
Following the steps described in Build a custom data labeling workflow with Amazon SageMaker Ground Truth, you can adapt how to pass the text to annotate and how to catch the broadcasted event from Angular Elements to create your custom template.
The following code shows the full Liquid HTML template to use in your job creations. This file should also be your index.html
root file of the Angular app under src/
folder. (Make sure to use the index.html
file under the dist
folder that has the minified .js files injected into it with the right Amazon Simple Storage Service (Amazon S3) path to host your app.)
<!doctype html>
<html lang="en">
<html>
<head>
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
</head>
<body>
<crowd-form style="display: none;">
<input name="annotations" id="annotations" type="hidden">
<input name="timeElapsed" id="timeElapsed" type="hidden">
<!-- Prevent crowd-form from creating its own button -->
<crowd-button form-action="submit" style="display: none;"></crowd-button>
</crowd-form>
<div class="mat-app-background basic-container">
<!-- Dev Mode to test the Angular Element -->
<!-- <ng-home source='10牛ステーキ-20パッケージ-ブランドX' translation='10 beef steak - 20 packages - brand X' id="home">loading</ng-home> -->
<ng-home source='{{ task.input.source }}' translation='{{ task.input.translatedDesc }}'>loading</ng-home>
</div>
<script src="<your-s3-bucket-angular-app>/runtime-es2015.js" type="module"></script>
<script src="<your-s3-bucket-angular-app>/runtime-es5.js" nomodule defer></script>
<script src="<your-s3-bucket-angular-app>/polyfills-es5.js" nomodule defer></script>
<script src="<your-s3-bucket-angular-app>/polyfills-es2015.js" type="module"></script>
<script src="<your-s3-bucket-angular-app>/styles-es2015.js" type="module"></script>
<script src="<your-s3-bucket-angular-app>/styles-es5.js" nomodule defer></script>
<script src="<your-s3-bucket-angular-app>/vendor-es2015.js" type="module"></script>
<script src="<your-s3-bucket-angular-app>/vendor-es5.js" nomodule defer></script>
<script src="<your-s3-bucket-angular-app>/main-es2015.js" type="module"></script>
<script src="<your-s3-bucket-angular-app>/main-es5.js" nomodule defer></script>
</body>
</html>
<script>
document.addEventListener("DOMContentLoaded", function(event) {
// Counter
var enterDate = new Date();
function secondsSinceEnter()
{
return (new Date() - enterDate) / 1000;
}
// GT Form Submitting
const component = document.querySelector('ng-home').addEventListener('rowselected', (event) => {
// alert(event.detail.CODE);
document.getElementById('annotations').value = event.detail.CODE;
document.getElementById('timeElapsed').value = secondsSinceEnter();
document.querySelector('crowd-form').submit();
});
});
</script>
<style>
.body {
background-color: #fafafa;
}
.header {
background: #673ab7;
color: #fff;
padding: 0 16px;
margin: 20px 20px 0px 20px;
padding: 20px;
}
.cards {
display: grid;
grid-template-columns: 30% auto;
grid-auto-rows: auto;
grid-gap: 1rem;
margin: 20px 20px 0px 20px;
}
.card {
box-shadow: 0 2px 1px -1px rgba(0,0,0,.2), 0 1px 1px 0 rgba(0,0,0,.14), 0 1px 3px 0 rgba(0,0,0,.12);
transition: box-shadow 280ms cubic-bezier(.4,0,.2,1);
display: block;
position: relative;
padding: 16px;
border-radius: 4px;
/* margin: 20px 0px 0px 20px; */
border: 2px solid #e7e7e7;
border-radius: 4px;
}
.highlight-step {
background-color: #2515424a;
margin: 0px -15px 0px -15px;
padding: 15px;
}
</style>
Creating the template
To create the preceding template, complete the following steps:
- Add the
crowd-html-element.js
script at the top of the template so you can use Crowd HTML Elements:<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
- Inject the text to annotate and the associated metadata coming from the pre-processing Lambda function to the user interface using the Liquid templating language directly in root element
<ng-home>
:<ng-home source='{{ task.input.source }}' translation='{{ task.input.translated }}' id="home">loading</ng-home>
- Use the
<crowd-form />
element, which submits the annotations to Ground Truth. The element is hidden because the submission happens in the background. See the following code:<crowd-form style="display: none;"> <input name="annotations" id="annotations" type="hidden"> <input name="timeElapsed" id="timeElapsed" type="hidden"> <!-- Prevent crowd-form from creating its own button --> <crowd-button form-action="submit" style="display: none;"></crowd-button> </crowd-form>
- Instead of using Crowd HTML Elements to submit the annotation, include a small script to integrate the Angular Element with
<crowd-form />
:ocument.addEventListener("DOMContentLoaded", function(event) { var enterDate = new Date(); function secondsSinceEnter() { return (new Date() - enterDate) / 1000; } const component = document.querySelector('ng-home').addEventListener('rowselected', (event) => document.getElementById('annotations').value = event.detail.CODE; document.getElementById('timeElapsed').value = secondsSinceEnter(); document.querySelector('crowd-form').submit(); }); });
For this use case, I’m also keeping a counter to monitor the time it takes a worker to complete the annotation.
The following diagram illustrates the data flow between each element.
Conclusion
This post showed how to build custom labeling UI with Angular and Ground Truth. The solution can handle communication between the different scopes in the custom template provided in the labeling job creation. The ability to use a custom front-end framework like Angular enables you to easily create modern web applications that serve your exact needs when tapping into public, private, or vendor labeling workforces.
For more information about hierarchical taxonomies in Ground Truth, see Creating hierarchical label taxonomies using Amazon SageMaker Ground Truth.
If you have any comments or questions about this post, please use the comments section. Happy labeling!
About the Authors
Yassine Landa is a Data Scientist at AWS. He holds an undergraduate degree in Math and Physics, and master’s degrees from French universities in Computer Science and Data Science, Web Intelligence, and Environment Engineering. He is passionate about building machine learning and artificial intelligence products for customers, and has won multiple awards for machine learning products he has built with tech startups and as a startup founder.
Amazon Scholar Kathleen McKeown takes stock of natural language processing; where we are, and where we’re going
After nearly 40 years of research, this year’s ACL 2020 keynote speaker sees big improvements coming in three key areas.Read More
Amazon scientist Matthias Seeger receives ICML “test of time” award
Seeger and three coauthors are honored for paper that forged durable links between previously separate domains.Read More
Adversarial training improves product discovery
Method automatically generates negative training examples for deep-learning model.Read More
Amazon at ACL: Old standards and new forays
Amazon researchers coauthor 17 conference papers, participate in seven workshops.Read More
2019 Q4 AWS Machine Learning Research Awards recipients announced
The AWS Machine Learning Research Awards (MLRA) is pleased to announce the 28 recipients of the 2019 Q4 call-for-proposal cycle.Read More
2019 Q4 recipients of AWS Machine Learning Research Awards
The AWS Machine Learning Research Awards (MLRA) aims to advance machine learning (ML) by funding innovative research and open-source projects, training students, and providing researchers with access to the latest technology. Since 2017, MLRA has supported over 180 research projects from 73 schools and research institutes in 13 countries, with topics such as ML algorithms, computer vision, natural language processing, medical research, neuroscience, social science, physics, and robotics.
On February 18, 2020, we announced the winners of MLRA’s 2019 Q2/Q3 call-for-proposal cycles. We’re now pleased to announce 28 new recipients of MLRA’s 2019 Q4 call-for-proposal cycle. The MLRA recipients represent 26 universities in six countries. The funded projects aim to develop open-source tools and research that benefit the ML community at large, or create impactful research using AWS ML solutions, such as Amazon SageMaker, AWS AI Services, and Apache MXNet on AWS. The following are the 2019 Q4 award recipients:
Recipient | University | Research Title |
Anasse Bari | New York University | Predicting the 2020 Elections Using Big Data, Analyzing What the World Wants Using Twitter and Teaching Next Generation of Thinkers How to Apply AI for Social Good |
Andrew Gordon Wilson | New York University | Scalable Numerical Methods and Probabilistic Deep Learning with Applications to AutoML |
Bo Li | University of Illinois at Urbana-Champaign | Trustworthy Machine Learning as Services via Robust AutoML and Knowledge Enhanced Logic Inference |
Dawn Song | University of California, Berkeley | Protecting the Public Against AI-Generated Fakes |
Dimosthenis Karatzas | Universitat Autónoma de Barcelona | Document Visual Question Answer (DocVQA) for Large-Scale Document Collections |
Dit-Yan Yeung | Hong Kong University of Science and Technology | Temporally Misaligned Spatiotemporal Sequence Modeling |
Lantao Liu | Indiana University Bloomington | Environment-Adaptive Sensing and Modeling using Autonomous Robots |
Leonidas Guibas | Stanford University | Learning Canonical Spaces for Object-Centric 3D Perception |
Maryam Rahnemoonfar | University of Maryland, Baltimore | Combining Model-Based and Data Driven Approaches to Study Climate Change via Amazon SageMaker |
Mi Zhang | Michigan State University | DA-NAS: An AutoML Framework for Joint Data Augmentation and Neural Architecture Search |
Michael P. Kelly | Washington University | Web-Based Machine Learning for Surgeon Benchmarking in Pediatric Spine Surgery |
Ming Zhao | Arizona State University | Enabling Deep Learning across Edge Devices and Cloud Resources |
Nianwen Xue | Brandeis University | AMR2KB: Construct a High-Quality Knowledge by Parsing Meaning Representations |
Nicholas Chia | Mayo Clinic | Massively-Scaled Inverse Reinforcement Learning Approach for Reconstructing the Mutational History of Colorectal Cancer |
Oswald Lanz | Fondazione Bruno Kessler | Structured Representation Learning for Video Recognition and Question Answering |
Pierre Gentine | Columbia University | Learning Fires |
Pratik Chaudhari | University of Pennsylvania | Offline and Off-Policy Reinforcement Learning |
Pulkit Agrawal | Massachusetts Institute of Technology | Curiosity Baselines for the Reinforcement Learning Community |
Quanquan Gu | University of California, Los Angeles | Towards Provably Efficient Deep Reinforcement Learning |
Shayok Chakraborty | Florida State University | Active Learning with Imperfect Oracles |
Soheil Feizi | University of Maryland, College Park | Explainable Deep Learning: Accuracy, Robustness and Fairness |
Spyros Makradakis | University of Nicosia | Clustered Ensemble of Specialist Amazon GluonTS Models for Time Series Forecasting |
Xin Jin | Johns Hopkins University | Making Sense of Network Performance for Distributed Machine Learning |
Xuan (Sharon) Di | Columbia University | Multi-Autonomous Vehicle Driving Policy Learning for Efficient and Safe Traffic |
Yi Yang | University of Technology Sydney | Efficient Video Analysis with Limited Supervision |
Yun Raymond Fu | Northeastern University | Generative Feature Transformation for Multi-Viewed Domain Adaptation |
Zhangyang (Atlas) Wang | Texas A&M University | Mobile-Captured Wound Image Analysis and Dynamic Modeling for Post-Discharge Monitoring of Surgical Site Infection |
Zhi-Li Zhang | University of Minnesota | Universal Graph Embedding Neural Networks for Learning Graph-Structured Data |
Congratulations to all MLRA recipients! We look forward to supporting your research.
For more information about MLRA, see AWS Machine Learning Research Awards or send an email to aws-ml-research-awards@amazon.com.
About the Author
Seo Yeon Shin is a program manager for the AWS AI Academic Programs.
Cisco uses Amazon SageMaker and Kubeflow to create a hybrid machine learning workflow
This is a guest post from members of Cisco’s AI/ML best practices team, including Technical Product Manager Elvira Dzhuraeva, Distinguished Engineer Debo Dutta, and Principal Engineer Amit Saha.
Cisco is a large enterprise company that applies machine learning (ML) and artificial intelligence (AI) across many of its business units. The Cisco AI team in the office of the CTO is responsible for the company’s open source (OSS) AI/ML best practices across the business units that use AI and ML, and is also a major contributor to the Kubeflow open-source project and MLPerf/MLCommons. Our charter is to create artifacts and best practices in ML that both Cisco business units and our customers can use, and we share these solutions as reference architectures.
Due to business needs, such as localized data requirements, Cisco operates a hybrid cloud environment. Model training is done on our own Cisco UCS hardware, but many of our teams leverage the cloud for inference to take advantage of the scalability, geo redundancy, and resiliency. However, such implementations may be challenging to customers, because hybrid integration often requires deep expertise and knowledge to build and support consistent AI/ML workflows.
To address this, we built an ML pipeline using the Cisco Kubeflow starter pack for a hybrid cloud implementation that uses Amazon SageMaker to serve models in the cloud. By providing this reference architecture, we aim to help customers build seamless and consistent ML workloads across their complex infrastructure to satisfy whatever limitations they may face.
Kubeflow is a popular open-source library for ML orchestration on Kubernetes. If you operate in a hybrid cloud environment, you can install the Cisco Kubeflow starter pack to develop, build, train, and deploy ML models on-premises. The starter pack includes the latest version of Kubeflow and an application examples bundle.
Amazon SageMaker is a managed ML service that helps you prepare data, process data, train models, track model experiments, host models, and monitor endpoints. With SageMaker Components for Kubeflow Pipelines, you can orchestrate jobs from Kubeflow Pipelines, which we did for our hybrid ML project. This approach lets us seamlessly use Amazon SageMaker managed services for training and inference from our on-premises Kubeflow cluster. Using Amazon SageMaker provides our hosted models with enterprise features such as automatic scaling, multi-model endpoints, model monitoring, high availability, and security compliance.
To illustrate how our use case works, we recreate the scenario using the publicly available BLE RSSI Dataset for Indoor Localization and Navigation dataset, which contains Bluetooth Low Energy (BLE) Received Signal Strength Indication (RSSI) measurements. The pipeline trains and deploys a model to predict the location of Bluetooth devices. The following steps outline how a Kubernetes cluster can interact with Amazon SageMaker for a hybrid solution. The ML model, written in Apache MXNet, is trained using Kubeflow running on Cisco UCS servers to satisfy our localized data requirements and then deployed to AWS using Amazon SageMaker.
The created and trained model is uploaded to Amazon Simple Storage Service (Amazon S3) and uses Amazon SageMaker endpoints for serving. The following diagram shows our end-to-end workflow.
Development environment
To get started, if you don’t currently have Cisco hardware, you can set up Amazon Elastic Kubernetes Service (Amazon EKS) running with Kubeflow. For instructions, see Creating an Amazon EKS cluster and Deploying Kubeflow Pipelines.
If you have an existing UCS machine, the Cisco Kubeflow starter pack offers a quick Kubeflow setup on your Kubernetes cluster (v15.x or later). To install Kubeflow, set the INGRESS_IP
variable with the machine’s IP address and run the kubeflowup.bash installation script. See the following code:
export INGRESS_IP=<UCS Machine's IP>
bash kubeflowup.bash
For more information about installation, see Installation Instructions on the GitHub repo.
Preparing the hybrid pipeline
For a seamless ML workflow between Cisco UCS and AWS, we created a hybrid pipeline using the Kubeflow Pipelines component and Amazon SageMaker Kubeflow components.
To start using the components you need to import Kubeflow Pipeline packages, including the AWS package:
import kfp
import kfp.dsl as dsl
from kfp import components
from kfp.aws import use_aws_secret
For the full code to configure and get the pipeline running, see the GitHub repo.
The pipeline describes the workflow and how the components relate to each other in the form of a graph. The pipeline configuration includes the definition of the inputs (parameters) required to run the pipeline and the inputs and outputs of each component. The following screenshot shows the visual representation of the finished pipeline on the Kubeflow UI.
The pipeline runs the following three steps:
- Train the model
- Create the model resource
- Deploy the model
Training the model
You train the model with the BLE data locally, create an image, upload it to the S3 bucket, and register the model to Amazon SageMaker by applying the MXNet model configurations .yaml file.
When the trained model artifacts are uploaded to Amazon S3, Amazon SageMaker uses the model stored in Amazon S3 to deploy the model to a hosting endpoint. Amazon SageMaker endpoints make it easier for downstream applications to consume models and help the team monitor them with Amazon CloudWatch. See the following code:
def blerssi_mxnet_train_upload_op(step_name='mxnet-train'):
return dsl.ContainerOp(
name='mxnet-train-upload-s3',
image='ciscoai/mxnet-blerssi-train-upload:v0.2',
command=['python', '/opt/mx-dnn.py', 'train'],
arguments=['--bucket-name', bucket_name]
).apply(use_aws_secret(secret_name=aws_secret_name, aws_access_key_id_name='AWS_ACCESS_KEY_ID', aws_secret_access_key_name='AWS_SECRET_ACCESS_KEY'))
Creating the model resource
When the MXNet model and artifacts are uploaded to Amazon S3, use the KF Pipeline CreateModel component to create an Amazon SageMaker model resource.
The Amazon SageMaker endpoint API is flexible and offers several options to deploy a trained model to an endpoint. For example, you can let the default Amazon SageMaker runtime manage the model deployment, health check, and model invocation. Amazon SageMaker also allows for customization to the runtime with custom containers and algorithms. For instructions, see Overview of containers for Amazon SageMaker.
For our use case, we wanted some degree of control over the model health check API and the model invocation API. We chose the custom override for the Amazon SageMaker runtime to deploy the trained model. The custom predictor allows for flexibility in how the incoming request is processed and passed along to the model for prediction. See the following code:
sagemaker_model_op = components.load_component_from_url(model)
Deploying the model
You deploy the model to an Amazon SageMaker endpoint with the KF Pipeline CreateEndpoint component.
The custom container used for inference gives the team maximum flexibility to define custom health checks and invocations to the model. However, the custom container must follow the golden path for APIs prescribed by the Amazon SageMaker runtime. See the following code:
sagemaker_deploy_op = components.load_component_from_url(deploy)
Running the pipeline
To run your pipeline, complete the following steps:
- Configure the Python code that defines the hybrid pipeline with Amazon SageMaker components:
@dsl.pipeline( name='MXNet Sagemaker Hybrid Pipeline', description='Pipeline to train BLERSSI model using mxnet and save in aws s3 bucket' ) def mxnet_pipeline( region="", image="", model_name="", endpoint_config_name="", endpoint_name="", model_artifact_url="", instance_type_1="", role="" ): train_upload_model = blerssi_mxnet_train_upload_op() create_model = sagemaker_model_op( region=region, model_name=model_name, image=image, model_artifact_url=model_artifact_url, role=role ).apply(use_aws_secret(secret_name=aws_secret_name, aws_access_key_id_name='AWS_ACCESS_KEY_ID', aws_secret_access_key_name='AWS_SECRET_ACCESS_KEY')) create_model.after(train_upload_model) sagemaker_deploy=sagemaker_deploy_op( region=region, endpoint_config_name=endpoint_config_name, endpoint_name=endpoint_name, model_name_1=create_model.output, instance_type_1=instance_type_1 ).apply(use_aws_secret(secret_name=aws_secret_name, aws_access_key_id_name='AWS_ACCESS_KEY_ID', aws_secret_access_key_name='AWS_SECRET_ACCESS_KEY')) sagemaker_deploy.after(create_model)
For more information about configuration, see Pipelines Quickstart. For the full pipeline code, see the GitHub repo.
- Run the pipeline by feeding the following parameters to execute the pipeline:
run = client.run_pipeline(blerssi_hybrid_experiment.id, 'blerssi-sagemaker-pipeline-'+timestamp, pipeline_package_path='mxnet_pipeline.tar.gz', params={ 'region': aws_region, 'image': inference_image, 'model_name': model_name, 'endpoint_config_name': endpoint_config_name, 'endpoint_name': endpoint_name, 'model_artifact_url': model_path, 'instance_type_1': instance_type, 'role': role_arn })
At this point, the BLERSSI Amazon SageMaker pipeline starts executing. After all the components execute successfully, check the logs of the sagemaker-deploy component to verify that the endpoint is created. The following screenshot shows the logs of the last step with the URL to the deployed model.
Validating the model
After the model is deployed in AWS, we validate it by submitting sample data to the model via an HTTP request using the endpoint name of the model deployed on AWS. The following screenshot shows a snippet from a sample Jupyter notebook that has a Python client and the corresponding output with location predictions.
Conclusion
Amazon SageMaker and Kubeflow Pipelines can easily integrate in one single hybrid pipeline. The complete set of blogs and tutorials for Amazon SageMaker makes it easy to create a hybrid pipeline via the Amazon SageMaker components for Kubeflow Pipelines. The API was exhaustive, covered all the key components we needed to use, and allowed for the development of custom algorithms and integration with the Cisco Kubeflow Starter Pack. By uploading a trained ML model to Amazon S3 to serve on AWS with Amazon SageMaker, we reduced the complexity and TCO of managing complex ML lifecycles by about 50%. We comply with the highest standards of enterprise policies in data privacy and serve models in a scalable fashion with redundancy on AWS all over the United States and the world.
About the Authors
Elvira Dzhuraeva is a Technical Product Manager at Cisco where she is responsible for cloud and on-premise machine learning and artificial intelligence strategy. She is also a Community Product Manager at Kubeflow and a member of MLPerf community.
Debo Dutta is a Distinguished Engineer at Cisco where he leads a technology group at the intersection of algorithms, systems and machine learning. While at Cisco, Debo is currently a visiting scholar at Stanford. He got his PhD in Computer Science from University of Southern California, and an undergraduate in Computer Science from IIT Kharagpur, India.
Amit Saha is a Principal Engineer at Cisco where he leads efforts in systems and machine learning. He is a visiting faculty at IIT Kharagpur. He has a PhD in Computer Science from Rice University, Houston and an undergraduate from IIT Kharagpur. He has served on several program committees for top Computer Science conferences.
Deriving conversational insights from invoices with Amazon Textract, Amazon Comprehend, and Amazon Lex
Organizations across industries have a large number of physical documents such as invoices that they need to process. It is difficult to extract information from a scanned document when it contains tables, forms, paragraphs, and check boxes. Organization have been addressing these problems with manual effort or custom code or by using Optical Character Recognition (OCR) technology. However, that requires templates for form extraction and custom workflows.
Moreover, after extracting the text or content from a document, they want to extract insights from these receipts or invoices for their end users. However, that would require building a complex NLP model. Training the model would require a large amount of training data and compute resources. Building and training a machine learning model could be expensive and time-consuming.
Further, providing a human like interface to interact with these documents is cumbersome for end users. These end users often call the help desk but over time this adds cost to the organization.
This post shows you how to use AWS AI services to automate text data processing and insight discovery. With AWS AI services such as Amazon Textract, Amazon Comprehend and Amazon Lex, you can set up an automated serverless solution to address this requirement. We will walk you through below steps:
- Extract text from receipts or invoices in pdf or images with Amazon Textract.
- Derive insights with Amazon Comprehend.
- Interact with these insights in natural language using Amazon Lex.
Next, we will go through the services and the architecture for building the solution to solve the problem.
Services used
This solution uses the following AI services, serverless technologies, and managed services to implement a scalable and cost-effective architecture:
- Amazon Cognito – Lets you add user signup, signin, and access control to your web and mobile apps quickly and easily.
- AWS Lambda – Executes code in response to triggers such as changes in data, shifts in system state, or user actions. Because Amazon S3 can directly trigger a Lambda function, you can build a variety of real-time serverless data-processing systems.
- Amazon Lex – Provides an interface to create conversational chatbots.
- Amazon Comprehend – NLP service that uses machine learning to find insights and relationships in text.
- Amazon Textract– Uses ML to extract text and data from scanned documents in PDF, JPEG, or PNG formats.
- Amazon Simple Storage Service (Amazon S3) – Serves as an object store for your documents and allows for central management with fine-tuned access controls.
Architecture
The following diagram illustrates the architecture of the solution.
The architecture contains the following steps:
- The backend user or administrator uses the AWS Management Console or AWS Command Line Interface (AWS CLI) to upload the PDF documents or images to an S3 bucket.
- The Amazon S3 upload triggers a AWS Lambda function.
- The Lambda function invokes an Amazon Textract StartDocumentTextDetection API, which sets up an asynchronous job to detect text from the PDF you uploaded.
- Amazon Textract notifies Amazon Simple Notification Service (Amazon SNS) when text processing is complete.
- A second Lambda function gets the notification from SNS topic when the job is completed to detect text.
- Once the lambda is notified of job completion from Amazon SNS, it calls a Amazon Textract GetDocumentTextDetection API to receive the result from asynchronous operation and loads the results into an S3 bucket.
- A Lambda function is used for fulfillment of the Amazon Lex intents. For a more detailed sequence of interactions please refer to the Building your chatbot step in “Deploying the Architecture with Cloudformation” section.
- Amazon Comprehend uses ML to find insights and relationships in text. The lambda function uses boto3 APIs that Amazon Comprehend provides for entity and key phrases detection.
- In response to the Bot’s welcome message, the user types “Show me the invoice summary”, this invokes the GetInvoiceSummary Lex intent and the Lambda function invokes the Amazon Comprehend DetectEntities API to detect entities for fulfillment.
- When the user types “Get me the invoice details”, this invokes the GetInvoiceDetails intent, Amazon Lex prompts the user to enter Invoice Number, and the Lambda function invokes the Amazon Comprehend DetectEntities API to return the Invoice Details message.
- When the user types “Can you show me the invoice notes for <invoice number>”, this invokes the GetInvoiceNotes intent, and the Lambda function invokes the Amazon Comprehend DetectKeyPhrases API to return comments associated with the invoice.
- You deploy the Lexbot Web UI in your AWS Cloudformation template by using an existing CloudFormation stack as a nested stack. To download the stack, see Deploy a Web UI for Your Chatbot. This nested stack deploys a Lex Web UI, the webpage is served as a static website from an S3 bucket. The web UI uses Amazon Cognito to generate an access token for authentication and uses AWS CodeStar to set up a delivery pipeline.The end-users interact this chatbot web UI.
Deploying the architecture with AWS CloudFormation
You deploy a CloudFormation template to provision the necessary AWS Indentity and Access Management (IAM) roles, services, and components of the solution including Amazon S3, Lambda, Amazon Textract, Amazon Comprehend, and the Amazon Lex chatbot.
- Launch the following CloudFormation template and in the US East (N. Virginia) Region:
- Don’t make any changes to stack name or parameters botname
InvoiceBot
. - In the Capabilities and transforms section, select all three check-boxes to provide acknowledgment to AWS CloudFormation to create IAM resources and expand the template.
For more information about these resources, see AWS IAM resources.
This template uses AWS Serverless Application Model (AWS SAM), which simplifies how to define functions and APIs for serverless applications, and also has features for these services, like environment variables.
- Choose Create stack.
The following screenshot of the Stack Detail page shows the status of the stack as CREATE_IN_PROGRESS
. It can take up to 20 minutes for the status to change to CREATE_COMPLETE
.
- On the Outputs tab, copy the value of
LexLambaFunctionArn
,AssetsUploadBucket
,ExtractedTextfilesBucket
, andLexUIWebAppUrl
.
Uploading documents to the S3 bucket
To upload your documents to your new S3 bucket, choose the S3 bucket URL corresponding to AssetsUploadBucket
that you copied earlier. Upload a PDF or image to start the text extraction flow.
You can download the invoice used in this blog from the GitHub repo and upload it to the AssetsUploadBucket
S3 URL. We recommend to customize this solution for your invoice templates. For more information about uploading files, see How do I upload files and folders to an S3 bucket?
After the upload completes, you can see the file on the Amazon S3 console on the Overview tab.
After you upload the file, the text is extracted from the document. To see an extracted file with the text, open the bucket by choosing the URL you copied earlier.
On the Overview tab, you can download the file and inspect the content to see if it’s the same as the text in the uploaded file.
Building your chatbot
We will use the following conversation to model the bot:
Bot: Welcome to InvoiceBot. You can ask me to provide your invoice summary, or details of your invoices, or your invoice notes
User: Show me the invoice summary
Bot: I reviewed your input documents and found 1 invoice with invoice numbers 35678-9 totaling $2100.0. I can get you invoice details or invoice notes. Simply type your request
User: Get me the invoice details
Bot: Please enter the invoice number
User: 35678-9
Bot: Invoice Details for 35678-9: On 5/10/1019 for the item One there is a charge of 1500.00. On 5/11/2019 for the item Merchant Two there is a charge of 100.00. On 5/12/2019 for the item Merchant Three there is a charge of 300.00. On 5/13/2019 for the item Merchant Three there is a charge of 200.00. You can request me for invoice notes or simply close this chat.
User: Can you show me the invoice notes for 35678-9
Bot: Invoice Notes for 35678-9: 5/13/2019 Merchant Three 200.00 Merchant Three 300.00 Laptop Office Supplies Merchant Two 100.00 Team Dinner Food 5/12/2019 5/11/2019 Desks and Office Supplies 5/10/1019 Merchant One 1500.00 Chairs . Feel free to try the options again or you can simply close this chat
We will build an Amazon Lex bot (InvoiceBot) with the following intents:
- GetInvoiceSummary – Intent that’s invoked when the user requests to view the Invoice Summary. This is fulfilled by a Lambda function and returns the count of invoices available, and the total amount of the invoices
- GetInvoiceDetails – Intent that’s invoked when the user requests to view the Invoice Details. This is fulfilled by a Lambda function and provides item level breakdown of the invoices including Date, Quantity and Item Details
- GetInvoiceNotes – Intent that’s invoked when the user requests to view the Invoice Notes. This is fulfilled by a Lambda function and provides notes from the invoices uploaded with Date and Item Description.
Publishing your chatbot
As described in the solution overview earlier, you use an Amazon Lex chatbot (InvoiceBot) to interact with the insights Amazon Comprehend derives from the text Amazon Textract extracts.
To publish your chatbot, complete the following steps:
- On the Amazon Lex console, choose Bots.
- Choose the chatbot you created.
- Under Intents, choose GetInvoiceSummary.
- Under Fulfilment, select your Lambda function.
- Search for the function by entering
LexLambdaFunction
and selecting the result.
A pop-up box appears.
- Choose OK.
- Choose Save intent.
- Repeat these steps for the remaining two intents,
GetInvoiceDetails
andGetInvoiceNotes
. - Choose Build.
- When the build is complete, choose Publish.
- For Create an alias, enter
Latest
. You can consider a different name; names like test, dev, beta, or prod primarily refer to the environment of the bot. - Choose Publish.
The following page opens after the bot is published.
- Choose Close.
Using the chatbot
Your chatbot is now ready to use. Navigate to the URL LexUIWebAppUrl
copied from the AWS CloudFormation Outputs tab. The following screenshots show the user conversation with the bot (read from left to right):
Conclusion
This post demonstrated how to create a conversational chatbot in Amazon Lex that enables interaction with insights derived using Amazon Comprehend and Amazon Textract from a text in images or in a PDF document. The code from this post is available on the GitHub repo for you to use and extend. We are interested to hear how you would like to apply this solution for your usecase. Please share your thoughts and questions in the comments section.
About the Authors
Mona Mona is an AI/ML Specialist Solutions Architect based out of Arlington, VA. She works with World Wide Public Sector Team and helps customers adopt machine learning on a large scale. She is passionate about NLP and ML Explainability areas in AI/ML .
Prem Ranga is an Enterprise Solutions Architect based out of Houston, Texas. He is part of the Machine Learning Technical Field Community and loves working with customers on their ML and AI journey. Prem is passionate about robotics, is an Autonomous Vehicles researcher, and also built the Alexa-controlled Beer Pours in Houston and other locations.
Saida Chanda is a Senior Partner Solutions Architect based out of Seattle, WA. He is a technology enthusiast who drives innovation through AWS partners to meet customers complex business requirements via simple solutions. His areas of interest are ML and DevOps. In his spare time, he likes to spend time with family and exploring his innerself through meditation.