Automated monitoring of your machine learning models with Amazon SageMaker Model Monitor and sending predictions to human review workflows using Amazon A2I

Automated monitoring of your machine learning models with Amazon SageMaker Model Monitor and sending predictions to human review workflows using Amazon A2I

When machine learning (ML) is deployed in production, monitoring the model is important for maintaining the quality of predictions. Although the statistical properties of the training data are known in advance, real-life data can gradually deviate over time and impact the prediction results of your model, a phenomenon known as data drift. Detecting these conditions in production can be challenging and time-consuming, and requires a system that captures incoming real-time data, performs statistical analyses, defines rules to detect drift, and sends alerts for rule violations. Furthermore, the process must be repeated for every new iteration of the model.

Amazon SageMaker Model Monitor enables you to continuously monitor ML models in production. You can set alerts to detect deviations in the model quality and take corrective actions, such as retraining models, auditing upstream systems, or fixing data quality issues. You can use insights from Model Monitor to proactively determine model prediction variance due to data drift and then use Amazon Augmented AI (Amazon A2I), a fully managed feature in Amazon SageMaker, to send ML inferences to human workflows for review. You can use Amazon A2I for multiple purposes, such as:

  • Reviewing results below a threshold
  • Human oversight and audit use cases
  • Augmenting AI and ML results as required

In this post, we show how to set up an ML workflow on Amazon SageMaker to train an XGBoost algorithm for breast cancer predictions. We deploy the model on a real-time inference endpoint, launch a model monitoring schedule, evaluate monitoring results, and trigger a human review loop for below-threshold predictions. We then show how the human loop workers review and update the predictions.

We walk you through the following steps using this accompanying Jupyter notebook:

  1. Preprocess your input dataset.
  2. Train an XGBoost model and deploy to a real-time endpoint.
  3. Generate baselines and start Model Monitor.
  4. Review the model monitor reports and derive insights.
  5. Set up a human review loop for low-confidence detection using Amazon A2I.

Prerequisites

Before getting started, you need to create your human workforce and set up your Amazon SageMaker Studio notebook.

Creating your human workforce

For this post, you create a private work team and add only one user (you) to it. For instructions, see Create an Amazon Cognito Workforce Using the Labeling Workforces Page.

Enter your email in the email addresses box for workers. To invite your colleagues to participate in reviewing tasks, include their email addresses in this box.

After you create your private team, you receive an email from no-reply@verificationemail.com that contains your workforce username, password, and a link that you can use to log in to the worker portal. Enter the username and password you received in the email to log in. You must then create a new, non-default password. This is your private worker’s interface.

When you create an Amazon A2I human review task using your private team (explained in the Starting a human loop section), your task should appear in the Jobs section. See the following screenshot.

After you create your private workforce, you can view it on the Labeling workforces page, on the Private tab.

Setting up your Amazon SageMaker Studio notebook

To set up your notebook, complete the following steps:

  1. Onboard to Amazon SageMaker Studio with the quick start procedure.
  2. When you create an AWS Identity and Access Management (IAM) role to the notebook instance, be sure to specify access to Amazon Simple Storage Service (Amazon S3). You can choose Any S3 Bucket or specify the S3 bucket you want to enable access to. You can use the AWS-managed policies AmazonSageMakerFullAccess and AmazonAugmentedAIFullAccess to grant general access to these two services.

  1. When user is created and is active, choose Open Studio.

  1. On the Studio landing page, from the File drop-down menu, choose New.
  2. Choose Terminal.

  1. In the terminal, enter the following code:
git clone https://github.com/aws-samples/amazon-a2i-sample-jupyter-notebooks

  1. Open the notebook by choosing Amazon-A2I-with-Amazon-SageMaker-Model-Monitor.ipynb in the amazon-a2i-sample-jupyter-notebooks folder.

Preprocessing your input dataset

You can follow the steps in this post using the accompanying Jupyter notebook. Make sure you provide an S3 bucket and a prefix of your choice. We then import the Python data science libraries and the Amazon SageMaker Python SDK that we need to run through our use case.

Loading the dataset

For this post, we use a dataset for breast cancer predictions from the UCI Machine Learning Repository. Please refer to the accompanying Jupyter notebook for the code to load and split this dataset. Based on the input features, we first train a model to detect a benign (label=0) or malignant (label=1) condition.

The following screenshot shows some of the rows in the training dataset.

Training and deploying an Amazon SageMaker XGBoost model

XGBoost (eXtreme Gradient Boosting) is a popular and efficient open-source implementation of the gradient boosted trees algorithm. For our use case, we use the binary:logistic objective. The model applies logistic regression for binary classification (in this example, whether a condition is benign or malignant). The output is a probability that represents the log likelihood of the Bernoulli distribution.

With Amazon SageMaker, you can use XGBoost as a built-in algorithm or framework. For this use case, we use the built-in algorithm. To specify the Amazon Elastic Container Registry (Amazon ECR) container location for Amazon SageMaker implementation of XGBoost, enter the following code:

from sagemaker.amazon.amazon_estimator import get_image_uri
container = get_image_uri(boto3.Session().region_name, 'xgboost', '1.0-1')

Creating the XGBoost estimator

We use the XGBoost container to construct an estimator using the Amazon SageMaker Estimator API and initiate a training job (the full walkthrough is available in the accompanying Jupyter notebook):

sess = sagemaker.Session()

xgb = sagemaker.estimator.Estimator(container,
                                    role, 
                                    train_instance_count=1, 
                                    train_instance_type='ml.m5.2xlarge',
                                    output_path='s3://{}/{}/output'.format(bucket, prefix),
                                    sagemaker_session=sess)

Specifying hyperparameters and starting training

We can now specify the hyperparameters for our training. You set hyperparameters to facilitate the estimation of model parameters from data. See the following code:

xgb.set_hyperparameters(max_depth=5,
                        eta=0.2,
                        gamma=4,
                        min_child_weight=6,
                        subsample=0.8,
                        silent=0,
                        objective='binary:logistic',
                        num_round=100)

xgb.fit({'train': s3_input_train, 'validation': s3_input_validation})

For more information, see XGBoost Parameters.

Deploying the XGBoost model

We deploy a model that’s hosted behind a real-time inference endpoint. As a prerequisite, we set up a data_capture_config for the Model Monitor after the endpoint is deployed, which enables Amazon SageMaker to collect the inference requests and responses for use in Model Monitor. For more information, see the accompanying notebook.

The deploy function returns a Predictor object that you can use for inference:

xgb_predictor = xgb.deploy(initial_instance_count=1,
                           instance_type='ml.m5.2xlarge',
                           endpoint_name=endpoint_name,
                           data_capture_config=data_capture_config)

Invoking the deployed model using the endpoint

You can now send data to this endpoint to get inferences in real time. The request and response payload, along with some additional metadata, is saved in the Amazon S3 location that you specified in DataCaptureConfig. You can follow the steps in the walkthrough notebook.

The following JSON code is an example of an inference request and response captured:

Starting Amazon SageMaker Model Monitor

Amazon SageMaker Model Monitor continuously monitors the quality of ML models in production. To start using Model Monitor, we create a baseline, inspect the baseline job results, and create a monitoring schedule.

Creating a baseline

The baseline calculations of statistics and constraints are needed as a standard against which data drift and other data quality issues can be detected. The training dataset is usually a good baseline dataset. The training dataset schema and the inference dataset schema should match (the number and order of the features). From the training dataset, you can ask Amazon SageMaker to suggest a set of baseline constraints and generate descriptive statistics to explore the data. To create the baseline, you can follow the detailed steps in the walkthrough notebook. See the following code:

# Start the baseline job
from sagemaker.model_monitor import DefaultModelMonitor
from sagemaker.model_monitor.dataset_format import DatasetFormat

my_default_monitor = DefaultModelMonitor(
    role=role,
    instance_count=1,
    instance_type='ml.m5.4xlarge',
    volume_size_in_gb=100,
    max_runtime_in_seconds=3600,
)

my_default_monitor.suggest_baseline(
    baseline_dataset=baseline_data_uri+'/train.csv',
    dataset_format=DatasetFormat.csv(header=False), # changed this to header=False since train.csv does not have header. 
    output_s3_uri=baseline_results_uri,
    wait=True
)

Inspecting baseline job results

When the baseline job is complete, we can inspect the results. Two files are generated:

  • statistics.json – This file is expected to have columnar statistics for each feature in the dataset that is analyzed. For the schema of this file, see Schema for Statistics.
  • constraints.json – This file is expected to have the constraints on the features observed. For the schema of this file, see Schema for Constraints.

Model Monitor computes per column/feature statistics. In the following screenshot, c0 and c1 in the name column refer to columns in the training dataset without the header row.

The constraints file is used to express the constraints that a dataset must satisfy. See the following screenshot.

Next we review the monitoring configuration in the constraints.json file:

  • datatype_check_threshold – During the baseline step, the generated constraints suggest the inferred data type for each column. You can tune the monitoring_config.datatype_check_threshold parameter to adjust the threshold for when it’s flagged as a violation.
  • domain_content_threshold – If there are more unknown values for a String field in the current dataset than in the baseline dataset, you can use this threshold to dictate if it needs to be flagged as a violation.
  • comparison_threshold – This value is used to calculate model drift.

For more information about constraints, see Schema for Constraints.

Create a monitoring schedule

With a monitoring schedule, Amazon SageMaker can start processing jobs at a specified frequency to analyze the data collected during a given period. Amazon SageMaker compares the dataset for the current analysis with the baseline statistics and constraints provided and generates a violations report. To create an hourly monitoring schedule, enter the following code:

from sagemaker.model_monitor import CronExpressionGenerator
from time import gmtime, strftime

mon_schedule_name = 'xgb-breast-cancer-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
my_default_monitor.create_monitoring_schedule(
    monitor_schedule_name=mon_schedule_name,
    endpoint_input=xgb_predictor.endpoint,
    output_s3_uri=s3_report_path,
    statistics=my_default_monitor.baseline_statistics(),
    constraints=my_default_monitor.suggested_constraints(),
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    enable_cloudwatch_metrics=True,

)

We then invoke the endpoint continuously to generate traffic for the model monitor to pick up. Because we set up an hourly schedule, we need to wait at least an hour for traffic to be detected.

Reviewing model monitoring

The violations file is generated as the output of a MonitoringExecution, which lists the results of evaluating the constraints (specified in the constraints.json file) against the current dataset that was analyzed. For more information about violation checks, see Schema for Violations. For our use case, the model monitor detects a data type mismatch violation in one of the requests sent to the endpoint. See the following screenshot.

For more details, see the walkthrough notebook.

Evaluating the results

To determine the next steps for our experiment, we should consider the following two perspectives:

  • Model Monitor violations: We only saw the datatype_check violation from the Model Monitor; we didn’t see a model drift violation. In our use case, Model Monitor uses the robust comparison method based on the two-sample K-S test to quantify the distance between the empirical distribution of our test dataset and the cumulative distribution of the baseline dataset. This distance didn’t exceed the value set for the comparison_threshold. The prediction results are aligned with the results in the training dataset.
  • Probability distribution of prediction results: We used a test dataset of 114 requests. Out of this, we see that the model predicts 60% of the requests to be malignant (over 90% probability output in the prediction results), 30% benign (less than 10% probability output in the prediction results), and the remaining 10% of the requests are indeterminate. The following chart summarizes these findings.

As a next step, you need to send the prediction results that are distributed with output probabilities of over 10% and less than 90% (because the model can’t predict with sufficient confidence) to a domain expert who can look at the model results and identify if the tumor is benign or malignant. You use Amazon A2I to set up a human review workflow and define conditions for activating the review loop.

Starting the human review workflow

To configure your human review workflow, you complete the following high-level steps:

  1. Create the human task UI.
  2. Create the workflow definition.
  3. Set the trigger conditions to activate the human loop.
  4. Start your human loop.
  5. Check that the human loop tasks are complete.

Creating the human task UI

The following example code shows how to create a human task UI resource, giving a UI template in liquid HTML. This template is rendered to the human workers whenever a human loop is required. You can follow through the complete steps using the accompanying Jupyter notebook. After the template is defined, set up the UI task function and run it.

def create_task_ui():
    '''
    Creates a Human Task UI resource.

    Returns:
    struct: HumanTaskUiArn
    '''
    response = sagemaker_client.create_human_task_ui(
        HumanTaskUiName=taskUIName,
        UiTemplate={'Content': template})
    return response
# Create task UI
humanTaskUiResponse = create_task_ui()
humanTaskUiArn = humanTaskUiResponse['HumanTaskUiArn']
print(humanTaskUiArn)

Creating the workflow definition

We create the flow definition to specify the following:

  • The workforce that your tasks are sent to.
  • The instructions that your workforce receives. This is specified using a worker task template.
  • Where your output data is stored.

See the following code:

create_workflow_definition_response = sagemaker_client.create_flow_definition(
        FlowDefinitionName= flowDefinitionName,
        RoleArn= role,
        HumanLoopConfig= {
            "WorkteamArn": WORKTEAM_ARN,
            "HumanTaskUiArn": humanTaskUiArn,
            "TaskCount": 1,
            "TaskDescription": "Review the model predictions and determine if you agree or disagree. Assign a label of 1 to indicate malignant result or 0 to indicate a benign result based on your review of the inference request",
            "TaskTitle": "Using Model Monitor and A2I Demo"
        },
        OutputConfig={
            "S3OutputPath" : OUTPUT_PATH
        }
    )
flowDefinitionArn = create_workflow_definition_response['FlowDefinitionArn'] # let's save this ARN for future use

Setting trigger conditions for human loop activation

We need to send the prediction results that are distributed with output probabilities of over 10% and under 90% (because the model can’t predict with sufficient confidence in this range). We use this as our activation condition, as shown in the following code:

# assign our original test dataset 
model_data_categorical = test_data[list(test_data.columns)[1:]]  

LOWER_THRESHOLD = 0.1
UPPER_THRESHOLD = 0.9
small_payload_df = model_data_categorical.head(len(predictions))
small_payload_df['prediction_prob'] = predictions
small_payload_df_res = small_payload_df.loc[
    (small_payload_df['prediction_prob'] > LOWER_THRESHOLD) &
    (small_payload_df['prediction_prob'] < UPPER_THRESHOLD)
]
print(small_payload_df_res.shape)
small_payload_df_res.head(10)

Starting a human loop

A human loop starts your human review workflow and sends data review tasks to human workers. See the following code:

# Activate human loops
import json
humanLoopName = str(uuid.uuid4())

start_loop_response = a2i.start_human_loop(
            HumanLoopName=humanLoopName,
            FlowDefinitionArn=flowDefinitionArn,
            HumanLoopInput={
                "InputContent": json.dumps(ip_content)
            }
        )

The workers in this use case are domain experts that can validate the request features and determine if the result is malignant or benign. The task requires reviewing the model predictions, agreeing or disagreeing, and updating the prediction as 1 for malignant and 0 for benign. The following screenshot shows a sample of tasks received.

The following screenshot shows updated predictions.

For more information about task UI design for tabular datasets, see Using Amazon SageMaker with Amazon Augmented AI for human review of Tabular data and ML predictions.

Checking the status of task completion and human loop

To check the status of the task and the human loop, enter the following code:

completed_human_loops = []
resp = a2i.describe_human_loop(HumanLoopName=humanLoopName)
print(f'HumanLoop Name: {humanLoopName}')
print(f'HumanLoop Status: {resp["HumanLoopStatus"]}')
print(f'HumanLoop Output Destination: {resp["HumanLoopOutput"]}')
print('n')
    
if resp["HumanLoopStatus"] == "Completed":
    completed_human_loops.append(resp)

When the human loop tasks are complete, we inspect the results of the review and the corrections made to prediction results.

You can use the human-labeled output to augment the training dataset for retraining. This keeps the distribution variance within the threshold and prevents data drift, thereby improving model accuracy. For more information about using Amazon A2I outputs for model retraining, see Object detection and model retraining with Amazon SageMaker and Amazon Augmented AI.

Cleaning up

To avoid incurring unnecessary charges, delete the resources used in this walkthrough when not in use, including the following:

Conclusion

This post demonstrated how you can use Amazon SageMaker Model Monitor and Amazon A2I to set up a monitoring schedule for your Amazon SageMaker model endpoints; specify baselines that include constraint thresholds; observe inference traffic; derive insights such as model drift, completeness, and data type violations; and send the low-confidence predictions to a human workflow with labelers to review and update the results. For video presentations, sample Jupyter notebooks, and more information about use cases like document processing, content moderation, sentiment analysis, object detection, text translation, and more, see Amazon A2I Resources.

 

References

[1] Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.


About the Authors

 

Prem Ranga is an Enterprise Solutions Architect at AWS based out of Houston, Texas. He is part of the Machine Learning Technical Field Community and loves working with customers on their ML and AI journey. Prem is passionate about robotics, is an Autonomous Vehicles researcher, and also built the Alexa-controlled Beer Pours in Houston and other locations.

 

 

Jasper Huang is a Technical Writer Intern at AWS and a student at the University of Pennsylvania pursuing a BS and MS in computer science. His interests include cloud computing, machine learning, and how these technologies can be leveraged to solve interesting and complex problems. Outside of work, you can find Jasper playing tennis, hiking, or reading about emerging trends.

 

 

 

Talia Chopra is a Technical Writer in AWS specializing in machine learning and artificial intelligence. She works with multiple teams in AWS to create technical documentation and tutorials for customers using Amazon SageMaker, MxNet, and AutoGluon. In her free time, she enjoys meditating, studying machine learning, and taking walks in nature.

 

Read More

RENGA Inc. automates code reviews with Amazon CodeGuru

RENGA Inc. automates code reviews with Amazon CodeGuru

This guest post was authored by Kazuma Ohara, Director of RENGA Inc., and edited by Yumiko Kanasugi, Solutions Architect at AWS Japan.

RENGA Inc. operates Mansion Note, one of Japan’s most popular condominium review and rating websites, which gets over a million unique visitors per month. Mansion Note provides a service where people can check reviews and rankings of condominiums and apartments all over Japan. People of all positions, such as current residents, former residents, neighbors, experts, real estate agents, and property owners, clarify their positions first and then post and share their candid opinions and reviews about the condominiums. This “wisdom of crowds” is expected to help potential buyers and tenants better imagine their new home before moving in, and can help eliminate regrets such as, “This is different from what I expected,” or, “I should have chosen a different condo.”

The company has a total of six engineers, including myself. Code reviews have been an essential process in our development, because RENGA takes code quality very seriously. The company, however, used to face a challenge in which code review tasks increased in proportion to the quantity of development, which led to an increased burden on reviewers. Also, no matter how many times we reviewed code, some bugs remained unnoticed, so we needed a mechanism to conduct code reviews more exhaustively.

We saw the announcement of Amazon CodeGuru at re:Invent 2019. The moment we learned that CodeGuru is a machine learning (ML)-based code review service, we knew it was exactly the tool we were looking for. At the time of the announcement, we were making significant improvements to our source code, so we thought that CodeGuru Reviewer might help us with that. We decided to adopt the tool, and it pointed out issues that neither our members nor other static analytics tools had ever detected. In this post, I talk about why RENGA decided to adopt CodeGuru, as well as its adoption process.

Maintaining code quality

RENGA was founded in 2012; 2020 marks its 8th anniversary. Although our product is getting mature, we still invest a lot of resources in development so we can extend features quickly. We not only accelerate the development cycle, but also give the same level of priority to maintaining code quality. When extending features, poor quality code adds complexity to the system and can become a technical debt. On the other hand, as long as consistent code quality is maintained, scaling the system doesn’t prevent developers from extending features, because the code itself is simple. Keeping the balance between agility and quality is important, especially for startups that continuously release new features.

RENGA has a two-step code review process. When a developer commits a fix to our remote repository on GitHub and makes a pull request, two senior members review it first. Then I do the final check and merge it into the primary branch (unless I find an issue). We also check the minimum coding rules using Checkstyle during the build phase.

In the past, we had a challenge with the cost and quality of code reviews. As the amount of code increased at RENGA, the burden on reviewers also increased. We spend about 5% of our development time on code reviews, and reviewers spend an additional 1 hour a day on average for code reviews. Code reviews can, however, be a bottleneck when we want to quickly release new features and promptly deliver those values to our users. Also, the more code we need to review, the less accurate it is to identify issues, and some issues may even be overlooked. One solution is to increase reviewers, but that isn’t easy because code reviews require not only extensive business and technical knowledge, but also an understanding of the top modules. So we needed an automated tool that could offload the reviewers’ workload.

Adopting CodeGuru Reviewer

CodeGuru Reviewer is an automated tool that you can seamlessly integrate into your development pipeline, so we found it very easy to adopt the tool. We had high expectations that the tool may not only solve cost issues, but also help us gain insights from a different perspective than humans because the tool is based on ML. The tool was still in the preview stage, but upon being offered a free tier, we decided to try it out.

The following architecture illustrates RENGA’s development pipeline.

Setting up CodeGuru Reviewer is very easy. All you need to do is select a repository on GitHub and associate it with the tool. AWS recommends that you create a new GitHub user for CodeGuru Reviewer before associating a GitHub repository with the tool. CodeGuru Reviewer is enabled after the association is complete.

The following screenshot shows the Associate repository page on the CodeGuru console.

CodeGuru Reviewer is triggered by a pull request. The tool provides recommendations by adding comments to the pull request, usually within 15 minutes after the request is made.

The following text is a recommendation that CodeGuru Reviewer generated:

It is more efficient to directly use Stream::min or Stream::max than Stream::sorted and Stream::findFirst. The former is O(n) in terms of time while the latter is not. Also the former is O(1) and the latter is O(n) in terms of memory.

The part of the code the recommendation pointed out was actually not a bottleneck, but it did help us improve performance. Learning this method has helped our developers code better with more confidence.

The following text is another example of a recommendation from CodeGuru Reviewer:

Consider closing the resource returned by the following method call: newInputStream. Currently, there are execution paths that do not contain closure statements, e.g., when exception is thrown by SampleData.read. Either a) close the object returned by newInputStream() in a try-finally block or b) close the resource by declaring the object returned by newInputStream() in a try-with-resources block.

Although there was no actual resource leakage, we modified the part that was pointed out to clarify that no leakage will occur, which improved the code readability.

After adopting CodeGuru Reviewer, we feel that the product generates fewer recommendations compared to other existing static analytics tools, but it provides highly accurate recommendations with less false-positive advice. Too many recommendations require extra time for triage, so accurate recommendations are a big bonus for busy developers.

Summary

Although the code review process is important, it shouldn’t increase the workload for reviewers and become a bottleneck in development. By adopting CodeGuru Reviewer, we successfully automated code reviews and reduced reviewers’ workloads. Furthermore, learning the best practices of coding—which we weren’t aware of—has helped us develop with more confidence. Going forward, we plan on measuring metrics such as cyclomatic complexity so we can provide higher-quality services to our customers promptly. We also expect that CodeGuru Reviewer will further expand its recommendation items.


About the Authors

Kazuma Ohara is the Director of RENGA Inc., an internet services company headquartered in Japan.

 

Yumiko Kanasugi is a Solutions Architect with Amazon Web Services Japan, supporting digital native business customers to utilize AWS.

Read More

How DevFactory builds better applications with Amazon CodeGuru

How DevFactory builds better applications with Amazon CodeGuru

This post is written in collaboration with DevFactory, an AWS Select Technology Partner.

DevFactory is an enterprise SaaS-focused company that is responsible for innovation, development, and operation of over 120 enterprise products. DevFactory also offers DevGraph, an integrated suite of software development tools built on AWS.

Amazon CodeGuru is an automated code review service that helps developers improve their quality of code by recommending actions in code review. CodeGuru consists of two services:

In this post, we talk about how DevFactory uses CodeGuru Reviewer to improve their software as a service (SaaS) applications.

What is CodeGuru Reviewer?

CodeGuru Reviewer is a code review service that uses a combination of machine learning (ML) and human curation techniques to analyze millions of lines of over 10,000 open-source projects and the Amazon internal code base to learn coding practices. It uses these models to find code issues such as concurrency race conditions, resources leaks, and wasted CPU cycles.

DevFactory’s challenge

DevFactory has more than 120 products and manages over 650 million lines of code. Most of these products were developed over the last two decades and therefore have custom code to implement widely available, off-the-shelf services. To adopt, upgrade, and maintain the code base with a global, fully remote workforce, DevFactory is constantly evolving and adding automation where necessary.

One key part of the strategy is to identify and enhance the gems in each newly acquired product. These are the services, features, and applications that are both unique and valuable to the customers. ML-driven forecasts? Business intelligence from social graphs? Containerization and productivity enhancement at scale? DevFactory wants their engineering teams to deliver these to customers, and leave the undifferentiated heavy lifting to AWS services and infrastructure.

The following table shows DevFactory by the numbers.

Verticals Repositories Lines of Code Number of Languages
20 6,000 ~650 million 45

In addition to jettisoning undifferentiated code, monitoring and maintaining the existing code base requires effort. DevFactory’s ideal code analysis solution is:

  • Accurate and focused – The most valuable code analysis tools are both highly accurate and highly targeted. Static analysis, for example, routinely turns up thousands of issues in perfectly acceptable code bases because it has both false and unimportant positives.
  • Specialized – To truly improve the code base, specialized tools were needed to find issues during the following stages:
    • Development – Coding style, correctness, and more
    • Deployment – Efficient use of the right services
    • Implementation – Performance and security
  • Up-to-date – Updating to the latest API or SDK can result in unintended consequences. Any code analysis tool needs to keep up with this creative destruction and enforce correct usage of ever-new services.
  • Actionable – Code reviews and style guides are helpful, but to operate existing and new products at DevFactory’s scale, they need automated analysis and automated actions. DevFactory values issue-finders that lend themselves to their (rather sophisticated) issue-fixing techniques.

How CodeGuru helps DevFactory

When CodeGuru was first unveiled at re:Invent 2019, DevFactory wanted to try it as soon as possible and enrolled in the early beta program. After running CodeGuru against code base repositories, DevFactory made the following findings:

  • CodeGuru is predictably the leader in detecting AWS service misuse and recommending actions, which is worth a lot to anyone relying on AWS services. For DevFactory, CodeGuru flagged syntactically valid code that still produced inaccurate results due to paginated Amazon DynamoDB query results.
  • CodeGuru resource leaks and security issue coverage is precise, actionable, and expanding. DevFactory concluded that the 21 issues CodeGuru flagged were much more valuable than the over 500 generic non-issues (and quite a few false positives) other generic tools turned up.
  • As a managed service, CodeGuru reduces the burden of finding issues with the issue-finder. For the small team that does diligence on hundreds of repositories each week, reliability is just as important as accuracy.
  • CodeGuru Reviewer helped DevFactory rewrite its DevGraph product, FogBugz, in cloud-native format.
  • CodeGuru Profiler helped DevFactory optimize its DevGraph product, EngineYard, for its new container-based offering.

Conclusion

CodeGuru, CodeGuru Profiler, and CodeGuru Reviewer are now generally available. For more information about getting started with these services, see the following:


About the Author

Muhammad Mansoor is a Solutions Architect and part of the AWS team based in New York City. Muhammad has a background in DevOps, Containers, Enterprise Transformation and Cloud Migration. In his spare time he loves to spend time with his family and enjoys running.

Read More

Visualizing TensorFlow training jobs with TensorBoard

Visualizing TensorFlow training jobs with TensorBoard

TensorBoard is an open-source toolkit for TensorFlow users that allows you to visualize a wide range of useful information about your model, from model graphs; to loss, accuracy, or custom metrics; to embedding projections, images, and histograms of weights and biases.

This post demonstrates how to use TensorBoard with Amazon SageMaker training jobs, write logs from TensorFlow training scripts to Amazon Simple Storage Service (Amazon S3), and ways to run TensorBoard: locally, using Amazon Elastic Container Service (Amazon ECS) on AWS Fargate, or inside of an Amazon SageMaker notebook instance.

Generating training logs using tf.summary

TensorFlow comes with a tf.summary module to write summary data, which it uses for monitoring and visualization. The module’s API provides methods to write scalars, audio, histograms, text, and image summaries, and can trace information that’s useful for profiling training jobs. An example command to write the accuracy of the first step of training looks like the following:

tf.summary.scalar('accuracy', 0.45, step=1)

To use the summary data after the training job is complete, it’s important to write the files to a persistent storage. This way, you can visualize your past jobs or compare different runs during the hyperparameter tuning phase. The tf.summary module allows you to use Amazon S3 as the destination for log files, passing the S3 bucket URI directly into the create_file_writer method. See the following code:

tf.summary.create_file_writer('s3://<bucket_name>/<prefix>')

Keras users can use keras.callbacks.TensorBoard as one of the callbacks provided to the Model.fit() method. This callback provides an abstraction of a low-level tf.summary API and collects a lot of the data automatically. With TensorBoard callbacks, you can collect data to visualize training graphs, metrics plots, activation histograms, and run profiling. See the following code:

tb_callback = tf.keras.callbacks.TensorBoard(log_dir='s3://<bucket_name>/<prefix>')
model.fit(x, y, epochs=5, callbacks=[tb_callback])

For a detailed example of how to collect summary data in the training scripts, see the TensorBoard Keras example notebook on the Amazon SageMaker examples GitHub repo or inside a running Amazon SageMaker notebook instance on the Amazon SageMaker Examples tab. This notebook uses TensorFlow 2.2 and Keras to train a Convolutional Neural Network (CNN) to recognize images from the CIFAR-10 dataset. Code in the notebook runs the training job locally inside the notebook instance one time, and then another 10 times during the hyperparameter tuning job. All training jobs write log files under one Amazon S3 prefix, so the log destination path for every run follows the format s3://<bucket_name>/<project_name>/logs/<training_job_name>, where the project name is tensorboard_keras_cifar10.

The notebook also demonstrates how to run TensorBoard inside of the Amazon SageMaker notebook instance. This method has some limitations; for example, the TensorBoard command blocks the run of the notebook and lives as long as the notebook instance is alive, but allows you to quickly access the dashboard and make sure the training is running correctly.

In the following sections, we look at other ways to run TensorBoard.

Running TensorBoard on your local machine

If you want to run TensorBoard locally, the first thing you need to do is to install TensorFlow:

pip3 install tensorflow

An independent distribution of TensorBoard is also available, but it has limited functionality if run without TensorFlow. For this post, we use TensorBoard as part of the TensorFlow distribution.

Assuming your AWS Command Line Interface (AWS CLI) is installed and configured properly, we simply run TensorBoard pointing to the Amazon S3 directory containing the generated summary data:

AWS_REGION=eu-west-1 tensorboard --logdir s3://<bucket_name>/tensorboard_keras_cifar10/logs/

You must specify the region where your S3 bucket is located. You can find the right region in the list of buckets on the Amazon S3 console.

The user you use must have read access to the specified S3 bucket. For more information about securely granting access to S3 buckets to a specific user, see Writing IAM Policies: How to Grant Access to an Amazon S3 Bucket.

You should see something similar to the following screenshot.

Running TensorBoard on Amazon ECS on AWS Fargate

If you prefer to have an instance of TensorBoard permanently running and accessible to your whole team, you can deploy it as an independent application in the cloud. One of the easiest ways to do this without managing servers is AWS Fargate, a serverless compute engine for containers. The following diagram illustrates this architecture.

You can deploy an example TensorBoard container image with all required roles and an Application Load Balancer by using the provided AWS CloudFormation template:

 

 

This template has five input parameters:

  • TensorBoard container image – Use tensorflow/tensorflow for a standard distribution or a custom container image if you want to enable the Profiler plugin
  • S3Bucket – Enter the name of the bucket where TensorFlow logs are stored
  • S3Prefix – Enter the path to the TensorFlow logs inside of the bucket; for example, tensorboard_keras_cifar10/logs/
  • VpcId – Select the VPC where you want TensorBoard to be deployed to
  • SubnetId – Select two or more subnets in the selected VPC

This example solution doesn’t include authorization and authentication mechanisms. Remember that if you deploy TensorBoard to a publicly accessible subnet, your TensorBoard instance and training logs are accessible to everyone on the internet. You can secure TensorBoard with the following methods:

After you create the CloudFormation stack, you can find the link to the deployed TensorBoard on the Outputs tab on the AWS CloudFormation console.

Using a custom TensorBoard container image

Because TensorBoard is part of the TensorFlow distribution, we can use the official tensorflow Docker container image hosted on Docker Hub.

Optionally, we can build a custom image with the optional Profiler TensorBoard plugin to visualize profiling data:

#Dockerfile
FROM tensorflow/tensorflow

RUN python3 -m pip install --upgrade --no-cache-dir tensorboard_plugin_profile

EXPOSE 6006

ENTRYPOINT ["tensorboard"]

You can build and test the container locally:

docker build -t tensorboard .

docker run -p 6006:6006 
    --env AWS_ACCESS_KEY_ID=XXXXX 
    --env AWS_SECRET_ACCESS_KEY=XXXXX 
    --env AWS_REGION=eu-west-1 
tensorboard 
    --logdir s3://bucket_name/tensorboard_keras_cifar10/logs/

After testing the container, you need to push it to a container image repository of your choice. Detailed instructions on deploying an application aren’t in the scope of this post. To set up Amazon ECS and Elastic Load Balancer, see Building, deploying, and operating containerized applications with AWS Fargate.

Conclusion

In this post, I showed you how to use TensorBoard to visualize TensorFlow training jobs using Amazon S3 as storage for the logs. You can use this solution and the example notebooks to build and train a model with Amazon SageMaker and run a hyperparameter tuning job. You can use TensorBoard to compare hyperparameters from different training runs, generate and display confusion matrices for the classifier, and profile and visualize the training job’s performance.


About the Author

Yegor Tokmakov is a solutions architect at AWS, working with startups. Before joining AWS, Yegor was Chief Technology Officer at a healthcare startup based in Berlin and was responsible for architecture and operations, as well as product development and growth of the tech team. Yegor is passionate about novel AI applications and data analytics. You can find him at @yegortokmakov on Twitter.

Read More

Creating a sophisticated conversational experience using Amazon Lex in Australian English

Creating a sophisticated conversational experience using Amazon Lex in Australian English

Amazon Lex is a service for building conversational interfaces into any application using voice and text. To build truly engaging conversational experiences, you need high quality speech recognition and natural language understanding that understands the intent of the customer accurately.

We are excited to announce that Amazon Lex now supports Australian English. With Australian English, you can deliver a robust and localized conversational experience that accurately understands the Australian dialect. You can also respond to users with natural sounding Amazon Polly Australian voices to provide a fully localized conversational experience.

This post shows how you can build a bot with Australian English support and use the pre-defined built-ins to deliver a superior experience to your users.

Building an Amazon Lex bot

This post uses the following conversation to model a bot:

User: I’d like to schedule an appointment for a dishwasher repair.
Agent: Sure, what city are you in?
User: Ballarat.
Agent: Got it. And what’s the serial number?
User: 1234.
Agent: Ok. For that model, I have technicians available next week. When would you prefer to have them visit?
User: Can I get someone on 4/9?
Agent: Sure. You are all set for September 4th.
User: Thanks!

The first step is to build an Amazon Lex bot with intents to manage appointments. The ScheduleAppointment, ModifyAppointment, and CancelAppointment intents provide this capability. You can download a sample bot definition to follow along with this post.

Creating a bot in Australian English

For this post, you will create an Amazon Lex bot called AppointmentBot. Alternatively, you can also import the sample bot directly and skip the bot creation process.

  1. Open the AWS console, and make sure you have selected the Asia Pacific (Sydney)/ap-southeast-2 region.
  2. Go to the Amazon Lex console. On the Bots tab, choose Create.
  3. Select Custom Bot.
  4. Enter the Bot name and select English (AU) as Language.
  5. Select the Output voice. The list of voices are specific to the language selected for the bot.  
  6. Specify Sentiment analysis, Session timeout, and COPPA settings.
  7. Choose Create.
  8. Once the bot is created, create ScheduleAppointment, ModifyAppointment, and CancelAppointment intents for your bot by adding sample utterances and slot values.
  9. Select Build.

At this point, you should have a working Lex bot.

Setting up an Amazon Connect flow

In this section, we deploy the bot in an Amazon Connect Interactive Voice Response (IVR).

Creating your Amazon Connect instance

In this first step, you create your Amazon Connect instance:

  1. On the AWS Management Console, choose Amazon Connect.
  2. If this is your first Amazon Connect instance, choose Get started; otherwise, choose Add an instance.
  3. For Identity management, choose Store users within Amazon Connect.
  4. Enter a URL prefix such as test-instance-############, where “############” is your current AWS account number.
  5. Choose Next step.
  6. For Create an administrator, enter a name, password, and email address.
  7. Choose Next step.
  8. For Telephony Options, leave both call options selected by default.
  9. Choose Next step.
  10. For Data storage, choose Next step.
  11. Review the settings and choose Create instance.
  12. After your instance is created, choose Get started.

Associating your bot with your Amazon Connect instance

Now that you have an Amazon Connect instance, you can claim a phone number, create a contact flow, and integrate with the AppointmentBot Lex bot you created in the prior step. First, associate your bot with your Amazon Connect instance:

  1. On the Amazon Connect console, open your instance by choosing the Instance Alias
  2. Choose Contact flows.
  3. From the drop-down list, choose AppointmentBot. If you don’t see the bot in the list, make sure you have selected the same Region you used when you created your Lex bot.
  4. Choose + Add Lex Bot.

Configuring Amazon Connect to work with your bot

Now you can use your bot with Amazon Connect.

  1. On the Amazon Connect dashboard, for Step 1, Choose Begin.
  2. For your phone number, choose Australia for the country, Direct Dial or Toll Free, and choose a phone number.
  3. Choose Next.
  4. If you want to test your new phone number, try it on the next screen or choose Skip for now.

For this post, you can skip hours of operation, creating queues, and creating prompts. For more information on these features, see the Amazon Connect Administrator Guide.

  1. For Step 5, Create contact flows, choose View contact flows.
  2. Choose Create contact flow.
  3. Change the name of the contact flow to Manage repairs.
  4. From the Set drop-down menu, drag a Set voice card to the contact flow canvas. Open the card and change the Language to “English (Australian)”, choose an available Voice, and choose Save.
  5. Drag a connector from the Entry point card to your new Set voice card.
  6. From the Interact drop-down menu, drag a Play prompt card to the contact flow canvas.
  7. In the Play prompt details, choose Text-to-speech or chat text.
  8. In the text box, enter Hi. How can I help? You can schedule or change a repair appointment.
  9. Choose Save.
  10. Drag a connector from the Set voice card to your new Play prompt card.
  11. Drag a Get customer input card to the contact flow canvas.
  12. In the Get customer input details, choose Text-to-speech or chat text.
  13. In the text box, enter What would you like to do?
  14. Choose Amazon Lex, and choose the AppointmentBot bot in the drop-down list.
  15. Add each AppointmentBot intent to the contact flow: ScheduleAppointment, ModifyAppointment, CancelAppointment, and Disconnect.
  16. Choose Save.
  17. Drag a connector from the Play prompt card to the Get customer input
  18. Drag a Play prompt card to the contact flow.
  19. Choose Text-to-speech or chat text.
  20. In the text box, enter Is there anything else I can help you with?
  21. Choose Save.
  22. Drag connectors for each entry (except for Disconnect) in the Get customer input card to the new Play prompt card.
  23. Drag a connector from the new Play prompt card back to the Get customer input card.
  24. From the Terminate/Transfer drop-down menu, and drag a Disconnect/Hang up card to the contact flow.
  25. Drag a connector from the Disconnect entry in the Get customer input card to the Disconnect/Hang up card.

Your contact flow should look something like the following image.

  1. Choose Save to save your contact flow.
  2. Choose Publish to make your new contact flow available to your callers.
  3. Choose the Dashboard icon from the side menu.
  4. Choose Dashboard.
  5. Choose View phone numbers.
  6. Choose your phone number to edit it, and change the contact flow or IVR to the Manage Repairs contact flow you just created. Choose Save.

Your Amazon Connect instance is now configured to work with your Amazon Lex bot. Try calling the phone number to see how it works.

Conclusion

Conversational experiences are vastly improved when you understand your user’s accent and respond in a style familiar to them. With Australian English support on Amazon Lex, you can now build bots that are better at understanding the Australian English accent. You can also use pre-defined slots to capture local information such as names and cities. Australian English allows you deliver a robust and localized conversational experience to your users in the Australia region. Australian English is available at the same price, and in the same Regions, as US English. You can try Australian English via the console, the AWS Command Line Interface (AWS CLI), and the AWS SDKs. Start building your Aussie bot today!


About the Authors

Brian Yost is a Senior Consultant with the AWS Professional Services Conversational AI team. In his spare time, he enjoys mountain biking, home brewing, and tinkering with technology.

 

 

 

Anubhav Mishra is a Product Manager with AWS. He spends his time understanding customers and designing product experiences to address their business challenges.

Read More

Translating PDF documents using Amazon Translate and Amazon Textract

Translating PDF documents using Amazon Translate and Amazon Textract

In 1993, the Portable Document Format or the PDF was born and released to the world. Since then, companies across various industries have been creating, scanning, and storing large volumes of documents in this digital format. These documents and the content within them are vital to supporting your business. Yet in many cases, the content is text-heavy and often written in a different language. This limits the flow of information and can directly influence your organization’s business productivity and global expansion strategy. To address this, you need an automated solution to extract the contents within these PDFs and translate them quickly and cost-efficiently.

In this post, we show you how to create an automated and serverless content-processing pipeline for analyzing text in PDF documents using Amazon Textract and translating them with Amazon Translate.

Amazon Textract automatically extracts text and data from scanned documents. Amazon Textract goes beyond simple OCR to also identify the contents of fields in forms and information stored in tables. This allows Amazon Textract to read virtually any type of document and accurately extract text and data without needing any manual effort or custom code.

Once the text and data are extracted, you can use Amazon Translate is a neural machine translation service that delivers fast, high-quality, and affordable language translation. Neural machine translation is a form of language translation automation that uses deep learning models to deliver more accurate and natural-sounding translation than traditional statistical and rule-based translation algorithms. The translation service is trained on a wide variety of content across different use cases and domains to perform well on many kinds of content. Its asynchronous batch processing capability enables you to translate a large collection of text or HTML documents with a single API call.

Solution overview

To be scalable and cost-effective, this solution uses serverless technologies and managed services. In addition to Amazon Textract and Amazon Translate, the solution uses the following services:

  • Amazon Simple Storage Service (Amazon S3) – Stores your documents and allows for central management with fine-tuned access controls.
  • Amazon Simple Notification Service (Amazon SNS) – Enables you to decouple microservices, distributed systems, and serverless applications with a highly available, durable, secure, fully managed pub/sub messaging service.
  • AWS Lambda – Runs code in response to triggers such as changes in data, changes in application state, or user actions. Because services like Amazon S3 and Amazon SNS can directly trigger a Lambda function, you can build a variety of real-time serverless data-processing systems.
  • AWS Step Functions – Coordinates multiple AWS services into serverless workflows.

Solution architecture

The architecture workflow contains the following steps:

  1. Users upload a PDF for analysis to Amazon S3.
  2. The Amazon S3 upload triggers a Lambda function.
  3. The function invokes Amazon Textract to extract text from the PDF in batch mode.
  4. Amazon Textract sends an SNS notification when the job is complete.
  5. A Lambda function reads the Amazon Textract response and stores the extracted text in Amazon S3.
  6. The Lambda function from the previous step invokes Amazon Translate in batch mode to translate the extracted texts into the target language.
  7. The Step Functions-based job poller polls for the translation job to complete.
  8. Step Functions sends an SNS notification when the translation is complete.
  9. A Lambda function reads the translated texts in Amazon S3 and generates a translated document in Amazon S3.

The following diagram illustrates this architecture.

Architecture Diagram showing the workflow how uploading the PDF document to S3 bucket triggers the process of extracting text using Amazon textract and then translating it using Amazon Translate.

For processing documents at scale, you can expand this solution to include Amazon Simple Queue Service (Amazon SQS) to queue the jobs and handle any potential failure related to throttling and default service concurrency limits. For more information about the limits in Amazon Translate and Amazon Textract, see Guidelines and Limits and Limits in Amazon Textract, respectively.

Deploying the solution with AWS CloudFormation

The first step is to use an AWS CloudFormation template to provision the necessary resources needed for the solution, including the AWS Identity and Access Management (IAM) roles, IAM policies, and SNS topics.

  1. Launch the AWS CloudFormation template by choosing the following (this creates the stack the us-east-1 Region):
  2. For Stack name, enter a unique stack name for this account; for example, document-translate.
  3. For TargetLanguageCode, enter the language code that you want your translated documents in; for example, es for Spanish.

For more information about supported languages, see Supported Languages and Language Codes.

  1. In the Capabilities and transforms section, and select the check-boxes to acknowledge that AWS CloudFormation will create IAM resources and transform the AWS Serverless Application Model (AWS SAM) template.

AWS SAM templates simplify the definition of resources needed for serverless applications. When deploying AWS SAM templates in AWS CloudFormation, AWS CloudFormation performs a transform to convert the AWS SAM template into a CloudFormation template. For more information, see Transform.

  1. Choose Create stack.

Screenshot showing Cloudformation launch page with stack name and input parameters as examples

The stack creation may take up to 20 minutes, after which the status changes to CREATE_COMPLETE. You can see the name of the newly created S3 bucket on the Outputs tab.

Translating the document

To translate your document, upload a document in English to the input folder of the S3 bucket you created in the previous step.

screenshot of S3 bucket uploaded with the document for translation

For this post, we scanned the “Universal Declaration of Human Rights,” created by the United Nations.

screenshot of a sample scanned document-UN Declaration of Human Rights in english

This upload event triggers the Lambda function <Stack name>-S3EventProcessor-<Random string>, which invokes the Amazon Textract startDocumentTextDetection API to extract the text from the scanned document.

When Amazon Textract completes the batch job, it sends an SNS notification. The notification triggers the Lambda function <Stack name>-TextractSNSEventProcessor-<Random string>, which processes the Amazon Textract response page by page to extract the LINE block elements to store them in the S3 bucket.

Amazon Textract extracts LINE block elements with a BoundingBox. A sentence in the scanned document results in multiple LINE block elements. To make sure that Amazon Translate has the entire sentence in scope for translation, the solution combines multiple LINE block elements to recreate the sentence boundary in the source document. This done by using the BreakIterator class available for Java. For more information, see Class BreakIterator.

The sentences are then stored in the S3 bucket as individual objects. Finally, the Amazon Translate job startTextTranslationJob is invoked with the input S3 bucket location where the text to be translated is available.

The Amazon Translate job completion SNS notification from the job poller triggers the Lambda function <Stack name>-TranslateJobSNSEventProcessor-<Random string>. The function creates the editable document by combining the translated texts created by the Amazon Translate batch job in the output folder of the S3 bucket with the following naming convention: inputFileName-TargetLanguageCode.docx.

screenshot of S3 bucket output folder where translated document is located.

The following screenshot shows the document translated in Spanish.

screenshot of a input sample scanned document (UN Declaration of Human Rights) translated from English to Spanish as output

The solution also supports translating documents for right-to-left (RTL) script languages such as Arabic and Hebrew. The following screenshot shows the translated document in Arabic (language code: ar).

screenshot of a input sample scanned document (UN Declaration of Human Rights) translated from English to Arabic as output

For any pipeline failure, check the Amazon CloudWatch logs for the corresponding Lambda function and look for potential errors that caused the failure.

To do a translation in a different language, you can update the LANG_CODE environment variable for the <Stack name>-TextractSEventProcessor-<Random string> function and trigger the solution pipeline by uploading a new document into the input folder of the S3 bucket.

Conclusion

In this post, we demonstrated how to extract text from PDF documents and translate them into an editable document in a different language using Amazon Translate asynchronous batch processing. For a low-latency, low-throughput solution translating smaller PDF documents, you can perform the translation through the real-time Amazon Translate API.

The ability to process data at scale is becoming important to organizations across all industries. Managed machine learning services like Amazon Textract and Amazon Translate can simplify your document processing and translation needs, helping you focus on addressing core business needs while keeping overall IT costs manageable.

For further reading, we recommend the following:


About the Authors

Siva Rajamani is a Boston-based Enterprise Solutions Architect for AWS. He enjoys working closely with customers and supporting their digital transformation and AWS adoption journey. His core areas of focus are Serverless, Application Integration, and Security. Outside of work, he enjoys outdoors activities and watching documentaries.

 

 

Sudhanshu Malhotra is a Boston-based Enterprise Solutions Architect for AWS. He’s a technology enthusiast who enjoys helping customers find innovative solutions to complex business challenges. His core areas of focus are DevOps, Machine Learning, and Security. When he’s not working with customers on their journey to the cloud, he enjoys reading, hiking, and exploring new cuisines.

 

 

Read More