Deploying reinforcement learning in production using Ray and Amazon SageMaker

Deploying reinforcement learning in production using Ray and Amazon SageMaker

Reinforcement learning (RL) is used to automate decision-making in a variety of domains, including games, autoscaling, finance, robotics, recommendations, and supply chain. Launched at AWS re:Invent 2018, Amazon SageMaker RL helps you quickly build, train, and deploy policies learned by RL. Ray is an open-source distributed execution framework that makes it easy to scale your Python applications. Amazon SageMaker RL uses the RLlib library that builds on the Ray framework to train RL policies.

This post walks you through the tools available in Ray and Amazon SageMaker RL that help you address challenges such as scale, security, iterative development, and operational cost when you use RL in production. For a primer on RL, see Amazon SageMaker RL – Managed Reinforcement Learning with Amazon SageMaker.

Use case

In this post, we take a simple supply chain use case, in which you’re deciding how many basketballs to buy for a store to meet customer demand. The agent decides how many basketballs to buy every day, and it takes 5 days to ship the product to the store. The use case is a multi-period variant of the classic newsvendor problem. Newspapers lose value in a single day, and therefore, each agent decision is independent. Because the basketball remains valuable as long as it’s in the store, the agent has to optimize its decisions over a sequence of steps. Customer demand has inherent uncertainty, so the agent needs to balance the trade-off between ordering too many basketballs that may not sell and incur storage cost versus buying too few, which can lead to unsatisfied customers. The objective of the agent is to maximize the sales while minimizing storage costs and customer dissatisfaction. We refer to the agent as the basketball agent in the rest of the post.

You need to address the following challenges to train and deploy the basketball agent in production:

  1. Formulate the problem. Determine the state, action, rewards, and state transition of the problem. Create a simulation environment that captures the problem formulation.
  2. Train the agent. Training with RL requires precise algorithm implementations because minor errors can lead to poor performance. Training can require millions of interactions between the agent and the environment before the policy converges. Therefore, distributed training becomes necessary to reduce training times. You need to make various choices while training the agent: picking the state representation, the reward function, the algorithm to use, the neural network architecture, and the hyperparameters of the algorithm. It becomes quickly overwhelming to navigate these options, experiment at scale, and finalize the policy to use in production.
  3. Deploy and monitor policy. After you train the policy and evaluate its performance offline, you can deploy it in production. When deploying the policy, you need the ensure the policy behaves as expected and scales to the workload in production. You can perform A/B testing, continually deploy improved versions of the model, and look out for anomalous behavior. Development and maintenance of the deployment infrastructure in a secure, scalable, and cost-effective manner can be an onerous task.

Amazon SageMaker RL and Ray provide the undifferentiated heavy lifting infrastructure required for deploying RL at scale, which helps you focus on the problem at hand. You can provision resources with a single click, use algorithm implementations that efficiently utilize these resources provisioned, and track, visualize, debug, and replicate your experiments. You can deploy the model with a managed microservice that autoscales and logs model actions. The rest of the post walks you through these steps with the basketball agent as our use case.

Formulating the problem

For the basketball problem, our state includes the expected customer demand, the current inventory in the store, the inventory on the way, and the cost of purchasing and storing the basketballs. The action is the number of basketballs to order. The reward is the net profit with a penalty for missed demand. This post includes code for a simulator that encodes the problem formulation using the de-facto standard Gym API. We assume a Poisson demand profile. You can use historical customer demand data in the simulator to capture real-world characteristics.

You can create a simulator in different domains, such as financial portfolio management, autoscaling, and multi-player gaming. Amazon SageMaker RL notebooks provide examples of simulators with custom packages and datasets.

Training your agent

You can start training your agent with RL using state-of-the-art algorithms available in Ray. The algorithms have been tested on well-known published benchmarks, and you can customize them to specify your loss function, neural network architecture, logging metrics, and more. You can choose between the TensorFlow and PyTorch deep learning frameworks.

Amazon SageMaker RL makes it easy to get started with Ray using a managed training experience. You can launch experiments on secure, up-to-date instances with pre-built Ray containers using familiar Jupyter notebooks. You pay for storage and instances based on your usage, with no minimum fees or upfront commitments, therefore costs are minimized.

The following code shows the configuration for training the basketball agent using the proximal policy optimization (PPO) algorithm with a single instance:

def get_experiment_config(self):
    return { "training": { "env": "Basketball-v1",
                          "run": "PPO",
                          "config": { 
                            "lr": 0.0001, 
                            "num_workers": (self.num_cpus - 1),
                            "num_gpus": self.num_gpus,        }, } }

To train a policy with Amazon SageMaker RL, you start a training job. You can save up to 90% on your training cost by using managed spot training, which uses Amazon Elastic Compute Cloud (Amazon EC2) Spot Instances instead of On-Demand Instances. Just enable train_use_spot_instances and set the train_max_wait. Amazon SageMaker restarts your training jobs if a Spot Instance is interrupted, and you can configure managed spot training jobs to use periodically saved checkpoints. For more information about using Spot Instances, see Managed Spot Training: Save Up to 90% On Your Amazon SageMaker Training Jobs. The following code shows how you can launch the training job using Amazon SageMaker RL APIs:

estimator = RLEstimator(base_job_name='basketball',
                        entry_point="train_basketball.py",
                        image_name=ray_tf_image,
                        train_instance_type='ml.m5.large',
                        train_instance_count=1,
                        train_use_spot_instances=True, # use spot instance
                        train_max_wait=7200, #seconds,
                        checkpoint_s3_uri=checkpoint_s3_uri, # s3 for checkpoint syncing
                        hyperparameters = {
                                # necessary for syncing between spot instances
                                "rl.training.upload_dir": checkpoint_s3_uri, 
                        }
                        ...)
estimator.fit()                       

Amazon SageMaker RL saves the metadata associated with each training job, such as the instance used, the source code, and the metrics logged. The print logs are saved in Amazon CloudWatch, the training outputs are saved in Amazon Simple Storage Service (Amazon S3), and you can replicate each training job with a single click. The Amazon SageMaker RL training job console visualizes the instance resource use and training metrics, such as episode reward and policy loss.

The following example visualization shows the mean episode rewards, policy entropy, and policy loss over the training time. As the agent learns to take better actions, its rewards improve. The entropy indicates the randomness of the actions taken by the agent. Initially, the agent takes random actions to explore the state space, and the entropy is high. As the agent improves, its randomness and entropy reduce. The policy loss indicates the value of the loss function used by the RL algorithm to update the policy neural network. We use the PPO algorithm, which should remain close to zero during training.

Ray also creates a TensorBoard with training metrics and saves them to Amazon S3. The following visualizations show the same metrics in TensorBoard.

Ray is designed for distributed runs, so it can efficiently use all the resources available in an instance: CPU, GPU, and memory. You can scale the training further with multi-instance clusters by incrementing the train_instance_count in the preceding API call. Amazon SageMaker RL creates the cluster for you, and Ray uses cluster resources to train the agent rapidly.

You can choose to create heterogeneous clusters with multiple instance types (for more information, see the following GitHub repo). For more information about distributed RL training, see Scaling your AI-powered Battlesnake with distributed reinforcement learning in Amazon SageMaker.

You can scale your experiments by creating training jobs with different configurations of state representation, reward function, RL algorithms, hyperparameters, and more. Amazon SageMaker RL helps you organize, track, compare, and evaluate your training jobs with Amazon SageMaker Experiments. You can search, sort by performance metrics, and track the lineage of a policy when you deploy in production. The following code shows an example of experiments with multiple learning rates for training a policy, and sorting by the mean episode rewards:

# define a SageMaker Experiment
rl_exp = Experiment.create(experiment_name="learning_rate_exp",...)

# define first trial
first_trial = Trial.create(trial_name="lr-1e-3", 
                           experiment_name=rl_exp.experiment_name,...)    
estimator_1 = RLEstimator(...)
estimator_1.fit(experiment_config={"TrialName": first_trial.trial_name,...})    

# define second trial                          
second_trial = Trial.create(trial_name="lr-1e-4", 
                            experiment_name=rl_exp.experiment_name,...)
estimator_2 = RLEstimator(...)
estimator.fit(experiment_config={"TrialName": second_trial.trial_name,...})

# define third trial                          
third_trial = Trial.create(trial_name="lr-1e-5", 
                            experiment_name=rl_exp.experiment_name,...)
estimator_3 = RLEstimator(...)
estimator.fit(experiment_config={"TrialName": third_trial.trial_name,...})

# get trials sorted by mean episode rewards
trial_component_analytics = ExperimentAnalytics(
    experiment_name=rl_exp.experiment_name,
    sort_by="metrics.episode_reward_mean.Avg",
    sort_order="Descending",...).dataframe()

The following screenshot shows the output.

The following screenshot shows that we saved 60% of the training cost by using a Spot Instance.

Deploying and monitoring the policy

After you train the RL policy, you can export the learned policy for evaluation and deployment. You can evaluate the learned policy against realistic scenarios expected in production, and ensure its behavior matches the expectation from domain expertise. You can deploy the policy in an Amazon SageMaker endpoint, to edge devices using AWS IoT Greengrass (see Training the Amazon SageMaker object detection model and running it on AWS IoT Greengrass – Part 3 of 3: Deploying to the edge), or natively in your production system.

Amazon SageMaker endpoints are fully managed. You deploy the policy with a single API call, and the required instances and load balancer are created behind a secure HTTP endpoint. The Amazon SageMaker endpoint autoscales the resources so that latency and throughput requirements are met with changing traffic patterns while incurring minimal cost.

With an Amazon SageMaker endpoint, you can check the policy performance in the production environment by A/B testing the policy against the existing model in production. You can log the decisions taken by the policy in Amazon S3 and check for anomalous behavior using Amazon SageMaker Model Monitor. You can use the resulting dataset to train an improved policy. If you have multiple policies, each for a different brand of basketball sold in the store, you can save on costs by deploying all the models behind a multi-model endpoint.

The following code shows how to extract the policy from the RLEstimator and deploy it to an Amazon SageMaker endpoint. The endpoint is configured to save all the model inferences using the Model Monitor feature.

endpoint_name = 'basketball-demo-model-monitor-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
prefix = 'sagemaker/basketball-demo-model-monitor'
data_capture_prefix = '{}/datacapture'.format(prefix)
s3_capture_upload_path = 's3://{}/{}'.format(s3_bucket, data_capture_prefix)
              
model = Model(model_data='s3://{}/{}/output/model.tar.gz'.format(s3_bucket, job_name),
              framework_version='2.1.0',
              role=role)

data_capture_config = DataCaptureConfig(
                            enable_capture=True, 
                            sampling_percentage=100, 
                            destination_s3_uri=s3_capture_upload_path)

predictor = model.deploy(initial_instance_count=1, 
                         instance_type="ml.c5.xlarge",
                         endpoint_name=endpoint_name,
                         data_capture_config=data_capture_config)

result = predictor.predict({"inputs": ...})                      

You can verify the configurations on the console. The following screenshot shows the data capture settings.

The following screenshot shows model’s production variants.

When the endpoint is up, you can quickly trace back to the model trained under the hood. The following code demonstrates how to retrieve the job-specific information (such as TrainingJobName, TrainingJobStatus, and TrainingTimeInSeconds) with a single line of API call:

#first get the endpoint config for the relevant endpoint
endpoint_config = sm.describe_endpoint_config(EndpointConfigName=endpoint_name)

#now get the model name for the model deployed at the endpoint. 
model_name = endpoint_config['ProductionVariants'][0]['ModelName']

#now look up the S3 URI of the model artifacts
model = sm.describe_model(ModelName=model_name)
modelURI = model['PrimaryContainer']['ModelDataUrl']

#search for the training job that created the model artifacts at above S3 URI location
search_params={
   "MaxResults": 1,
   "Resource": "TrainingJob",
   "SearchExpression": { 
      "Filters": [ 
         { 
            "Name": "ModelArtifacts.S3ModelArtifacts",
            "Operator": "Equals",
            "Value": modelURI
         }]}
}
results = sm.search(**search_params)

# trace lineage of the underlying training job
results['Results'][0]['TrainingJob'].keys()

The following screenshot shows the output.

When you invoke the endpoint, the request payload, response payload, and additional metadata is saved in the Amazon S3 location that you specified in DataCaptureConfig. You should expect to see different files from different time periods, organized based on the hour when the invocation occurred. The format of the Amazon S3 path is s3://{destination-bucket-prefix}/{endpoint-name}/{variant-name}/yyyy/mm/dd/hh/filename.jsonl.

The HTTP request and response payload is saved in Amazon S3, where the JSON file is sorted by date. The following screenshot shows the view on the Amazon S3 console.

The following code is a line from the JSON file. With all the captured data, you can closely monitor the endpoint status and perform evaluation when necessary.

'captureData': {    
    'endpointInput': {   
        'data': 
            {"inputs": {
                "observations": 
                    [[1307.4744873046875, 737.0364990234375, 
                    2065.304931640625, 988.8933715820312,
                    357.6395568847656, 41.90699768066406,
                    60.84299850463867, 4.65033483505249,
                    5.944803237915039, 64.77123260498047]],
                "prev_action": [0], 
                "is_training": false,
                "prev_reward": -1, 
                "seq_lens": -1}},
        'encoding': 'JSON',
        'mode': 'INPUT',
        'observedContentType': 'application/json'},
    'endpointOutput': {
        'data': {
            "outputs": {
                "action_logp": [0.862621188],
                "action_prob": [2.36936307],
                "actions_0": [[-0.267252982]],
                "vf_preds": [0.00718466379],
                "action_dist_inputs": [[-0.364359707, -2.08935]]
            }
        },
        'encoding': 'JSON',
        'mode': 'OUTPUT',
        'observedContentType': 'application/json'}}
'eventMetadata': {   
        'eventId': '0ad69e2f-c1b1-47e4-8334-47750c3cd504',
        'inferenceTime': '2020-09-30T00:47:14Z'
        },
'eventVersion': '0'

Conclusion

With Ray and Amazon SageMaker RL, you can get started on reinforcement learning quickly and scale to production workloads. The total cost of ownership of Amazon SageMaker over a 3-year horizon is reduced by over 54% compared to other cloud options, and developers can be up to 10 times more productive.

The post just scratches the surface of what you can do with Amazon SageMaker RL. Give it a try and please send us feedback, either in the Amazon SageMaker Discussion Forum or through your usual AWS contacts.


About the Author

Bharathan Balaji is a Research Scientist in AWS and his research interests lie in reinforcement learning systems and applications. He contributed to the launch of Amazon SageMaker RL and AWS DeepRacer. He likes to play badminton, cricket and board games during his spare time.

 

 

 

Anna Luo is an Applied Scientist in AWS. She obtained her Ph.D. in Statistics from UC Santa Barbara. Her interests lie in large-scale reinforcement learning algorithms and distributed computing. Her current personal goal is to master snowboarding.

 

 

Read More

Iris landmark tracking in the browser with MediaPipe and TensorFlow.js

Iris landmark tracking in the browser with MediaPipe and TensorFlow.js

Posted by Ann Yuan and Andrey Vakunov, Software Engineers at Google

Iris tracking enables a wide range of applications, such as hands-free interfaces for assistive technologies and understanding user behavior beyond clicks and gestures. Iris tracking is also a challenging computer vision problem. Eyes appear under variable light conditions, are often occluded by hair, and can be perceived as differently shaped depending on the head’s angle of rotation and the person’s expression. Existing solutions rely heavily on specialized hardware, often requiring a costly headset or a remote eye tracker system. These approaches are ill-suited for mobile devices with limited computing resources.

GIF of eye re-coloring tool in use
An example of eye re-coloring enabled.

In March we announced the release of a new package detecting facial landmarks in the browser. Today, we’re excited to add iris tracking to this package through the TensorFlow.js face landmarks detection model. This work is made possible by the MediaPipe Iris model. We have deprecated the original facemesh model, and future updates will be made to the face landmarks detection model.

Note that iris tracking does not infer the location at which people are looking, nor does it provide any form of identity recognition. In our model’s documentation and the accompanying Model Card, we detail the model’s intended uses, limitations and fairness attributes (aligned with Google’s AI Principles).

The MediaPipe iris model is able to track landmarks for the iris and pupil using a single RGB camera, in real-time, without the need for specialized hardware. The model also returns landmarks for the eyelids and eyebrow regions, enabling detection of slight eye movements such as blinking. Try the model out yourself right now in your browser.

Introducing @tensorflow/face-landmarks-detection

GIF of Facemesh predictions
Above left are predictions from @tensorflow-models/facemesh@0.0.4, above right are predictions from @tensorflow-models/face-landmarks-detection@0.0.1. Iris landmarks are in red.

Users familiar with our existing facemesh model will be able to upgrade to the new faceLandmarksDetection model with only a few code changes, detailed below. faceLandmarksDetection offers three major improvements over facemesh:

  1. Iris keypoints detection
  2. Improved eyelid contour detection
  3. Improved detection for rotated faces

These improvements are highlighted in the GIF above, which demonstrates how the landmarks returned by faceLandmarksDetection and facemesh differ for the same image sequence.

Installation

There are two ways to install the faceLandmarksDetection package:

  1. Through script tags:
  2. <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@2.6.0/dist/tf.js"></script>
    <script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/face-landmarks-detection"></script>
  3. Through NPM (via the yarn package manager):
  4. $ yarn add @tensorflow-models/face-landmarks-detection@0.0.1
    $ yarn add @tensorflow/tfjs@2.6.0

Usage

Once the package is installed, you only need to load the model weights and then pass in an image to start detecting facial landmarks:

// If you are using NPM, first require the model. If you are using script tags, you can skip this step because `faceLandmarksDetection` will already be available in the global scope.
const faceLandmarksDetection = require('@tensorflow-models/face-landmarks-detection');

// Load the faceLandmarksDetection model assets.
const model = await faceLandmarksDetection.load(
faceLandmarksDetection.SupportedPackages.mediapipeFacemesh);

// Pass in a video stream to the model to obtain an array of detected faces from the MediaPipe graph.
// For Node users, the `estimateFaces` API also accepts a `tf.Tensor3D`, or an ImageData object.
const video = document.querySelector("video");
const faces = await model.estimateFaces({ input: video });

The input to estimateFaces can be a video, a static image, a `tf.Tensor3D` or even an ImageData object for use in node.js pipelines. FaceLandmarksDetection then returns an array of prediction objects for the faces in the input, which include information about each face (e.g. a confidence score, and the locations of 478 landmarks within the face).

Here is a sample prediction object:

{
faceInViewConfidence: 1,
boundingBox: {
topLeft: [232.28, 145.26], // [x, y]
bottomRight: [449.75, 308.36],
},
mesh: [
[92.07, 119.49, -17.54], // [x, y, z]
[91.97, 102.52, -30.54],
...
],
// x,y,z positions of each facial landmark within the input space.
scaledMesh: [
[322.32, 297.58, -17.54],
[322.18, 263.95, -30.54]
],
// Semantic groupings of x,y,z positions.
annotations: {
silhouette: [
[326.19, 124.72, -3.82],
[351.06, 126.30, -3.00],
...
],
...
}
}

Refer to our README for more details about the API.

Performance

FaceLandmarksDetection is a lightweight package containing only ~3MB of weights, making it ideally suited for real-time inference on a variety of mobile devices. When testing, note that TensorFlow.js also provides several different backends to choose from, including WebGL and WebAssembly (WASM) with XNNPACK for devices with lower-end GPU’s. The table below shows how the package performs across a few different devices and TensorFlow.js backends.:

Desktop:

Chart of desktop performance

Mobile:

All benchmarks were collected in the Chrome browser. See our earlier blogpost for details on how to activate SIMD for the TF.js WebAssembly backend.

Looking ahead

Both the TensorFlow.js and MediaPipe teams plan to add depth estimation capabilities to our face landmark detection solutions using the improved iris coordinates. We strongly believe in sharing code that enables reproducible research and rapid experimentation, and are looking forward to seeing how the wider community makes use of the MediaPipe iris model.

Try the demo!

Use this link to try our new package in your web browser. We look forward to seeing how you use it in your apps.

More information

Read More

Adding custom data sources to Amazon Kendra

Adding custom data sources to Amazon Kendra

Amazon Kendra is a highly accurate and easy-to-use intelligent search service powered by machine learning (ML). Amazon Kendra provides native connectors for popular data sources like Amazon Simple Storage Service (Amazon S3), SharePoint, ServiceNow, OneDrive, Salesforce, and Confluence so you can easily add data from different content repositories and file systems into a centralized location. This enables you to use Kendra’s natural language search capabilities to quickly find the most relevant answers to your questions.

However, many organizations store relevant information in the form of unstructured data on company intranets or within file systems on corporate networks that are inaccessible to Amazon Kendra.

You can now use the custom data source feature in Amazon Kendra to upload content to your Amazon Kendra index from a wider range of data sources. When you select a connector type, the custom data source feature gives complete control over how documents are selected and indexed, and provides visibility and metrics on which content associated with a data source has been added, modified, or deleted.

In this post, we describe how to use a simple web connector to scrape content from unauthenticated webpages, capture attributes, and ingest this content into an Amazon Kendra index using the custom data source feature. This enables you to ingest your content directly to the index using the BatchPutDocument API, and allows you to keep track of the ingestion through Amazon CloudWatch log streams and through the metrics from the data sync operation.

Setting up a web connector

To use the custom data source connector in Amazon Kendra, you need to create an application that scrapes the documents in your repository and builds a list of documents. You ingest those documents into your Amazon Kendra index by using the BatchPutDocument operation. To delete documents, you have to provide a list of the document IDs and use the BatchDeleteDocument operation. If you need to modify a document (for example because it was updated), if you provide the same document ID, the document with the matching document ID is replaced on your index.

For this post, we scrape HTML content from AWS FAQs for 11 AI/ML services:

We use BeautifulSoup and requests library to scrape the content from the AWS FAQ website. The script first gets the content of an AWS FAQ page through the get_soup_from_url function. Based on the presence of certain CSS classes, it locates question and answers pairs and for each URL, it creates a text file to be later ingested in Amazon Kendra.

The solution in this post is for demonstration purposes only. We recommend running similar scripts only on your own websites after consulting with the team who manages them, or be sure to follow the terms of service for the website that you’re trying to scrape.

The following screenshot shows a sample of the script.

The following screenshot shows the results of a sample run.

The ScrapedFAQS.zip file contains the scraped documents.

Creating a custom data source

To ingest documents through the custom data source, you need to first create a data source. The assumption is you already have an Amazon Kendra index in your account. If you don’t, you can create a new index.

Amazon Kendra has two provisioning editions: the Amazon Kendra Developer Edition, recommended for building proof of concepts (POCs), and the Amazon Kendra Enterprise Edition, which provides multi-AZ deployment, making it ideal for production. Amazon Kendra connectors work with both editions.

To create your custom data source, complete the following steps:

  1. On your index, choose Add data sources.

  1. For Custom data source connector, choose Add connector.

  1. For Data source name, enter a name (for example, MyCustomConnector).

  1. Review the information in the Next steps

  1. Choose Add data source.

Syncing documents using the custom data source

Now that your connector is set up, you can ingest documents in Amazon Kendra using the BatchPutDocument API, and get some metrics to track the status of ingestion. For that you need an ExecutionID, so before running your BatchPutDocument operation, you need to start a data source sync job. When the data sync is complete, you stop the data source sync job.

For this post, you use the latest version of the AWS SDK for Python (Boto3) and ingest 10 documents with the IDs 0–9.

Extract the .zip file containing the scraped content by using any standard file decompression utility. You should have 11 files on your local file system. In a real use case, these files are likely on a shared file server in your data center. When you create a custom data source, you have complete control over how the documents for the index are selected. Amazon Kendra only provides metric information that you can use to monitor the performance of your data source.

To sync your documents, enter the following code:

import boto3

#Index ID
index_id = <YOUR-INDEX-ID>
#Datasource ID
data_source_id = <YOUR-DATASOURCE-ID>

kendra = boto3.client('kendra')

#Start a data source sync job
result = kendra.start_data_source_sync_job(
    Id = data_source_id,
    IndexId = index_id
    )

print("Start data source sync operation: ")
print(result)

#Obtain the job execution ID from the result
job_execution_id = result['ExecutionId']
print("Job execution ID: "+job_execution_id)

#Start ingesting documents
try:
    #Part of the workflow will require you to have a list with your documents ready
    #for ingestion
    docs = get_docs(data_source_id, job_execution_id)
    #batchput docs
    result = kendra.batch_put_document(
        IndexId = index_id,
        Documents = docs
        )
    print("Response from batch_put_document:")
    print(result)

#Stop data source sync job
finally:
    #Stop data source sync
    result = kendra.stop_data_source_sync_job(
        Id = data_source_id,
        IndexId = index_id
        )
    print("Stop data source sync operation:")
    print(result)

If everything goes well, you see output similar to the following:

Start data source sync operation:
{
    'ExecutionId': 'a5ac1ba0-b480-46e3-a718-5fffa5006f1a',
    'ResponseMetadata': {
        'RequestId': 'a24a2600-0570-4520-8956-d58c8b1ef01c',
        'HTTPStatusCode': 200,
        'HTTPHeaders': {
            'x-amzn-requestid': 'a24a2600-0570-4520-8956-d58c8b1ef01c',
            'content-type': 'application/x-amz-json-1.1',
            'content-length': '54',
            'date': 'Mon, 12 Oct 2020 19:55:11 GMT'
        },
        'RetryAttempts': 0
    }
}

Job execution ID: a5ac1ba0-b480-46e3-a718-5fffa5006f1a

Response from batch_put_document:
{
    'FailedDocuments': [],
    'ResponseMetadata': {
        'RequestId': 'fcda5fed-c55c-490b-9867-b45a3eb6a780',
        'HTTPStatusCode': 200,
        'HTTPHeaders': {
            'x-amzn-requestid': 'fcda5fed-c55c-490b-9867-b45a3eb6a780',
            'content-type': 'application/x-amz-json-1.1',
            'content-length': '22',
            'date': 'Mon, 12 Oct 2020 19:55:12 GMT'
        },
        'RetryAttempts': 0
    }
}

Stop data source sync operation:
{
    'ResponseMetadata': {
        'RequestId': '249a382a-7170-49d1-855d-879b5a6f2954',
        'HTTPStatusCode': 200,
        'HTTPHeaders': {
            'x-amzn-requestid': '249a382a-7170-49d1-855d-879b5a6f2954',
            'content-type': 'application/x-amz-json-1.1',
            'content-length': '0',
            'date': 'Mon, 12 Oct 2020 19:55:12 GMT'
        },
        'RetryAttempts': 0
    }
}

Allow for some time for the sync job to finish, because document ingestion could continue as an asynchronous process after the data source sync process has stopped. The status on the Amazon Kendra console should change from Syncing-indexing to Succeeded when all the documents have been ingested successfully. You can now confirm the count of the documents that were ingested successfully and the metrics of the operation on the Amazon Kendra console.

Deleting documents from a custom data source

In this section, you explore how to remove documents from your index. You can use the same DataSourceSync job that you used for ingesting the documents. This process could be useful if you have a changelog of the documents you’re syncing with your Amazon Kendra index, and during your sync job you want to delete documents from your index and also ingest new documents. You can do this by starting the sync job, performing the BatchDeleteDocument operation, performing the BatchPutDocumentation operation, and stopping the sync job.

For this post, we use a separate data source sync job to remove the documents with IDs 6, 7, and 8. See the following code:

import boto3

#Index ID
index_id = <YOUR-INDEX-ID>
#Datasource ID
data_source_id = <YOUR-DATASOURCE-ID>

kendra = boto3.client('kendra')

#Start data source sync job
result = kendra.start_data_source_sync_job(
    Id = data_source_id,
    IndexId = index_id
    )
print("Start data source sync operation: ")
print(result)

job_execution_id = result['ExecutionId']
print("Job execution ID: "+job_execution_id)
try:
    #Add the document IDs you would like to delete
    delete_docs = ["6", "7", "8"]
    #Start the batch put delete operation
    result = kendra.batch_delete_document(
        IndexId = index_id,
        DocumentIdList = delete_docs,
        DataSourceSyncJobMetricTarget = {
            "DataSourceSyncJobId": job_execution_id,
            "DataSourceId": data_source_id
            }
            )
    print("Response from batch_delete_document:")
    print(result)

finally:
#Stop the data source sync job
    result = kendra.stop_data_source_sync_job(
        Id = data_source_id,
        IndexId = index_id
    )
    print("Stop data source sync operation:")
    print(result)

When the process is complete, you see a message similar to following:

Start data source sync operation:

{
    'ExecutionId': '6979977e-0d91-45e9-b69e-19b179cc3bdf',
    'ResponseMetadata': {
        'RequestId': '677c5ab8-b5e0-4b55-8520-6aa838b8696e',
        'HTTPStatusCode': 200,
        'HTTPHeaders': {
            'x-amzn-requestid': '677c5ab8-b5e0-4b55-8520-6aa838b8696e',
            'content-type': 'application/x-amz-json-1.1',
            'content-length': '54',
            'date': 'Mon, 12 Oct 2020 20:25:42 GMT'
        },
        'RetryAttempts': 0
    }
}

Job execution ID: 6979977e-0d91-45e9-b69e-19b179cc3bdf

Response from batch_delete_document:

{
    'FailedDocuments': [],
    'ResponseMetadata': {
        'RequestId': 'e647bac8-becd-4e2f-a089-84255a5d715d',
        'HTTPStatusCode': 200,
        'HTTPHeaders': {
            'x-amzn-requestid': 'e647bac8-becd-4e2f-a089-84255a5d715d',
            'content-type': 'application/x-amz-json-1.1',
            'content-length': '22',
            'date': 'Mon, 12 Oct 2020 20:25:43 GMT'
        },
        'RetryAttempts': 0
    }
}

Stop data source sync operation:
{
    'ResponseMetadata': {
        'RequestId': '58626ede-d535-43dc-abf8-797a5637fc86',
        'HTTPStatusCode': 200,
        'HTTPHeaders': {
            'x-amzn-requestid': '58626ede-d535-43dc-abf8-797a5637fc86',
            'content-type': 'application/x-amz-json-1.1',
            'content-length': '0',
            'date': 'Mon, 12 Oct 2020 20:25:43 GMT'
        },
        'RetryAttempts': 0
    }
}

On Amazon Kendra console, you can see the operation details.

Running queries

In this section, we show results from queries using the documents you ingested into your index.

The following screenshot shows results for the query “what is deep learning?”

The following screenshot shows results for the query “how do I try amazon rekognition?”

The following screenshot shows results for the query “what is vga resolution?”

Conclusion

In this post, we demonstrated how you can use the custom data source feature in Amazon Kendra to ingest documents from a custom data source into an Amazon Kendra index. We used a sample web connector to scrape content from AWS FAQs and stored it in a local file system. Then we outlined the steps you can follow to ingest those scraped documents into your Kendra index. We also detailed how to use CloudWatch metrics to check the status of an ingestion job, and ran a few natural language search queries to get relevant results from the ingested content.

We hope this post helps you take advantage of the intelligent search capabilities of Amazon Kendra to find accurate answers from your enterprise content. For more information about Amazon Kendra, watch AWS re:Invent 2019 – Keynote with Andy Jassy on YouTube.

 


About the Authors

Tapodipta Ghosh is a Senior Architect. He leads the Content And Knowledge Engineering Machine Learning team that focuses on building models related to AWS Technical Content. He also helps our customers with AI/ML strategy and implementation using our AI Language services like Kendra.

 

 

 

Juan Pablo Bustos is an AI Services Specialist Solutions Architect at Amazon Web Services, based in Dallas, TX. Outside of work, he loves spending time writing and playing music as well as trying random restaurants with his family.

 

 

 

 

 

Read More

Explaining Amazon SageMaker Autopilot models with SHAP

Explaining Amazon SageMaker Autopilot models with SHAP

Machine learning (ML) models have long been considered black boxes because predictions from these models are hard to interpret. However, recently, several frameworks aiming at explaining ML models were proposed. Model interpretation can be divided into local and global explanations. A local explanation considers a single sample and answers questions like “Why does the model predict that Customer A will stop using the product?” or “Why did the ML system refuse John Doe a loan?” Another interesting question is “What should John Doe change in order to get the loan approved?” In contrast, global explanations aim at explaining the model itself and answer questions like “Which features are important for prediction?” You can use local explanations to derive global explanations by averaging many samples. For further reading on interpretable ML, see the excellent book Interpretable Machine Learning by Christoph Molnar.

In this post, we demonstrate using the popular model interpretation framework SHAP for both local and global interpretation.

SHAP

SHAP is a game theoretic framework inspired by shapley values that provides local explanations for any model. SHAP has gained popularity in recent years, probably due to its strong theoretical basis. The SHAP package contains several algorithms that, when given a sample and model, derive the SHAP value for each of the model’s input features. The SHAP value of a feature represents its contribution to the model’s prediction.

To explain models built by Amazon SageMaker Autopilot, we use SHAP’s KernelExplainer, which is a black box explainer. KernelExplainer is robust and can explain any model, so can handle the complex feature processing of Amazon SageMaker Autopilot. KernelExplainer only requires that the model support an inference functionality that, when given a sample, returns the model’s prediction for that sample. The prediction is the predicted value for regression and the class probability for classification.

SHAP includes several other explainers, such as TreeExplainer and DeepExplainer, which are specific for decision forest and neural networks, respectively. These are not black box explainers and require knowledge of the model structure and trained params. TreeExplainer and DeepExplainer are limited and, as of this writing, can’t support any feature processing.

Creating a notebook instance

You can run the example code provided in this post. It’s recommended to run the code inside an Amazon SageMaker instance type of ml.m5.xlarge or larger to accelerate running time. To launch the notebook with the example code using Amazon SageMaker Studio, complete the following steps:

  1. Launch an Amazon SageMaker Studio instance.
  2. Open terminal and clone the GitHub repogit clone https://github.com/awslabs/amazon-sagemaker-examples.git
  3. Open the notebook autopilot/model-explainability/explaining_customer_churn_model.ipynb.
  4. Use kernel Python 3 (Data Science).

Setting up the required packages

In this post, we start with a model built by Amazon SageMaker Autopilot, which was already trained on a binary classification task. See the following code:

import boto3
import pandas as pd
import sagemaker
from sagemaker import AutoML
from datetime import datetime
import numpy as np
region = boto3.Session().region_name
session = sagemaker.Session()

For instructions on creating and training an Amazon SageMaker Autopilot model, see Customer Churn Prediction with Amazon SageMaker Autopilot.

Install SHAP with the following code:

!conda install -c conda-forge -y shap
import shap
from shap import KernelExplainer
from shap import sample
from scipy.special import expit

Initialize the plugin to make the plots interactive.
shap.initjs()

Creating an inference endpoint

Create an inference endpoint for the trained model built by Amazon SageMaker Autopilot. See the following code:

autopilot_job_name = '<your_automl_job_name_here>'
autopilot_job = AutoML.attach(autopilot_job_name, sagemaker_session=session)
ep_name = 'sagemaker-automl-' + datetime.now().strftime('%Y-%m-%d-%H-%M-%S')

For classification response to work with SHAP we need the probability scores. This can be achieved by providing a list of keys for response content. The order of the keys will dictate the content order in the response. This parameter is not needed for regression.

inference_response_keys = ['predicted_label', 'probability']

Create the inference endpoint

autopilot_job.deploy(initial_instance_count=1, instance_type='ml.m5.2xlarge', inference_response_keys=inference_response_keys, endpoint_name=ep_name)

You can skip this step if an endpoint with the argument inference_response_keys set as ['predicted_label', 'probability'] was already created.

Wrapping the Amazon SageMaker Autopilot endpoint with an estimator class

For ease of use, we wrap the inference endpoint with a custom estimator class. Two inference functions are provided: predict, which returns the numeric prediction value to be used for regression, and predict_proba, which returns the class probabilities to be used for classification. See the following code:

from sagemaker.predictor import RealTimePredictor
from sagemaker.content_types import CONTENT_TYPE_CSV

class AutomlEstimator:
    def __init__(self, endpoint, sagemaker_session):
        self.predictor = RealTimePredictor(
            endpoint=endpoint,
            sagemaker_session=sagemaker_session,
            content_type=CONTENT_TYPE_CSV,
            accept=CONTENT_TYPE_CSV
        )
    
    def get_automl_response(self, x):
        if x.__class__.__name__ == 'ndarray':
            payload = ""
            for row in x:
                payload = payload + ','.join(map(str, row)) + 'n'
        else:
            payload = x.to_csv(sep=',', header=False, index=False)
        return self.predictor.predict(payload).decode('utf-8')

    # Prediction function for regression
    def predict(self, x):
        response = self.get_automl_response(x)
        # Return the first column from the response array containing the numeric prediction value (or label in case of classification)
        response = np.array([x.split(',')[0] for x in response.split('n')[:-1]])
        return response

    # Prediction function for classification
    def predict_proba(self, x):
        # Return the probability score from AutoPilot’s endpoint response
        response = self.get_automl_response(x)
        response = np.array([x.split(',')[1] for x in response.split('n')[:-1]])
        return response.astype(float)

Create an instance of AutomlEstimator:

automl_estimator = AutomlEstimator(endpoint=ep_name, sagemaker_session=session)

Data

In this notebook, we use the same dataset as used in the Customer Churn Prediction with Amazon SageMaker Autopilot GitHub repo. Follow the notebook in the GitHub repo to download the dataset if it was not previously downloaded.

Background data

KernelExplainer requires a sample of the data to be used as background data. KernelExplainer uses this data to simulate a feature being missing by replacing the feature value with a random value from the background. We use shap.sample to sample 50 rows from the dataset to be used as background data. Using more samples as background data produces more accurate results, but runtime increases. The clustering algorithms provided in SHAP only support numeric data. You can use a vector of zeros as background data to produce reasonable results.

Choosing background data is challenging. For more information, see AI Explanations Whitepaper and Runtime considerations.

churn_data = pd.read_csv('../Data sets/churn.txt')
data_without_target = churn_data.drop(columns=['Churn?'])
background_data = sample(data_without_target, 50)

Setting up KernelExplainer

Next, we create the KernelExplainer. Because it’s a black box explainer, KernelExplainer only requires a handle to the predict (or predict_proba) function and doesn’t require any other information about the model. For classification, it’s recommended to derive feature importance scores in the log-odds space because additivity is a more natural assumption there, so we use Logit. For regression, you should use Identity. See the following code:

problem_type = automl_job.describe_auto_ml_job(job_name=automl_job_name)['ResolvedAttributes']['ProblemType'] 
link = "identity" if problem_type == 'Regression' else "logit" 

The handle to predict_proba is passed to KernelExplainer since KernelSHAP requires the class probability:

explainer = KernelExplainer(automl_estimator.predict_proba, background_data, link=link)

By analyzing the background data, KernelExplainer provides us with explainer.expected_value, which is the model prediction with all features missing. Considering a customer for which we have no data at all (all features are missing), this should theoretically be the model prediction. See the following code:

Since expected_value is given in the log-odds space we convert it back to probability using expit which is the inverse function to logit

print('expected value =', expit(explainer.expected_value))
expected value = 0.21051377184689046

Local explanation with KernelExplainer

We use KernelExplainer to explain the prediction of a single sample, the first sample in the dataset. See the following code:

# Get the first sample
x = data_without_target.iloc[0:1]

ManagedEndpoint will auto delete the endpoint after calculating the SHAP values. To disable auto delete, use ManagedEndpoint(ep_name, auto_delete=False)

from managed_endpoint import ManagedEndpoint
with ManagedEndpoint(ep_name) as mep:
    shap_values = explainer.shap_values(x, nsamples='auto', l1_reg='aic')

The SHAP package includes many visualization tools. The following force_plot code provides a visualization for the SHAP values of a single sample. Since shap_values are provided in the log-odds space, we convert them back to the probability space by using Logit

shap.force_plot(explainer.expected_value, shap_values, x, link=link)

The following visualization is the result.

From this plot, we learn that the most influential feature is VMail Message, which pushes the probability down by about 7%. VMail Message = 25 makes the probability 7% lower in comparison to the notion of that feature being missing. SHAP values don’t provide the information of how increasing or decreasing VMail Message affects prediction.

In many use cases, we’re interested only in the most influential features. By setting l1_reg='num_features(5)', SHAP provides non-zero scores for only the most influential five features:

with ManagedEndpoint(ep_name) as mep:
    shap_values = explainer.shap_values(x, nsamples='auto', l1_reg='num_features(5)')
shap.force_plot(explainer.expected_value, shap_values, x, link=link)

The following visualization is the result.

KernelExplainer computation cost

KernelExplainer computation cost is dominated by the inference calls. To estimate SHAP values for a single sample, KernelExplainer calls the inference function twice: first with the sample unaugmented, and then with many randomly augmented instances of the sample. The number of augmented instances in our use case is 50 (number of samples in the background data) * 2088 (nsamples = 'auto') = 104,400. So, for this use case, the cost of running KernelExplainer for a single sample is roughly the cost of 104,400 inference calls.

Global explanation with KernelExplainer

Next, we use KernelExplainer to provide insight about the model as a whole. We do this by running KernelExplainer locally on 50 samples and aggregating the results:

X = sample(data_without_target, 50)
with ManagedEndpoint(ep_name) as mep:
    shap_values = explainer.shap_values(X, nsamples='auto', l1_reg='aic')

You can use force_plot to visualize SHAP values for many samples simultaneously, force_plot then rotates the plot of each sample by 90 degrees and stacks the plots horizontally. See the following code:

shap.force_plot(explainer.expected_value, shap_values, X, link=link)

The resulting plot is interactive (in the notebook) and can be manually analyzed.

summary_plot is another visualization tool displaying the mean absolute value of the SHAP values for each feature using a bar plot. Currently, summary_plot doesn’t support link functions, so the SHAP values are presented in the log-odds space (and not the probability space). See the following code:

shap.summary_plot(shap_values, X, plot_type="bar")

The following graph shows the results.

Conclusion

In this post, we demonstrated how to use KernelSHAP to explain models created by Amazon SageMaker Autopilot, both locally and globally. KernelExplainer is a robust black box explainer that requires only that the model support an inference functionality that, when given a sample, returns the model’s prediction for that sample. This inference functionality was provided by wrapping the Amazon SageMaker Autopilot inference endpoint with a custom estimator class.

For more information about Amazon SageMaker Autopilot, see Amazon SageMaker Autopilot.

To explore related features of Amazon SageMaker, see the following:

 


About the Authors

Yotam Elor is a Senior Applied Scientist at AWS Sagemaker. He works on Sagemaker Autopilot – AWS’s auto ML solution.

 

 

 

 

Somnath Sarkar is a Software Engineer in the AWS SageMaker Autopilot team. He enjoys machine learning in general with focus in scalable and distributed systems.

Read More

Creating an intelligent ticket routing solution using Slack, Amazon AppFlow, and Amazon Comprehend

Creating an intelligent ticket routing solution using Slack, Amazon AppFlow, and Amazon Comprehend

Support tickets, customer feedback forms, user surveys, product feedback, and forum posts are some of the documents that businesses collect from their customers and employees. The applications used to collect these case documents typically include incident management systems, social media channels, customer forums, and email. Routing these cases quickly and accurately to support groups best suited to handle them speeds up resolution times and increases customer satisfaction.

In traditional incident management systems internal to a business, assigning the case to a support group is either done by the employee during case creation or a centralized support group routing these tickets to specialized groups after case creation. Both of these scenarios have drawbacks. In the first scenario, the employee opening the case should be aware of the various support groups and their function. The decision to pick the right support group increases the cognitive overload on the employee opening the case. There is a chance of human error in both scenarios, which results in re-routing cases and thereby increasing the resolution times. These repetitive tasks result in a decrease of employee productivity.

Enterprises use business communication platforms like Slack to facilitate conversations between employees. This post provides a solution that simplifies reporting incidents through Slack and routes them to the right support groups. You can use this solution to set up a Slack channel in which employees can report many types of support issues. Individual support groups have their own private Slack channels.

Amazon AppFlow provides a no-code solution to transfer data from Slack channels into AWS securely. You can use Amazon Comprehend custom classification to classify the case documents into support groups automatically. Upon classification, you post the message in the respective private support channel by using Slack Application Programming Interface (API) integration. Depending on the incident management system, you can automate the ticket creation process using APIs.

When you combine Amazon AppFlow with Amazon Comprehend, you can implement an accurate, intelligent routing solution that eliminates the need to create and assign tickets to support groups manually. You can increase productivity by focusing on higher-priority tasks.

Solution overview

For our use case, we use the fictitious company AnyCorp Software Inc, whose programmers use a primary Slack channel to ask technical questions about four different topics. The programmer gets a reply with a ticket number that they can refer to for future communication. The question is intelligently routed to one of the five  specific channels dedicated to each topic-specific support group. The following diagram illustrates this architecture.

The solution to building this intelligent ticket routing solution comprises four main components:

  • Communication platform – A Slack application with a primary support channel for employees to report issues, four private channels for each support group and one private channel for all other issues.
  • Data transfer – A flow in Amazon AppFlow securely transfers data from the primary support channel in Slack to an Amazon Simple Storage Service (Amazon S3) bucket, scheduled to run every 1 minute.
  • Document classification – A multi-class custom document classifier in Amazon Comprehend uses ground truth data comprising issue descriptions and their corresponding support group labels. You also create an endpoint for this custom classification model.
  • Routing controller – An AWS Lambda function is triggered when Amazon AppFlow puts new incidents into the S3 bucket. For every incident received, the function calls the Amazon Comprehend custom classification model endpoint, which returns a label for the support group best suited to address the incident. After receiving the label from Amazon Comprehend, the function using the Slack API replies to the original thread in the primary support channel. The reply contains a ticket number and the support group’s name that will address the issue. Simultaneously, the function posts the issue to the private channel associated with the support group that will address the issue.

Dataset

For Amazon Comprehend to classify a document into one of the named categories, it needs to train on a dataset with known inputs and outputs. For our use case, we use the Jira Social Repository dataset hosted on GitHub. The dataset comprises issues extracted from the Jira Issue Tracking System of four popular open-source ecosystems: the Apache Software Foundation, Spring, JBoss, and CodeHaus communities. We used the Apache Software Foundation issues, filtered four categories (GROOVY, HADOOP, MAVEN, and LOG4J2) and created a CSV file for training purposes.

  1. Download the data.zip
  2. On the Amazon S3 console, choose Create bucket.
  3. For Bucket name, enter [YOUR_COMPANY]-comprehend-issue-classifier.
  4. Choose Create.
  5. Unzip the train-data.zip file and upload all files in folder into the [YOUR_COMPANY]-comprehend-issue-classifier bucket.
  6. Create another bucket named [YOUR_COMPANY]-comprehend-issue-classifier-output.

We store the output of the custom classification model training in this folder.

Your [YOUR_COMPANY]-comprehend-issue-classifier bucket should look like the following screenshot.

Training Data for Comprehend

Deploying Amazon Comprehend

To deploy Amazon Comprehend, complete the following steps:

  1. On the Amazon Comprehend console, under Customization, choose Custom classification.
  2. Choose Train classifier.
  3. For Name, enter comprehend-issue-classifier.
  4. For Classifier mode, select Using Multi-class mode.

Because our dataset has multiple classes and only one class per line, we use the multi-class mode.

  1. For S3 location, enter s3://[YOUR_COMPANY]-comprehend-issue-classifier.
  2. For Output data, choose Browse S3.
  3. Find the bucket you created in the previous step and choose the s3://[YOUR_COMPANY]-comprehend-issue-classifier-output folder.
  4. For IAM role, select Create an IAM role.
  5. For Permissions to access, choose Input and output (if specified) S3 bucket.
  6. For Name suffix, enter comprehend-issue-classifier.

Custom Classification Model Training Job Configuration

  1. Choose Train classifier.

The process can take up to 30 minutes to complete.

  1. When the training is complete and the status shows as Trained, choose comprehend-issue-classifier.
  2. In the Endpoints section, choose Create endpoint.
  3. For Endpoint name, enter comprehend-issue-classifier-endpoint.
  4. For Inference units, enter 1.
  5. Choose Create endpoint.
  6. When the endpoint is created, copy its ARN from the Endpoint details section to use later in the Lambda function.

Creating a Slack app

In this section, we create a Slack app to connect with Amazon AppFlow for our intelligent ticket routing solution. For more information, see Creating, managing, and building apps.

  • Sign in to your Slack workspace where you’d like to create the ticket routing solution, or create a new workspace.
  • Create a Slack app named TicketResolver.
  • After you create the app, in the navigation pane, under Features, choose OAuth & Permissions.
  • For Redirect URLs, enter https://console.aws.amazon.com/appflow/oauth.
  • For User Token Scopes, add the following:
    • channels:history
    • channels:read
    • chat:write
    • groups:history
    • groups:read
    • im:history
    • im:read
    • mpim:history
    • mpim:read
    • users:read
  1. In the navigation pane, under Settings, choose Basic Information.
  2. Expand Install your app to your workspace.
  3. Choose Install App to Workspace.
  4. Follow the instructions to install the app to your workspace.

  1. Create a Slack channel named testing-slack-integration. This channel is your primary channel to report issues.
  2. Create an additional five channels: groovy-issues, hadoop-issues, maven-isssues, log4j2-issues, other-issues. Mark them all as private. These will be used by the support groups designated to handle the specific issues.
  3. In your channel, choose Connect an app.

  1. Connect the TicketResolver app you created.

Deploying the AWS CloudFormation template

You can deploy this architecture using the provided AWS CloudFormation template in us-east-1.

  1. Choose Launch Stack:

  1. Provide a stack name.
  2. Provide the following parameters:
    • CategoryChannelMap, which is a mapping between Amazon Comprehend categories and your Slack channels in string format; for example, '{ "GROOVY":"groovy-issues", "HADOOP":"hadoop-issues", "MAVEN":"maven-issues", "LOG4J2":"log4j-issues", "OTHERS":"other-issues" }'
    • ComprehendClassificationScoreThreshold, which can be left with default value of 0.75
    • ComprehendEndpointArn which is created in the previous step that looks like arn:aws:comprehend:{YOUR_REGION}:{YOUR_ACCOUNT_ID}:document-classifier-endpoint/comprehend-issue-classifier-endpoint
    • Region where your AWS resources are provisioned. Default is set to us-east-1
    • SlackOAuthAccessToken, which is the OAuth access token on your Slack API page in the OAuth Tokens & Redirect URLs section

    • SlackClientID can be found under App Credentials section from your slack app home page as Client ID
    • SlackClientSecret can be found under App Credentials section from your slack app home page as Client Secret

    • SlackWorkspaceInstanceURL which can be found by clicking the down arrow next to workspace.

    • SlackChannelID which is the channel ID for the testing-slack-integration channel

    • LambdaCodeBucket which is a bucket name where your lambda code is stored. Default is set to intelligent-routing-lambda-code, which is the public bucket containing the lambda deployment package. If your AWS account is in us-east-1, no change is needed. For other regions, please download the lambda deployment package from here. Create a s3 bucket in your AWS account and upload the package, and change the parameter value to your bucket name.
    • LambdaCodeKey which is a zip file name of your lambda code. Default is set to lambda.zip, which is the name of deployment package in the public bucket. Please revise this to your file name if you had to download and upload the lambda deployment package to your bucket in step k.
  1. Choose Next
  2. In the Capabilities and transforms section, select all three check-boxes to provide acknowledgment to AWS CloudFormation to create AWS Identity and Access Management (IAM) resources and expand the template.
  3. Choose Create stack.

This process might take 15 minutes or more to complete, and creates the following resources:

  • IAM roles for the Lambda function to use
  • A Lambda function to integrate Slack with Amazon Comprehend to categorize issues typed by Slack users
  • An Amazon AppFlow Slack connection for the flow to use
  • An Amazon AppFlow flow to securely connect the Slack app with AWS services

Activating the Amazon AppFlow flow

You can create a flow on the Amazon AppFlow console.

  1. On the Amazon AppFlow console, choose View flows.
  2. Choose Activate flow.

Your SlackAppFlow is now active and runs every 1 minute to gather incremental data from the Slack channel testing-slack-integration.

Testing your integration

We can test the end-to-end integration by typing some issues related to your channels in the testing-slack-integration channel and waiting for about 1 minute for your Amazon AppFlow connection to transfer data to the S3 bucket. This triggers the Lambda function to run Amazon Comprehend analysis and return a category, and finally respond in the testing-slack-integration channel and the channel with the corresponding category with a random ticket number generated.

For example, in the following screenshot, we enter a Maven-related issue in the testing-slack-integration channel.

You see a reply from the TicketResolver app added to your original response in the testing-slack-integration channel.

Also, you see a slack message posted in channel.

Cleaning up

To avoid incurring any charges in the future, delete all the resources you created as part of this post:

  • Amazon Comprehend endpoint comprehend-issue-classifier-endpoint
  • Amazon Comprehend classifier comprehend-issue-classifier
  • Slack app TicketResolver
  • Slack channels testing-slack-integration, groovy-issues, hadoop-issues, maven-issues, log4j2-issues, and other-issues
  • S3 bucket comprehend-issue-classifier-output
  • S3 bucket comprehend-issue-classifier
  • CloudFormation stack (this removes all the resources the CloudFormation template created)

Conclusion

In this post, you learned how to use Amazon Comprehend, Amazon AppFlow, and Slack to create an intelligent issue-routing solution. For more information about securely transferring data software-as-a-service (SaaS) applications like Salesforce, Marketo, Slack, and ServiceNow to AWS, see Get Started with Amazon AppFlow. For more information about Amazon Comprehend custom classification models, see Custom Classification. You can also discover other Amazon Comprehend features and get inspiration from other AWS blog posts about using Amazon Comprehend beyond classification.

 


About the Author

Shanthan Kesharaju is a Senior Architect who helps our customers with AI/ML strategy and architecture. He is an award winning product manager and has built top trending Alexa skills. Shanthan has an MBA in Marketing from Duke University and an MS in Management Information Systems from Oklahoma State University.

 

 

So Young Yoon is a Conversation A.I. Architect at AWS Professional Services where she works with customers across multiple industries to develop specialized conversational assistants which have helped these customers provide their users faster and accurate information through natural language. Soyoung has M.S. and B.S. in Electrical and Computer Engineering from Carnegie Mellon University.

 

Read More

NVIDIA A100 Launches on AWS, Marking Dawn of Next Decade in Accelerated Cloud Computing

NVIDIA A100 Launches on AWS, Marking Dawn of Next Decade in Accelerated Cloud Computing

Amazon Web Services’ first GPU instance debuted 10 years ago, with the NVIDIA M2050. At that time, CUDA-based applications were focused primarily on accelerating scientific simulations, with the rise of AI and deep learning still a ways off.

Since then, AWS has added to its stable of cloud GPU instances, which has included the K80 (p2), K520 (g3), M60 (g4), V100 (p3/p3dn) and T4 (g4).

With its new P4d instance generally available today, AWS is paving the way for another bold decade of accelerated computing powered with the latest NVIDIA A100 Tensor Core GPU.

The P4d instance delivers AWS’s highest performance, most cost-effective GPU-based platform for machine learning training and high performance computing applications. The instances reduce the time to train machine learning models by up to 3x with FP16 and up to 6x with TF32 compared to the default FP32 precision.

They also provide exceptional inference performance. NVIDIA A100 GPUs just last month swept the MLPerf Inference benchmarks — providing up to 237x faster performance than CPUs.

Each P4d instance features eight NVIDIA A100 GPUs and, with AWS UltraClusters, customers can get on-demand and scalable access to over 4,000 GPUs at a time using AWS’s Elastic Fabric Adaptor (EFA) and scalable, high-performant storage with Amazon FSx. P4d offers 400Gbps networking and uses NVIDIA technologies such as NVLink, NVSwitch, NCCL and GPUDirect RDMA to further accelerate deep learning training workloads. NVIDIA GPUDirect RDMA on EFA ensures low-latency networking by passing data from GPU to GPU between servers without having to pass through the CPU and system memory.

In addition, the P4d instance is supported in many AWS services, including Amazon Elastic Container Services, Amazon Elastic Kubernetes Service, AWS ParallelCluster and Amazon SageMaker. P4d can also leverage all the optimized, containerized software available from NGC, including HPC applications, AI frameworks, pre-trained models, Helm charts and inference software like TensorRT and Triton Inference Server.

P4d instances are now available in US East and West, and coming to additional regions soon. The instances can be purchased as On-Demand, with Savings Plans, with Reserved Instances, or as Spot Instances.

The first decade of GPU cloud computing has brought over 100 exaflops of AI compute to the market. With the arrival of the Amazon EC2 P4d instance powered by NVIDIA A100 GPUs, the next decade of GPU cloud computing is off to a great start.

NVIDIA and AWS are making it possible for applications to continue pushing the boundaries of AI across a wide array of applications. We can’t wait to see what customers will do with it.

Visit AWS and get started with P4d instances today.

The post NVIDIA A100 Launches on AWS, Marking Dawn of Next Decade in Accelerated Cloud Computing appeared first on The Official NVIDIA Blog.

Read More

Real-time data labeling pipeline for ML workflows using Amazon SageMaker Ground Truth

Real-time data labeling pipeline for ML workflows using Amazon SageMaker Ground Truth

High-quality machine learning (ML) models depend on accurately labeled, high-quality training, validation, and test data. As ML and deep learning models are increasingly integrated into production environments, it’s becoming more important than ever to have customizable, real-time data labeling pipelines that can continuously receive and process unlabeled data.

For example, you may want to create a consumer-facing application that regularly collects and sends new data objects to a data labeling pipeline, which produces labels and builds a dataset for model training or retraining. This pipeline creates a positive feedback loop that leads to more accurate, sophisticated models.

Amazon SageMaker Ground Truth streaming labeling jobs provide infrastructure and resources to create a continuously running labeling job that receives new data objects on demand and sends them to human workers to be labeled. You can chain multiple streaming labeling jobs together to create more intricate and refined data labeling pipelines.

Use this blog post to learn how to set up and customize Ground Truth streaming labeling jobs.

Walkthrough overview

In addition to discussing the benefits of using a streaming labeling job, such as eliminating delays, enforcing idempotency, and customizing input data sources, this post features two Jupyter notebooks that you can use to set up streaming labeling jobs. You can use these notebooks or follow the console instructions in this post to create a streaming labeling job using a supported, language-specific AWS Software Development Kit (SDK) of your choice.

The first notebook shows you how to create Ground Truth streaming labeling jobs. This notebook supports built-in and custom task types, which allow you to quickly create data labeling pipelines for various data types such as image, text, video, video frame, 3D point cloud, and more. This walkthrough demonstrates how to use Amazon Simple Notification Service (Amazon SNS) to send secure, real-time messages to a streaming labeling job to feed new data objects to human workers for labeling. You learn how you can set up notifications to receive the output data from that labeling task in real time, as soon as workers finish labeling a data object.

When you create a streaming labeling job, you can route the output data of that job to another streaming labeling job to create more complex data labeling pipelines and for data label verification and adjustment. This is referred to as chaining labeling jobs. You can use the second notebook with this post to learn how to chain two streaming labeling jobs together.

You can run both notebooks on default mode, requiring little to no input. Default mode creates image object detection (bounding box) labeling jobs and demonstrates how to send data objects to these labeling jobs. If you have your own data objects that you want to use, you can turn off default mode.

To get started, complete the Prerequisites and Launching a notebook instance and setting up the demo notebook sections in this post to gather the resources you need to complete this tutorial and, optionally, set up the Jupyter notebooks ground_truth_create_streaming_labeling_job.ipynb and ground_truth_create_chained_streaming_labeling_job.ipynb in an Amazon SageMaker notebook instance.

The following diagram illustrates the solution architecture.

Advantages of streaming labeling jobs

The first notebook shows you how to create Ground Truth streaming labeling jobs.

Real-time input channel

You can feed objects in real time and continuously to a labeling job. Amazon SNS allows you to configure topics to feed objects in real time to a running labeling job.

Long-running workflows

You can launch labeling jobs that can run for a long time if they’re actively being fed objects. Streaming jobs are designed to be long-running workflows that keep running until you choose to stop them.

Ground Truth will stop the job if it is idle for a long time. A job is defined as idle if Ground Truth doesn’t detect any objects waiting to be labeled in the system over a certain number of days. For example, if Ground Truth doesn’t receive new data objects from the SNS input topic and all the objects fed to the system are already labeled, a timer for idle time starts. If the idle timer hits a certain number, Ground Truth stops the labeling job.

In short, if objects are actively flowing through the system at regular cadence, and you can achieve a long-running workflow. For more information about configuring idle timers, see Stop a Streaming Labeling Job.

Eliminate delays

With streaming labeling jobs, objects can flow through your data labeling pipeline faster. Streaming jobs work in a sliding window manner, where Ground Truth keeps sending objects for labeling as long as slots are available. The slots are defined by the parameter MaxConcurrentTaskCount, which defines the maximum number of objects (slots) that can be filled by objects to be sent for labeling. When MaxConcurrentTaskCount is reached, you can view the number of data objects queued in Amazon Simple Queue Services (Amazon SQS).

For example, if MaxConcurrentTaskCount is 10, and 25 objects are sent via the input SNS topic, Ground Truth sends a maximum of 10 objects to the workers at a time and a maximum of 15 remaining objects are in the Amazon Simple Queue Service (Amazon SQS) queue. If a worker works on and submits 2 objects out of the 10 that were sent, only 8 slots are currently filled, and 2 more are sent to workers from the remaining 15 objects. This way, workers have a constant flow of objects coming in from your inputs, up to a maximum of 10 objects. There aren’t any delays resulting from batching objects. As workers work on objects, new objects are pumped in constantly and you can achieve data labeling with greater speed.

Rate limiting

You can limit and control how and when you feed data to workers. When you feed objects to your input SNS topic, they’re collected in an SQS queue in your account, named GroundTruth-<labeling-job-name>. If more objects are sent to the labeling job than the MaxConcurrentTaskCount, they remain in the SQS queue. Otherwise, they are sent to workers to be labeled. Any object in the SQS queue is available for a maximum of 14 days.

For example, if MaxConcurrentTaskCount is 1000, and 2,500 objects are sent to a streaming labeling job via an input SNS topic, Ground Truth sends a maximum of 1,000 objects to the workers at a time, and initially, 1,500 remain in the SQS queue. The speed of the workers determines how quickly the 1,500 objects in the queue are sent for labeling. If these objects remain in the queue for longer than expected, this serves as an indicator that you have sent more objects than can be worked on by workers in a given timeframe. If the data objects take longer than expected to label, you can adjust input to feed objects to Amazon SNS at a slower pace. You can also change the value of MaxConcurrentTaskCount to suit the pace of the worker.

To monitor the speed and quantity of data objects being fed into the SQS queue associated with a streaming labeling job, you can set up alarms for the queue with Amazon CloudWatch. For more information, see Available CloudWatch metrics for Amazon SQS. For example, you can set up an alarm on the ApproximateAgeOfOldestMessage metric to see how close your oldest data object is to the 14-day limit. When this alarm is trigged, you can take appropriate actions, like resending the object to the input SNS topic or notifying workers that tasks will expire if not completed within a given timeframe.

Output notifications

A new SNS channel is added as an output channel for your labeling job. When a worker completes a labeling job task from a Ground Truth streaming labeling job, Ground Truth uses your output topic to publish output data to one or more endpoints that you specify. To receive notifications when a worker finishes a labeling task, you must subscribe an endpoint to your SNS output topic. For example, you can subscribe an email, an AWS Lambda function, or an SQS queue to the SNS output topic used for labeling job, and any object labeled through Ground Truth appears in real time after labeling.

In addition to the SNS output topic, you can also use the frequent Amazon Simple Storage Service (Amazon S3) output file updates in the Amazon S3 output path. All labels are added to an output manifest file in Amazon S3. You can reference this file if, for example, the real-time output notifications were missed. If the S3 bucket is versioned, you can view and access different versions of the output manifest file.

Idempotency

You can use a unique identifier to distinguish the objects you feed to a labeling job and track them in the output. You can bring your own unique identifier, or take advantage of an auto-generated identifier Ground Truth creates if you don’t supply one.

When you send a data object to your streaming labeling job using an Amazon SNS message, you can specify your deduplication key and deduplication ID. The unique identifier helps make sure that each object sent for labeling is unique. If you send two objects with the same unique identifier, the latter object is considered a duplicate. This prevents in accidental injection of objects that weren’t intended and also provides an ID to track output data when labels are generated. For more information, see Duplicate Message Handling.

Drop objects into Amazon S3

You can set up your S3 buckets to automatically publish data labeling requests to your SNS input topic any time a data object is added to the bucket. With this setup, you can drop objects into the S3 bucket and they are automatically sent to your streaming labeling job.

For more information about setting up your S3 bucket and notifications, see Send Data Objects to Your Labeling Job Using An S3 Bucket.

Solution overview

To complete this use case, use the notebook ground_truth_create_streaming_labeling_job.ipynb in the Amazon SageMaker Examples GitHub repo.

After completing the prerequisites, you can use this walkthrough to do the following:

  1. Launch a notebook instance and set up the demo notebook
  2. Launch a streaming job
  3. Monitor the job
  4. Send objects to an ongoing job
  5. Stop the labeling job

Streaming labeling jobs are launched using the Ground Truth API operation CreateLabelingJob in a supported language-specific AWS SDK.

Prerequisites

If you’re a new user of Ground Truth streaming labeling jobs, it’s recommended you review Ground Truth Streaming Labeling Jobs before completing this walkthrough.

To complete this walkthrough, you need the following:

  • An AWS account.
  • An S3 bucket in the same AWS Region you use to launch your streaming labeling job. If you’re using a demo notebook, this bucket must also be in the same Region as your Amazon SageMaker notebook instance. You can either specify this bucket in the notebook variable BUCKET, or use the default bucket in the Region that you create your notebook instance in. For more information, see How do I create an S3 Bucket?
  • An AWS Identity and Access Management (IAM) execution role with required permissions. The notebook automatically uses the role you used to create your notebook instance (see the next item in this list). Add the following permissions to this IAM role:
    • Attach managed policies AmazonSageMakerGroundTruthExecution. The following GIF demonstrates how to attach this policy to the role on the IAM console.

    • When you create your role, you specify Amazon S3 permissions. You can either allow that role to access all your resources in Amazon S3, or you can specify particular buckets. Make sure that your IAM role has access to the S3 bucket that you plan to use. This bucket must be in the same Region as your notebook instance.
  • A work team. A work team is a group of people that you select to label your data. A work team is a group of workers from a workforce, which is made up of workers engaged through Amazon Mechanical Turk, vendor-managed workers, or your own private workers that you invite to work on your tasks. Whichever workforce type you choose, Ground Truth takes care of sending tasks to workers. To preview the worker UI, use a private workforce and add yourself to the work team you use in the notebook.
    • To use a private or vendor workforce, record the Amazon Resource Name (ARN) of the work team you use—you need it in the accompanying Jupyter notebooks. The following GIF demonstrates how to quickly create a private work team on the Amazon SageMaker console.

    • If you don’t specify a private or vendor workforce, the notebook automatically uses the Mechanical Turk workforce. When you create the labeling job, you can specify the total amount you pay an individual worker for labeling a data object. To learn more, see Amazon SageMaker Ground Truth pricing.
  • If you’re not using default mode in the notebooks, you must supply a HTML worker task template. This template is used to render the human task UI that your workers use to complete tasks. You can copy your template directly to the notebooks, which provides logic to write the template to Amazon S3, or you can add the template to your S3 bucket and record the template Amazon S3 URI. For more information about sample templates, see Built-in Task Types. For more information about custom labeling workflows, see Step 2: Creating your custom labeling task template.
  • A list of label categories. The notebooks use this list to create a label category configuration file and upload it to Amazon S3. When you use default mode in the notebooks, this list is provided.
  • If you’re not using the notebooks, you need two Lambda functions to pre-process your input data (PreHumanTaskLambdaArn) and output data (AnnotationConsolidationLambdaArn). If you use one of the built-in task types, Ground Truth provides these functions.

Launching a notebook instance and setting up the demo notebook

To use the notebooks, you can launch an Amazon SageMaker notebook instance. For more information, see Create a Notebook Instance. When your notebook instance is active, complete the following steps to use the notebooks:

  1. On the Amazon SageMaker console in Notebook instances, locate your notebook instance.
  2. Choose Open Jupyter or Open Jupyter Lab.
  3. In Jupyter, choose the SageMaker Examples In Jupyter Lab, choose the Amazon SageMaker icon to see a list of example notebooks.
  4. In the Ground Truth Labeling Jobs section, select one of the following notebooks to use alongside this post. In Jupyter, choose Use next to a notebook to start using it. In Jupyter Lab, select the notebook, then choose Create Copy.
    1. ground_truth_create_streaming_labeling_job.ipynb
    2. ground_truth_create_chained_streaming_labeling_job.ipynb

Launching a streaming job

Streaming labeling jobs are created using the same API operation, CreateLabelingJob, as non-streaming labeling jobs. To create a streaming labeling job, you specify an input topic as your input data source, and an output topic as your output data source. New data objects are continuously sent to your labeling job through the input topic, and output data is sent to the output topic as soon as workers complete labeling tasks. You can configure your output topic to send a notification or trigger an event any time output data is received.

When you create a streaming labeling job, the input manifest file is optional.

You can use the Amazon SNS API operation CreateTopic to create your input and output topics, or you can use the Amazon SNS console. The response to a successful request to CreateTopic includes the topic ARN. You use the topic ARNs of your input and output topics in CreateLabelingJob in the parameters.

If the name of the topic contains GroundTruth (not case-sensitive) or SageMaker (not case-sensitive), the policy AmazonSageMakerGroundTruthExecution grants sufficient permissions to publish messages to your labeling job. If not, make sure to grant your IAM role permission to perform the actions sns:Publish and sns:Subscribe for your SNS topics.

Creating an SNS topic using the Amazon Python (Boto3) SDK

The notebook ground_truth_create_streaming_labeling_job.ipynb creates SNS topics using the AWS Python (Boto3) SDK. In the following code, replace LABELING_JOB_NAME with the name of the labeling job:

sns = boto3.client('sns')

# Create Input Topic
input_response = sns.create_topic(Name= LABELING_JOB_NAME + '-Input')
INPUT_SNS_TOPIC_ARN = input_response['TopicArn']

# Create Output Topic
output_response = sns.create_topic(Name= LABELING_JOB_NAME + '-Output')
OUTPUT_SNS_TOPIC_ARN = output_response['TopicArn']

Creating an SNS topic on the Amazon SNS console

To create an SNS topic on the Amazon SNS console, complete the following steps:

  1. On the Amazon SNS console, choose Topics.
  2. Choose Create topic.

  1. For Name, enter a name.
  2. For Display name, enter an optional display name.
  3. If required, add additional configurations for your topic, such as Encryption, Access policy, Delivery retry policy, Delivery status logging, and

After the topics are created, feed the input topic ARN to LabelingJobSnsDataSource.SnsTopicArn and the output topic ARN to OutputConfig.SnsTopicArn.

Creating a streaming labeling job using CreateLabelingJob

You must create Ground Truth streaming labeling jobs with the Amazon SageMaker API operation CreateLabelingJob.

The ground_truth_create_streaming_labeling_job.ipynb notebook walks you through creating the resources required and configuring the request.

If you’re not using this notebook, use an AWS SDK supported by CreateLabelingJob. For more information about using an API request to create a streaming labeling job, see Example: Use SageMaker API To Create Streaming Labeling Job. If you’re a new user of Ground Truth, it’s recommended that you use one of the image or text based built-in task types to familiarize yourself with Ground Truth streaming labeling jobs.

After you fill in the parameters of your request, submit the request to create a labeling job. Refer to the Use the CreateLabelingJob API to create a streaming labeling job section in the ground_truth_create_streaming_labeling_job.ipynb notebook. You can also use the AWS Command Line Interface (AWS CLI) or AWS SDK. For more information, see Example: Use SageMaker API To Create Streaming Labeling Job.

Monitoring the job

You can call DescribeLabelingJob after the job is created. Refer to the Use the DescribeLabelingJob API to describe a streaming labeling job section in the ground_truth_create_streaming_labeling_job.ipynb notebook.

Make sure the LabelingJobStatus is InProgress before feeding objects via the SNS channel. The following code is an example of how you can use DescribeLabelingJob (using the AWS Python (Boto3) SDK) to retrieve the labeling job status:

sagemaker = boto3.client('sagemaker')
sagemaker.describe_labeling_job(LabelingJobName=LABELING_JOB_NAME)['LabelingJobStatus']

If you specified the optional field S3DataSource.ManifestS3Uri in the CreateLabelingJob request, the objects in the Amazon S3 file are automatically sent to workers as soon as the labeling job starts. The LabelCounters element of the response to your DescribeLabelingJob request shows these objects as Unlabeled initially, and then HumanLabeled after they have been annotated and workers have submitted their work.

Amazon SQS offers a secure, durable, and available hosted queue. Streaming labeling jobs create an SQS queue in your account. You can check for the queue by the name GroundTruth-LABELING_JOB_NAME. The following code is an example of how you can use GetQueueUrl (using the AWS Python (Boto3) SDK) to retrieve the labeling job status:

sqs = boto3.client('sqs')
response = sqs.get_queue_url(QueueName='GroundTruth-' + LABELING_JOB_NAME.lower())

Sending objects to an ongoing job

After your labeling job has started, data objects can be fed to it through the console or the Amazon SNS API. For more information, see Send Data Objects Using Amazon SNS. The format of the SNS message that you use to send a data object to your labeling job is the same as the augmented manifest format.

For example, to send a new image object to an image classification labeling job, your message may look similar to the following:

{"source-ref": "s3://awsexamplebucket/example-image.jpg"}

If you create a text-based labeling job, your request may look similar to the following:

{"source": "Lorem ipsum dolor sit amet"}

Publishing a request on the Amazon SNS console

To publish a request to your labeling job on the Amazon SNS console, complete the following steps:

  1. On the Amazon SNS console, choose Topics.
  2. Choose your input topic.
  3. Choose Publish message.

Publishing a request using the Publish API operation

You can use the Amazon SNS API operation Publish to send a request to label a data object to your streaming labeling job via a supported AWS SDK.

The notebook demonstrates how to publish a message using this operation.

The following code is an example of how you can use the AWS Python (Boto3) SDK to send a request to Publish. Replace INPUT_TOPIC_ARN with the ARN of your input topic, and replace REQUEST with a request similar to the preceding examples.

sns = boto3.client('sns')
published_message = sns.publish(TopicArn=INPUT_TOPIC_ARN,Message=REQUEST)

After you publish a request, a call to DescribeLabelingJob shows Unlabeled incremented by 1:

"LabelCounters" : {
    'TotalLabeled': 0, 
    'HumanLabeled': 0, 
    'MachineLabeled': 0,  
    'FailedNonRetryableError': 0,  
    'Unlabeled': 1
}

Previewing the worker task

If you used a private workforce and made yourself a worker on the work team used to create the labeling job, you can navigate to your worker portal to preview the worker task. You can find the worker portal link in the Labeling workforces page on the Ground Truth console (on the Amazon SageMaker console) in the Region you used to launch the labeling job. This link is also included in the welcome email sent to you when you were added to the work team.

When a worker submits a data object after labeling it, it is sent to your output topic. Additionally, the results are periodically added to the S3 output bucket you specified when you created your labeling job in S3OutputPath.

Stopping the labeling job

You can use streaming labeling jobs in long-running workflows, and they run until you stop them. This allows you to continuously feed objects to the labeling job.

However, if the system detects no objects are available in the system to be labeled and is idle continuously for more than a certain number of days, GroundTruth attempts to stop the job. For more information, see Stopping Streaming Jobs.

You can stop your labeling job on the Ground Truth console or using the Ground Truth API operation StopLabelingJob. To use the console, complete the following steps:

  1. On the Amazon SageMaker console, choose Ground Truth. Be sure to use the Region you used to launch your labeling job.
  2. Select the labeling job you want to stop.
  3. From the Actions drop-down menu, choose Stop job.

The final cells in the notebook demonstrate how you can stop a labeling job using the AWS Python (Boto3) SDK:

sagemaker = boto3.client('sagemaker')
sagemaker.stop_labeling_job(LabelingJobName=LABELING_JOB_NAME)

When a labeling job has been successfully stopped, its status shows as Stopped.

Other Features of Streaming Labeling Jobs

The following sections cover additional features of streaming labeling jobs: sending objects to your labeling job by dropping them in an S3 bucket, and chaining multiple labeling jobs together.

Sending data objects to your labeling job using an S3 bucket

You can set up your S3 buckets to automatically publish data labeling requests to your SNS input topic any time a data object is added to the bucket. With this setup, you can drop objects into the S3 bucket and they are automatically sent to your streaming labeling job.

To configure an S3 bucket to automatically send data objects to your SNS input topic, you need to add an access policy to the input topic to allow Amazon S3 to add an event to it. The following code illustrates the type of policy to attach with your topic ARN (replace SNS-topic-ARN):

{
 "Version": "2012-10-17",
 "Id": "example-ID",
 "Statement": [
  {
   "Sid": "example-statement-ID",
   "Effect": "Allow",
   "Principal": {
    "AWS":"*"  
   },
   "Action": [
    "SNS:Publish"
   ],
   "Resource": "SNS-topic-ARN",
   "Condition": {
      "ArnLike": { "aws:SourceArn": "arn:aws:s3:*:*:<bucket-name>" },
      "StringEquals": { "aws:SourceAccount": "<bucket-owner-account-id>" }
   }
  }
 ]
}

To set up your S3 bucket to send data objects to your streaming labeling job on the Amazon S3 console, complete the following steps:

  1. On the Amazon S3 console, choose the bucket that you want to use to send data objects to your labeling job.
  2. On the Properties tab, under Advanced settings, choose Events.
  3. Choose Add notification.
  4. Give your notification a name.
  5. Select All object create events.
  6. Optionally, enter a prefix if you want to drop data objects into a prefix within the S3 bucket.
  7. If you only want to send specific types of data objects to your SNS input topic, specify a suffix. For example, to ensure only image files are sent to your SNS input topic, you can enter .jpg,.png,.jpeg.
  8. From the Send to drop-down menu, choose SNS Topic.
  9. Choose the SNS input topic you used or will use to create your labeling job.
  10. Choose Save.

The following GIF demonstrates how to set up this configuration on the Amazon S3 console.

Chaining

To create sophisticated, persistent, real-time data labeling pipelines that allow you to add multiple types of annotations to data objects, audit and verify labels, and more, you can chain multiple streaming labeling jobs.

Chaining allows you to send the output of one streaming labeling job to another streaming labeling job. For example, the output data of Job 1 can be sent to Job 2 as input, the output data of Job 2 can flow to Job n-1, and the data of Job n-1 can flow to Job n in real time.

As an example use case, you could use Job 1 to add a semantic segmentation mask to a sequence of video frames. You then use Job 2 to add bounding boxes to identify and localize data objects in each frame. Finally, you use Job 3 to verify and adjust labels as needed.

To set this up, you use the output SNS topic of Job 1 as the input SNS topic of Job 2. Similarly, you use the output SNS topic of Job 2 as the input SNS topic of Job 3, and so on. The following diagram illustrates this architecture.

After you set up your jobs this way, a data object flowing through Job 1 makes its way to Job 2 automatically after passing through Job 1. The following are some possibilities for chaining with two jobs:

  • Specify different label attribute names for jobs with a similar task type. For example, Job 1 (label data) chains to Job 2 (adjust, review, and verify annotations from Job 1).
  • Use different label attribute names for jobs with different task types. For example, Job 1 (labeling for image classification) chains to Job 2 (labeling for object detection).
  • Use the same label attribute names for both jobs. For example, Job 1 (labeling) chains to Job 2 (partial labeled data of Job 1 flows to Job 2).

You can use the notebook ground_truth_create_chained_streaming_labeling_job.ipynb to learn how to chain two streaming labeling jobs. This example demonstrates the first use case in the preceding list (different label attribute names for jobs with similar task types). When used on default mode, this notebook chains a bounding box (object detection) job (Job 1) to a bounding box audit job. Any bounding boxes annotated in Job 1 can be adjusted in Job 2 in real time. You can generalize this use case to set up quality-check workflows, in which a work team reviews the annotations of another work team.

You can also use the notebook to set up any kind of chained streaming jobs to achieve multiple job-chaining configurations.

Conclusion

This post covers the benefits of using Ground Truth streaming feature and how to create and chain streaming labeling jobs. This post merely scratches the surface of what Ground Truth streaming can do.

To get started, use one of the notebooks included in this post to launch and experiment with streaming labeling jobs, or see Create a Streaming Labeling Job.

Let us know what you think in the comments.

 


About the Authors

Priyanka Gopalakrishna is a software engineer at Amazon AI. Her focus is on solving labeling problems using machine learning and building scalable AI solutions using distributed systems.

 

 

 

 

Talia Chopra is a Technical Writer in AWS specializing in machine learning and artificial intelligence. She works with multiple teams in AWS to create technical documentation and tutorials for customers using Amazon SageMaker, MxNet, and AutoGluon.

Read More

‘Marbles at Night’ Illuminates Future of Graphics in NVIDIA Omniverse

‘Marbles at Night’ Illuminates Future of Graphics in NVIDIA Omniverse

Reflections have never looked so good.

Artists are using NVIDIA RTX GPUs to take real-time graphics to the next level, creating visuals with rendered surfaces and light reflections to produce incredible photorealistic details.

The Marbles RTX technology demo, first previewed at GTC in March, ran on a single NVIDIA RTX 8000 GPU. It showcased how complex physics can be simulated in a real-time, ray-traced world.

During the GeForce RTX 30 Series launch event in September, NVIDIA CEO Jensen Huang unveiled a more challenging take on the NVIDIA Marbles RTX project: staging the scene to take place at night and illustrate the effect of hundreds of dynamic, animated lights.

Marbles at Night is a physics-based demo created with dynamic, ray-traced lights and over 100 million polygons. Built in NVIDIA Omniverse and running on a single GeForce RTX 3090 GPU, the final result showed hundreds of different light sources at night, with each marble reflecting lights differently and all happening in real time.

Beyond demonstrating the latest technologies for content creation, Marbles at Night showed how creative professionals can now seamlessly collaborate and design simulations with incredible lighting, accurate reflections and real-time ray tracing with path tracing.

Pushing the Limits of Creativity

A team of artists from NVIDIA collaborated and built the project in NVIDIA Omniverse, the real-time graphics and simulation platform based on NVIDIA RTX GPUs and Pixar’s Universal Scene Description.

Working in Omniverse, the artists were able to upload, store and access all the assets in the cloud, allowing them to easily share files across teams. They could send a link, open the file and work on the assets at the same time.

Every single asset in Marbles at Night was hand-made, modeled and textured from scratch. Marbles RTX Creative Director Gavriil Klimov bought over 200 art supplies and took reference photos of each to capture realistic details, from paint splatter to wear and tear. Texturing — a process that allows artists to transfer details from one model to another — was done entirely in Substance Painter, with multiple variations for each asset.

In Omniverse, the artists manually crafted everything in the Marbles project using RTX Renderer and a variety of creative applications like 3ds Max, Maya, Cinema 4D, ZBrush and Blender. The simulation platform enabled the creative team to view all content at the highest possible quality in real time, resulting in shorter cycles and more iterations.

Nearly a dozen people were working on the project remotely from locations as far afield as California, New York, Australia and Russia. Although the team members were located around the world, Omniverse allowed them to work on scenes simultaneously thanks to Omniverse Nucleus. Running on premises or in the cloud, the module enabled the teams to collaborate in real time across vast distances.

The collaboration-based workflow, combined with the fact the project’s assets were stored in the cloud, made it easier for everyone to access the files and edit in real time.

The final technology demo completed in Omniverse resulted in over 500GB worth of texture data, over 100 unique objects, more than 5,000 meshes and about 100 million polygons.

The Research Behind the Project

NVIDIA Research recently released a paper on the reservoir-based spatiotemporal importance resampling (ReSTIR) technique, which details how to render dynamic direct lighting and shadows from millions of area lights in real time. Inspired by this technique, the NVIDIA rendering team, led by distinguished engineer Ignacio Llamas, implemented an algorithm that allowed Klimov and team to place as many lights as they wanted for the Marbles demo, without being constrained by lighting limits.

“Before, we were limited to using less than 10 lights. But today with Omniverse capabilities using RTX, we were able to place as many lights as we wanted,” said Klimov. “That’s the beauty of it — you can creatively decide what the limit is that works for you.”

Traditionally, artists and developers achieved complex lighting using baked solutions. NVIDIA Research, in collaboration with the Visual Computing Lab at Dartmouth College, produced the research paper that dives into how artists can enable direct lighting from millions of moving lights.

The approach requires no complex light structure, no baking and no global scene parameterization. All the lights can cast shadows, everything can move arbitrarily and new emitters can be added dynamically. This technique is implemented using DirectX Ray Tracing accelerated by NVIDIA RTX and NVIDIA RT Cores.

Get more insights into the NVIDIA Research that’s helping professionals simplify complex design workflows, and learn about the latest announcement of Omniverse, now in open beta.

Additional Resources: 

The post ‘Marbles at Night’ Illuminates Future of Graphics in NVIDIA Omniverse appeared first on The Official NVIDIA Blog.

Read More