Ström discusses his career journey in conversational AI, his published research, and where he sees the field of conversational AI headed nextRead More
New workshop, dataset will help promote safety in the skies
ICCV workshop hosted by Amazon Prime Air and AWS will announce results of challenge to detect airborne obstacles.Read More
Build a cognitive search and a health knowledge graph using AWS AI services
Medical data is highly contextual and heavily multi-modal, in which each data silo is treated separately. To bridge different data, a knowledge graph-based approach integrates data across domains and helps represent the complex representation of scientific knowledge more naturally. For example, three components of major electronic health records (EHR) are diagnosis codes, primary notes, and specific medications. Because these are represented in different data silos, secondary use of these documents for accurately identifying patients with a specific observable trait is a crucial challenge. By connecting those different sources, subject matter experts have a richer pool of data to understand how different concepts such as diseases and symptoms interact with one another and help conduct their research. This ultimately helps healthcare and life sciences researchers and practitioners create better insights from the data for a variety of use cases, such as drug discovery and personalized treatments.
In this post, we use Amazon HealthLake to export EHR data in the Fast Healthcare Interoperability Resources (FHIR) data format. We then build a knowledge graph based on key entities extracted and harmonized from the medical data. Amazon HealthLake also extracts and transforms unstructured medical data, such as medical notes, so it can be searched and analyzed. Together with Amazon Kendra and Amazon Neptune, we allow domain experts to ask a natural language question, surface the results and relevant documents, and show connected key entities such as treatments, inferred ICD-10 codes, medications, and more across records and documents. This allows for easy analysis of co-occurrence of key entities, co-morbidities analysis, and patient cohort analysis in an integrated solution. Combining effective search capabilities and data mining through graph networks reduces time and cost for users to find relevant information around patients and improve knowledge serviceability surrounding EHRs. The code base for this post is available on the GitHub repo.
Solution overview
In this post, we use the output from Amazon HealthLake for two purposes.
First, we index EHRs into Amazon Kendra for semantic and accurate document ranking out of patient notes, which help improve physician efficiency identifying patient notes and compare it with other patients sharing similar characteristics. This shifts from using a lexical search to a semantic search that introduces context around the query, which results in better search output (see the following screenshot).
Second, we use Neptune to build knowledge graph applications for users to view metadata associated with patient notes in a more simple and normalized view, which allows us to highlight the important characteristics stemming from a document (see the following screenshot).
The following diagram illustrates our architecture.
The steps to implement the solution are as follows:
- Create and export Amazon HealthLake data.
- Extract patient visit notes and metadata.
- Load patient notes data into Amazon Kendra.
- Load the data into Neptune.
- Set up the backend and front end to run the web app.
Create and export Amazon HealthLake data
As a first step, create a data store using Amazon HealthLake either via the Amazon HealthLake console or the AWS Command Line Interface (AWS CLI). For this post, we focus on the AWS CLI approach.
- We use AWS Cloud9 to create a data store with the following code, replacing <<your data store name >> with a unique name:
aws healthlake create-fhir-datastore --region us-east-1 --datastore-type-version R4 --preload-data-config PreloadDataType="SYNTHEA" --datastore-name "<<your_data_store_name>>"
The preceding code uses a preloaded dataset from Synthea, which is supported in FHIR version R4, to explore how to use Amazon HealthLake output. Running the code produces a response similar to the following code, and this step takes a few minutes to complete (approximately 30 minutes at the time of writing):
{
"DatastoreEndpoint": "https://healthlake.us-east-1.amazonaws.com/datastore/<<your_data_store_id>>/r4/",
"DatastoreArn": "arn:aws:healthlake:us-east-1:<<your_AWS_account_number>>:datastore/fhir/<<your_data_store_id>>",
"DatastoreStatus": "CREATING",
"DatastoreId": "<<your_data_store_id>>"
}
You can check the status of completion either on the Amazon HealthLake console or in the AWS Cloud9 environment.
- To check the status in AWS Cloud9, use the following code to check the status and wait until
DatastoreStatus
changes fromCREATING
toACTIVE
:
aws healthlake describe-fhir-datastore --datastore-id "<<your_data_store_id>>" --region us-east-1
- When the status changes to
ACTIVE
, get the role ARN from theHEALTHLAKE-KNOWLEDGE-ANALYZER-IAMROLE
stack in AWS CloudFormation, associated with the physical IDAmazonHealthLake-Export-us-east-1-HealthDataAccessRole
, and copy the ARN in the linked page. - In AWS Cloud9, use the following code to export the data from Amazon HealthLake to the Amazon Simple Storage Service (Amazon S3) bucket generated from AWS Cloud Development Kit (AWS CDK) and note the
job-id
output:
aws healthlake start-fhir-export-job --output-data-config S3Uri="s3://hl-synthea-export-<<your_AWS_account_number>>/export-$(date +"%d-%m-%y")" --datastore-id <<your_data_store_id>> --data-access-role-arn arn:aws:iam::<<your_AWS_account_number>>:role/AmazonHealthLake-Export-us-east-1-HealthKnoMaDataAccessRole
- Verify that the export job is complete using the following code with the
job-id
obtained from the last code you ran. (when the export is complete,JobStatus
in the output statesCOMPLETED
):
aws healthlake describe-fhir-export-job --datastore-id <<your_data_store_id>> --job-id <<your_job_id>>
Extract patient visit notes and metadata
The next step involves decoding patient visits to obtain the raw texts. We will import the following file DocumentReference-0.ndjson (shown in the following screenshot of S3) from the Amazon HealthLake export step we previously completed into the CDK deployed Amazon SageMaker notebook instance. First, save the notebook provided from the Github repo into the SageMaker instance. Then, run the notebook to automatically locate and import the DocumentReference-0.ndjson files from S3.
For this step, use the resourced SageMaker to quickly run the notebook. The first part of the notebook creates a text file that contains notes from each patient’s visit and is saved to an Amazon S3 location. Because multiple visits could exist for a single patient, a unique identification combines the patient unique ID and the visit ID. These patients’ notes are used to perform semantic search against using Amazon Kendra.
The next step in the notebook involves creating triples based on the automatically extracted metadata. By creating and saving the metadata in an Amazon S3 location, an AWS Lambda function gets triggered to generate the triples surrounding the patient visit notes.
Load patient notes data into Amazon Kendra
The text files that are uploaded in the source path of the S3 bucket need to be crawled and indexed. For this post, a developer edition is created during the AWS CDK deployment, so the index is created to connect the raw patient notes.
- On the AWS CloudFormation console under the HEALTHLAKE-KNOWLEDGE-ANALYZER-CORE stack, search for kendra on the Resources tab and take note of the index ID and data source ID (copy the first part of the physical ID before the pipe ( | )).
- Back in AWS Cloud9, run the following command to synchronize the patient notes in Amazon S3 to Amazon Kendra:
aws kendra start-data-source-sync-job --id <<data_source_id_2nd_circle>> --index-id <<index_id_1st_ circle>>
- You can verify when the sync status is complete by running the following command:
aws kendra describe-data-source --id <<data_source_id_2nd_circle>> --index-id <<index_id_1st_circle>>
Because the ingested data is very small, it should immediately show that Status is ACTIVE upon running the preceding command.
Load the data into Neptune
In this next step, we access the Amazon Elastic Compute Cloud (Amazon EC2) instance that was spun up and load the triples from Amazon S3 into Neptune using the following code:
curl -X POST
-H 'Content-Type: application/json'
https://healthlake-knowledge-analyzer-vpc-and-neptune-neptunedbcluster.cluster-<<your_unique_id>>.us-east-1.neptune.amazonaws.com:8182/loader -d '
{
"source": "s3://<<your_Amazon_S3_bucket>>/stdized-data/neptune_triples/nquads/",
"format": "nquads",
"iamRoleArn": "arn:aws:iam::<<your_AWS_account_number>>:role/KNOWLEDGE-ANALYZER-IAMROLE-ServiceRole",
"region": "us-east-1",
"failOnError": "TRUE"
}'
Set up the backend and front end to run the web app
The preceding step should take a few seconds to complete. In the meantime, configure the EC2 instance to access the web app. Make sure to have both Python and Node installed in the instance.
- Run the following code in the terminal of the instance:
sudo iptables -t nat -I PREROUTING -p tcp --dport 80 -j REDIRECT --to-ports 3000
This routes the public address to the deployed app.
- Copy the two folders titled
ka-webapp
andka-server-webapp
and upload them to a folder nameddev
in the EC2 instance. - For the front end, create a screen by running the following command:
screen -S back
- In this screen, change the folder to
ka-webapp
and runnpm
install. - After installation, go into the file
.env.development
and place the Amazon EC2 public IPv4 address and save the file. - Run
npm
start and then detach the screen. - For the backend, create another screen by entering:
screen -S back
- Change the folder to
ka-server-webapp
and runpip install -r requirements.txt
. - When the libraries are installed, enter the following code:
./run.sh
- Detach from the current screen, and using any browser, go the Amazon EC2 Public IPv4 address to access the web app.
Try searching for a patient diagnosis and choose a document link to visualize the knowledge graph of that document.
Next steps
In this post, we integrate data output from Amazon HealthLake into both a search and graph engine to semantically search relevant information and highlight important entities linked to documents. You can further expand this knowledge graph and link it to other ontologies such as MeSH and MedDRA.
Furthermore, this provides a foundation to further integrate other clinical datasets and expand this knowledge graph to build a data fabric. You can make queries on historical population data, chaining structured and language-based searches for cohort selection to correlate disease with patient outcome.
Clean up
To clean up your resources, complete the following steps:
- To delete the stacks created, enter the following commands in the order given to properly remove all resources:
$ cdk destroy HEALTHLAKE-KNOWLEDGE-ANALYZER-UPDATE-CORE
$ cdk destroy HEALTHLAKE-KNOWLEDGE-ANALYZER-WEBAPP
$ cdk destroy HEALTHLAKE-KNOWLEDGE-ANALYZER-CORE
- While the preceding commands are in progress, delete the Amazon Kendra data source that was created:
$ cdk destroy HEALTHLAKE-KNOWLEDGE-ANALYZER-VPC-AND-NEPTUNE
$ cdk destroy HEALTHLAKE-KNOWLEDGE-ANALYZER-IAMROLE
$ aws healthlake delete-fhir-datastore --datastore-id <<your_data_store_id>>
- To verify it’s been deleted, check the status by running the following command:
$ aws healthlake describe-fhir-datastore --datastore-id "<<your_data_store_id>>" --region us-east-1
- Check the AWS CloudFormation console to ensure that all associated stacks starting with
HEALTHLAKE-KNOWLEDGE-ANALYZER
have all been deleted successfully.
Conclusion
Amazon HealthLake provides a managed service based on the FHIR standard to allow you to build health and clinical solutions. Connecting the output of Amazon HealthLake to Amazon Kendra and Neptune gives you the ability to build a cognitive search and a health knowledge graph to power your intelligent application.
Building on top of this approach can enable researchers and front-line physicians to easily search across clinical notes and research articles by simply typing their question into a web browser. Every clinical evidence is tagged, indexed, and structured using machine learning to provide evidence-based topics on things like transmission, risk factors, therapeutics, and incubation. This particular functionality is tremendously valuable for clinicians or scientists because it allows them to quickly ask a question to validate and advance their clinical decision support or research.
Try this out on your own! Deploy this solution using Amazon HealthLake in your AWS account by deploying the example on GitHub.
About the Authors
Prithiviraj Jothikumar, PhD, is a Data Scientist with AWS Professional Services, where he helps customers build solutions using machine learning. He enjoys watching movies and sports and spending time to meditate.
Phi Nguyen is a solutions architect at AWS helping customers with their cloud journey with a special focus on data lake, analytics, semantics technologies and machine learning. In his spare time, you can find him biking to work, coaching his son’s soccer team or enjoying nature walk with his fami
Parminder Bhatia is a science leader in the AWS Health AI, currently building deep learning algorithms for clinical domain at scale. His expertise is in machine learning and large scale text analysis techniques in low resource settings, especially in biomedical, life sciences and healthcare technologies. He enjoys playing soccer, water sports and traveling with his family.
Garin Kessler is a Senior Data Science Manager at Amazon Web Services, where he leads teams of data scientists and application architects to deliver bespoke machine learning applications for customers. Outside of AWS, he lectures on machine learning and neural language models at Georgetown. When not working, he enjoys listening to (and making) music of questionable quality with friends and family.
Dr. Taha Kass-Hout is Director of Machine Learning and Chief Medical Officer at Amazon Web Services, and leads our Health AI strategy and efforts, including Amazon Comprehend Medical and Amazon HealthLake. Taha is also working with teams at Amazon responsible for developing the science, technology, and scale for COVID-19 lab testing. A physician and bioinformatician, Taha served two terms under President Obama, including the first Chief Health Informatics officer at the FDA. During this time as a public servant, he pioneered the use of emerging technologies and cloud (CDC’s electronic disease surveillance), and established widely accessible global data sharing platforms, the openFDA, that enabled researchers and the public to search and analyze adverse event data, and precisionFDA (part of the Presidential Precision Medicine initiative).
Improve the streaming transcription experience with Amazon Transcribe partial results stabilization
Whether you’re watching a live broadcast of your favorite soccer team, having a video chat with a vendor, or calling your bank about a loan payment, streaming speech content is everywhere. You can apply a streaming transcription service to generate subtitles for content understanding and accessibility, to create metadata to enable search, or to extract insights for call analytics. These transcription services process streaming audio content and generate partial transcription results until it provides a final transcription for a segment of continuous speech. However, some words or phrases in these partial results might change, as the service further understands the context of the audio.
We’re happy to announce that Amazon Transcribe now allows you to enable and configure partial results stabilization for streaming audio transcriptions. Amazon Transcribe is an automatic speech recognition (ASR) service that enables developers to add real-time speech-to-text capabilities into their applications for on-demand and streaming content. Instead of waiting for an entire sentence to be transcribed, you can now control the stabilization level of partial results. Transcribe offers 3 settings: High, Medium and Low. Setting the stabilization “High” allows a greater portion of the partial results to be fixed with only the last few words changing during the transcription process. This feature helps you have more flexibility in your streaming transcription workflows based on the user experience you want to create.
In this post, we walk through the benefits of this feature and how to enable it via the Amazon Transcribe console or the API.
How partial results stabilization works
Let’s dive deeper into this with an example.
During your daily conversations, you may think you hear a certain word or phrase, but later realize that it was incorrect based on additional context. Let’s say you were talking to someone about food, and you heard them say “Tonight, I will eat a pear…” However, when the speaker finishes, you realize they actually said “Tonight I will eat a pair of pancakes.” Just as humans may change our understanding based on the information at hand, Amazon Transcribe uses machine learning (ML) to self-correct the transcription of streaming audio based on the context it receives. To enable this, Amazon Transcribe uses partial results.
During the streaming transcription process, Amazon Transcribe outputs chunks of the results with an isPartial
flag. Results with this flag marked as true
are the ones that Amazon Transcribe may change in the future depending on the additional context received. After Amazon Transcribe classifies that it has sufficient context to be over a certain confidence threshold, the results are stabilized and the isPartial
flag for that specific partial result is marked false
. The window size of these partial results could range from a few words to multiple sentences depending on the stream context.
The following image displays how the partial results are generated (and edited) in Amazon Transcribe for streaming transcription.
Results stabilization enables more control over the latency and accuracy of transcription results. Depending on the use case, you may prioritize one over the other. For example, when providing live subtitles, high stabilization of results may be preferred because speed is more important than accuracy. On the other hand for use cases like content moderation, lower stabilization is preferred because accuracy may be more important than latency.
A high stability level enables quicker stabilization of transcription results by limiting the window of context for stabilizing results, but can lead to lower overall accuracy. On the other hand, a low stability level leads to more accurate transcription results, but the partial transcription results are more likely to change.
With the streaming transcription API, you can now control the stability of the partial results in your transcription stream.
Now let’s look at how to use the feature.
Access partial results stabilization via the Amazon Transcribe console
To start using partial results stabilization on the Amazon Transcribe console, complete the following steps:
- On the Amazon Transcribe console, make sure you’re in a Region that supports Amazon Transcribe Streaming.
For this post, we use us-east-1
.
- In the navigation pane, choose Real-time transcription.
- Under Additional settings, enable Partial results stabilization.
- Select your stability level.
You can choose between three levels:
- High – Provides the most stable partial transcription results with lower accuracy compared to Medium and Low settings. Results are less likely to change as additional context is gathered.
- Medium – Provides partial transcription results that have a balance between stability and accuracy
- Low – Provides relatively less stable partial transcription results with higher accuracy compared to High and Medium settings. Results get updated as additional context is gathered and utilized.
- Choose Start streaming to play a stream and check the results.
Access partial results stabilization via the API
In this section, we demonstrate streaming with HTTP/2. You can enable your preferred level of partial results stabilization in an API request.
You enable this feature via the enable-partial-results-stabilization
flag and the partial-results-stability
level input parameters:
POST /stream-transcription HTTP/2
x-amzn-transcribe-language-code: LanguageCode
x-amzn-transcribe-sample-rate: MediaSampleRateHertz
x-amzn-transcribe-media-encoding: MediaEncoding
x-amzn-transcribe-session-id: SessionId
x-amzn-transcribe-enable-partial-results-stabilization= true
x-amzn-transcribe-partial-results-stability = low | medium | high
Enabling partial results stabilization introduces the additional parameter flag Stable
in the API response at the item level in the transcription results. If a partial results item in the streaming transcription result has the Stable
flag marked as true
, the corresponding item transcription in the partial results doesn’t change irrespective of any subsequent context identified by Amazon Transcribe. If the Stable
flag is marked as false
, there is still a chance that the corresponding item may change in the future, until the IsPartial
flag is marked as false
.
The following code shows our API response:
{
"Alternatives": [
{
"Items": [
{
"Confidence": 0,
"Content": "Amazon",
"EndTime": 1.22,
"Stable": true,
"StartTime": 0.78,
"Type": "pronunciation",
"VocabularyFilterMatch": false
},
{
"Confidence": 0,
"Content": "is",
"EndTime": 1.63,
"Stable": true,
"StartTime": 1.46,
"Type": "pronunciation",
"VocabularyFilterMatch": false
},
{
"Confidence": 0,
"Content": "the",
"EndTime": 1.76,
"Stable": true,
"StartTime": 1.64,
"Type": "pronunciation",
"VocabularyFilterMatch": false
},
{
"Confidence": 0,
"Content": "largest",
"EndTime": 2.31,
"Stable": true,
"StartTime": 1.77,
"Type": "pronunciation",
"VocabularyFilterMatch": false
},
{
"Confidence": 1,
"Content": "rainforest",
"EndTime": 3.34,
"Stable": true,
"StartTime": 2.4,
"Type": "pronunciation",
"VocabularyFilterMatch": false
},
],
"Transcript": "Amazon is the largest rainforest "
}
],
"EndTime": 4.33,
"IsPartial": false,
"ResultId": "f4b5d4dd-b685-4736-b883-795dc3f7f636",
"StartTime": 0.78
}
Conclusion
This post introduces the recently launched partial results stabilization feature in Amazon Transcribe. For more information, see the Amazon Transcribe Partial results stabilization documentation.
To learn more about the Amazon Transcribe Streaming Transcription API, check out Using Amazon Transcribe streaming With HTTP/2 and Using Amazon Transcribe streaming with WebSockets.
About the Author
Alex Chirayath is an SDE in the Amazon Machine Learning Solutions Lab. He helps customers adopt AWS AI services by building solutions to address common business problems.
The Washington Post Launches Audio Articles Voiced by Amazon Polly
AWS is excited to announce that The Washington Post is integrating Amazon Polly to provide their readers with audio access to stories across The Post’s entire spectrum of web and mobile platforms, starting with technology stories. Amazon Polly is a service that turns text into lifelike speech, allowing you to create applications that talk, and build entirely new categories of speech-enabled products. Post subscribers live busy lives with limited time to read the news. The goal is to unlock the Post’s world-class written journalism in audio form and give readers a convenient way to stay up to date on the news, like listening while doing other things.
In The Post’s announcement, Kat Down Mulder, managing editor says, “Whether you’re listening to a story while multitasking or absorbing a compelling narrative while on a walk, audio unlocks new opportunities to engage with our journalism in more convenient ways. We saw that trend throughout last year as readers who listened to audio articles on our apps engaged more than three times longer with our content. We’re doubling-down on our commitment to audio and will be experimenting rapidly and boldly in this space. The full integration of Amazon Polly within our publishing ecosystem is a big step that offers readers this powerful convenience feature at scale, while ensuring a high-quality and consistent audio experience across all our platforms for our subscribers and readers.”
Integrating Amazon Polly into The Post’s publishing workflow has been easy and straightforward. When an article is ready for publication, the written content management system (CMS) publishes the text article and simultaneously sends the text to the audio CMS, where the article text is processed by Amazon Polly to produce an audio recording of the article. The audio is delivered as an mp3 and published in conjunction with the written portion of the article.
Figure 1 High-level architecture Washington Post article creation
Last year, The Post began testing article narration using the text-to-speech, accessibility capabilities in iOS and Android operating systems. While there were promising signs around engagement, some noted that the voices sounded robotic. The Post started testing other options and ended up choosing Amazon Polly because of its high-quality automated voices. “We’ve tested users’ perceptions to both human and automated voices and found high levels of satisfaction with Amazon Polly’s offering. Integrating Amazon Polly into our publishing workflow also gives us the ability to offer a consistent listening experience across platforms and experiment with new functions that we believe our subscribers will enjoy.” says Ryan Luu, senior product manager at The Post.
Over the coming months, The Post will be adding voice support for new sections, new languages and better usability. “We plan to introduce new features like more playback controls, text highlighting as you listen, and audio versions of Spanish articles,” said Luu. “We also hope to give readers the ability to create audio playlists to make it easy for subscribers to queue up stories they’re interested in and enjoy that content on the go.”
Amazon Polly is a text-to-speech service that powers audio access to news articles for media publishers like Gannett (the publisher of USA Today), The Globe and Mail (the biggest newspaper in Canada), and leading publishing companies such as BlueToad and Trinity Audio. In addition, Amazon Polly provides natural sounding voices in a variety of languages and personas to give content a voice in other sectors such as education, healthcare, and gaming.
For more information, see What Is Amazon Polly? and log in to the Amazon Polly console to try it out for free. To experience The Post’s new audio articles, listen to the story “Did you get enough steps in today? Maybe one day you’ll ask your ‘smart’ shirt.”
About the Author
Esther Lee is a Product Manager for AWS Language AI Services. She is passionate about the intersection of technology and education. Out of the office, Esther enjoys long walks along the beach, dinners with friends and friendly rounds of Mahjong.
3 questions with Philip Resnik: Analyzing social media to understand the risks of suicide
Resnik is a featured speaker at the first virtual Amazon Web Services Machine Learning Summit on June 2.Read More
Build an anomaly detection model from scratch with Amazon Lookout for Vision
A common problem in manufacturing is verifying that products meet quality standards. You can use manual inspection on a subset of the products, but it’s usually not scalable enough to meet demand as production grows. In this post, I go through the steps of creating an end-to-end machine vision solution that identifies visual anomalies in products using Amazon Lookout for Vision. I’ll show you how to train a model that performs anomaly detection, use the model in real-time, update the model when new data is available, and how to monitor the model.
Solution overview
Imagine a factory producing Lego bricks. The bricks are transported on a conveyor belt in front of a camera that determines if they meet the factory’s quality standards. When a brick on the belt breaks a light beam, the device takes a photo and sends it to Amazon Lookout for Vision for anomaly detection. If a defective brick is identified, it’s pushed off the belt by a pusher.
The following diagram illustrates the architecture of our anomaly detection solution, which uses Amazon Lookout for Vision, Amazon Simple Storage Service (Amazon S3), and a Raspberry Pi.
Amazon Lookout for Vision is a machine learning (ML) service that uses machine vision to help you identify visual defects in products without needing any ML experience. It uses deep learning to remove the need for carefully calibrated environments in terms of lighting and camera angle, which many existing machine vision techniques require.
To get started with Amazon Lookout for Vision, you need to provide data for the service to use when training the underlying deep learning models. The dataset used in this post consists of 289 normal and 116 anomalous images of a Lego brick, which are hosted in an S3 bucket that I have made public so you can download the dataset.
To make the scenario more realistic, I’ve varied the lighting and camera position between images. Additionally, I use 20 test images and 9 new images to update the model later on with both normal and anomalous images. The anomalous images were created by drawing on and scratching the brick, changing the brick color, adding other bricks, and breaking off small pieces to simulate production defects. The following image shows the physical setup used when collecting training images.
Pre-requisites
To follow along with this post, you’ll need the following:
- An AWS account to train and use Amazon Lookout for Vision
- A camera (for this post, I use a Pi camera)
- A device that can run code (I use a Raspberry Pi 4)
Train the model
To use the dataset when training a model, you first upload the training data to Amazon S3 and create an Amazon Lookout for Vision project. A project is an abstraction around the training dataset and multiple model versions. You can think of a project as a collection of the resources that relate to a specific machine vision use case. For instance, in this post, I use one dataset but create multiple model versions as I gradually optimize the model for the use case with new data, all within the boundaries of one project.
You can use the SDK, AWS Command Line Interface (AWS CLI), and AWS Management Console to perform all the steps required to create and train a model. For this post, I use a combination of the AWS CLI and the console to train and start the model, and use the SDK to send images for anomaly detection from the Raspberry Pi.
To train the model, we complete the following high-level steps:
- Upload the training data to Amazon S3.
- Create an Amazon Lookout for Vision project.
- Create an Amazon Lookout for Vision dataset.
- Train the model.
Upload the training data to Amazon S3
To get started, complete the following steps:
- Download the dataset to your computer.
- Create an S3 bucket and upload the training data.
I named my bucket l4vdemo, but bucket names need to be globally unique, so make sure to change it if you copy the following code. Make sure to keep the folder structure in the dataset, because Amazon Lookout for Vision uses it to label normal and anomalous images automatically based on folder name. You could use the integrated labeling tool on the Amazon Lookout for Vision console or Amazon SageMaker Ground Truth to label the data, but the automatic labeler allows you to keep the folder structure and save some time.
aws s3 sync s3://aws-ml-blog/artifacts/Build-an-anomaly-detection-model-from-scratch-with-L4V/ data/
aws s3 mb s3://l4vdemo
aws s3 sync data s3://l4vdemo
Create an Amazon Lookout for Vision project
You’re now ready to create your project.
- On the Amazon Lookout for Vision console, choose Projects in the navigation pane.
- Choose Create project.
- For Project name, enter a name.
- Choose Create project.
Create the dataset
For this post, I create a single dataset and import the training data from the S3 bucket I uploaded the data to in Step 1.
- Choose Create dataset.
- Select import images from S3 bucket.
- For S3 URI, enter the URI for your bucket (for this post, s3://l4vdemo/, but make sure to use the unique bucket name you created).
- For Automatic labeling, select Automatically attach labels to images based on the folder name.
This allows you to use the existing folder structure to infer whether your images are normal or anomalous.
- Choose Create dataset.
Train the model
After we create the dataset, the number of labeled and unlabeled images should be visible in the Filters pane, as well as the number of normal and anomalous images.
- To start training a deep learning model, choose Train model.
Model training can take a few hours depending on the number of images in the training dataset.
- When training is complete, in the navigation pane, chose Models under your project.
You should see the newly created model listed with a status of Training complete.
- Choose the model to see performance metrics like precision, recall and F1 score, training duration, and more model metadata.
Use the model
Now that a model is trained, let’s test it on data it hasn’t seen before. To use the model, you must first start hosting it to provision all backend resources required to perform real-time inference.
aws lookoutvision start-model
--project-name lego-demo
--model-version 1
--min-inference-units 1
When starting the model hosting, you pass both project name and model version as arguments to identify the model. You also need to specify the number of inference units to use; each unit enables approximately five requests per second.
To use the hosted model, use the detect-anomalies command and pass in the project and model version along with the image to perform inference on:
aws lookoutvision detect-anomalies
--project-name lego-demo
--model-version 1
--content-type image/jpeg
--body test/test-1611853160.2488434.jpg
The dataset we use in this post contains 20 images, and I encourage you to test the model with different images.
When performing inference on an anomalous brick, the response could look like the following:
{
"DetectAnomalyResult": {
"Source": {
"Type": "direct"
},
"IsAnomalous": true,
"Confidence": 0.9754859209060669
}
}
The flag IsAnomalous
is true and Amazon Lookout for Vision also provides a confidence score that tells you how sure the model is of its classification. The service always provides a binary classification, but you can use the confidence score to make more well-informed decisions, such as whether to scrap the brick directly or send it for manual inspection. You could also persist images with lower confidence scores and use them to update the model, which I show you how to do in the next section.
Keep in mind that you’re charged for the model as long as it’s running, so stop it when you no longer need it:
aws lookoutvision stop-model
--project-name lego-demo
--model-version 1
Update the model
As new data becomes available, you may want to maintain or update the model to accommodate for new types of defects and increase the model’s overall performance. The dataset contains nine images in the new-data folder, which I use to update the model. To update an Amazon Lookout for Vision model, you run a trial detection and verify the machine predictions to correct the model predictions, and add the verified images to your training dataset.
Run a trial detection
To run a trial detection, complete the following steps:
- On the Amazon Lookout for Vision console, under your model in the navigation pane, choose Trial detections.
- Choose Run trial detection.
- For Trial name, enter a name.
- For Import images, select Import images from S3 bucket.
- For S3 URI, enter the URI of the new-data folder that you uploaded in Step 1 of training the model
- Choose Run trial
Verify machine predictions
When the trial detection is complete, you can verify the machine predictions.
- Choose Verify machine predictions.
- Select either Correct or Incorrect to label the images
- When all the images have been labeled, choose Add verified images to dataset.
This updates your training dataset with the new data.
Retrain the model
After you update your training dataset with the new data, you can see that the number of labeled images in your dataset has increased, along with the number of verified images.
- Choose Train model to train a new version of the model.
- When the new model is training, on the Models page, you can verify that a new model version is being trained. When the training is complete, you can view model performance metrics on the Models page and start using the new version.
Anomaly detection application
Now that I’ve trained my model, let’s use it with the Raspberry Pi to sort Lego bricks. In this use case, I’ve set up a Raspberry Pi with a camera that gets triggered whenever a break beam sensor senses a Lego brick. We use the following code:
import boto3
from picamera import PiCamera
import my_break_bream_sensor
import my_pusher
l4v_client = boto3.client('lookoutvision')
image_path = '/home/pi/Desktop/my_image.jpg'
with PiCamera() as camera:
while(True):
if my_break_bream_sensor.isBroken(): # Replace with your own sensor.
camera.capture(image_path)
with open(image_path, 'rb') as image:
response = l4v_client.detect_anomalies(ProjectName='lego-demo',
ContentType='image/jpeg',
Body=image.read(),
ModelVersion='2')
is_anomalous = response['DetectAnomalyResult']['IsAnomalous']
if (is_anomalous):
my_pusher.push() # Replace with your own pusher.
Monitoring the model
When the system is up and running, you can use the Amazon Lookout for Vision dashboard to visualize metadata from the projects you have running, such as the number of detected anomalies during a specific time period. The dashboard provides an overview of all current projects, as well as aggregated information like total anomaly ratio.
Pricing
The cost of the solution is based on the time to train the model and the time the model is running. You can divide the cost across all analyzed products to get a per-product cost. Assuming one brick is analyzed per second nonstop for a month, the cost of the solution, excluding hardware and training, is around $0.001 per brick, assuming we’re using 1 inference unit. However, if you increase production speed and analyze five bricks per second, the cost is around $0.0002 per brick.
Conclusion
Now you know how to use Amazon Lookout for Vision to train, run, update, and monitor an anomaly detection application. The use case in this post is of course simplified; you will have other requirements specific to your needs. Many factors affect the total end-to-end latency when performing inference on an image. The Amazon Lookout for Vision model runs in the cloud, which means that you need to evaluate and test network availability and bandwidth to ensure that the requirements can be met. To avoid creating bottlenecks, you can use a circuit breaker in your application to manage timeouts and prevent congestion in case of network issues.
Now that you know how to train, test and use and update an ML model for anomaly detection, try it out with your own data! To get further details about Amazon Lookout for Vision, please visit the webpage!
About the Authors
Niklas Palm is a Solutions Architect at AWS in Stockholm, Sweden, where he helps customers across the Nordics succeed in the cloud. He’s particularly passionate about serverless technologies along with IoT and machine learning. Outside of work, Niklas is an avid cross-country skier and snowboarder as well as a master egg boiler.
From pure mathematician to Amazon applied scientist
Early on, Giovanni Paolini knew little about machine learning — now he’s leading new science on artificial intelligence that could inform AWS products.Read More
Build an intelligent search solution with automated content enrichment
Unstructured data belonging to the enterprise continues to grow, making it a challenge for customers and employees to get the information they need. Amazon Kendra is a highly accurate intelligent search service powered by machine learning (ML). It helps you easily find the content you’re looking for, even when it’s scattered across multiple locations and content repositories.
Amazon Kendra leverages deep learning and reading comprehension to deliver precise answers. It offers natural language search for a user experience that’s like interacting with a human expert. When documents don’t have a clear answer or if the question is ambiguous, Amazon Kendra returns a list of the most relevant documents for the user to choose from.
To help narrow down a list of relevant documents, you can assign metadata at the time of document ingestion to provide filtering and faceting capabilities, for an experience similar to the Amazon.com retail site where you’re presented with filtering options on the left side of the webpage. But what if the original documents have no metadata, or users have a preference for how this information is categorized? You can automatically generate metadata using ML in order to enrich the content and make it easier to search and discover.
This post outlines how you can automate and simplify metadata generation using Amazon Comprehend Medical, a natural language processing (NLP) service that uses ML to find insights related to healthcare and life sciences (HCLS) such as medical entities and relationships in unstructured medical text. The metadata generated is then ingested as custom attributes alongside documents into an Amazon Kendra index. For repositories with documents containing generic information or information related to domains other than HCLS, you can use a similar approach with Amazon Comprehend to automate metadata generation.
To demonstrate an intelligent search solution with enriched data, we use Wikipedia pages of the medicines listed in the World Health Organization (WHO) Model List of Essential Medicines. We combine this content with metadata automatically generated using Amazon Comprehend Medical, into a unified Amazon Kendra index to make it searchable. You can visit the search application and try asking it some questions of your own, such as “What is the recommended paracetamol dose for an adult?” The following screenshot shows the results.
Solution overview
We take a two-step approach to custom content enrichment during the content ingestion process:
- Identify the metadata for each document using Amazon Comprehend Medical.
- Ingest the document along with the metadata in the search solution based on an Amazon Kendra index.
Amazon Comprehend Medical uses NLP to extract medical insights about the content of documents by extracting medical entities such as medication, medical condition, anatomical location, the relationships between entities such as route and medication, and traits such as negation. In this example, for the Wikipedia page of each medicine from the WHO Model List of Essential Medicines, we use the DetectEntitiesV2 operation of Amazon Comprehend Medical to detect the entities in the categories ANATOMY
, MEDICAL_CONDITION
, MEDICATION
, PROTECTED_HEALTH_INFORMATION
, TEST_TREATMENT_PROCEDURE
, and TIME_EXPRESSION
. We use these entities to generate the document metadata.
We prepare the Amazon Kendra index by defining custom attributes of type STRING_LIST corresponding to the entity categories ANATOMY
, MEDICAL_CONDITION
, MEDICATION
, PROTECTED_HEALTH_INFORMATION
, TEST_TREATMENT_PROCEDURE
, and TIME_EXPRESSION
. For each document, the DetectEntitiesV2
operation of Amazon Comprehend Medical returns a categorized list of entities. Each entity from this list with a sufficiently high confidence score (for this use case, greater than 0.97) is added to the custom attribute corresponding to its category. After all the detected entities are processed in this way, the populated attributes are used to generate the metadata JSON file corresponding to that document. Amazon Kendra has an upper limit of 10 strings for an attribute of STRING_LIST
type. In this example, we take the top 10 entities with the highest frequency of occurrence in the processed document.
After the metadata JSON files for all the documents are created, they’re copied to the Amazon Simple Storage Service (Amazon S3) bucket configured as a data source to the Amazon Kendra index, and a data source sync is performed to ingest the documents in the index along with the metadata.
Prerequisites
To deploy and work with the solution in this post, make sure you have the following:
- An AWS account with privileges to create AWS Identity and Access Management (IAM) roles and policies. For more information, see Overview of access management: Permissions and policies.
- Basic knowledge of AWS and the AWS Command Line Interface (AWS CLI). For more information about the AWS CLI, see AWS CLI Command Reference.
- An S3 bucket to store the documents and metadata. For more information, see Creating a bucket and What is Amazon S3?
- Access to AWS CloudShell, Amazon Kendra, and Amazon Comprehend Medical.
Architecture
We use the AWS CloudFormation template medkendratemplate.yaml
to deploy an Amazon Kendra index with the custom attributes of type STRING_LIST corresponding to the entity categories ANATOMY
, MEDICAL_CONDITION
, MEDICATION
, PROTECTED_HEALTH_INFORMATION
, TEST_TREATMENT_PROCEDURE
, and TIME_EXPRESSION
.
The following diagram illustrates our solution architecture.
Based on this architecture, the steps to build and use the solution are as follows:
- On CloudShell, a Bash script called
getpages.sh
downloads Wikipedia pages of the medicines and store them as text files. - A Python script called
meds.py
, which contains the core logic of the automation of the metadata generation, makes the detect_entities_v2 API call to Amazon Comprehend Medical to detect entities for each of the Wikipedia pages and generate metadata based on the entities returned. The steps used in this script are as follows:- Split the Wikipedia page text into chunks smaller than the maximum text size allowed by the
detect_entities_v2
API call. - Make the
detect_entities_v2
call. - Filter the entities detected by the
detect_entities_v2
call using a threshold confidence score (0.97 for this example). - Keep track of each unique entity corresponding to its category and the frequency of occurrence of that entity.
- For each entity category, sort the entities in that category from highest to lowest frequency of occurrence and select the top 10 entities.
- Create a metadata object based on the selected entities and output it in JSON format.
- Split the Wikipedia page text into chunks smaller than the maximum text size allowed by the
- We use the AWS CLI to copy the text data and the metadata to the S3 bucket that is configured as a data source to the Amazon Kendra index using the S3 connector.
- We perform a data source sync using the Amazon Kendra console to ingest the contents of the documents along with the metadata in the Amazon Kendra index.
- Finally, we use the Amazon Kendra search console to make queries to the index.
Create an Amazon S3 bucket to be used as a data source
Create an Amazon S3 bucket that you will use as a data source for the Amazon Kendra index.
Deploy the infrastructure as a CloudFormation stack
To deploy the infrastructure and resources for this solution, complete the following steps:
In a separate browser tab, open the AWS Management Console, and make sure that you’re logged in to your AWS account. Click the following button to launch the CloudFormation stack to deploy the infrastructure.
After that you should see a page similar to the following image:
For S3DataSourceBucket, enter your data source bucket name without the s3:// prefix, select I acknowledge that AWS CloudFormation might create IAM resources with custom names, and then choose Create stack.
Stack creation can take 30–45 minutes to complete. You can monitor the stack creation status on the Stack info tab. You can also look at the different tabs, such as Events, Resources, and Template. While the stack is being created, you can work on getting the data and generating the metadata in the next few steps.
Get the data and generate the metadata
To fetch your data and start generating metadata, complete the following steps:
- On the AWS Management Console, click icon shown by a red circle in the following picture to start AWS CloudShell.
- Copy the file
code-data.tgz
and extract the contents by using the following commands on AWS CloudShell:
aws s3 cp s3://aws-ml-blog/artifacts/build-an-intelligent-search-solution-with-automated-content-enrichment/code-data.tgz .
tar zxvf code-data.tgz
- Change the working directory to
code-data
:
cd code-data
At this point, you can choose to run the end-to-end workflow of getting the data, creating the metadata using Amazon Comprehend Medical (which takes about 35–40 minutes), and then ingesting the data along with the metadata in the Amazon Kendra index, or just complete the last step to ingest the data with the metadata that has been generated using Amazon Comprehend Medical and supplied in the package for convenience.
- To use the metadata supplied in the package, enter the following code and then jump to Step 6:
tar zxvf med-data-meta.tgz
- Perform this step to get a hands-on experience of building the end-to-end solution. The following command runs a bash script called main.sh, which calls the following scripts:
prereq.sh
to install prerequisites and create subdirectories to store data and metadatagetpages.sh
to get the Wikipedia pages of medicines in the listgetmetapar.sh
to call themeds.py
Python script for each document
./main.sh
The Python script meds.py
contains the logic to make the get_entities_v2
call to Amazon Comprehend Medical and then process the output to produce the JSON metadata file. It takes about 30–40 minutes for this to complete.
While performing Step 5, if CloudShell times out, security tokens get refreshed, or the script stops before all the data is processed, start the CloudShell session again and run getmetapar.sh
, which starts the data processing from the point it was stopped:
./getmetapar.sh
- Upload the data and metadata to the S3 bucket being used as the data source for the Amazon Kendra index using the following AWS CLI commands:
aws cp Data/ s3://<REPLACE-WITH-NAME-OF-YOUR-S3-BUCKET>/Data/ —recursive
aws cp Meta/ s3://<REPLACE-WITH-NAME-OF-YOUR-S3-BUCKET>/Meta/ —recursive
Review Amazon Kendra configuration and start the data source sync
Before starting this step, make sure that the CloudFormation stack creation is complete. In the following steps, we start the data source sync to begin crawling and indexing documents.
- On the Amazon Kendra console, choose the index AuthKendraIndex, which was created as part of the CloudFormation stack.
- In the navigation pane, choose Data sources.
- On the Settings tab, you can see the data source bucket being configured.
- Choose the data source and choose Sync now.
The data source sync can take 10–15 minutes to complete.
Observe Amazon Kendra index facet definition
In the navigation pane, choose Facet definition. The following screenshot shows the entries for ANATOMY
, MEDICAL_CONDITION
, MEDICATION
, PROTECTED_HEALTH_INFORMATION
, TEST_TREATMENT_PROCEDURE
, and TIME_EXPRESSION
. These are the categories of the entities detected by Amazon Comprehend Medical. These are defined as custom attributes in the CloudFormation template that we used to create the Amazon Kendra index. The facetable check boxes for PROTECTED_HEALTH_INFORMATION
and TIME_EXPRESSION
aren’t selected, therefore these aren’t shown in the facets of the search user interface.
Query the repository of WHO Model List of Essential Medicines
We’re now ready to make queries to our search solution.
- On the Amazon Kendra console, navigate to your index and choose Search console.
- In the search field, enter
What is the treatment for diabetes?
The following screenshot shows the results.
- Choose Filter search results to see the facets.
The headings of MEDICATION
, ANATOMY
, MEDICAL_CONDITION
, and TEST_TREATMENT_PROCEDURE
are the categories defined as Amazon Kendra facets, and the list of items underneath them are the entities of these categories as detected by Amazon Comprehend Medical in the documents being searched. PROTECTED_HEALTH_INFORMATION
and TIME_EXPRESSION
are not shown.
- Under MEDICAL_CONDITION, select pregnancy to refine the search results.
You can go back to the Facet definition page and make PROTECTED_HEALTH_INFORMATION
and TIME_EXPRESSION
facetable and save the configuration. Go back to the search console, make a new query, and observe the facets again. Experiment with these facets to see what suits your needs best.
Make additional queries and use the facets to refine the search results. You can use the following queries to get started, but you can also experiment with your own:
- What is a common painkiller?
- Is parcetamol safe for children?
- How to manage high blood pressure?
- When should BCG vaccine be administered?
You can observe how domain-specific facets improve the search experience.
Infrastructure cleanup
To delete the infrastructure that was deployed as part of the CloudFormation stack, delete the stack from the AWS CloudFormation console. Stack deletion can take 20–30 minutes.
When the stack status shows as Delete Complete
, go to the Events tab and confirm that each of the resources has been removed. You can also cross-verify by checking on the Amazon Kendra console that the index is deleted.
You must delete your data source bucket separately because it wasn’t created as part of the CloudFormation stack.
Conclusion
In this post, we demonstrated how to automate the process to enrich the content by generating domain-specific metadata for an Amazon Kendra index using Amazon Comprehend or Amazon Comprehend Medical, thereby improving the user experience for the search solution.
This example used the entities detected by Amazon Comprehend Medical to generate the Amazon Kendra metadata. Depending on the domain of the content repository, you can use a similar approach with the pretrained model or custom trained models of Amazon Comprehend. Try out our solution and let us know what you think! You can further enhance the metadata by using other elements such as protected health information (PHI) for Amazon Comprehend Medical and events, key phrases, personally identifiable information (PII), dominant language, sentiment, and syntax for Amazon Comprehend.
About the Authors
Abhinav Jawadekar is a Senior Partner Solutions Architect at Amazon Web Services. Abhinav works with AWS partners to help them in their cloud journey.
Udi Hershkovich has been a Principal WW AI/ML Service Specialist at AWS since 2018. Prior to AWS, Udi held multiple leadership positions with AI startups and Enterprise initiatives including co-founder and CEO at LeanFM Technologies, offering ML-powered predictive maintenance in facilities management, CEO of Safaba Translation Solutions, a machine translation startup acquired by Amazon in 2015, and Head of Professional Services for Contact Center Intelligence at Amdocs. Udi holds Law and Business degrees from the Interdisciplinary Center in Herzliya, Israel, and lives in Pittsburgh, Pennsylvania, USA.
Using hyperboloids to embed knowledge graphs
Novel embedding scheme enables a 7% to 33% improvement over its best-performing predecessors in handling graph queries.Read More