What’s it like to be a scientist at Amazon? What drew you to science? What advice do you have? We asked those questions a lot in 2021 — these are some of the best answers.Read More
The top Amazon Science articles of 2021
From quantum chess to robot arms to body fat percentage and ML-powered grocery shopping, these 10 articles resonated with readers in 2021.Read More
Introducing hybrid machine learning
Gartner predicts that by the end of 2024, 75% of enterprises will shift from piloting to operationalizing artificial intelligence (AI), and the vast majority of workloads will end up in the cloud in the long run. For some enterprises that plan to migrate to the cloud, the complexity, magnitude, and length of migrations may be daunting. The speed of different teams and their appetites for new tooling can vary dramatically. An enterprise’s data science team may be hungry for adopting the latest cloud technology, while the application development team is focused on running their web applications on premises. Even with a multi-year cloud migration plan, some of the product releases must be built on the cloud in order to meet the enterprise’s business outcomes.
For these customers, we propose hybrid machine learning (ML) patterns as an intermediate step in your journey to the cloud. Hybrid ML patterns are those that involve a minimum of two compute environments, typically local compute resources such as personal laptops or corporate data centers, and the cloud. With the hybrid ML architecture patterns described in this post, enterprises can achieve their desired business goals without having to wait for the cloud migration to complete. At the end of the day, we want to support customer success in all shapes and forms.
We have published a new whitepaper, Hybrid Machine Learning, to help you integrate the cloud with existing on-premises ML infrastructure. For more whitepapers from AWS, see AWS Whitepapers & Guides.
Hybrid ML architecture patterns
The whitepaper gives you an overview of the various hybrid ML patterns across the entire ML lifecycle, including ML model development, data preparation, training, deployment, and ongoing management. The following table summarizes the eight different hybrid ML architectural patterns we discuss in the whitepaper. For each pattern, we provide a preliminary reference architecture in addition to the advantages and disadvantages. We also identify a “when to move” criterion to help you make decisions—for example, when the level of effort to maintain and scale a given pattern has exceeded the value it provides.
Development | Training | Deployment |
Develop on personal computers, train and host in the cloud | Train locally, deploy in the cloud | Serve ML models in the cloud to applications hosted on premises |
Develop on local servers, train and host in the cloud | Store data locally, train and deploy in the cloud | Host ML models with Lambda@Edge to applications on premises |
Develop in the cloud while connecting to data hosted on premises | Train with a third-party SaaS provider to host in the cloud | |
Train in the cloud, deploy ML models on premises | Orchestrate hybrid ML workloads with Kubeflow and Amazon EKS Anywhere |
In this post, we dive deep into the hybrid architecture pattern for deployment with a focus on serving models hosted in the cloud to applications hosted on premises.
Architecture overview
The most common use case for this hybrid pattern is enterprise migrations. Your data science team may be ready to deploy to the cloud, but your application team is still refactoring their code to host on cloud-native services. This approach enables the data scientists to bring their newest models to market, while the application team separately considers when, where, and how to move the rest of the application to the cloud.
The following diagram shows the architecture for hosting an ML model via Amazon SageMaker in an AWS Region, serving responses to requests from applications hosted on premises.
Technical deep dive
In this section, we dive deep into the technical architecture and focus on the various components that comprise the hybrid workload explicitly and refer to resources elsewhere as necessary.
Let’s take a real-world use case of a retail company whose application development team has hosted their ecommerce web application on premises. The company wants to improve brand loyalty, grow sales and revenue, and increase efficiencies by using data to create more sophisticated and unique customer experiences. They intend to increase customer engagement by 50% by adding a “recommended for you” widget on their home screen. However, they’re struggling to deliver personalized experiences due to the limitations of static, rule-based systems, complexities and costs, and friction with platform integration due to their current legacy, on-premises architecture.
The application team has a 5-year enterprise migration strategy to refactor their web application using cloud-native architecture to move to the cloud, whereas the data science teams are ready to begin implementation in the cloud. With the hybrid architecture pattern described in this post, the company can achieve their desired business outcome quickly without having to wait for the 5-year enterprise migration to complete.
The data scientists develop the ML model, perform training, and deploy the trained model in the cloud. The ecommerce web application that’s hosted on premises consumes the ML model via the exposed endpoints. Let’s walk this through in detail.
In the model development phase, data scientists can use local development environments, such as PyCharm or Jupyter installations on their personal computer, and then connect to the cloud via AWS Identity and Access Management (IAM) permissions and interface with AWS service APIs through the AWS Command Line Interface (AWS CLI) or an AWS SDK (such as Boto3). They also have the flexibility to use Amazon SageMaker Studio, a single web-based visual interface that comes with common data science packages and kernels preinstalled for model development.
Data scientists can take advantage of SageMaker training capabilities, including access to on-demand CPU and GPU instances, automatic model tuning, managed Spot Instances, checkpointing for saving the state of models, managed distributed training, and many more, using the SageMaker training SDK and APIs. For an overview on training models with SageMaker, see Train a Model with Amazon SageMaker.
After the model is trained, data scientists can deploy the models using SageMaker hosting capabilities and expose a REST HTTP(s) endpoint serving predictions to end applications hosted on premises. The application development teams can integrate their on-premises applications to interact with the ML model via SageMaker hosted endpoints to get the inference results. Application developers can access the deployed models through application programming interface (API) requests with response times as low as a few milliseconds. This supports use cases requiring real-time responses, such as personalized product recommendations.
The client application on premises connects with the ML model hosted on the SageMaker hosted endpoint on AWS over a private network using VPN or Direct Connect connection, to provide inference results to its end users. The client application can use any client library to invoke the endpoint using an HTTP Post request along with necessary authentication credentials configured programmatically and the expected payload. SageMaker also has commands and libraries that abstract some of the low-level details such as authentication using the AWS credentials saved in our client application environment, such as the SageMaker invoke-endpoint runtime command from the AWS CLI, SageMaker runtime client from Boto3 (AWS SDK for Python), and the Predictor class from the SageMaker Python SDK.
To make the endpoint accessible over the internet, we can use Amazon API Gateway. Although you can directly access SageMaker hosted endpoints from API Gateway, a common pattern you can use is adding an AWS Lambda function in between. You can use the Lambda function for any preprocessing, which may be needed in order to send the request in the format expected by the endpoint, or postprocessing for transforming the response into the format required by the client application. For more information, see Call an Amazon SageMaker model endpoint using Amazon API Gateway and AWS Lambda.
The client application on premises connects with ML models hosted on SageMaker on AWS over a private network using VPN or Direct Connect connection, to provide inference results to its end users.
The following diagram illustrates how the data science team develops the ML model, performs training, and deploys the trained model in the cloud, while the application development team develops and deploys the ecommerce web application on premises.
After the model is deployed into the production environment, your data scientists can use Amazon SageMaker Model Monitor to continuously monitor the quality of the ML models in real time. They can also set up an automated alert triggering system when deviations in the model quality occur, such as data drift and anomalies. Amazon CloudWatch Logs collects log files monitoring the model status and notifies you when the quality of the model reaches certain thresholds. This enables your data scientists to take corrective actions, such as retraining models, auditing upstream systems, or fixing quality issues without having to monitor models manually. With AWS Managed Services, your data science team can avoid the downside of implementing monitoring solutions from scratch.
Your data scientists can reduce the overall time required to deploy their ML models in production by automating load testing and model tuning across SageMaker ML instances by using Amazon SageMaker Inference Recommender. It helps your data scientists select the best instance type and configuration (such as instance count, container parameters, and model optimizations) for their ML models.
Lastly, it’s always a best practice to decouple hosting your ML model from hosting your application. In this approach, the data scientists use dedicated resources to host their ML model, specifically ones that are separated from the application, which greatly simplifies the process to push better models. This is a key step in the innovation flywheel. This also prevents any form of tight coupling between the hosted ML model and the application, thereby enabling the model to be highly performant.
In addition to improving the model performance with updated research trends, this approach provides the ability to redeploy a model with updated data. The global COVID-19 pandemic has demonstrated the reality that markets are changing all the time, and the ML model need to stay up to date with the latest trends. The only way you can deliver on that requirement is by being able to retrain and redeploy your model with updated data.
Conclusion
Check out the whitepaper Hybrid Machine Learning, in which we look at additional patterns for hosting ML models via Lambda@Edge, AWS Outposts, AWS Local Zones, and AWS Wavelength. We explore hybrid ML patterns across the entire ML lifecycle. We look at developing locally, while training and deploying in the cloud. We discuss patterns for training locally to deploy on the cloud, and even to host ML models in the cloud to serve applications on premises.
How are you integrating the cloud with your existing on-premises ML infrastructure? Please share your feedback about hybrid ML in the comments so we can continue to improve our products, features, and documentation. If you want to engage the authors of this document for advice on your cloud migration, contact us at hybrid-ml-support@amazon.com.
About the Authors
Alak Eswaradass is a Solutions Architect at AWS, based in Chicago, Illinois. She is passionate about helping customers design cloud architectures utilizing AWS services to solve business challenges. She hangs out with her daughters and explores the outdoors in her free time.
Emily Webber joined AWS just after SageMaker launched, and has been trying to tell the world about it ever since! Outside of building new ML experiences for customers, Emily enjoys meditating and studying Tibetan Buddhism.
Roop Bains is a Solutions Architect at AWS focusing on AI/ML. He is passionate about machine learning and helping customers achieve their business objectives. In his spare time, he enjoys reading and hiking.
Use deep learning frameworks natively in Amazon SageMaker Processing
Until recently, customers who wanted to use a deep learning (DL) framework with Amazon SageMaker Processing faced increased complexity compared to those using scikit-learn or Apache Spark. This post shows you how SageMaker Processing has simplified running machine learning (ML) preprocessing and postprocessing tasks with popular frameworks such as PyTorch, TensorFlow, Hugging Face, MXNet, and XGBoost.
Benefits of SageMaker Processing
Training an ML model takes many steps. One of them, data preparation, is paramount to creating an accurate ML model. A typical preprocessing step includes operations such as the following:
- Converting the dataset to the input format expected by the ML algorithm that you’re using
- Transforming existing features to a more expressive representation, such as one-hot encoding categorical features
- Rescaling or normalizing numerical features
- Engineering high-level features; for example, replacing mailing addresses with GPS coordinates
- Cleaning and tokenizing text for natural language processing (NLP) applications
- Resizing, centering, or augmenting images for computer vision applications
Likewise, you often need to run postprocessing jobs (for example, filtering or collating) and model evaluation jobs (scoring models against different test sets) as part of your ML model development lifecycle.
All these tasks involve running custom scripts on your dataset and saving the processed version for later use by your training jobs. In 2019, we launched SageMaker Processing, a capability of Amazon SageMaker that lets you run your preprocessing, postprocessing, and model evaluation workloads on a fully managed infrastructure. It does the heavy lifting for you, managing the infrastructure that runs your bespoke scripts. It spins up the necessary resources to do the job and tears them down when it’s done.
The SageMaker Python SDK provides a SageMaker Processing library that lets you do the following:
-
Use scikit-learn data processing features through a built-in container image provided by SageMaker with a scikit-learn framework. You can instantiate the
SKLearnProcessor
class provided in the SageMaker Python SDK and feed it your scikit-learn script. -
Use Apache Spark for distributed data processing through a built-in Apache Spark container image provided by SageMaker. Similar to the previous process, you can instantiate the
PySparkProcessor
class provided in the SageMaker Python SDK and feed it your PySpark script. - Lastly, you can bring you own container to do the job. If you want preprocessing or postprocessing tasks to use libraries or frameworks other than scikit-learn and PySpark, you can package your custom code in a container. You then instantiate the
ScriptProcessor
class through your container image and feed it your data processing script.
Before release 2.52 of the SageMaker Python SDK, using SageMaker Processing in combination with popular ML frameworks such as PyTorch, TensorFlow, Hugging Face, MXNet, and XGBoost required you to bring your own container. You had to first build a container and then make sure that it included the relevant framework and all its dependencies. We wanted to simplify data scientists’ lives by removing the need to create a custom container image for these popular frameworks. And we wanted to deliver the same consistent experience people already had with Processing when using scikit-learn or Spark.
In the following sections, we show you how to natively use popular ML frameworks such as PyTorch, TensorFlow, Hugging Face, or MXNet with SageMaker Processing, without having to build a single container.
Using machine learning / deep learning frameworks in SageMaker Processing
The introduction of FrameworkProcessor—in release 2.52 of the SageMaker Python SDK in August 2021—changed everything. You can now use SageMaker Processing with your preferred ML framework among PyTorch, TensorFlow, Hugging Face, MXNet, and XGBoost. ML practitioners can now focus on perfecting their data processing code instead of spending additional energy on maintaining the lifecycle of custom containers. Now you can use one of the built-in containers and classes provided by SageMaker to use the data processing features of any of the previously mentioned frameworks. For this post, we only test one framework: PyTorch. However, you can reproduce the same procedures for any of the four other supported frameworks. The differences from one framework to the next are in the FrameworkProcessor
subclass being used, the framework release, and the specifics of each framework for the data processing script.
The dataset
To illustrate our solution, let’s imagine that we plan to train a model to classify animal pictures. We rely on a publicly available dataset, the COCO dataset, which contains images from Flickr representing a real-world dataset not pre-formatted or resized specifically for deep learning. This makes it a good fit for our example scenario. Before we even get to the training stage, our initial problem is that the images we want to use to train our model come in all forms and shapes. Therefore, to make sure that this doesn’t affect our training or impact the quality of our model, we preprocess the images. In particular, we make sure that they’re the same shape and size before moving any further.
The COCO dataset provides an annotation file that contains information on each image in the dataset, such as the class, superclass, file name, and URL to download the file. We limit the scope of the dataset for the sake of this example by only using animal images. For the train and validation sets, the data we need for the image labels and the file paths are under different headings in the annotations. We only use a small fraction of the dataset, sufficient for this example.
Processing logic
Before we train our model, all image data must have the same dimensions for length, width, and channel. Typically, algorithms use a square format, with identical length and width. However, most real-world datasets such as ours contain images in many different dimensions and ratios. To prepare our dataset for training, we need to resize and crop the images if they aren’t already square.
We also randomly augment the images to help our training algorithm generalize better. We only augment the training data, not the validation or test data, because we want to generate a prediction on the image as it normally would be presented for inference.
Our processing stage consists of two steps.
First, we instantiate the PyTorchProcessor class needed to run our bespoke data processing script:
Second, we need to pass it the instructions to conduct the actual data processing tasks that are contained in our script:
- The dataset (
coco-annotations.zip
) is automatically copied inside the container under the destination directory (/opt/ml/processing/input
). We could add additional inputs if needed. - This is where the Python script (preprocessing.py) reads it. By specifying
source_dir
, we instruct Processing where to find the script and any of its dependencies. For instance, insource_dir
you can find an extra file (script_utils.py
) used by our main script, and a file to make sure that all dependencies are satisfied (requirements.txt
). We then also pass any command line arguments useful to our script. - Our preprocessing script then processes the data, splits it three ways, and saves the files inside the container under
/opt/ml/processing/output/train
,/opt/ml/processing/output/validation
, and/opt/ml/processing/output/test
. We added more output to illustrate the flexibility of saving any useful data that results from that processing step for further use. - When the job is complete, all outputs are automatically copied to your default SageMaker bucket in Amazon Simple Storage Service (Amazon S3).
We run this step with the following code:
At the end of this processing step, after sampling our initial dataset, we restructure it to fit the actual structure expected by the major ML frameworks. We also center, crop, and augment the images. We’re ready to proceed to the next stage and train our model. We also add an extra output (the data_structured
folder) to save the restructured source data. This allows us to reuse the same dataset for further processing or training without restarting the whole preparation from scratch (that is, from the annotations file). More details on this can be found in the script.
Conclusion
In this post, we showed you how SageMaker Processing has simplified the use of the most popular ML frameworks, such as PyTorch, TensorFlow, MXNet, Hugging Face, and XGBoost. This is possible thanks to the introduction of FrameworkProcessor in the recent releases (2.52+) of the SageMaker Python SDK. You can now use the existing SageMaker containers provided natively for these frameworks with SageMaker Processing, and focus solely on your data processing code. Behind the scenes, SageMaker Processing manages the necessary infrastructure for you.
We hope this gave you a glimpse into the possibilities offered by SageMaker Processing. As a next step, you can look beyond preprocessing and postprocessing steps and consider the full lifecycle of an ML model. SageMaker Processing can play an active role before the training takes place but also post-training for any postprocessing tasks. You may want to also look at SageMaker Pipelines to automate the entire model lifecycle by crafting all these different steps together into a model pipeline.
This post was inspired by the post Amazon SageMaker Processing – Fully Managed Data Processing and Model Evaluation when SageMaker Processing first launched. Check out the SageMaker Python SDK for more details on the other supported frameworks: Hugging Face, TensorFlow, MXNet, XGBoost.
Sample notebooks and scripts for all four supported frameworks are available on GitHub: PyTorch example, Hugging Face example, TensorFlow example, MXNet example.
If you have feedback about this post, let us know in the comments section. If you have questions about this post, start a new thread on one of the AWS Developer forums or contact AWS Support.
About the Authors
Patrick Sard works as a Solutions Architect at AWS in Brussels, Belgium. Apart from being a cloud enthusiast, Patrick loves practicing tai chi (preferably Chen style), enjoys an occasional wine-tasting (he trained as a sommelier), and is an avid tennis player.
Davide Gallitelli is a Specialist Solutions Architect for AI/ML in the EMEA region. He is based in Brussels and works closely with customers throughout Benelux. He has been a developer since he was very young, and started coding at the age of 7. He discovered AI/ML while at university, and has fallen in love with it since then.
Gah-Yi Ban wins 2021 Best OM Paper Award
Ban, an Amazon Visiting Academic, won for a paper she co-authored with Duke University professor Cynthia Rudin.Read More
Reducing unnecessary clarification questions from voice agents
New approach improves F1 score of clarification questions by 81%.Read More
“Ambient intelligence” will accelerate advances in general AI
Alexa’s chief scientist on how customer-obsessed science is accelerating general intelligence.Read More
Fiddler.ai CEO on the emerging category of explainable AI
Krishna Gade, the founder of this Alexa Fund portfolio company, answers three questions about ‘responsible AI’.Read More
Live call analytics for your contact center with Amazon language AI services
Your contact center connects your business to your community, enabling customers to order products, callers to request support, clients to make appointments, and much more. When calls go well, callers retain a positive image of your brand, and are likely to return and recommend you to others. And the converse, of course, is also true.
Naturally, you want to do what you can to ensure that your callers have a good experience. There are two aspects to this:
- Help supervisors assess the quality of your caller’s experiences in real time – For example, your supervisors need to know if initially unhappy callers become happier as the call progresses. And if not, why? What actions can be taken, before the call ends, to assist the agent to improve the customer experience for calls that aren’t going well?
- Help agents optimize the quality of your caller’s experiences – For example, can you deploy live call transcription? This removes the need for your agents to take notes during calls, freeing them to focus more attention on providing positive customer interactions.
Contact Lens for Amazon Connect provides real-time supervisor and agent assist features that could be just what you need, but you may not yet be using Amazon Connect. You need a solution that works with your existing contact center.
Amazon Machine Learning (ML) services like Amazon Transcribe and Amazon Comprehend provide feature-rich APIs that you can use to transcribe and extract insights from your contact center audio at scale. Although you could build your own custom call analytics solution using these services, that requires time and resources. In this post, we introduce our new sample solution for live call analytics.
Solution overview
Our new sample solution, Live Call Analytics (LCA), does most of the heavy lifting associated with providing an end-to-end solution that can plug into your contact center and provide the intelligent insights that you need.
It has a call summary user interface, as shown in the following screenshot.
It also has a call detail user interface.
LCA currently supports the following features:
- Accurate streaming transcription with support for personally identifiable information (PII) redaction and custom vocabulary
- Sentiment detection
- Automatic scaling to handle call volume changes
- Call recording and archiving
- A dynamically updated web user interface for supervisors and agents:
- A call summary page that displays a list of in-progress and completed calls, with call timestamps, metadata, and summary statistics like duration, and sentiment trend
- Call detail pages showing live turn-by-turn transcription of the caller/agent dialog, turn-by-turn sentiment, and sentiment trend
- Standards-based telephony integration with your contact center using Session Recording Protocol (SIPREC)
- A built-in standalone demo mode that allows you to quickly install and try out LCA for yourself, without needing to integrate with your contact center telephony
- Easy-to-install resources with a single AWS CloudFormation template
This is just the beginning! We expect to add many more exciting features over time, based on your feedback.
Deploy the CloudFormation stack
Start your LCA experience by using AWS CloudFormation to deploy the sample solution with the built-in demo mode enabled.
The demo mode downloads, builds, and installs a small virtual PBX server on an Amazon Elastic Compute Cloud (Amazon EC2) instance in your AWS account (using the free open-source Asterisk project) so you can make test phone calls right away and see the solution in action. You can integrate it with your contact center later after evaluating the solution’s functionality for your unique use case.
- Use the appropriate Launch Stack button for the AWS Region in which you’ll use the solution. We expect to add support for additional Regions over time.
- For Stack name, use the default value,
LiveCallAnalytics
. - For Install Demo Asterisk Server, use the default value,
true
. - For Allowed CIDR Block for Demo Softphone, use the IP address of your local computer with a network mask of
/32
.
To find your computer’s IP address, you can use the website checkip.amazonaws.com.
Later, you can optionally install a softphone application on your computer, which you can register with LCA’s demo Asterisk server. This allows you to experiment with LCA using real two-way phone calls.
If that seems like too much hassle, don’t worry! Simply leave the default value for this parameter and elect not to register a softphone later. You will still be able to test the solution. When the demo Asterisk server doesn’t detect a registered softphone, it automatically simulates the agent side of the conversation using a built-in audio recording.
- For Allowed CIDR List for SIPREC Integration, leave the default value.
This parameter isn’t used for demo mode installation. Later, when you want to integrate LCA with your contact center audio stream, you use this parameter to specify the IP address of your SIPREC source hosts, such as your Session Border Controller (SBC) servers.
- For Authorized Account Email Domain, use the domain name part of your corporate email address (this allows others with email addresses in the same domain to sign up for access to the UI).
- For Call Audio Recordings Bucket Name, leave the value blank to have an Amazon Simple Storage Service (Amazon S3) bucket for your call recordings automatically created for you. Otherwise, use the name of an existing S3 bucket where you want your recordings to be stored.
- For all other parameters, use the default values.
If you want to customize the settings later, for example to apply PII redaction or custom vocabulary to improve accuracy, you can update the stack for these parameters.
The main CloudFormation stack uses nested stacks to create the following resources in your AWS account:
- S3 buckets to hold build artifacts and call recordings
- An EC2 instance (t4g.large) with the demo Asterisk server installed, with VPC, security group, Elastic IP address, and internet gateway
- An Amazon Chime Voice Connector, configured to stream audio to Amazon Kinesis Video Streams
- An Amazon Elastic Container Service (Amazon ECS) instance that runs containers in AWS Fargate to relay streaming audio from Kinesis Video Streams to Amazon Transcribe and record transcription segments in Amazon DynamoDB, with VPC, NAT gateways, Elastic IP addresses, and internet gateway
- An AWS Lambda function to create and store final stereo call recordings
- A DynamoDB table to store call and transcription data, with Lambda stream processing that adds analytics to the live call data
- The AWS AppSync API, which provides a GraphQL endpoint to support queries and real-time updates
- Website components including S3 bucket, Amazon CloudFront distribution, and Amazon Cognito user pool
- Other miscellaneous supporting resources, including AWS Identity and Access Management (IAM) roles and policies (using least privilege best practices), Amazon Virtual Private Cloud (Amazon VPC) resources, Amazon EventBridge event rules, and Amazon CloudWatch log groups.
The stacks take about 20 minutes to deploy. The main stack status shows CREATE_COMPLETE when everything is deployed.
Create a user account
We now open the web user interface and create a user account.
- On the AWS CloudFormation console, choose the main stack,
LiveCallAnalytics
, and choose the Outputs tab.
- Open your web browser to the URL shown as
CloudfrontEndpoint
in the outputs.
You’re directed to the login page.
- Choose Create account.
- For Username, use your email address that belongs to the email address domain you provided earlier.
- For Password, use a sequence that has a length of at least 8 characters, and contains uppercase and lowercase characters, plus numbers and special characters.
- Choose CREATE ACCOUNT.
The Confirm Sign up page appears.
Your confirmation code has been emailed to the email address you used as your username. Check your inbox for an email from no-reply@verificationemail.com
with subject “Account Verification.”
- For Confirmation Code, copy and paste the code from the email.
- Choose CONFIRM.
Make a test phone call
Call the number shown as DemoPBXPhoneNumber
in the AWS CloudFormation outputs for the main LiveCallAnalytics
stack.
You haven’t yet registered a softphone app, so the demo Asterisk server picks up the call and plays a recording. Listen to the recording, and answer the questions when prompted. Your call is streamed to the LCA application, and is recorded, transcribed, and analyzed. When you log in to the UI later, you can see a record of this call.
Optional: Install and register a softphone
If you want to use LCA with live two-person phone calls instead of the demo recording, you can register a softphone application with your new demo Asterisk server.
The following README has step-by-step instructions for downloading, installing, and registering a free (for non-commercial use) softphone on your local computer. The registration is successful only if Allowed CIDR Block for Demo Softphone correctly reflects your local machine’s IP address. If you got it wrong, or if your IP address has changed, you can choose the LiveCallAnalytics
stack in AWS CloudFormation, and choose Update to provide a new value for Allowed CIDR Block for Demo Softphone.
If you still can’t successfully register your softphone, and you are connected to a VPN, disconnect and update Allowed CIDR Block for Demo Softphone—corporate VPNs can restrict IP voice traffic.
When your softphone is registered, call the phone number again. Now, instead of playing the default recording, the demo Asterisk server causes your softphone to ring. Answer the call on the softphone, and have a two-way conversation with yourself! Better yet, ask a friend to call your Asterisk phone number, so you can simulate a contact center call by role playing as caller and agent.
Explore live call analysis features
Now, with LCA successfully installed in demo mode, you’re ready to explore the call analysis features.
- Open the LCA web UI using the URL shown as
CloudfrontEndpoint
in the main stack outputs.
We suggest bookmarking this URL—you’ll use it often!
- Make a test phone call to the demo Asterisk server (as you did earlier).
- If you registered a softphone, it rings on your local computer. Answer the call, or better, have someone else answer it, and use the softphone to play the agent role in the conversation.
- If you didn’t register a softphone, the Asterisk server demo audio plays the role of agent.
Your phone call almost immediately shows up at the top of the call list on the UI, with the status In progress
.
The call has the following details:
- Call ID – A unique identifier for this telephone call
- Initiation Timestamp – Shows the time the telephone call started
- Caller Phone Number – Shows the number of the phone from which you made the call
- Status – Indicates that the call is in progress
- Caller Sentiment – The average caller sentiment
- Caller Sentiment Trend –The caller sentiment trend
- Duration – The elapsed time since the start of the call
- Choose the call ID of your
In progress
call to open the live call detail page.
As you talk on the phone from which you made the call, your voice and the voice of the agent are transcribed in real time and displayed in the auto scrolling Call Transcript pane.
Each turn of the conversation (customer and agent) is annotated with a sentiment indicator. As the call continues, the sentiment for both caller and agent is aggregated over a rolling time window, so it’s easy to see if sentiment is trending in a positive or negative direction.
- End the call.
- Navigate back to the call list page by choosing Calls at the top of the page.
Your call is now displayed in the list with the status Done.
- To display call details for any call, choose the call ID to open the details page, or select the call to display the Calls list and Call Details pane on the same page.
You can change the orientation to a side-by-side layout using the Call Details settings tool (gear icon).
You can make a few more phone calls to become familiar with how the application works. With the softphone installed, ask someone else to call your Asterisk demo server phone number: pick up their call on your softphone and talk with them while watching the turn-by-turn transcription update in real time. Observe the low latency. Assess the accuracy of transcriptions and sentiment annotation—you’ll likely find that it’s not perfect, but it’s close! Transcriptions are less accurate when you use technical or domain-specific jargon, but you can use custom vocabulary to teach Amazon Transcribe new words and terms.
Processing flow overview
How did LCA transcribe and analyze your test phone calls? Let’s take a quick look at how it works.
The following diagram shows the main architectural components and how they fit together at a high level.
The demo Asterisk server is configured to use Voice Connector, which provides the phone number and SIP trunking needed to route inbound and outbound calls. When you configure LCA to integrate with your contact center instead of the demo Asterisk server, Voice Connector is configured to integrate instead with your existing contact center using SIP-based media recording (SIPREC) or network-based recording (NBR). In both cases, Voice Connector streams audio to Kinesis Video Streams using two streams per call, one for the caller and one for the agent.
When a new video stream is initiated, an event is fired using EventBridge. This event triggers a Lambda function, which uses an Amazon Simple Queue Service (Amazon SQS) queue to initiate a new call processing job in Fargate, a serverless compute service for containers. A single container instance processes multiple calls simultaneously. AWS auto scaling provisions and de-provisions additional containers dynamically as needed to handle changing call volumes.
The Fargate container immediately creates a streaming connection with Amazon Transcribe and starts consuming and relaying audio fragments from Kinesis Video Streams to Amazon Transcribe.
The container writes the streaming transcription results in real time to a DynamoDB table.
A Lambda function, the Call Event Stream Processor, fed by DynamoDB streams, processes and enriches call metadata and transcription segments. The event processor function interfaces with AWS AppSync to persist changes (mutations) in DynamoDB and to send real-time updates to logged in web clients.
The LCA web UI assets are hosted on Amazon S3 and served via CloudFront. Authentication is provided by Amazon Cognito. In demo mode, user identities are configured in an Amazon Cognito user pool. In a production setting, you would likely configure Amazon Cognito to integrate with your existing identity provider (IdP) so authorized users can log in with their corporate credentials.
When the user is authenticated, the web application establishes a secure GraphQL connection to the AWS AppSync API, and subscribes to receive real-time events such as new calls and call status changes for the calls list page, and new or updated transcription segments and computed analytics for the call details page.
The entire processing flow, from ingested speech to live webpage updates, is event driven, and so the end-to-end latency is small—typically just a few seconds.
Monitoring and troubleshooting
AWS CloudFormation reports deployment failures and causes on the relevant stack Events tab. See Troubleshooting CloudFormation for help with common deployment problems. Look out for deployment failures caused by limit exceeded errors; the LCA stacks create resources such as NAT gateways, Elastic IP addresses, and other resources that are subject to default account and Region Service Quotas.
Amazon Transcribe has a default limit of 25 concurrent transcription streams, which limits LCA to 12 concurrent calls (two streams per call). Request an increase for the number of concurrent HTTP/2 streams for streaming transcription if you need to handle a larger number of concurrent calls.
LCA provides runtime monitoring and logs for each component using CloudWatch:
-
Call trigger Lambda function – On the Lambda console, open the
LiveCallAnalytics-AISTACK-transcribingFargateXXX
function. Choose the Monitor tab to see function metrics. Choose View logs in CloudWatch to inspect function logs. -
Call processing Fargate task – On the Amazon ECS console, choose the
LiveCallAnalytics
cluster. Open theLiveCallAnalytics
service to see container health metrics. Choose the Logs tab to inspect container logs. -
Call Event Stream Processor Lambda function – On the Lambda console, open the
LiveCallAnalytics-AISTACK-CallEventStreamXXX
function. Choose the Monitor tab to see function metrics. Choose View logs in CloudWatch to inspect function logs. -
AWS AppSync API – On the AWS AppSync console, open the
CallAnalytics-LiveCallAnalytics-XXX
API. Choose Monitoring in the navigation pane to see API metrics. Choose View logs in CloudWatch to inspect AppSyncAPI logs.
Cost assessment
This solution has hourly cost components and usage cost components.
The hourly costs add up to about $0.15 per hour, or $0.22 per hour with the demo Asterisk server enabled. For more information about the services that incur an hourly cost, see AWS Fargate Pricing, Amazon VPC pricing (for the NAT gateway), and Amazon EC2 pricing (for the demo Asterisk server).
The hourly cost components comprise the following:
- Fargate container – 2vCPU at $0.08/hour and 4 GB memory at $0.02/hour = $0.10/hour
- NAT gateways – Two at $0.09/hour
- EC2 instance – t4g.large at $0.07/hour (for demo Asterisk server)
The usage costs add up to about $0.30 for a 5-minute call, although this can vary based on total usage, because usage affects Free Tier eligibility and volume tiered pricing for many services. For more information about the services that incur usage costs, see the following:
- AWS AppSync pricing
- Amazon Cognito Pricing
- Amazon Comprehend Pricing
- Amazon DynamoDB pricing
- Amazon EventBridge pricing
- Amazon Kinesis Video Streams pricing
- AWS Lambda Pricing
- Amazon SQS pricing
- Amazon S3 pricing
- Amazon Transcribe Pricing
- Amazon Voice Connector Chime pricing (streaming)
To explore LCA costs for yourself, use AWS Cost Explorer or choose Bill Details on the AWS Billing Dashboard to see your month-to-date spend by service.
Integrate with your contact center
To deploy LCA to analyze real calls to your contact center using AWS CloudFormation, update the existing LiveCallAnalytics
demo stack, changing the parameters to disable demo mode.
Alternatively, delete the existing LiveCallAnalytics
demo stack, and deploy a new LiveCallAnalytics
stack (use the stack options from the previous section).
You could also deploy a new LiveCallAnalytics
stack in a different AWS account or Region.
Use these parameters to configure LCA for contact center integration:
- For Install Demo Asterisk Server, enter
false
. - For Allowed CIDR Block for Demo Softphone, leave the default value.
- For Allowed CIDR List for Siprec Integration, use the CIDR blocks of your SIPREC source hosts, such as your SBC servers. Use commas to separate CIDR blocks if you enter more than one.
When you deploy LCA, a Voice Connector is created for you. Use the Voice Connector documentation as guidance to configure this Voice Connector and your PBX/SBC for SIP-based media recording (SIPREC) or network-based recording (NBR). The Voice Connector Resources page provides some vendor-specific example configuration guides, including:
- SIPREC Configuration Guide: Cisco Unified Communications Manager (CUCM) and Cisco Unified Border Element (CUBE)
- SIPREC Configuration Guide: Avaya Aura Communication Manager and Session Manager with Sonus SBC 521
The LCA GitHub repository has additional vendor specific notes that you may find helpful; see SIPREC.md.
Customize your deployment
Use the following CloudFormation template parameters when creating or updating your stack to customize your LCA deployment:
- To use your own S3 bucket for call recordings, use Call Audio Recordings Bucket Name and Audio File Prefix.
- To redact PII from the transcriptions, set IsContentRedactionEnabled to
true
. For more information, see Redacting or identifying PII in a real-time stream. - To improve transcription accuracy for technical and domain-specific acronyms and jargon, set UseCustomVocabulary to the name of a custom vocabulary that you already created in Amazon Transcribe. For more information, see Custom vocabularies.
LCA is an open-source project. You can fork the LCA GitHub repository, enhance the code, and send us pull requests so we can incorporate and share your improvements!
Clean up
When you’re finished experimenting with this solution, clean up your resources by opening the AWS CloudFormation console and deleting the LiveCallAnalytics
stacks that you deployed. This deletes resources that were created by deploying the solution. The recording S3 buckets, DynamoDB table, and CloudWatch Log groups are retained after the stack is deleted to avoid deleting your data.
Post Call Analytics: Companion solution
Our companion solution, Post Call Analytics (PCA), offers additional insights and analytics capabilities by using the Amazon Transcribe Call Analytics batch API to detect common issues, interruptions, silences, speaker loudness, call categories, and more. Unlike LCA, which transcribes and analyzes streaming audio in real time, PCA transcribes and analyzes your call recordings after the call has ended. Configure LCA to store call recordings to the PCA’s ingestion S3 bucket, and use the two solutions together to get the best of both worlds. For more information, see Post call analytics for your contact center with Amazon language AI services.
Conclusion
The Live Call Analytics (LCA) sample solution offers a scalable, cost-effective approach to provide live call analysis with features to assist supervisors and agents to improve focus on your callers’ experience. It uses Amazon ML services like Amazon Transcribe and Amazon Comprehend to transcribe and extract real-time insights from your contact center audio.
The sample LCA application is provided as open source—use it as a starting point for your own solution, and help us make it better by contributing back fixes and features via GitHub pull requests. For expert assistance, AWS Professional Services and other AWS Partners are here to help.
We’d love to hear from you. Let us know what you think in the comments section, or use the issues forum in the LCA GitHub repository.
About the Authors
Bob Strahan is a Principal Solutions Architect in the AWS Language AI Services team.
Oliver Atoa is a Principal Solutions Architect in the AWS Language AI Services team.
Sagar Khasnis is a Senior Solutions Architect focused on building applications for Productivity Applications. He is passionate about building innovative solutions using AWS services to help customers achieve their business objectives. In his free time, you can find him reading biographies, hiking, working out at a fitness studio, and geeking out on his personal rig at home.
Court Schuett is a Chime Specialist SA with a background in telephony and now likes to build things that build things.
Post call analytics for your contact center with Amazon language AI services
Your contact center connects your business to your community, enabling customers to order products, callers to request support, clients to make appointments, and much more. Each conversation with a caller is an opportunity to learn more about that caller’s needs, and how well those needs were addressed during the call. You can uncover insights from these conversations that help you manage script compliance and find new opportunities to satisfy your customers, perhaps by expanding your services to address reported gaps, improving the quality of reported problem areas, or by elevating the customer experience delivered by your contact center agents.
Contact Lens for Amazon Connect provides call transcriptions with rich analytics capabilities that can provide these kinds of insights, but you may not currently be using Amazon Connect. You need a solution that works with your existing contact center call recordings.
Amazon Machine Learning (ML) services like Amazon Transcribe Call Analytics and Amazon Comprehend provide feature-rich APIs that you can use to transcribe and extract insights from your contact center audio recordings at scale. Although you could build your own custom call analytics solution using these services, that requires time and resources. In this post, we introduce our new sample solution for post call analytics.
Solution overview
Our new sample solution, Post Call Analytics (PCA), does most of the heavy lifting associated with providing an end-to-end solution that can process call recordings from your existing contact center. PCA provides actionable insights to spot emerging trends, identify agent coaching opportunities, and assess the general sentiment of calls.
You provide your call recordings, and PCA automatically processes them using Transcribe Call Analytics and other AWS services to extract valuable intelligence such as customer and agent sentiment, call drivers, entities discussed, and conversation characteristics such as non-talk time, interruptions, loudness, and talk speed. Transcribe Call Analytics detects issues using built-in ML models that have been trained using thousands of hours of conversations. With the automated call categorization capability, you can also tag conversations based on keywords or phrases, sentiment, and non-talk time. And you can optionally redact sensitive customer data such as names, addresses, credit card numbers, and social security numbers from both transcript and audio files.
PCA’s web user interface has a home page showing all your calls, as shown in the following screenshot.
You can choose a record to see the details of the call, such as speech characteristics.
You can also scroll down to see annotated turn-by-turn call details.
You can search for calls based on dates, entities, or sentiment characteristics.
You can also search your call transcriptions.
Lastly, you can query detailed call analytics data from your preferred business intelligence (BI) tool.
PCA currently supports the following features:
-
Transcription
- Batch turn-by-turn transcription with support for Amazon Transcribe custom vocabulary for accuracy of domain-specific terminology
- Personally identifiable information (PII) redaction from transcripts and audio files, and vocabulary filtering for masking custom words and phrases
- Multiple languages and automatic language detection
- Standard audio file formats
- Caller and agent speaker labels using channel identification or speaker diarization
-
Analytics
- Caller and agent sentiment details and trends
- Talk and non-talk time for both caller and agent
- Configurable Transcribe Call Analytics categories based on the presence or absence of keywords or phrases, sentiment, and non-talk time
- Detects callers’ main issues using built-in ML models in Transcribe Call Analytics
- Discovers entities referenced in the call using Amazon Comprehend standard or custom entity detection models, or simple configurable string matching
- Detects when caller and agent interrupt each other
- Speaker loudness
-
Search
- Search on call attributes such as time range, sentiment, or entities
- Search transcriptions
-
Other
- Detects metadata from audio file names, such as call GUID, agent’s name, and call date time
- Scales automatically to handle variable call volumes
- Bulk loads large archives of older recordings while maintaining capacity to process new recordings as they arrive
- Sample recordings so you can quickly try out PCA for yourself
- It’s easy to install with a single AWS CloudFormation template
This is just the beginning! We expect to add many more exciting features over time, based on your feedback.
Deploy the CloudFormation stack
Start your PCA experience by using AWS CloudFormation to deploy the solution with sample recordings loaded.
- Use the following Launch Stack button to deploy the PCA solution in the
us-east-1
(N. Virginia) AWS Region.
The source code is available in our GitHub repository. Follow the directions in the README to deploy PCA to additional Regions supported by Amazon Transcribe.
- For Stack name, use the default value,
PostCallAnalytics
. - For AdminUsername, use the default value, admin.
- For AdminEmail, use a valid email address—your temporary password is emailed to this address during the deployment.
- For loadSampleAudioFiles, change the value to
true
. - For EnableTranscriptKendraSearch, change the value to
Yes, create new Kendra Index (Developer Edition)
.
If you have previously used your Amazon Kendra Free Tier allowance, you incur an hourly cost for this index (more information on cost later in this post). Amazon Kendra transcript search is an optional feature, so if you don’t need it and are concerned about cost, use the default value of No.
- For all other parameters, use the default values.
If you want to customize the settings later, for example to apply custom vocabulary to improve accuracy, or to customize entity detection, you can update the stack to set these parameters.
The main CloudFormation stack uses nested stacks to create the following resources in your AWS account:
- Amazon Simple Storage Service (Amazon S3) buckets to hold build artifacts and call recordings
- AWS Systems Manager Parameter Store settings to store configuration settings
- AWS Step Functions workflows to orchestrate recording file processing
- AWS Lambda functions to process audio files and turn-by-turn transcriptions and analytics
- Amazon DynamoDB tables to store call metadata
- Website components including S3 bucket, Amazon CloudFront distribution, and Amazon Cognito user pool
- Other miscellaneous supporting resources, including AWS Identity and Access Management (IAM) roles and policies (using least privilege best practices), Amazon Simple Queue Service (Amazon SQS) message queues, and Amazon CloudWatch log groups.
- Optionally, an Amazon Kendra index and AWS Amplify search application to provide intelligent call transcript search.
The stacks take about 20 minutes to deploy. The main stack status shows as CREATE_COMPLETE when everything is deployed.
Set your password
After you deploy the stack, you need to open the PCA web user interface and set your password.
- On the AWS CloudFormation console, choose the main stack,
PostCallAnalytics
, and choose the Outputs tab.
- Open your web browser to the URL shown as
WebAppURL
in the outputs.
You’re redirected to a login page.
- Open the email your received, at the email address you provided, with the subject “Welcome to the Amazon Transcribe Post Call Analytics (PCA) Solution!”
This email contains a generated temporary password that you can use to log in (as user admin) and create your own password.
- Set a new password.
Your new password must have a length of at least eight characters, and contain uppercase and lowercase characters, plus numbers and special characters.
You’re now logged in to PCA. Because you set loadSampleAudioFiles
to true, your PCA deployment now has three sample calls pre-loaded for you to explore.
Optional: Open the transcription search web UI and set your permanent password
Follow these additional steps to log in to the companion transcript search web app, which is deployed only when you set EnableTranscriptKendraSearch
when you launch the stack.
- On the AWS CloudFormation console, choose the main stack,
PostCallAnalytics
, and choose the Outputs tab. - Open your web browser to the URL shown as
TranscriptionMediaSearchFinderURL
in the outputs.
You’re redirected to the login page.
- Open the email your received, at the email address you provided, with the subject “Welcome to Finder Web App.”
This email contains a generated temporary password that you can use to log in (as user admin).
- Create your own password, just like you already did for the PCA web application.
As before, your new password must have a length of at least eight characters, and contain uppercase and lowercase characters, plus numbers and special characters.
You’re now logged in to the transcript search Finder application. The sample audio files are indexed already, and ready for search.
Explore post call analytics features
Now, with PCA successfully installed, you’re ready to explore the call analysis features.
Home page
To explore the home page, open the PCA web UI using the URL shown as WebAppURL
in the main stack outputs (bookmark this URL, you’ll use it often!)
You already have three calls listed on the home page, sorted in descending time order (most recent first). These are the sample audio files.
The calls have the following key details:
- Job Name – Is assigned from the recording audio file name, and serves as a unique job name for this call
- Timestamp – Is parsed from the audio file name if possible, otherwise it’s assigned the time when the recording is processed by PCA
- Customer Sentiment and Customer Sentiment Trend – Show the overall caller sentiment and, importantly, whether the caller was more positive at the end of the call than at the beginning
- Language Code – Shows the specified language or the automatically detected dominant language of the call
Call details
Choose the most recently received call to open and explore the call detail page. You can review the call information and analytics such as sentiment, talk time, interruptions, and loudness.
Scroll down to see the following details:
- Entities grouped by entity type. Entities are detected by Amazon Comprehend and the sample entity recognizer string map.
- Categories detected by Transcribe Call Analytics. By default, there are no categories; see Call categorization for more information.
- Issues detected by the Transcribe Call Analytics built-in ML model. Issues succinctly capture the main reasons for the call. For more information, see Issue detection.
Scroll further to see the turn-by-turn transcription for the call, with annotations for speaker, time marker, sentiment, interruptions, issues, and entities.
Use the embedded media player to play the call audio from any point in the conversation. Set the position by choosing the time marker annotation on the transcript or by using the player time control. The audio player remains visible as you scroll down the page.
PII is redacted from both transcript and audio—redaction is enabled using the CloudFormation stack parameters.
Search based on call attributes
To try PCA’s built-in search, choose Search at the top of the screen. Under Sentiment, choose Average, Customer, and Negative to select the calls that had average negative customer sentiment.
Choose Clear to try a different filter. For Entities, enter Hyundai
and then choose Search. Select the call from the search results and verify from the transcript that the customer was indeed calling about their Hyundai.
Search call transcripts
Transcript search is an experimental, optional, add-on feature powered by Amazon Kendra.
Open the transcript web UI using the URL shown as TranscriptionMediaSearchFinderURL
in the main stack outputs. To find a recent call, enter the search query customer hit the wall
.
The results show transcription extracts from relevant calls. Use the embedded audio player to play the associated section of the call recording.
You can expand Filter search results to refine the search results with additional filters. Choose Open Call Analytics to open the PCA call detail page for this call.
Query call analytics using SQL
You can integrate PCA call analytics data into a reporting or BI tool such as Amazon QuickSight by using Amazon Athena SQL queries. To try it, open the Athena query editor. For Database, choose pca.
Observe the table parsedresults
. This table contains all the turn-by-turn transcriptions and analysis for each call, using nested structures.
You can also review flattened result sets, which are simpler to integrate into your reporting or analytics application. Use the query editor to preview the data.
Processing flow overview
How did PCA transcribe and analyze your phone call recordings? Let’s take a quick look at how it works.
The following diagram shows the main data processing components and how they fit together at a high level.
Call recording audio files are uploaded to the S3 bucket and folder, identified in the main stack outputs as InputBucket
and InputBucketPrefix
, respectively. The sample call recordings are automatically uploaded because you set the parameter loadSampleAudioFiles
to true when you deployed PCA.
As each recording file is added to the input bucket, an S3 Event Notification triggers a Lambda function that initiates a workflow in Step Functions to process the file. The workflow orchestrates the steps to start an Amazon Transcribe batch job and process the results by doing entity detection and additional preparation of the call analytics results. Processed results are stored as JSON files in another S3 bucket and folder, identified in the main stack outputs as OutputBucket
and OutputBucketPrefix
.
As the Step Functions workflow creates each JSON results file in the output bucket, an S3 Event Notification triggers a Lambda function, which loads selected call metadata into a DynamoDB table.
The PCA UI web app queries the DynamoDB table to retrieve the list of processed calls to display on the home page. The call detail page reads additional detailed transcription and analytics from the JSON results file for the selected call.
Amazon S3 Lifecycle policies delete recordings and JSON files from both input and output buckets after a configurable retention period, defined by the deployment parameter RetentionDays
. S3 Event Notifications and Lambda functions keep the DynamoDB table synchronized as files are both created and deleted.
When the EnableTranscriptKendraSearch
parameter is true
, the Step Functions workflow also adds time markers and metadata attributes to the transcription, which are loaded into an Amazon Kendra index. The transcription search web application is used to search call transcriptions. For more information on how this works, see Make your audio and video files searchable using Amazon Transcribe and Amazon Kendra.
Monitoring and troubleshooting
AWS CloudFormation reports deployment failures and causes on the stack Events tab. See Troubleshooting CloudFormation for help with common deployment problems.
PCA provides runtime monitoring and logs for each component using CloudWatch:
-
Step Functions workflow – On the Step Functions console, open the workflow
PostCallAnalyticsWorkflow
. The Executions tab show the status of each workflow run. Choose any run to see details. Choose CloudWatch Logs from the Execution event history to examine logs for any Lambda function that was invoked by the workflow. -
PCA server and UI Lambda functions – On the Lambda console, filter by
PostCallAnalytics
to see all the PCA-related Lambda functions. Choose your function, and choose the Monitor tab to see function metrics. Choose View logs in CloudWatch to inspect function logs.
Cost assessment
For pricing information for the main services used by PCA, see the following:
- Amazon CloudFront Pricing
- Amazon CloudWatch pricing
- Amazon Cognito Pricing
- Amazon Comprehend Pricing
- Amazon DynamoDB pricing
- Amazon API Gateway pricing
- Amazon Kendra pricing (for the optional transcription search feature)
- AWS Lambda Pricing
- Amazon Transcribe Pricing
- Amazon S3 pricing
- AWS Step Functions Pricing
When transcription search is enabled, you incur an hourly cost for the Amazon Kendra index: $1.125/hour for the Developer Edition (first 750 hours are free), or $1.40/hour for the Enterprise Edition (recommended for production workloads).
All other PCA costs are incurred based on usage, and are Free Tier eligible. After the Free Tier allowance is consumed, usage costs add up to about $0.15 for a 5-minute call recording.
To explore PCA costs for yourself, use AWS Cost Explorer or choose Bill Details on the AWS Billing Dashboard to see your month-to-date spend by service.
Integrate with your contact center
You can configure your contact center to enable call recording. If possible, configure recordings for two channels (stereo), with customer audio on one channel (for example, channel 0) and the agent audio on the other channel (channel 1).
Via the AWS Command Line Interface (AWS CLI) or SDK, copy your contact center recording files to the PCA input bucket folder, identified in the main stack outputs as InputBucket
and InputBucketPrefix
. Alternatively, if you already save your call recordings to Amazon S3, use deployment parameters InputBucketName
and InputBucketRawAudio
to configure PCA to use your existing S3 bucket and prefix, so you don’t have to copy the files again.
Customize your deployment
Use the following CloudFormation template parameters when creating or updating your stack to customize your PCA deployment:
- To enable or disable the optional (experimental) transcription search feature, use
EnableTranscriptKendraSearch
. - To use your existing S3 bucket for incoming call recordings, use
InputBucket
andInputBucketPrefix
. - To configure automatic deletion of recordings and call analysis data when using auto-provisioned S3 input and output buckets, use
RetentionDays
. - To detect call timestamp, agent name, or call identifier (GUID) from the recording file name, use
FilenameDatetimeRegex
,FilenameDatetimeFieldMap
,FilenameGUIDRegex
, andFilenameAgentRegex
. - To use the standard Amazon Transcribe API instead of the default call analytics API, use TranscribeApiMode. PCA automatically reverts to the standard mode API for audio recordings that aren’t compatible with the call analytics API (for example, mono channel recordings). When using the standard API some call analytics, metrics such as issue detection and speaker loudness aren’t available.
- To set the list of supported audio languages, use
TranscribeLanguages
. - To mask unwanted words, use
VocabFilterMode
and setVocabFilterName
to the name of a vocabulary filter that you already created in Amazon Transcribe. See Vocabulary filtering for more information. - To improve transcription accuracy for technical and domain specific acronyms and jargon, set
VocabularyName
to the name of a custom vocabulary that you already created in Amazon Transcribe. See Custom vocabularies for more information. - To configure PCA to use single-channel audio by default, and to identify speakers using speaker diarizaton rather than channel identification, use
SpeakerSeparationType
andMaxSpeakers
. The default is to use channel identification with stereo files using Transcribe Call Analytics APIs to generate the richest analytics and most accurate speaker labeling. - To redact PII from the transcriptions or from the audio, set
CallRedactionTranscript
orCallRedactionAudio
to true. See Redaction for more information. - To customize entity detection using Amazon Comprehend, or to provide your own CSV file to define entities, use the Entity detection parameters.
See the README on GitHub for more details on configuration options and operations for PCA.
PCA is an open-source project. You can fork the PCA GitHub repository, enhance the code, and send us pull requests so we can incorporate and share your improvements!
Clean up
When you’re finished experimenting with this solution, clean up your resources by opening the AWS CloudFormation console and deleting the PostCallAnalytics
stacks that you deployed. This deletes resources that you created by deploying the solution. S3 buckets containing your audio recordings and analytics, and CloudWatch log groups are retained after the stack is deleted to avoid deleting your data.
Live Call Analytics: Companion solution
Our companion solution, Live Call Analytics (LCA), offers real time-transcription and analytics capabilities by using the Amazon Transcribe and Amazon Comprehend real-time APIs. Unlike PCA, which transcribes and analyzes recorded audio after the call has ended, LCA transcribes and analyzes your calls as they are happening and provides real-time updates to supervisors and agents. You can configure LCA to store call recordings to the PCA’s ingestion S3 bucket, and use the two solutions together to get the best of both worlds. See Live call analytics for your contact center with Amazon language AI services for more information.
Conclusion
The Post Call Analytics solution offers a scalable, cost-effective approach to provide call analytics with features to help improve your callers’ experience. It uses Amazon ML services like Transcribe Call Analytics and Amazon Comprehend to transcribe and extract rich insights from your customer conversations.
The sample PCA application is provided as open source—use it as a starting point for your own solution, and help us make it better by contributing back fixes and features via GitHub pull requests. For expert assistance, AWS Professional Services and other AWS Partners are here to help.
We’d love to hear from you. Let us know what you think in the comments section, or use the issues forum in the PCA GitHub repository.
About the Authors
Bob Strahan is a Principal Solutions Architect in the AWS Language AI Services team.
Dr. Andrew Kane is an AWS Principal WW Tech Lead (AI Language Services) based out of London. He focuses on the AWS Language and Vision AI services, helping our customers architect multiple AI services into a single use-case driven solution. Before joining AWS at the beginning of 2015, Andrew spent two decades working in the fields of signal processing, financial payments systems, weapons tracking, and editorial and publishing systems. He is a keen karate enthusiast (just one belt away from Black Belt) and is also an avid home-brewer, using automated brewing hardware and other IoT sensors.
Steve Engledow is a Solutions Engineer working with internal and external AWS customers to build reusable solutions to common problems.
Connor Kirkpatrick is an AWS Solutions Engineer based in the UK. Connor works with the AWS Solution Architects to create standardised tools, code samples, demonstrations, and quickstarts. He is an enthusiastic rower, wobbly cyclist, and occasional baker.
Franco Rezabek is an AWS Solutions Engineer based in London, UK. Franco works with AWS Solution Architects to create standardized tools, code samples, demonstrations, and quick starts.