Responsible AI at Google Research: AI for Social Good

Responsible AI at Google Research: AI for Social Good

Google’s AI for Social Good team consists of researchers, engineers, volunteers, and others with a shared focus on positive social impact. Our mission is to demonstrate AI’s societal benefit by enabling real-world value, with projects spanning work in public health, accessibility, crisis response, climate and energy, and nature and society. We believe that the best way to drive positive change in underserved communities is by partnering with change-makers and the organizations they serve.

In this blog post we discuss work done by Project Euphonia, a team within AI for Social Good, that aims to improve automatic speech recognition (ASR) for people with disordered speech. For people with typical speech, an ASR model’s word error rate (WER) can be less than 10%. But for people with disordered speech patterns, such as stuttering, dysarthria and apraxia, the WER could reach 50% or even 90% depending on the etiology and severity. To help address this problem, we worked with more than 1,000 participants to collect over 1,000 hours of disordered speech samples and used the data to show that ASR personalization is a viable avenue for bridging the performance gap for users with disordered speech. We’ve shown that personalization can be successful with as little as 3-4 minutes of training speech using layer freezing techniques.

This work led to the development of Project Relate for anyone with atypical speech who could benefit from a personalized speech model. Built in partnership with Google’s Speech team, Project Relate enables people who find it hard to be understood by other people and technology to train their own models. People can use these personalized models to communicate more effectively and gain more independence. To make ASR more accessible and usable, we describe how we fine-tuned Google’s Universal Speech Model (USM) to better understand disordered speech out of the box, without personalization, for use with digital assistant technologies, dictation apps, and in conversations.

Addressing the challenges

Working closely with Project Relate users, it became clear that personalized models can be very useful, but for many users, recording dozens or hundreds of examples can be challenging. In addition, the personalized models did not always perform well in freeform conversation.

To address these challenges, Euphonia’s research efforts have been focusing on speaker independent ASR (SI-ASR) to make models work better out of the box for people with disordered speech so that no additional training is necessary.

Prompted Speech dataset for SI-ASR

The first step in building a robust SI-ASR model was to create representative dataset splits. We created the Prompted Speech dataset by splitting the Euphonia corpus into train, validation and test portions, while ensuring that each split spanned a range of speech impairment severity and underlying etiology and that no speakers or phrases appeared in multiple splits. The training portion consists of over 950k speech utterances from over 1,000 speakers with disordered speech. The test set contains around 5,700 utterances from over 350 speakers. Speech-language pathologists manually reviewed all of the utterances in the test set for transcription accuracy and audio quality.

Real Conversation test set

Unprompted or conversational speech differs from prompted speech in several ways. In conversation, people speak faster and enunciate less. They repeat words, repair misspoken words, and use a more expansive vocabulary that is specific and personal to themselves and their community. To improve a model for this use case, we created the Real Conversation test set to benchmark performance.

The Real Conversation test set was created with the help of trusted testers who recorded themselves speaking during conversations. The audio was reviewed, any personally identifiable information (PII) was removed, and then that data was transcribed by speech-language pathologists. The Real Conversation test set contains over 1,500 utterances from 29 speakers.

Adapting USM to disordered speech

We then tuned USM on the training split of the Euphonia Prompted Speech set to improve its performance on disordered speech. Instead of fine-tuning the full model, our tuning was based on residual adapters, a parameter-efficient tuning approach that adds tunable bottleneck layers as residuals between the transformer layers. Only these layers are tuned, while the rest of the model weights are untouched. We have previously shown that this approach works very well to adapt ASR models to disordered speech. Residual adapters were only added to the encoder layers, and the bottleneck dimension was set to 64.

Results

To evaluate the adapted USM, we compared it to older ASR models using the two test sets described above. For each test, we compare adapted USM to the pre-USM model best suited to that task: (1) For short prompted speech, we compare to Google’s production ASR model optimized for short form ASR; (2) for longer Real Conversation speech, we compare to a model trained for long form ASR. USM improvements over pre-USM models can be explained by USM’s relative size increase, 120M to 2B parameters, and other improvements discussed in the USM blog post.

Model word error rates (WER) for each test set (lower is better).

We see that the USM adapted with disordered speech significantly outperforms the other models. The adapted USM’s WER on Real Conversation is 37% better than the pre-USM model, and on the Prompted Speech test set, the adapted USM performs 53% better.

These findings suggest that the adapted USM is significantly more usable for an end user with disordered speech. We can demonstrate this improvement by looking at transcripts of Real Conversation test set recordings from a trusted tester of Euphonia and Project Relate (see below).

Audio1    Ground Truth    Pre-USM ASR    Adapted USM
                    
   I now have an Xbox adaptive controller on my lap.    i now have a lot and that consultant on my mouth    i now had an xbox adapter controller on my lamp.
                    
   I’ve been talking for quite a while now. Let’s see.    quite a while now    i’ve been talking for quite a while now.
Example audio and transcriptions of a trusted tester’s speech from the Real Conversation test set.

A comparison of the Pre-USM and adapted USM transcripts revealed some key advantages:

  • The first example shows that Adapted USM is better at recognizing disordered speech patterns. The baseline misses key words like “XBox” and “controller” that are important for a listener to understand what they are trying to say.
  • The second example is a good example of how deletions are a primary issue with ASR models that are not trained with disordered speech. Though the baseline model did transcribe a portion correctly, a large part of the utterance was not transcribed, losing the speaker’s intended message.

Conclusion

We believe that this work is an important step towards making speech recognition more accessible to people with disordered speech. We are continuing to work on improving the performance of our models. With the rapid advancements in ASR, we aim to ensure people with disordered speech benefit as well.

Acknowledgements

Key contributors to this project include Fadi Biadsy, Michael Brenner, Julie Cattiau, Richard Cave, Amy Chung-Yu Chou, Dotan Emanuel, Jordan Green, Rus Heywood, Pan-Pan Jiang, Anton Kast, Marilyn Ladewig, Bob MacDonald, Philip Nelson, Katie Seaver, Joel Shor, Jimmy Tobin, Katrin Tomanek, and Subhashini Venugopalan. We gratefully acknowledge the support Project Euphonia received from members of the USM research team including Yu Zhang, Wei Han, Nanxin Chen, and many others. Most importantly, we wanted to say a huge thank you to the 2,200+ participants who recorded speech samples and the many advocacy groups who helped us connect with these participants.


1Audio volume has been adjusted for ease of listening, but the original files would be more consistent with those used in training and would have pauses, silences, variable volume, etc. 

Read More

The world’s first braiding of non-Abelian anyons

The world’s first braiding of non-Abelian anyons

Imagine you’re shown two identical objects and then asked to close your eyes. When you open your eyes, you see the same two objects in the same position. How can you determine if they have been swapped back and forth? Intuition and the laws of quantum mechanics agree: If the objects are truly identical, there is no way to tell.

While this sounds like common sense, it only applies to our familiar three-dimensional world. Researchers have predicted that for a special type of particle, called an anyon, that is restricted to move only in a two-dimensional (2D) plane, quantum mechanics allows for something quite different. Anyons are indistinguishable from one another and some, non-Abelian anyons, have a special property that causes observable differences in the shared quantum state under exchange, making it possible to tell when they have been exchanged, despite being fully indistinguishable from one another. While researchers have managed to detect their relatives, Abelian anyons, whose change under exchange is more subtle and impossible to directly detect, realizing “non-Abelian exchange behavior” has proven more difficult due to challenges with both control and detection.

In “Non-Abelian braiding of graph vertices in a superconducting processor”, published in Nature, we report the observation of this non-Abelian exchange behavior for the first time. Non-Abelian anyons could open a new avenue for quantum computation, in which quantum operations are achieved by swapping particles around one another like strings are swapped around one another to create braids. Realizing this new exchange behavior on our superconducting quantum processor could be an alternate route to so-called topological quantum computation, which benefits from being robust against environmental noise.

Exchange statistics and non-Abelian anyons

In order to understand how this strange non-Abelian behavior can occur, it’s helpful to consider an analogy with the braiding of two strings. Take two identical strings and lay them parallel next to one another. Swap their ends to form a double-helix shape. The strings are identical, but because they wrap around one another when the ends are exchanged, it is very clear when the two ends are swapped.

The exchange of non-Abelian anyons can be visualized in a similar way, where the strings are made from extending the particles’ positions into the time dimension to form “world-lines.” Imagine plotting two particles’ locations vs. time. If the particles stay put, the plot would simply be two parallel lines, representing their constant locations. But if we exchange the locations of the particles, the world lines wrap around one another. Exchange them a second time, and you’ve made a knot.

While a bit difficult to visualize, knots in four dimensions (three spatial plus one time dimension) can always easily be undone. They are trivial — like a shoelace, simply pull one end and it unravels. But when the particles are restricted to two spatial dimensions, the knots are in three total dimensions and — as we know from our everyday 3D lives — cannot always be easily untied. The braiding of the non-Abelian anyons’ world lines can be used as quantum computing operations to transform the state of the particles.

A key aspect of non-Abelian anyons is “degeneracy”: the full state of several separated anyons is not completely specified by local information, allowing the same anyon configuration to represent superpositions of several quantum states. Winding non-Abelian anyons about each other can change the encoded state.

How to make a non-Abelian anyon

So how do we realize non-Abelian braiding with one of Google’s quantum processors? We start with the familiar surface code, which we recently used to achieve a milestone in quantum error correction, where qubits are arranged on the vertices of a checkerboard pattern. Each color square of the checkerboard represents one of two possible joint measurements that can be made of the qubits on the four corners of the square. These so-called “stabilizer measurements” can return a value of either + or – 1. The latter is referred to as a plaquette violation, and can be created and moved diagonally — just like bishops in chess — by applying single-qubit X- and Z-gates. Recently, we showed that these bishop-like plaquette violations are Abelian anyons. In contrast to non-Abelian anyons, the state of Abelian anyons changes only subtly when they are swapped — so subtly that it is impossible to directly detect. While Abelian anyons are interesting, they do not hold the same promise for topological quantum computing that non-Abelian anyons do.

To produce non-Abelian anyons, we need to control the degeneracy (i.e., the number of wavefunctions that causes all stabilizer measurements to be +1). Since a stabilizer measurement returns two possible values, each stabilizer cuts the degeneracy of the system in half, and with sufficiently many stabilizers, only one wave function satisfies the criterion. Hence, a simple way to increase the degeneracy is to merge two stabilizers together. In the process of doing so, we remove one edge in the stabilizer grid, giving rise to two points where only three edges intersect. These points, referred to as “degree-3 vertices” (D3Vs), are predicted to be non-Abelian anyons.

In order to braid the D3Vs, we have to move them, meaning that we have to stretch and squash the stabilizers into new shapes. We accomplish this by implementing two-qubit gates between the anyons and their neighbors (middle and right panels shown below).

Non-Abelian anyons in stabilizer codes. a: Example of a knot made by braiding two anyons’ world lines. b: Single-qubit gates can be used to create and move stabilizers with a value of –1 (red squares). Like bishops in chess, these can only move diagonally and are therefore constrained to one sublattice in the regular surface code. This constraint is broken when D3Vs (yellow triangles) are introduced. c: Process to form and move D3Vs (predicted to be non-Abelian anyons). We start with the surface code, where each square corresponds to a joint measurement of the four qubits on its corners (left panel). We remove an edge separating two neighboring squares, such that there is now a single joint measurement of all six qubits (middle panel). This creates two D3Vs, which are non-Abelian anyons. We move the D3Vs by applying two-qubit gates between neighboring sites (right panel).

Now that we have a way to create and move the non-Abelian anyons, we need to verify their anyonic behavior. For this we examine three characteristics that would be expected of non-Abelian anyons:

  1. The “fusion rules” — What happens when non-Abelian anyons collide with each other?
  2. Exchange statistics — What happens when they are braided around one another?
  3. Topological quantum computing primitives — Can we encode qubits in the non-Abelian anyons and use braiding to perform two-qubit entangling operations?

The fusion rules of non-Abelian anyons

We investigate fusion rules by studying how a pair of D3Vs interact with the bishop-like plaquette violations introduced above. In particular, we create a pair of these and bring one of them around a D3V by applying single-qubit gates.

While the rules of bishops in chess dictate that the plaquette violations can never meet, the dislocation in the checkerboard lattice allows them to break this rule, meet its partner and annihilate with it. The plaquette violations have now disappeared! But bring the non-Abelian anyons back in contact with one another, and the anyons suddenly morph into the missing plaquette violations. As weird as this behavior seems, it is a manifestation of exactly the fusion rules that we expect these entities to obey. This establishes confidence that the D3Vs are, indeed, non-Abelian anyons.

Demonstration of anyonic fusion rules (starting with panel I, in the lower left). We form and separate two D3Vs (yellow triangles), then form two adjacent plaquette violations (red squares) and pass one between the D3Vs. The D3Vs deformation of the “chessboard” changes the bishop rules of the plaquette violations. While they used to lie on adjacent squares, they are now able to move along the same diagonals and collide (as shown by the red lines). When they do collide, they annihilate one another. The D3Vs are brought back together and surprisingly morph into the missing adjacent red plaquette violations.

Observation of non-Abelian exchange statistics

After establishing the fusion rules, we want to see the real smoking gun of non-Abelian anyons: non-Abelian exchange statistics. We create two pairs of non-Abelian anyons, then braid them by wrapping one from each pair around each other (shown below). When we fuse the two pairs back together, two pairs of plaquette violations appear. The simple act of braiding the anyons around one another changed the observables of our system. In other words, if you closed your eyes while the non-Abelian anyons were being exchanged, you would still be able to tell that they had been exchanged once you opened your eyes. This is the hallmark of non-Abelian statistics.

Braiding non-Abelian anyons. We make two pairs of D3Vs (panel II), then bring one from each pair around each other (III-XI). When fusing the two pairs together again in panel XII, two pairs of plaquette violations appear! Braiding the non-Abelian anyons changed the observables of the system from panel I to panel XII; a direct manifestation of non-Abelian exchange statistics.

Topological quantum computing

Finally, after establishing their fusion rules and exchange statistics, we demonstrate how we can use these particles in quantum computations. The non-Abelian anyons can be used to encode information, represented by logical qubits, which should be distinguished from the actual physical qubits used in the experiment. The number of logical qubits encoded in N D3Vs can be shown to be N/2–1, so we use N=8 D3Vs to encode three logical qubits, and perform braiding to entangle them. By studying the resulting state, we find that the braiding has indeed led to the formation of the desired, well-known quantum entangled state called the Greenberger-Horne-Zeilinger (GHZ) state.

Using non-Abelian anyons as logical qubits. a, We braid the non-Abelian anyons to entangle three qubits encoded in eight D3Vs. b, Quantum state tomography allows for reconstructing the density matrix, which can be represented in a 3D bar plot and is found to be consistent with the desired highly entangled GHZ-state.

Conclusion

Our experiments show the first observation of non-Abelian exchange statistics, and that braiding of the D3Vs can be used to perform quantum computations. With future additions, including error correction during the braiding procedure, this could be a major step towards topological quantum computation, a long-sought method to endow qubits with intrinsic resilience against fluctuations and noise that would otherwise cause errors in computations.

Acknowledgements

We would like to thank Katie McCormick, our Quantum Science Communicator, for helping to write this blog post.

Read More

Use the AWS CDK to deploy Amazon SageMaker Studio lifecycle configurations

Use the AWS CDK to deploy Amazon SageMaker Studio lifecycle configurations

Amazon SageMaker Studio is the first fully integrated development environment (IDE) for machine learning (ML). Studio provides a single web-based visual interface where you can perform all ML development steps required to prepare data, as well as build, train, and deploy models. Lifecycle configurations are shell scripts triggered by Studio lifecycle events, such as starting a new Studio notebook. You can use lifecycle configurations to automate customization for your Studio environment. This customization includes installing custom packages, configuring notebook extensions, preloading datasets, and setting up source code repositories. For example, as an administrator for a Studio domain, you may want to save costs by having notebook apps shut down automatically after long periods of inactivity.

The AWS Cloud Development Kit (AWS CDK) is a framework for defining cloud infrastructure through code and provisioning it through AWS CloudFormation stacks. A stack is a collection of AWS resources that can be programmatically updated, moved, or deleted. AWS CDK constructs are the building blocks of AWS CDK applications, representing the blueprint to define cloud architectures.

In this post, we show how to use the AWS CDK to set up Studio, use Studio lifecycle configurations, and enable its access for data scientists and developers in your organization.

Solution overview

The modularity of lifecycle configurations allows you to apply them to all users in a domain or to specific users. This way, you can set up lifecycle configurations and reference them in the Studio kernel gateway or Jupyter server quickly and consistently. The kernel gateway is the entry point to interact with a notebook instance, whereas the Jupyter server represents the Studio instance. This enables you to apply DevOps best practices and meet safety, compliance, and configuration standards across all AWS accounts and Regions. For this post, we use Python as the main language, but the code can be easily changed to other AWS CDK supported languages. For more information, refer to Working with the AWS CDK.

Prerequisites

To get started, make sure you have the following prerequisites:

Clone the GitHub repository

First, clone the GitHub repository.

As you clone the repository, you can observe that we have a classic AWS CDK project with the directory studio-lifecycle-config-construct, which contains the construct and resources required to create lifecycle configurations.

AWS CDK constructs

The file we want to inspect is aws_sagemaker_lifecycle.py. This file contains the SageMakerStudioLifeCycleConfig construct we use to set up and create lifecycle configurations.

The SageMakerStudioLifeCycleConfig construct provides the framework for building lifecycle configurations using a custom AWS Lambda function and shell code read in from a file. The construct contains the following parameters:

  • ID – The name of the current project.
  • studio_lifecycle_content – The base64 encoded content.
  • studio_lifecycle_tags – Labels you assign to organize Amazon resources. They are inputted as key-value pairs and are optional for this configuration.
  • studio_lifecycle_config_app_typeJupyterServer is for the unique server itself, and the KernelGateway app corresponds to a running SageMaker image container.

For more information on the Studio notebook architecture, refer to Dive deep into Amazon SageMaker Studio Notebooks architecture.

The following is a code snippet of the Studio lifecycle config construct (aws_sagemaker_lifecycle.py):

class SageMakerStudioLifeCycleConfig(Construct):
 def __init__(
 self,
 scope: Construct,
 id: str,
 studio_lifecycle_config_content: str,
 studio_lifecycle_config_app_type: str,
 studio_lifecycle_config_name: str,
 studio_lifecycle_config_arn: str,
 **kwargs,
 ):
 super().__init__(scope, id)
 self.studio_lifecycle_content = studio_lifecycle_content
 self.studio_lifecycle_config_name = studio_lifecycle_config_name
 self.studio_lifecycle_config_app_type = studio_lifecycle_config_app_type

 lifecycle_config_role = iam.Role(
 self,
 "SmStudioLifeCycleConfigRole",
 assumed_by=iam.ServicePrincipal("lambda.amazonaws.com"),
 )

 lifecycle_config_role.add_to_policy(
 iam.PolicyStatement(
 resources=[f"arn:aws:sagemaker:{scope.region}:{scope.account}:*"],
 actions=[
 "sagemaker:CreateStudioLifecycleConfig",
 "sagemaker:ListUserProfiles",
 "sagemaker:UpdateUserProfile",
 "sagemaker:DeleteStudioLifecycleConfig",
 "sagemaker:AddTags",
 ],
 )
 )

 create_lifecycle_script_lambda = lambda_.Function(
 self,
 "CreateLifeCycleConfigLambda",
 runtime=lambda_.Runtime.PYTHON_3_8,
 timeout=Duration.minutes(3),
 code=lambda_.Code.from_asset(
 "../mlsl-cdk-constructs-lib/src/studiolifecycleconfigconstruct"
 ),
 handler="onEvent.handler",
 role=lifecycle_config_role,
 environment={
 "studio_lifecycle_content": self.studio_lifecycle_content,
 "studio_lifecycle_config_name": self.studio_lifecycle_config_name,
 "studio_lifecycle_config_app_type": self.studio_lifecycle_config_app_type,
 },
 )

 config_custom_resource_provider = custom_resources.Provider(
 self,
 "ConfigCustomResourceProvider",
 on_event_handler=create_lifecycle_script_lambda,
 )

 studio_lifecyle_config_custom_resource = CustomResource(
 self,
 "LifeCycleCustomResource",
 service_token=config_custom_resource_provider.service_token,
 )
 self. studio_lifecycle_config_arn = studio_lifecycle_config_custom_resource.get_att("StudioLifecycleConfigArn")

After you import and install the construct, you can use it. The following code snippet shows how to create a lifecycle config using the construct in a stack either in app.py or another construct:

my_studio_lifecycle_config = SageMakerStudioLifeCycleConfig(
 self,
 "MLSLBlogPost",
 studio_lifecycle_config_content="base64content",
 studio_lifecycle_config_name="BlogPostTest",
 studio_lifecycle_config_app_type="JupyterServer",
 
 )

Deploy AWS CDK constructs

To deploy your AWS CDK stack, run the following commands in the location where you cloned the repository.

The command may be python instead of python3 depending on your path configurations.

  1. Create a virtual environment:
    1. For macOS/Linux, use python3 -m venv .cdk-venv.
    2. For Windows, use python3 -m venv .cdk-venv.
  2. Activate the virtual environment:
    1. For macOS/Linux, use source .cdk-venvbinactivate.
    2. For Windows, use .cdk-venv/Scripts/activate.bat.
    3. For PowerShell, use .cdk-venv/Scripts/activate.ps1.
  3. Install the required dependencies:
    1. pip install -r requirements.txt
    2. pip install -r requirements-dev.txt
  4. At this point, you can optionally synthesize the CloudFormation template for this code:
    cdk synth

  5. Deploy the solution with the following commands:
    1. aws configure
    2. cdk bootstrap
    3. cdk deploy

When the stack is successfully deployed, you should be able to view the stack on the CloudFormation console.

You will also be able to view the lifecycle configuration on the SageMaker console.

Choose the lifecycle configuration to view the shell code that runs as well as any tags you assigned.

Attach the Studio lifecycle configuration

There are multiple ways to attach a lifecycle configuration. In this section, we present two methods: using the AWS Management Console, and programmatically using the infrastructure provided.

Attach the lifecycle configuration using the console

To use the console, complete the following steps:

  1. On the SageMaker console, choose Domains in the navigation pane.
  2. Choose the domain name you’re using and the current user profile, then choose Edit.
  3. Select the lifecycle configuration you want to use and choose Attach.

From here, you can also set it as default.

Attach the lifecycle configuration programmatically

You can also retrieve the ARN of the Studio lifecycle configuration created by the construct’s and attach it to the Studio construct programmatically. The following code shows the lifecycle configuration ARN being passed to a Studio construct:

default_user_settings=sagemaker.CfnDomain.UserSettingsProperty(
                execution_role=self.sagemaker_role.role_arn,
                jupyter_server_app_settings=sagemaker.CfnDomain.JupyterServerAppSettingsProperty(
                    default_resource_spec=sagemaker.CfnDomain.ResourceSpecProperty(
                        instance_type="system",
                        lifecycle_config_arn = my_studio_lifecycle_config.studio_lifeycycle_config_arn

                    )
                )

Clean up

Complete the steps in this section to clean up your resources.

Delete the Studio lifecycle configuration

To delete your lifecycle configuration, complete the following steps:

  1. On the SageMaker console, choose Studio lifecycle configurations in the navigation pane.
  2. Select the lifecycle configuration, then choose Delete.

Delete the AWS CDK stack

When you’re done with the resources you created, you can destroy your AWS CDK stack by running the following command in the location where you cloned the repository:

cdk destroy

When asked to confirm the deletion of the stack, enter yes.

You can also delete the stack on the AWS CloudFormation console with the following steps:

  1. On the AWS CloudFormation console, choose Stacks in the navigation pane.
  2. Choose the stack that you want to delete.
  3. In the stack details pane, choose Delete.
  4. Choose Delete stack when prompted.

If you run into any errors, you may have to manually delete some resources depending on your account configuration.

Conclusion

In this post, we discussed how Studio serves as an IDE for ML workloads. Studio offers lifecycle configuration support, which allows you to set up custom shell scripts to perform automated tasks, or set up development environments at launch. We used AWS CDK constructs to build the infrastructure for the custom resource and lifecycle configuration. Constructs are synthesized into CloudFormation stacks that are then deployed to create the custom resource and lifecycle script that is used in Studio and the notebook kernel.

For more information, visit Amazon SageMaker Studio.


About the Authors

Cory Hairston is a Software Engineer with the Amazon ML Solutions Lab. He currently works on providing reusable software solutions.

Alex Chirayath is a Senior Machine Learning Engineer at the Amazon ML Solutions Lab. He leads teams of data scientists and engineers to build AI applications to address business needs.

Gouri Pandeshwar is an Engineer Manager at the Amazon ML Solutions Lab. He and his team of engineers are working to build reusable solutions and frameworks that help accelerate adoption of AWS AI/ML services for customers’ business use cases.

Read More

Boost agent productivity with Salesforce integration for Live Call Analytics

Boost agent productivity with Salesforce integration for Live Call Analytics

As a contact center agent, would you rather focus on having productive customer conversations or get distracted by having to look up customer information and knowledge articles that could exist in various systems? We’ve all been there. Having a productive conversation while multitasking is challenging. A single negative experience may put a dent on a customer’s perception of your brand.

The Live Call Analytics with Agent Assist (LCA) open-source solution addresses these challenges by providing features such as AI-powered agent assistance, call transcription, call summarization, and much more. As part of our effort to meet the needs of your agents, we strive to add features based on your feedback and our own experience helping contact center operators.

One of the features we added is the ability to write your own AWS Lambda hooks for the start of call and post-call to custom process calls as they occur. This makes it easier to custom integrate with LCA architecture without complex modification to the original source code. It also lets you update LCA stack deployments more easily and quickly than if you were modifying the code directly.

Today, we are excited to announce a feature that lets you integrate LCA with your Customer Relationship Management (CRM) system, built on top of the pre- and post-call Lambda hooks.

In this post, we walk you through setting up the LCA/CRM integration with Salesforce.

Solution overview

LCA now has two additional Lambda hooks:

  • Start of call Lambda hook – The LCA Call Event/Transcript Processor invokes this hook at the beginning of each call. This function can implement custom logic that applies to the beginning of call processing, such as retrieving call summary details logged into a case in a CRM.
  • Post-call summary Lambda hook – The LCA Call Event/Transcript Processor invokes this hook after the call summary is processed. This function can implement custom logic that’s relevant to postprocessing, for example, updating the call summary to a CRM system.

The following diagram illustrates the start of call and post-call (summary) Lambda hooks that integrate with Salesforce to look up and update case records, respectively.

Start of call and Post call (summary) Lambda Hooks that integrate with Salesforce to look-up and update Case records respectively

Here are the steps we walk you through:

  1. Set up Salesforce to allow the custom Lambda hooks to look up or update the case records.
  2. Deploy the LCA and Salesforce integration stacks.
  3. Update the LCA stack with the Salesforce integration Lambda hooks and perform validations.

Prerequisites

You need the following prerequisites:

Create a Salesforce connected app

To set up your Salesforce app, complete the following steps:

  1. Log in to your Salesforce org and go to Setup.
  2. Search for App Manager and choose App Manager.
    Search for App Manager
  3. Choose New Connected App.
  4. For Connected App Name, enter a name.
  5. For Contact Email, enter a valid email.
  6. Select Enable OAuth Settings and enter a value for Callback URL.
  7. Under Available OAuth Scopes, choose Manage user data via APIs (api).
  8. Select Require Secret for Webserver Flow and Require Secret for Refresh Token Flow.
  9. Choose Save.
    New Connected App
  10. Under API (Enable OAuth Settings), choose Manage Consumer Details.
  11. Verify your identity if prompted.
  12. Copy the consumer key and consumer secret.

You need these when deploying the AWS Serverless Application Model (AWS SAM) application.

Get your Salesforce access token

If you don’t already have an access token, you need to obtain one. Before doing this, make sure that you’re prepared to update any applications that are using an access token because this step creates a new one and may invalidate the prior tokens.

  1. Find your personal information by choosing Settings from View profile on the top right.
  2. Choose Reset My Security Token followed by Reset Security Token.
    Reset Security Token
  3. Make note of the new access token that you receive via email.

Create a Salesforce customer contact record for each caller

The Lambda function that performs case look-up and update matches the caller’s phone number with a contact record in Salesforce. To create a new contact, complete the following steps:

  1. Log in to your Salesforce org.
  2. Under App Launcher, search for and choose Service Console.
    Service Console
  3. On the Service Console page, choose Contacts from the drop-down list, then choose New.
    Add new contact
  4. Enter a valid phone number under the Phone field of the New Contact page.
  5. Enter other contact details and choose Save.
  6. Repeat Steps 1–5 for any caller that makes a phone call and test the integration.

Deploy the LCA stack

Complete the following steps to deploy the LCA stack:

  1. Follow the instructions under the Deploy the CloudFormation stack section of Live call analytics and agent assist for your contact center with Amazon language AI services.
  2. Make sure that you choose ANTHROPIC, SAGEMAKER, or LAMBDA for the End of Call Transcript Summary parameter. See Transcript Summarization for more details.

The stacks take about 45 minutes to deploy.

  1. After the main stack shows CREATE_COMPLETE, on the Outputs tab, make a note of the Kinesis data stream ARN (CallDataStreamArn).

Deploy the Salesforce integration stack

To deploy the Salesforce integration stack, complete the following steps:

  1. Open a command-line terminal and run the following commands:
https://github.com/aws-samples/amazon-transcribe-live-call-analytics.git
cd amazon-transcribe-live-call-analytics/plugins/salesforce-integration
sam build
sam deploy —guided

Use the following table as a reference for parameter choices.

Parameter Name Description
AWS Region The Region where you have deployed the LCA solution
SalesforceUsername The user name of your Salesforce organization that has permissions to read and create cases
SalesforcePassword The password associated to your Salesforce user name
SalesforceAccessToken The access token you obtained earlier
SalesforceConsumerKey The consumer key you copied earlier
SalesforceConsumerSecret The consumer secret you obtained earlier
SalesforceHostUrl The login URL of your Salesforce organization
SalesforceAPIVersion The Salesforce API version (choose default or v56.0)
LCACallDataStreamArn The Kinesis data stream ARN (CallDataStreamArn) obtained earlier
  1. After the stack successfully deploys, make a note of StartOfCallLambdaHookFunctionArn and PostCallSummaryLambdaHookFunctionArn from the outputs displayed on your terminal.

Update LCA Stack

Complete the following steps to update the LCA stack:

  1. On the AWS CloudFormation console, update the main LCA stack.
  2. Choose Use current template.
  3. For Lambda Hook Function ARN for Custom Start of Call Processing (existing), provide the StartOfCallLambdaHookFunctionArn that you obtained earlier.
  4. For Lambda Hook Function ARN for Custom Post Processing, after the Call Transcript Summary is processed (existing), provide the PostCallSummaryLambdaHookFunctionArn that you obtained earlier.
  5. Make sure that End of Call Transcript Summary is not DISABLED.

Validate the integration

Make a test call and make sure you can see the beginning of call AGENT ASSIST and post-call AGENT ASSIST transcripts. Refer to the Explore live call analysis and agent assist features section of the Live call analytics and agent assist for your contact center with Amazon language AI services post for guidance.

Clean up

To avoid incurring charges, clean up your resources by following these instructions when you are finished experimenting with this solution:

  1. On the AWS CloudFormation console, and delete the LCA stacks that you deployed. This deletes resources that were created by deploying the solution. The recording S3 buckets, DynamoDB table, and CloudWatch log groups are retained after the stack is deleted to avoid deleting your data.
  2. On your terminal, run sam delete to delete the Salesforce integration Lambda functions.
  3. Follow the instructions in Deactivate a Developer Edition Org to deactivate your Salesforce Developer org.

Conclusion

In this post, we demonstrated how the Live-Call Analytics sample project can accelerate your adoption of real-time contact center analytics and integration. Rather than building from scratch, we show how to use the existing code base with the pre-built integration points with the start of call and post-call Lambda hooks. This enhances agent productivity by integrating with Salesforce to look up and update case records. Explore our open-source project and enhance the CRM pre- and post-call Lambda hooks to accommodate your use case.


About the Authors

Kishore Dhamodaran is a Senior Solutions Architect at AWS.

Bob Strahan Bob Strahan is a Principal Solutions Architect in the AWS Language AI Services team.

Christopher Lott is a Senior Solutions Architect in the AWS AI Language Services team. He has 20 years of enterprise software development experience. Chris lives in Sacramento, California and enjoys gardening, aerospace, and traveling the world.

Babu Srinivasan is a Sr. Specialist SA – Language AI services in the World Wide Specialist organization at AWS, with over 24 years of experience in IT and the last 6 years focused on the AWS Cloud. He is passionate about AI/ML. Outside of work, he enjoys woodworking and entertains friends and family (sometimes strangers) with sleight of hand card magic.

Read More

Shell-e-brate Good Times in 3D With ‘Kingsletter’ This Week ‘In the NVIDIA Studio’

Shell-e-brate Good Times in 3D With ‘Kingsletter’ This Week ‘In the NVIDIA Studio’

Editor’s note: This post is part of our weekly In the NVIDIA Studio series, which celebrates featured artists, offers creative tips and tricks, and demonstrates how NVIDIA Studio technology improves creative workflows. We’re also deep diving on new GeForce RTX 40 Series GPU features, technologies and resources, and how they dramatically accelerate content creation.

Amir Anbarestani, an accomplished 3D artist who goes by the moniker Kingsletter, had a “shell of a good time” creating his Space Turtle scene this week In the NVIDIA Studio.

Kingsletter has always harbored a fascination with 3D art, he said. As a child, he often enjoyed exploring and crafting within immersive environments. Whether it was playing with plasticine — putty-like modeling material — or creating pencil drawings, his innate inclination for self-expression always found resonance within the expansive domain of 3D.

Below, he shares his inspiration and creative process using ZBrush, Adobe Substance 3D Painter and Blender.

An NVIDIA DLSS 3 plug-in is now available in Unreal Engine 5, offering select benefits including AI upscaling for high frame rates, super resolution and more for GeForce RTX 40 Series owners.

And 3D creative app Marvelous Designer launches Into the Omniverse its NVIDIA Omniverse Connector this month. Learn how talented artists are using the Connector, along with the Universal Scene Description (“OpenUSD”) framework, to elevate their creative workflows.

NVIDIA DLSS 3 Plug-In Is Unreal — Engine 5

NVIDIA Studio released a DLSS 3 plug-in compatible with Unreal Engine 5. The Play in Editor tool is useful for game developers to quickly review gameplay in a level while editing — and DLSS 3 AI upscaling will unlock significantly higher frame rates on GeForce RTX 40 Series GPUs for even smoother previewing.

NVIDIA DLSS 3 plug-in unlocks incredible visual details with DLSS 3 in Unreal Engine 5.

Plus, select Unreal Engine viewports offer DLSS 2 Super Resolution and upscaling benefits in typical content-creation workflows like modeling, lighting, animation and more.

Download DLSS 3 for Unreal Engine 5.2, available now. Learn more about NVIDIA technologies supported by Unreal Engine 5.

Turtle Recall 

The process began with sketching and initial sculpting in the ZBrush tool, where the concept of a floating turtle in space took shape and evolved into a dynamic shot of the creature soaring toward the camera.

“It’s remarkable how something as simple as shaping an idea’s basic form can be so immensely gratifying,” said Kingsletter on the blockout phase. “There’s a unique joy in starting with a blank canvas and gradually bringing the essence of a concept to life.”

Sketching and initial sculpting in ZBrush.

After finalizing the model in ZBrush, Kingsletter used ZRemesher to retopologize it, or generate a low-poly version suitable for the intended scene. This is useful for removing artifacts and other mesh issues before animation and rigging.

“NVIDIA graphics cards are industry leading in the creative community. I don’t think I know anyone that uses other GPUs.” — Kingsletter

The RIZOMUV UV mapping 3D software was then deployed for unwrapping the model, the process of opening a mesh to make a 2D texture that covers a 3D object. This is effective for adding textures to objects with precision, a common need for professional artists.

Next, Kingsletter applied surface details, from subtle dusting to extreme wear and tear, with materials mimicking real-world behaviors such as sheen, subsurface scattering and more in Adobe Substance 3D Painter. RTX-accelerated light and ambient occlusion enabled fully baked models in mere seconds.

Textures added and baked rapidly in Adobe Substance 3D Painter.

Kingsletter then moved to Blender to animate the scene, setting up simple rigs and curves to bring the turtle’s flapping limbs and flight to life. Harnessing the potential of his MSI Creator Z17 HX Studio A13V NVIDIA Studio laptop from MSI with GeForce RTX 4070 graphics turtle-ly exceeded the artist’s lofty expectations.

The MSI Creator Z17 HX Studio laptop with GeForce RTX 4070 graphics.

“As a digital creative professional, I always strive to work with the best creative tools available,” Kingsletter said. “Choosing the MSI Creator laptop allowed me to exceed my creative professional needs and indulge in my passionate gaming hobby.”

He enriched the cosmic environment using Blender’s particle system, which scattered random debris, asteroids and a small, rotating planet throughout the outer-space scene. AI-powered RTX-accelerated OptiX ray tracing in the viewport unlocked buttery-smooth interactive animations in the viewport.

Create magnificent worlds in Blender accelerated by GeForce RTX graphics.

“Simulating smoke proved to be the most challenging aspect,” said Kingsletter about his first foray into this form of animation. “Through numerous trials and errors, I persevered until I achieved a truly satisfactory result.”

Realistic smoke elevated the 3D animation.

His RTX 4070 GPU facilitated smoother, more efficient rendering of the final visuals with RTX-accelerated OptiX ray tracing in Blender Cycles, ensuring the fastest final frame render.

When asked what he’d advise his younger artist self, Kingsletter said, “I’d enhance my observation skills. By immersing myself in the intricacies of form and paying careful attention to the world around me, I would have laid a stronger foundation for my creative journey.”

Wise words for all creators.

Digital 3D artist Kingsletter.

Check out Kingsletter’s beautiful 3D creations on Instagram.

Follow NVIDIA Studio on Instagram, Twitter and Facebook. Access tutorials on the Studio YouTube channel and get updates directly in your inbox by subscribing to the Studio newsletter.

Read More

Into the Omniverse: Universal Scene Description Support for Marvelous Designer Lets Users Tailor Digital Assets, Clothes for 3D Characters

Into the Omniverse: Universal Scene Description Support for Marvelous Designer Lets Users Tailor Digital Assets, Clothes for 3D Characters

Editor’s note: This post is part of Into the Omniverse, a monthly series focused on how artists, developers and enterprises can transform their workflows using the latest advances in Universal Scene Description and NVIDIA Omniverse.

Whether animating fish fins or fashioning chic outfits for digital characters, creators can tap Marvelous Designer software to compose and tailor assets, clothes and other materials for their 3D workflows.

Marvelous Designer recently launched an Omniverse Connector, a tool that enhances collaborative workflows that take place between its software and NVIDIA Omniverse, a development platform for connecting and building 3D tools and applications.

The Connector enables users to significantly speed and ease their design processes, thanks to its support for the Universal Scene Description framework, known as OpenUSD, which serves as a common language between 3D tools.

In a typical computer graphics pipeline, an artist needs to go back and forth between software in finalizing their work. The new Omniverse Connector enables creators to save time with Marvelous Designer’s improved import and export capabilities through OpenUSD.

In a recent livestream, 3D designer Brandon Yu shared how he’s using the new Connector and OpenUSD to improve his collaborative workflow, enhance productivity, expand creative possibilities and streamline his design process.

Mike Shawbrook, who has more than 150,000 subscribers on his MH Tutorials YouTube channel, walks through using the new Connector in the tutorial below. Shawbrook demonstrates how he set up a live session between Marvelous Designer and Omniverse to create a simple cloth blanket.

For more, check out this tutorial on using the new Connector and see how OpenUSD can improve 3D workflows:

Improved USD Compatibility

With the Marvelous Designer Omniverse Connector, users can harness the real-time rendering capabilities of Omniverse to visualize their garments in an interactive environment. This integration empowers creators to make informed design decisions, preview garments’ reactions to different lighting conditions and simulate realistic fabric behavior in real time.

The Connector’s expanded support for OpenUSD enables seamless interchange of 3D data between creative applications.

In the graphic above, an artist uses the new connector to adjust 3D-animated fish fins, a key digital material in an underwater scene.

Get Plugged Into the Omniverse 

To learn more about how OpenUSD can improve 3D workflows, check out a new video series on the file framework. The first installment covers four OpenUSD “superpowers.”

Anyone can build their own Omniverse extension or Connector to enhance their 3D workflows and tools.

Share your Marvelous Designer and Omniverse creations to the Omniverse gallery for a chance to be featured on NVIDIA social media channels.

Get started with NVIDIA Omniverse by downloading the standard license free, or learn how Omniverse Enterprise can connect your team. Developers can get started with Omniverse resources and learn about OpenUSD. Explore the growing ecosystem of 3D tools connected to Omniverse.

Stay up to date on the platform by subscribing to the newsletter, and follow NVIDIA Omniverse on Instagram, Medium and Twitter. For more, join the Omniverse community and check out the Omniverse forums, Discord server, Twitch and YouTube channels. 

Featured image courtesy of Marvelous Designer.

Read More