November 2023 – Page 18

Gen AI for the Genome: LLM Predicts Characteristics of COVID Variants

A widely acclaimed large language model for genomic data has demonstrated its ability to generate gene sequences that closely resemble real-world variants of SARS-CoV-2, the virus behind COVID-19.

Called GenSLMs, the model, which last year won the Gordon Bell special prize for high performance computing-based COVID-19 research, was trained on a dataset of nucleotide sequences — the building blocks of DNA and RNA. It was developed by researchers from Argonne National Laboratory, NVIDIA, the University of Chicago and a score of other academic and commercial collaborators.

When the researchers looked back at the nucleotide sequences generated by GenSLMs, they discovered that specific characteristics of the AI-generated sequences closely matched the real-world Eris and Pirola subvariants that have been prevalent this year — even though the AI was only trained on COVID-19 virus genomes from the first year of the pandemic.

“Our model’s generative process is extremely naive, lacking any specific information or constraints around what a new COVID variant should look like,” said Arvind Ramanathan, lead researcher on the project and a computational biologist at Argonne. “The AI’s ability to predict the kinds of gene mutations present in recent COVID strains — despite having only seen the Alpha and Beta variants during training — is a strong validation of its capabilities.”

In addition to generating its own sequences, GenSLMs can also classify and cluster different COVID genome sequences by distinguishing between variants. In a demo coming soon to NGC, NVIDIA’s hub for accelerated software, users can explore visualizations of GenSLMs’ analysis of the evolutionary patterns of various proteins within the COVID viral genome.

Reading Between the Lines, Uncovering Evolutionary Patterns

A key feature of GenSLMs is its ability to interpret long strings of nucleotides — represented with sequences of the letters A, T, G and C in DNA, or A, U, G and C in RNA — in the same way an LLM trained on English text would interpret a sentence. This capability enables the model to understand the relationship between different areas of the genome, which in coronaviruses consists of around 30,000 nucleotides.

In the demo, users will be able to choose from among eight different COVID variants to understand how the AI model tracks mutations across various proteins of the viral genome. The visualization depicts evolutionary couplings across the viral proteins — highlighting which snippets of the genome are likely to be seen in a given variant.

“Understanding how different parts of the genome are co-evolving gives us clues about how the virus may develop new vulnerabilities or new forms of resistance,” Ramanathan said. “Looking at the model’s understanding of which mutations are particularly strong in a variant may help scientists with downstream tasks like determining how a specific strain can evade the human immune system.”

GenSLMs was trained on more than 110 million prokaryotic genome sequences and fine-tuned with a global dataset of around 1.5 million COVID viral sequences using open-source data from the Bacterial and Viral Bioinformatics Resource Center. In the future, the model could be fine-tuned on the genomes of other viruses or bacteria, enabling new research applications.

To train the model, the researchers used NVIDIA A100 Tensor Core GPU-powered supercomputers, including Argonne’s Polaris system, the U.S. Department of Energy’s Perlmutter and NVIDIA’s Selene.

The GenSLMs research team’s Gordon Bell special prize was awarded at last year’s SC22 supercomputing conference. At this week’s SC23, in Denver, NVIDIA is sharing a new range of groundbreaking work in the field of accelerated computing. View the full schedule.

NVIDIA Research comprises hundreds of scientists and engineers worldwide, with teams focused on topics including AI, computer graphics, computer vision, self-driving cars and robotics. Learn more about NVIDIA Research and subscribe to NVIDIA healthcare news.

Main image courtesy of Argonne National Laboratory’s Bharat Kale.

This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. DOE Office of Science and the National Nuclear Security Administration. Research was supported by the DOE through the National Virtual Biotechnology Laboratory, a consortium of DOE national laboratories focused on response to COVID-19, with funding from the Coronavirus CARES Act.

Researchers Poised for Advances With NVIDIA CUDA Quantum

Michael Kuehn and Davide Vodola are taking to new heights work that’s pioneering quantum computing for the world’s largest chemical company.

The BASF researchers are demonstrating how a quantum algorithm can see what no traditional simulation can — key attributes of NTA, a compound with applications that include removing toxic metals like iron from a city’s wastewater.

The quantum computing team at BASF simulated on GPUs how the equivalent of 24 qubits — the processing engines of a quantum computer — can tackle the challenge.

Many corporate R&D centers would consider that a major achievement, but they pressed on, and recently ran their first 60 qubit simulations on NVIDIA’s Eos H100 Supercomputer.

“It’s the largest simulation of a molecule using a quantum algorithm we’ve ever run,” said Kuehn.

Flexible, Friendly Software

BASF is running the simulation on NVIDIA CUDA Quantum, a platform for programming CPUs, GPUs and quantum computers, also known as QPUs.

Vodola described it as “very flexible and user friendly, letting us build up a complex quantum circuit simulation from relatively simple building blocks. Without CUDA Quantum, it would be impossible to run this simulation,” he said.

The work requires a lot of heavy lifting, too, so BASF turned to an NVIDIA DGX Cloud service that uses NVIDIA H100 Tensor Core GPUs.

“We need a lot of computing power, and the NVIDIA platform is significantly faster than CPU-based hardware for this kind of simulation,” said Kuehn.

BASF’s quantum computing initiative, which Kuehn helped launch, started in 2017. In addition to its work in chemistry, the team is developing use cases for quantum computing in machine learning as well as optimizations for logistics and scheduling.

An Expanding CUDA Quantum Community

Other research groups are also advancing science with CUDA Quantum.

At SUNY Stony Brook, researchers are pushing the boundaries of high-energy physics to simulate complex interactions of subatomic particles. Their work promises new discoveries in fundamental physics.

“CUDA Quantum enables us to do quantum simulations that would otherwise be impossible,” said Dmitri Kharzeev, a SUNY professor and scientist at Brookhaven National Lab.

In addition, a research team at Hewlett Packard Labs is using the Perlmutter supercomputer to explore magnetic phase transition in quantum chemistry in one of the largest simulations of its kind. The effort could reveal important and unknown details of physical processes too difficult to model with conventional techniques.

“As quantum computers progress toward useful applications, high-performance classical simulations will be key for prototyping novel quantum algorithms,” said Kirk Bresniker, a chief architect at Hewlett Packard Labs. “Simulating and learning from quantum data are promising avenues toward tapping quantum computing’s potential.”

A Quantum Center for Healthcare

These efforts come as support for CUDA Quantum expands worldwide.

Classiq — an Israeli startup that already has more than 400 universities using its novel approach to writing quantum programs — announced today a new research center at the Tel Aviv Sourasky Medical Center, Israel’s largest teaching hospital.

Created in collaboration with NVIDIA, it will train experts in life science to write quantum applications that could someday help doctors diagnose diseases or accelerate the discovery of new drugs.

Classiq created quantum design software that automates low-level tasks, so developers don’t need to know all the complex details of how a quantum computer works. It’s now being integrated with CUDA Quantum.

Terra Quantum, a quantum services company with headquarters in Germany and Switzerland, is developing hybrid quantum applications for life sciences, energy, chemistry and finance that will run on CUDA Quantum. And IQM in Finland is enabling its superconducting QPU to use CUDA Quantum.

Quantum Loves Grace Hopper

Several companies, including Oxford Quantum Circuits, will use NVIDIA Grace Hopper Superchips to power their hybrid quantum efforts. Based in Reading, England, Oxford Quantum is using Grace Hopper in a hybrid QPU/GPU system programmed by CUDA Quantum.

Quantum Machines announced that the Israeli National Quantum Center will be the first deployment of NVIDIA DGX Quantum, a system using Grace Hopper Superchips. Based in Tel Aviv, the center will tap DGX Quantum to power quantum computers from Quantware, ORCA Computing and more.

In addition, Grace Hopper is being put to work by qBraid, in Chicago, to build a quantum cloud service, and Fermioniq, in Amsterdam, to develop tensor-network algorithms.

The large quantity of shared memory and the memory bandwidth of Grace Hopper make these superchips an excellent fit for memory-hungry quantum simulations.

Get started programming hybrid quantum systems today with the latest release of CUDA Quantum from NGC, NVIDIA’s catalog of accelerated software, or GitHub.

NVIDIA Grace Hopper Superchip Powers 40+ AI Supercomputers Across Global Research Centers, System Makers, Cloud Providers

Dozens of new supercomputers for scientific computing will soon hop online, powered by NVIDIA’s breakthrough GH200 Grace Hopper Superchip for giant-scale AI and high performance computing.

The NVIDIA GH200 enables scientists and researchers to tackle the world’s most challenging problems by accelerating complex AI and HPC applications running terabytes of data.

At the SC23 supercomputing show, NVIDIA today announced that the superchip is coming to more systems worldwide, including from Dell Technologies, Eviden, Hewlett Packard Enterprise (HPE), Lenovo, QCT and Supermicro.

Bringing together the Arm-based NVIDIA Grace CPU and Hopper GPU architectures using NVIDIA NVLink-C2C interconnect technology, GH200 also serves as the engine behind scientific supercomputing centers across the globe.

Combined, these GH200-powered centers represent some 200 exaflops of AI performance to drive scientific innovation.

HPE Cray Supercomputers Integrate NVIDIA Grace Hopper

At the show in Denver, HPE announced it will offer HPE Cray EX2500 supercomputers with the NVIDIA Grace Hopper Superchip. The integrated solution will feature quad GH200 processors, scaling up to tens of thousands of Grace Hopper Superchip nodes to provide organizations with unmatched supercomputing agility and quicker AI training. This configuration will also be part of a supercomputing solution for generative AI that HPE introduced today.

“Organizations are rapidly adopting generative AI to accelerate business transformations and technological breakthroughs,” said Justin Hotard, executive vice president and general manager of HPC, AI and Labs at HPE. “Working with NVIDIA, we’re excited to deliver a full supercomputing solution for generative AI, powered by technologies like Grace Hopper, which will make it easy for customers to accelerate large-scale AI model training and tuning at new levels of efficiency.”

Next-Generation AI Supercomputing Centers

A vast array of the world’s supercomputing centers are powered by NVIDIA Grace Hopper systems. Several top centers announced at SC23 that they’re now integrating GH200 systems for their supercomputers.

Germany’s Jülich Supercomputing Centre will use GH200 superchips in JUPITER, set to become the first exascale supercomputer in Europe. The supercomputer will help tackle urgent scientific challenges, such as mitigating climate change, combating pandemics and bolstering sustainable energy production.

Japan’s Joint Center for Advanced High Performance Computing — established between the Center for Computational Sciences at the University of Tsukuba and the Information Technology Center at the University of Tokyo — promotes advanced computational sciences integrated with data analytics, AI and machine learning across academia and industry. Its next-generation supercomputer will be powered by NVIDIA Grace Hopper.

The Texas Advanced Computing Center, based in Austin, Texas, designs and operates some of the world’s most powerful computing resources. The center will power its Vista supercomputer with NVIDIA GH200 for low power and high-bandwidth memory to deliver more computation while enabling bigger models to run with greater efficiency.

The National Center for Supercomputing Applications at the University of Illinois Urbana-Champaign will tap NVIDIA Grace Hopper superchips to power DeltaAI, an advanced computing and data resource set to triple NCSA’s AI-focused computing capacity.

And, the University of Bristol recently received funding from the UK government to build Isambard-AI, set to be the country’s most powerful supercomputer, which will enable AI-driven breakthroughs in robotics, big data, climate research and drug discovery. The new system, being built by HPE, will be equipped with over 5,000 NVIDIA GH200 Grace Hopper Superchips, providing 21 exaflops of AI supercomputing power capable of making 21 quintillion AI calculations per second.

These systems join previously announced next-generation Grace Hopper systems from the Swiss National Supercomputing Centre, Los Alamos National Laboratory and SoftBank Corp.

GH200 Shipping Globally and Available in Early Access from CSPs

GH200 is available in early access from select cloud service providers such as Lambda and Vultr. Oracle Cloud Infrastructure today announced plans to offer GH200 instances, while CoreWeave detailed plans for early availability of its GH200 instances starting in Q1 2024.

Other system manufacturers such as ASRock Rack, ASUS, GIGABYTE and Ingrasys will begin shipping servers with the superchips by the end of the year.

NVIDIA Grace Hopper has been adopted in early access for supercomputing initiatives by more than 100 enterprises, organizations and government agencies across the globe, including the NASA Ames Research Center for aeronautics research and global energy company TotalEnergies.

In addition, the GH200 will soon become available through NVIDIA LaunchPad, which provides free access to enterprise NVIDIA hardware and software through an internet browser.

Learn more about Grace Hopper and other supercomputing breakthroughs by joining NVIDIA at SC23.

Taking legal action to protect users of AI and small businesses

Today we’re taking action to protect users of Google’s Bard AI as well as against fraudsters who sought to weaponize copyright law for profitRead More

MARRS: Multimodal Reference Resolution System

*= All authors listed contributed equally to this work
Successfully handling context is essential for any dialog understanding task. This context maybe be conversational (relying on previous user queries or system responses), visual (relying on what the user sees, for example, on their screen), or background (based on signals such as a ringing alarm or playing music). In this work, we present an overview of MARRS, or Multimodal Reference Resolution System, an on-device framework within a Natural Language Understanding system, responsible for handling conversational, visual and background…Apple Machine Learning Research

Build trust and safety for generative AI applications with Amazon Comprehend and LangChain

We are witnessing a rapid increase in the adoption of large language models (LLM) that power generative AI applications across industries. LLMs are capable of a variety of tasks, such as generating creative content, answering inquiries via chatbots, generating code, and more.

Organizations looking to use LLMs to power their applications are increasingly wary about data privacy to ensure trust and safety is maintained within their generative AI applications. This includes handling customers’ personally identifiable information (PII) data properly. It also includes preventing abusive and unsafe content from being propagated to LLMs and checking that data generated by LLMs follows the same principles.

In this post, we discuss new features powered by Amazon Comprehend that enable seamless integration to ensure data privacy, content safety, and prompt safety in new and existing generative AI applications.

Amazon Comprehend is a natural language processing (NLP) service that uses machine learning (ML) to uncover information in unstructured data and text within documents. In this post, we discuss why trust and safety with LLMs matter for your workloads. We also delve deeper into how these new moderation capabilities are utilized with the popular generative AI development framework LangChain to introduce a customizable trust and safety mechanism for your use case.

Why trust and safety with LLMs matter

Trust and safety are paramount when working with LLMs due to their profound impact on a wide range of applications, from customer support chatbots to content generation. As these models process vast amounts of data and generate humanlike responses, the potential for misuse or unintended outcomes increases. Ensuring that these AI systems operate within ethical and reliable boundaries is crucial, not just for the reputation of businesses that utilize them, but also for preserving the trust of end-users and customers.

Moreover, as LLMs become more integrated into our daily digital experiences, their influence on our perceptions, beliefs, and decisions grows. Ensuring trust and safety with LLMs goes beyond just technical measures; it speaks to the broader responsibility of AI practitioners and organizations to uphold ethical standards. By prioritizing trust and safety, organizations not only protect their users, but also ensure sustainable and responsible growth of AI in society. It can also help to reduce risk of generating harmful content, and help adhere to regulatory requirements.

In the realm of trust and safety, content moderation is a mechanism that addresses various aspects, including but not limited to:

Privacy – Users can inadvertently provide text that contains sensitive information, jeopardizing their privacy. Detecting and redacting any PII is essential.
Toxicity – Recognizing and filtering out harmful content, such as hate speech, threats, or abuse, is of utmost importance.
User intention – Identifying whether the user input (prompt) is safe or unsafe is critical. Unsafe prompts can explicitly or implicitly express malicious intent, such as requesting personal or private information and generating offensive, discriminatory, or illegal content. Prompts may also implicitly express or request advice on medical, legal, political, controversial, personal, or financial

Content moderation with Amazon Comprehend

In this section, we discuss the benefits of content moderation with Amazon Comprehend.

Addressing privacy

Amazon Comprehend already addresses privacy through its existing PII detection and redaction abilities via the DetectPIIEntities and ContainsPIIEntities APIs. These two APIs are backed by NLP models that can detect a large number of PII entities such as Social Security numbers (SSNs), credit card numbers, names, addresses, phone numbers, and so on. For a full list of entities, refer to PII universal entity types. DetectPII also provides character-level position of the PII entity within a text; for example, the start character position of the NAME entity (John Doe) in the sentence “My name is John Doe” is 12, and the end character position is 19. These offsets can be used to perform masking or redaction of the values, thereby reducing risks of private data propagation into LLMs.

Addressing toxicity and prompt safety

Today, we are announcing two new Amazon Comprehend features in the form of APIs: Toxicity detection via the DetectToxicContent API, and prompt safety classification via the ClassifyDocument API. Note that DetectToxicContent is a new API, whereas ClassifyDocument is an existing API that now supports prompt safety classification.

Toxicity detection

With Amazon Comprehend toxicity detection, you can identify and flag content that may be harmful, offensive, or inappropriate. This capability is particularly valuable for platforms where users generate content, such as social media sites, forums, chatbots, comment sections, and applications that use LLMs to generate content. The primary goal is to maintain a positive and safe environment by preventing the dissemination of toxic content.

At its core, the toxicity detection model analyzes text to determine the likelihood of it containing hateful content, threats, obscenities, or other forms of harmful text. The model is trained on vast datasets containing examples of both toxic and nontoxic content. The toxicity API evaluates a given piece of text to provide toxicity classification and confidence score. Generative AI applications can then use this information to take appropriate actions, such as stopping the text from propagating to LLMs. As of this writing, the labels detected by the toxicity detection API are HATE_SPEECH, GRAPHIC, HARRASMENT_OR_ABUSE, SEXUAL, VIOLENCE_OR_THREAT, INSULT, and PROFANITY. The following code demonstrates the API call with Python Boto3 for Amazon Comprehend toxicity detection:

import boto3
client = boto3.client('comprehend')
response = client.detect_toxic_content(
    	TextSegments=[{"Text": "What is the capital of France?"},
                      {"Text": "Where do I find good baguette in France?"}],
    	LanguageCode='en')
print(response)

Prompt safety classification

Prompt safety classification with Amazon Comprehend helps classify an input text prompt as safe or unsafe. This capability is crucial for applications like chatbots, virtual assistants, or content moderation tools where understanding the safety of a prompt can determine responses, actions, or content propagation to LLMs.

In essence, prompt safety classification analyzes human input for any explicit or implicit malicious intent, such as requesting personal or private information and generation of offensive, discriminatory, or illegal content. It also flags prompts looking for advice on medical, legal, political, controversial, personal, or financial subjects. Prompt classification returns two classes, UNSAFE_PROMPT and SAFE_PROMPT, for an associated text, with an associated confidence score for each. The confidence score ranges between 0–1 and combined will sum up to 1. For instance, in a customer support chatbot, the text “How do I reset my password?” signals an intent to seek guidance on password reset procedures and is labeled as SAFE_PROMPT. Similarly, a statement like “I wish something bad happens to you” can be flagged for having a potentially harmful intent and labeled as UNSAFE_PROMPT. It’s important to note that prompt safety classification is primarily focused on detecting intent from human inputs (prompts), rather than machine-generated text (LLM outputs). The following code demonstrates how to access the prompt safety classification feature with the ClassifyDocument API:

import boto3
client = boto3.client('comprehend')
response = self.client.classify_document(
           		Text=prompt_value, 
EndpointArn=endpoint_arn)
print(response)

Note that endpoint_arn in the preceding code is an AWS-provided Amazon Resource Number (ARN) of the pattern arn:aws:comprehend:<region>:aws:document-classifier-endpoint/prompt-safety, where <region> is the AWS Region of your choice where Amazon Comprehend is available.

To demonstrate these capabilities, we built a sample chat application where we ask an LLM to extract PII entities such as address, phone number, and SSN from a given piece of text. The LLM finds and returns the appropriate PII entities, as shown in the image on the left.

With Amazon Comprehend moderation, we can redact the input to the LLM and output from the LLM. In the image on the right, the SSN value is allowed to be passed to the LLM without redaction. However, any SSN value in the LLM’s response is redacted.

The following is an example of how a prompt containing PII information can be prevented from reaching the LLM altogether. This example demonstrates a user asking a question that contains PII information. We use Amazon Comprehend moderation to detect PII entities in the prompt and show an error by interrupting the flow.

The preceding chat examples showcase how Amazon Comprehend moderation applies restrictions on data being sent to an LLM. In the following sections, we explain how this moderation mechanism is implemented using LangChain.

Integration with LangChain

With the endless possibilities of the application of LLMs into various use cases, it has become equally important to simplify the development of generative AI applications. LangChain is a popular open source framework that makes it effortless to develop generative AI applications. Amazon Comprehend moderation extends the LangChain framework to offer PII identification and redaction, toxicity detection, and prompt safety classification capabilities via AmazonComprehendModerationChain.

AmazonComprehendModerationChain is a custom implementation of the LangChain base chain interface. This means that applications can use this chain with their own LLM chains to apply the desired moderation to the input prompt as well as to the output text from the LLM. Chains can be built by merging numerous chains or by mixing chains with other components. You can use AmazonComprehendModerationChain with other LLM chains to develop complex AI applications in a modular and flexible manner.

To explain it further, we provide a few samples in the following sections. The source code for the AmazonComprehendModerationChain implementation can be found within the LangChain open source repository. For full documentation of the API interface, refer to the LangChain API documentation for the Amazon Comprehend moderation chain. Using this moderation chain is as simple as initializing an instance of the class with default configurations:

from langchain_experimental.comprehend_moderation import AmazonComprehendModerationChain

comprehend_moderation = AmazonComprehendModerationChain()

Behind the scenes, the moderation chain performs three consecutive moderation checks, namely PII, toxicity, and prompt safety, as explained in the following diagram. This is the default flow for the moderation.

The following code snippet shows a simple example of using the moderation chain with the Amazon FalconLite LLM (which is a quantized version of the Falcon 40B SFT OASST-TOP1 model) hosted in Hugging Face Hub:

from langchain import HuggingFaceHub
from langchain import PromptTemplate, LLMChain
from langchain_experimental.comprehend_moderation import AmazonComprehendModerationChain

template = """Question: {question}
Answer:"""
repo_id = "amazon/FalconLite"
prompt = PromptTemplate(template=template, input_variables=["question"])
llm = HuggingFaceHub(
repo_id=repo_id, 
model_kwargs={"temperature": 0.5, "max_length": 256}
)
comprehend_moderation = AmazonComprehendModerationChain(verbose=True)
chain = (
    prompt 
    | comprehend_moderation 
    | { "input" : (lambda x: x['output']) | llm }  
    | comprehend_moderation
)

try:
    response = chain.invoke({"question": "An SSN is of the format 123-45-6789. Can you give me John Doe's SSN?"})
except Exception as e:
    print(str(e))
else:
    print(response['output'])

In the preceding example, we augment our chain with comprehend_moderation for both text going into the LLM and text generated by the LLM. This will perform default moderation that will check PII, toxicity, and prompt safety classification in that sequence.

Customize your moderation with filter configurations

You can use the AmazonComprehendModerationChain with specific configurations, which gives you the ability to control what moderations you wish to perform in your generative AI–based application. At the core of the configuration, you have three filter configurations available.

ModerationPiiConfig – Used to configure PII filter.
ModerationToxicityConfig – Used to configure toxic content filter.
ModerationIntentConfig – Used to configure intent filter.

You can use each of these filter configurations to customize the behavior of how your moderations behave. Each filter’s configurations have a few common parameters, and some unique parameters, that they can be initialized with. After you define the configurations, you use the BaseModerationConfig class to define the sequence in which the filters must apply to the text. For example, in the following code, we first define the three filter configurations, and subsequently specify the order in which they must apply:

from langchain_experimental.comprehend_moderation 
import (BaseModerationConfig, 
ModerationPromptSafetyConfig, 
ModerationPiiConfig, 
ModerationToxicityConfig)

pii_config = ModerationPiiConfig(labels=["SSN"],
   redact=True,
   mask_character="X")
toxicity_config = ModerationToxicityConfig(threshold=0.6)
prompt_safety_config = ModerationPromptSafetyConfig(threshold=0.8)
moderation_config = BaseModerationConfig(filters=[ toxicity_config, 
      pii_config, 
      prompt_safety_config])
comprehend_moderation = AmazonComprehendModerationChain(moderation_config=moderation_config)

Let’s dive a little deeper to understand what this configuration achieves:

First, for the toxicity filter, we specified a threshold of 0.6. This means that if the text contains any of the available toxic labels or entities with a score greater than the threshold, the whole chain will be interrupted.
If there is no toxic content found in the text, a PII check is In this case, we’re interested in checking if the text contains SSN values. Because the redact parameter is set to True, the chain will mask the detected SSN values (if any) where the SSN entitiy’s confidence score is greater than or equal to 0.5, with the mask character specified (X). If redact is set to False, the chain will be interrupted for any SSN detected.
Finally, the chain performs prompt safety classification, and will stop the content from propagating further down the chain if the content is classified with UNSAFE_PROMPT with a confidence score of greater than or equal to 0.8.

The following diagram illustrates this workflow.

In case of interruptions to the moderation chain (in this example, applicable for the toxicity and prompt safety classification filters), the chain will raise a Python exception, essentially stopping the chain in progress and allowing you to catch the exception (in a try-catch block) and perform any relevant action. The three possible exception types are:

ModerationPIIError
ModerationToxicityError
ModerationPromptSafetyError

You can configure one filter or more than one filter using BaseModerationConfig. You can also have the same type of filter with different configurations within the same chain. For example, if your use case is only concerned with PII, you can specify a configuration that must interrupt the chain if in case an SSN is detected; otherwise, it must perform redaction on age and name PII entities. A configuration for this can be defined as follows:

pii_config1 = ModerationPiiConfig(labels=["SSN"],
    redact=False)
pii_config2 = ModerationPiiConfig(labels=["AGE", "NAME"],
    redact=True, 
    mask_character="X")
moderation_config = BaseModerationConfig(filters=[ pii_config1, 
      pii_config2])
comprehend_moderation = AmazonComprehendModerationChain(moderation_config=moderation_config)

Using callbacks and unique identifiers

If you’re familiar with the concept of workflows, you may also be familiar with callbacks. Callbacks within workflows are independent pieces of code that run when certain conditions are met within the workflow. A callback can either be blocking or nonblocking to the workflow. LangChain chains are, in essence, workflows for LLMs. AmazonComprehendModerationChain allows you to define your own callback functions. Initially, the implementation is limited to asynchronous (nonblocking) callback functions only.

This effectively means that if you use callbacks with the moderation chain, they will run independently of the chain’s run without blocking it. For the moderation chain, you get options to run pieces of code, with any business logic, after each moderation is run, independent of the chain.

You can also optionally provide an arbitrary unique identifier string when creating an AmazonComprehendModerationChain to enable logging and analytics later. For example, if you’re operating a chatbot powered by an LLM, you may want to track users who are consistently abusive or are deliberately or unknowingly exposing personal information. In such cases, it becomes necessary to track the origin of such prompts and perhaps store them in a database or log them appropriately for further action. You can pass a unique ID that distinctly identifies a user, such as their user name or email, or an application name that is generating the prompt.

The combination of callbacks and unique identifiers provides you with a powerful way to implement a moderation chain that fits your use case in a much more cohesive manner with less code that is easier to maintain. The callback handler is available via the BaseModerationCallbackHandler, with three available callbacks: on_after_pii(), on_after_toxicity(), and on_after_prompt_safety(). Each of these callback functions is called asynchronously after the respective moderation check is performed within the chain. These functions also receive two default parameters:

moderation_beacon – A dictionary containing details such as the text on which the moderation was performed, the full JSON output of the Amazon Comprehend API, the type of moderation, and if the supplied labels (in the configuration) were found within the text or not
unique_id – The unique ID that you assigned while initializing an instance of the AmazonComprehendModerationChain.

The following is an example of how an implementation with callback works. In this case, we defined a single callback that we want the chain to run after the PII check is performed:

from langchain_experimental.comprehend_moderation import BaseModerationCallbackHandler

class MyModCallback(BaseModerationCallbackHandler):
    async def on_after_pii(self, output_beacon, unique_id):
        import json
        moderation_type = output_beacon['moderation_type']
        chain_id = output_beacon['moderation_chain_id']
        with open(f'output-{moderation_type}-{chain_id}.json', 'w') as file:
            data = { 'beacon_data': output_beacon, 'unique_id': unique_id }
            json.dump(data, file)
    
    '''
    # implement this callback for toxicity
    async def on_after_toxicity(self, output_beacon, unique_id):
        pass

    # implement this callback for prompt safety
    async def on_after_prompt_safety(self, output_beacon, unique_id):
        pass
    '''

my_callback = MyModCallback()

We then use the my_callback object while initializing the moderation chain and also pass a unique_id. You may use callbacks and unique identifiers with or without a configuration. When you subclass BaseModerationCallbackHandler, you must implement one or all of the callback methods depending on the filters you intend to use. For brevity, the following example shows a way to use callbacks and unique_id without any configuration:

comprehend_moderation = AmazonComprehendModerationChain(
moderation_callback = my_callback,
unique_id = 'john.doe@email.com')

The following diagram explains how this moderation chain with callbacks and unique identifiers works. Specifically, we implemented the PII callback that should write a JSON file with the data available in the moderation_beacon and the unique_id passed (the user’s email in this case).

In the following Python notebook, we have compiled a few different ways you can configure and use the moderation chain with various LLMs, such as LLMs hosted with Amazon SageMaker JumpStart and hosted in Hugging Face Hub. We have also included the sample chat application that we discussed earlier with the following Python notebook.

Conclusion

The transformative potential of large language models and generative AI is undeniable. However, their responsible and ethical use hinges on addressing concerns of trust and safety. By recognizing the challenges and actively implementing measures to mitigate risks, developers, organizations, and society at large can harness the benefits of these technologies while preserving the trust and safety that underpin their successful integration. Use Amazon Comprehend ContentModerationChain to add trust and safety features to any LLM workflow, including Retrieval Augmented Generation (RAG) workflows implemented in LangChain.

For information on building RAG based solutions using LangChain and Amazon Kendra’s highly accurate, machine learning (ML)-powered intelligent search, see – Quickly build high-accuracy Generative AI applications on enterprise data using Amazon Kendra, LangChain, and large language models. As a next step, refer to the code samples we created for using Amazon Comprehend moderation with LangChain. For full documentation of the Amazon Comprehend moderation chain API, refer to the LangChain API documentation.

About the authors

Wrick Talukdar is a Senior Architect with the Amazon Comprehend Service team. He works with AWS customers to help them adopt machine learning on a large scale. Outside of work, he enjoys reading and photography.

Anjan Biswas is a Senior AI Services Solutions Architect with a focus on AI/ML and Data Analytics. Anjan is part of the world-wide AI services team and works with customers to help them understand and develop solutions to business problems with AI and ML. Anjan has over 14 years of experience working with global supply chain, manufacturing, and retail organizations, and is actively helping customers get started and scale on AWS AI services.

Nikhil Jha is a Senior Technical Account Manager at Amazon Web Services. His focus areas include AI/ML, and analytics. In his spare time, he enjoys playing badminton with his daughter and exploring the outdoors.

Chin Rane is an AI/ML Specialist Solutions Architect at Amazon Web Services. She is passionate about applied mathematics and machine learning. She focuses on designing intelligent document processing solutions for AWS customers. Outside of work, she enjoys salsa and bachata dancing.

Optimizing neural networks for special-purpose hardware

Curating the neural-architecture search space and taking advantage of human intuition reduces latency on real-world applications by up to 55%.Read More

Enabling large-scale health studies for the research community

Posted by Chintan Ghate, Software Engineer, and Diana Mincu, Research Engineer, Google Research

As consumer technologies like fitness trackers and mobile phones become more widely used for health-related data collection, so does the opportunity to leverage these data pathways to study and advance our understanding of medical conditions. We have previously touched upon how our work explores the use of this technology within the context of chronic diseases, in particular multiple sclerosis (MS). This effort leverages the FDA MyStudies platform, an open-source platform used to create clinical study apps, that makes it easier for anyone to run their own studies and collect good quality healthcare data, in a trusted and safe way.

Today, we describe the setup that we developed by expanding the FDA MyStudies platform and demonstrate how it can be used to set up a digital health study. We also present our exploratory research study created through this platform, called MS Signals, which consists of a symptom tracking app for MS patients. The goal for this app is twofold: 1) to ensure that the enhancements to the FDA MyStudies platform made for a more streamlined study creation experience; and 2) to understand how new data collection mechanisms can be used to revolutionize patients’ chronic disease management and tracking. We have open sourced our extension to the FDA MyStudies platform under the Apache 2.0 license to provide a resource for the community to build their own studies.

Extending the FDA MyStudies platform

The original FDA MyStudies platform allowed people to configure their own study apps, manage participants, and create separate iOS and Android apps. To simplify the study creation process and ensure increased study engagement, we made a number of accessibility changes. Some of the main improvements include: cross-platform (iOS and Android) app generation through the use of Flutter, an open source framework by Google for building multi-platform applications from a single codebase; a simplified setup, so that users can prototype their study quickly (under a day in most cases); and, most importantly, an emphasis on accessibility so that diverse patient’s voices are heard. The accessibility enhancements include changes to the underlying features of the platform and to the particular study design of the MS Signals study app.

Multi-platform support with rapid prototyping

We decided on the use of Flutter as it would be a single point that would generate both iOS and Android apps in one go, reducing the work required to support multiple platforms. Flutter also provides hot-reloading, which allows developers to build & preview features quickly. The design-system in the app takes advantage of this feature to provide a central point from which the branding & theme of the app can be changed to match the tone of a new study and previewed instantly. The demo environment in the app also utilizes this feature to allow developers to mock and preview questionnaires locally on their machines. In our experience this has been a huge time-saver in A/B testing the UX and the format and wording of questions live with clinicians.

System accessibility enhancements

To improve the accessibility of the platform for more users, we made several usability enhancements:

Light & dark theme support
Bold text & variable font-sizes
High-contrast mode
Improving user awareness of accessibility settings

Extended exposure to bright light themes can strain the eyes, so supporting dark theme features was necessary to make it easier to use the study app frequently. Some small or light text-elements are illegible to users with vision impairments, so we added 1) bold-text and support for larger font-sizes and 2) high-contrast color-schemes. To ensure that accessibility settings are easy to find, we placed an introductory one-time screen that was presented during the app’s first launch, which would directly take users to their system accessibility settings.

Study accessibility enhancements

To make the study itself easier to interact with and reduce cognitive overload, we made the following changes:

Clarified the onboarding process
Improved design for questionnaires

First, we clarified the on-boarding process by presenting users with a list of required steps when they first open the app in order to reduce confusion and participant drop-off.

The original questionnaire design in the app presented each question in a card format, which utilizes part of the screen for shadows and depth effects of the card. In many situations, this is a pleasant aesthetic, but in apps where accessibility is priority, these visual elements restrict the space available on the screen. Thus, when more accessible, larger font-sizes are used there are more frequent word breaks, which reduces readability. We fixed this simply by removing the card design elements and instead using the entire screen, allowing for better visuals with larger font-sizes.

The MS Signals prototype study

To test the usability of these changes, we used our redesigned platform to create a prototype study app called MS Signals, which uses surveys to gather information about a participant’s MS-related symptoms.

MS Signals app screenshots.

<!–

MS Signals app screenshots.

–>

MS Studies app design

As a first step, before entering any study information, participants are asked to complete an eligibility and study comprehension questionnaire to ensure that they have read through the potentially lengthy terms of study participation. This might include, for example, questions like “In what country is the study available?” or “Can you withdraw from the study?” A section like this is common in most health studies, and it tends to be the first drop-off point for participants.

To minimize study drop-off at this early stage, we kept the eligibility test brief and reflected correct answers for the comprehension test back to the participants. This helps minimize the number of times a user may need to go through the initial eligibility questionnaire and ensures that the important aspects of the study protocol are made clear to them.

After successful enrollment, participants are taken to the main app view, which consists of three pages:

Activities:

This page lists the questionnaires available to the participant and is where the majority of their time is spent. The questionnaires vary in frequency — some are one-time surveys created to gather medical history, while others are repeated daily, weekly or monthly, depending on the symptom or area they are exploring. For the one-time survey we provide a counter above each question to signal to users how far they have come and how many questions are left, similar to the questionnaire during the eligibility and comprehension step.
Dashboard:
To ensure that participants get something back in return for the information they enter during a study, the Dashboard area presents a summary of their responses in graph or pie chart form. Participants could potentially show this data to their care provider as a summary of their condition over the last 6 months, an improvement over the traditional pen and paper methods that many employ today.
Resources:
A set of useful links, help articles and common questions related to MS.

Questionnaire design

Since needing to frequently input data can lead to cognitive overload, participant drop off, and bad data quality, we reduced the burden in two ways:

We break down large questionnaires into smaller ones, resulting in 6 daily surveys, containing 3–5 questions each, where each question is multiple choice and related to a single symptom. This way we cover a total of 20 major symptoms, and present them in a similar way to how a clinician would ask these questions in an in-clinic setting.
We ensure previously entered information is readily available in the app, along with the time of the entry.

In designing the survey content, we collaborated closely with experienced clinicians and researchers to finalize the wording and layout. While studies in this field typically use the Likert scale to gather symptom information, we defined a more intuitive verbose scale to provide better experience for participants tracking their disease and the clinicians or researchers viewing the disease history. For example, in the case of vision issues, rather than asking participants to rate their symptoms on a scale from 1 to 10, we instead present a multiple choice question where we detail common vision problems that they may be experiencing.

This verbose scale helps patients track their symptoms more accurately by including context that helps them more clearly define their symptoms. This approach also allows researchers to answer questions that go beyond symptom correlation. For example, for vision issues, data collected using the verbose scale would reveal to researchers whether nystagmus is more prominent in patients with MS compared to double vision.

Side-by-side comparison with a Likert scale on the left, and a Verbose scale on the right.

<!–

Side by side comparison with a Likert scale on the left, and a Verbose scale on the right.

–>

Focusing on accessibility

Mobile-based studies can often present additional challenges for participants with chronic conditions: the text can be hard to read, the color contrast could make it difficult to see certain bits of information, or it may be challenging to scroll through pages. This may result in participant drop off, which, in turn, could yield a biased dataset if the people who are experiencing more advanced forms of a disease are unable to provide data.

In order to prevent such issues, we include the following accessibility features:

Throughout, we employ color blind accessible color schemes. This includes improving the contrast between crucial text and important additional information, which might otherwise be presented in a smaller font and a faded text color.
We reduced the amount of movement required to access crucial controls by placing all buttons close to the bottom of the page and ensuring that pop-ups are controllable from the bottom part of the screen.

To test the accessibility of MS Signals, we collaborated with the National MS Society to recruit participants for a user experience study. For this, a call for participation was sent out by the Society to their members, and 9 respondents were asked to test out the various app flows. The majority indicated that they would like a better way than their current method to track their symptom data, that they considered MS Signals to be a unique and valuable tool that would enhance the accuracy of their symptom tracking, and that they would want to share the dashboard view with their healthcare providers.

Next steps

We want to encourage everyone to make use of the open source platform to start setting up and running their own studies. We are working on creating a set of standard study templates, which would incorporate what we learned from above, and we hope to release those soon. For any issues, comments or questions please check out our resource page.

How we taught Google Translate to recognize homonyms

How Google Translate’s neural model taught it to understand bass from bass.Read More

Use machine learning without writing a single line of code with Amazon SageMaker Canvas

In the recent past, using machine learning (ML) to make predictions, especially for data in the form of text and images, required extensive ML knowledge for creating and tuning of deep learning models. Today, ML has become more accessible to any user who wants to use ML models to generate business value. With Amazon SageMaker Canvas, you can create predictions for a number of different data types beyond just tabular or time series data without writing a single line of code. These capabilities include pre-trained models for image, text, and document data types.

In this post, we discuss how you can use pre-trained models to retrieve predictions for supported data types beyond tabular data.

Text data

SageMaker Canvas provides a visual, no-code environment for building, training, and deploying ML models. For natural language processing (NLP) tasks, SageMaker Canvas integrates seamlessly with Amazon Comprehend to allow you to perform key NLP capabilities like language detection, entity recognition, sentiment analysis, topic modeling, and more. The integration eliminates the need for any coding or data engineering to use the robust NLP models of Amazon Comprehend. You simply provide your text data and select from four commonly used capabilities: sentiment analysis, language detection, entities extraction, and personal information detection. For each scenario, you can use the UI to test and use batch prediction to select data stored in Amazon Simple Storage Service (Amazon S3).

Sentiment analysis

With sentiment analysis, SageMaker Canvas allows you to analyze the sentiment of your input text. It can determine if the overall sentiment is positive, negative, mixed, or neutral, as shown in the following screenshot. This is useful in situations like analyzing product reviews. For example, the text “I love this product, it’s amazing!” would be classified by SageMaker Canvas as having a positive sentiment, whereas “This product is horrible, I regret buying it” would be labeled as negative sentiment.

Entities extraction

SageMaker Canvas can analyze text and automatically detect entities mentioned within it. When a document is sent to SageMaker Canvas for analysis, it will identify people, organizations, locations, dates, quantities, and other entities in the text. This entity extraction capability enables you to quickly gain insights into the key people, places, and details discussed in documents. For a list of supported entities, refer to Entities.

Language detection

SageMaker Canvas can also determine the dominant language of text using Amazon Comprehend. It analyzes text to identify the main language and provides confidence scores for the detected dominant language, but doesn’t indicate percentage breakdowns for multilingual documents. For best results with long documents in multiple languages, split the text into smaller pieces and aggregate the results to estimate language percentages. It works best with at least 20 characters of text.

Personal information detection

You can also protect sensitive data using personal information detection with SageMaker Canvas. It can analyze text documents to automatically detect personally identifiable information (PII) entities, allowing you to locate sensitive data like names, addresses, dates of birth, phone numbers, email addresses, and more. It analyzes documents up to 100 KB and provides a confidence score for each detected entity so you can review and selectively redact the most sensitive information. For a list of entities detected, refer to Detecting PII entities.

Image data

SageMaker Canvas provides a visual, no-code interface that makes it straightforward for you to use computer vision capabilities by integrating with Amazon Rekognition for image analysis. For example, you can upload a dataset of images, use Amazon Rekognition to detect objects and scenes, and perform text detection to address a wide range of use cases. The visual interface and Amazon Rekognition integration make it possible for non-developers to harness advanced computer vision techniques.

Object detection in images

SageMaker Canvas uses Amazon Rekognition to detect labels (objects) in an image. You can upload the image from the SageMaker Canvas UI or use the Batch Prediction tab to select images stored in an S3 bucket. As shown in the following example, it can extract objects in the image such as clock tower, bus, buildings, and more. You can use the interface to search through the prediction results and sort them.

Text detection in images

Extracting text from images is a very common use case. Now, you can perform this task with ease on SageMaker Canvas with no code. The text is extracted as line items, as shown in the following screenshot. Short phrases within the image are classified together and identified as a phrase.

You can perform batch predictions by uploading a set of images, extract all the images in a single batch job, and download the results as a CSV file. This solution is useful when you want to extract and detect text in images.

Document data

SageMaker Canvas offers a variety of ready-to-use solutions that solve your day-to-day document understanding needs. These solutions are powered by Amazon Textract. To view all the available options for documents, choose to Ready-to-use models in the navigation pane and filter by Documents, as shown in the following screenshot.

Document analysis

Document analysis analyzes documents and forms for relationships among detected text. The operations return four categories of document extraction: raw text, forms, tables, and signatures. The solution’s capability of understanding the document structure gives you extra flexibility in the type of data you want to extract from the documents. The following screenshot is an example of what table detection looks like.

This solution is able to understand layouts of complex documents, which is helpful when you need to extract specific information in your documents.

Identity document analysis

This solution is designed to analyze documents like personal identification cards, driver’s licenses, or other similar forms of identification. Information such as middle name, county, and place of birth, together with its individual confidence score on the accuracy, will be returned for each identity document, as shown in the following screenshot.

There is an option to do batch prediction, whereby you can bulk upload sets of identification documents and process them as a batch job. This provides a quick and seamless way to transform identification document details into key-value pairs that can be used for downstream processes such as data analysis.

Expense analysis

Expense analysis is designed to analyze expense documents like invoices and receipts. The following screenshot is an example of what the extracted information looks like.

The results are returned as summary fields and line item fields. Summary fields are key-value pairs extracted from the document, and contain keys such as Grand Total, Due Date, and Tax. Line item fields refer to data that is structured as a table in the document. This is useful for extracting information from the document while retaining its layout.

Document queries

Document queries are designed for you to ask questions about your documents. This is a great solution to use when you have multi-page documents and you want to extract very specific answers from your documents. The following is an example of the types of questions you can ask and what the extracted answers look like.

The solution provides a straightforward interface for you to interact with your documents. This is helpful when you want to get specific details within large documents.

Conclusion

SageMaker Canvas provides a no-code environment to use ML with ease across various data types like text, images, and documents. The visual interface and integration with AWS services like Amazon Comprehend, Amazon Rekognition, and Amazon Textract eliminates the need for coding and data engineering. You can analyze text for sentiment, entities, languages, and PII. For images, object and text detection enables computer vision use cases. Finally, document analysis can extract text while preserving its layout for downstream processes. The ready-to-use solutions in SageMaker Canvas make it possible for you to harness advanced ML techniques to generate insights from both structured and unstructured data. If you’re interested using no-code tools with ready-to-use ML models, try out SageMaker Canvas today. For more information, refer to Getting started with using Amazon SageMaker Canvas.

About the authors

Julia Ang is a Solutions Architect based in Singapore. She has worked with customers in a range of fields, from health and public sector to digital native businesses, to adopt solutions according to their business needs. She has also been supporting customers in Southeast Asia and beyond to use AI & ML in their businesses. Outside of work, she enjoys learning about the world through traveling and engaging in creative pursuits.

Loke Jun Kai is a Specialist Solutions Architect for AI/ML based in Singapore. He works with customer across ASEAN to architect machine learning solutions at scale in AWS. Jun Kai is an advocate for Low-Code No-Code machine learning tools. In his spare time, he enjoys being with the nature.

Reading Between the Lines, Uncovering Evolutionary Patterns

Flexible, Friendly Software

An Expanding CUDA Quantum Community

A Quantum Center for Healthcare

Quantum Loves Grace Hopper

HPE Cray Supercomputers Integrate NVIDIA Grace Hopper

Next-Generation AI Supercomputing Centers

GH200 Shipping Globally and Available in Early Access from CSPs

Why trust and safety with LLMs matter

Content moderation with Amazon Comprehend

Addressing privacy

Addressing toxicity and prompt safety

Toxicity detection

Prompt safety classification

Integration with LangChain

Customize your moderation with filter configurations

Using callbacks and unique identifiers

Conclusion

About the authors

Extending the FDA MyStudies platform

Multi-platform support with rapid prototyping

System accessibility enhancements

Study accessibility enhancements

The MS Signals prototype study

MS Studies app design

Questionnaire design

Focusing on accessibility

Next steps

Text data

Sentiment analysis

Entities extraction

Language detection

Personal information detection

Image data

Object detection in images

Text detection in images

Document data

Document analysis

Identity document analysis

Expense analysis

Document queries

Conclusion

About the authors

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.