Use IP-restricted presigned URLs to enhance security in Amazon SageMaker Ground Truth

Amazon SageMaker Ground Truth significantly reduces the cost and time required for labeling data by integrating human annotators with machine learning to automate the labeling process. You can use SageMaker Ground Truth to create labeling jobs, which are workflows where data objects (such as images, videos, or documents) need to be annotated by human workers. These labeling jobs are distributed among a workteam—a group of workers assigned to perform the annotations. To access the data objects they need to label, workers are provided with Amazon S3 presigned URLs.

A presigned URL is a temporary URL that grants time-limited access to an Amazon Simple Storage Service (Amazon S3) object. In the context of SageMaker Ground Truth, these presigned URLs are generated using the grant_read_access Liquid filter and embedded into the task templates. Workers can then use these URLs to directly access the necessary files, such as images or documents, in their web browsers for annotation purposes.

While presigned URLs offer a convenient way to grant temporary access to S3 objects, sharing these URLs with people outside of the workteam can lead to unintended access of those objects. To mitigate this risk and enhance the security of SageMaker Ground Truth labeling tasks, we have introduced a new feature that adds an additional layer of security by restricting access to the presigned URLs to the worker’s IP address or virtual private cloud (VPC) endpoint from which they access the labeling task. In this blog post, we show you how to enable this feature, allowing you to enhance your data security as needed, and outline the success criteria for this feature, including the scenarios where it will be most beneficial.

Prerequisites

Before you get started configuring IP-restricted presigned URLs, the following resources can help you understand the background concepts:

  • Amazon S3 presigned URL: This documentation covers the use of Amazon S3 presigned URLs, which provide temporary access to objects. Understanding how presigned URLs work will be beneficial.
  • Use Amazon SageMaker Ground Truth to label data: This guide explains how to use SageMaker Ground Truth for data labeling tasks, including setting up workteams and workforces. Familiarity with these concepts will be helpful when configuring IP restrictions for your workteams.

Introducing IP-restricted presigned URLs

Working closely with our customers, we recognized the need for enhanced security posture and stricter access controls to presigned URLs. So, we introduced a new feature that uses AWS global condition context keys aws:SourceIp and aws:VpcSourceIp to allow customers to restrict presigned URL access to specific IP addresses or VPC endpoints. By incorporating AWS Identity and Access Management (IAM) policy constraints, you can now restrict presigned URLs to only be accessible from an IP address or VPC endpoint of your choice. This IP-based access control effectively locks down the presigned URL to the worker’s location, mitigating the risk of unauthorized access or unintended sharing.

Benefits of the new feature

This update brings several significant security benefits to SageMaker Ground Truth:

  • Enhanced data privacy: These IP restrictions restrict presigned URLs to only be accessible from customer-approved locations, such as corporate VPNs, workers’ home networks, or designated VPC endpoints. Although the presigned URLs are pre-authenticated, this feature adds an additional layer of security by verifying the access location and locking the URL to that location until the task is completed.
  • Reduced risk of unauthorized access: Enforcing IP-based access controls minimizes the risk of data being accessed from unauthorized locations and mitigates the risk of data sharing outside the worker’s approved access network. This is particularly important when dealing with sensitive or confidential data.
  • Flexible security options: You can apply these restrictions in either VPC or non-VPC settings, allowing you to tailor security measures to your organization’s specific needs.
  • Auditing and compliance: By locking down presigned URLs to specific IP addresses or VPC endpoints, you can more easily track and audit access to your organization’s data, helping achieve compliance with internal policies and external regulations.
  • Seamless integration: This new feature seamlessly integrates with existing SageMaker Ground Truth workflows, providing enhanced security without disrupting established labeling processes or requiring significant changes to existing infrastructure.

By introducing IP-Restricted presigned URLs, SageMaker Ground Truth empowers you with greater control over data access, so sensitive information remains accessible only to authorized workers within approved locations.

Configuring IP-restricted presigned URLs for SageMaker Ground Truth

The new IP restriction feature for presigned URLs in SageMaker Ground Truth can be enabled through the SageMaker API or the AWS Command Line Interface (AWS CLI). Before we go into the configuration of this new feature, let’s look at how you can create and update workteams today using the AWS CLI. You can also perform these operations through the SageMaker API using the AWS SDK.

Here’s an example of creating a new workteam using the create-workteam command:

aws sagemaker create-workteam 
    --description "A team for image labeling tasks" 
    --workforce-name "default" 
    --workteam-name "MyWorkteam" 
    --member-definitions '{
        "CognitoMemberDefinition": {
            "ClientId": "exampleclientid",
            "UserGroup": "sagemaker-groundtruth-user-group",
            "UserPool": "us-west-2_examplepool"
        }
    }'

To update an existing workteam, you use the update-workteam command:

aws sagemaker update-workteam 
    --workteam-name "MyWorkteam" 
    --description "Updated description for image labeling tasks"

Note that these examples only show a subset of the available parameters for the create-workteam and update-workteam APIs. You can find detailed documentation and examples in the SageMaker Ground Truth Developer Guide.

Enabling IP restrictions for presigned URLs

With the new IP restriction feature, you can now configure IP-based access constraints specific to each workteam when creating a new workteam or modifying an existing one. Here’s how you can enable these restrictions:

  1. When creating or updating a workteam, you can specify a WorkerAccessConfiguration object, which defines access constraints for the workers in that workteam.
  2. Within the WorkerAccessConfiguration, you can include an S3Presign object, which allows you to set access configurations for the presigned URLs used by the workers. Currently, only IamPolicyConstraints can be added to the S3Presign SageMaker Ground Truth provides two Liquid filters that you can use in your custom worker task templates to generate presigned URLs:
    • grant_read_access: This filter generates a presigned URL for the specified S3 object, granting temporary read access. The command will look like:
      <!-- Using grant_read_access filter -->
      <img src="{{ s3://bucket-name/path/to/image.jpg | grant_read_access }}"/>

    • s3_presign: This new filter serves the same purpose as grant_read_access but makes it clear that the generated URL is subject to the S3Presign configuration defined for the workteam. The command will look like:
      <!-- Using s3_presign filter (equivalent) -->
      <img src="{{ s3://bucket-name/path/to/image.jpg | s3_presign }}"/>

  3. The S3Presign object supports IamPolicyConstraints, where you can enable or disable the SourceIp and VpcSourceIp
    • SourceIp: When enabled, workers can access presigned URLs only from the specified IP addresses or ranges.
    • VpcSourceIp: When enabled, workers can access presigned URLs only from the specified VPC endpoints within your AWS account.

You can call the SageMaker ListWorkteams or DescribeWorkteam APIs to view workteams’ metadata, including the WorkerAccessConfiguration.

Let’s say you want to create or update a workteam so that presigned URLs will be restricted to the public IP address of the worker who originally accessed it.

Create workteam:

aws sagemaker create-workteam 
    --description "An example workteam with S3 presigned URLs restricted" 
    --workforce-name "default" 
    --workteam-name "exampleworkteam" 
    --member-definitions '{
        "CognitoMemberDefinition": {
            "ClientId": "exampleclientid",
            "UserGroup": "sagemaker-groundtruth-user-group", 
            "UserPool": "us-west-2_examplepool"
        }
    }' 
    --worker-access-configuration '{
        "S3Presign": {
            "IamPolicyConstraints": {
                "SourceIp": "Enabled",
                "VpcSourceIp": "Disabled"
            }
        }
    }'

Update workteam:

aws sagemaker update-workteam 
    --workteam-name "existingworkteam" 
    --worker-access-configuration '{
        "S3Presign": {
            "IamPolicyConstraints": {
                "SourceIp": "Enabled", 
                "VpcSourceIp": "Disabled"
            }
        }
    }'

Success criteria

While the IP-restricted presigned URLs feature provides enhanced security, there are scenarios where it might not be suitable. Understanding these limitations can help you make an informed decision about using the feature and verify that it aligns with your organization’s security needs and network configurations.

IP-restricted presigned URLs are effective in scenarios where there’s a consistent IP address used by the worker accessing SageMaker Ground Truth and the S3 object. For example, if a worker accesses labeling tasks from a stable public IP address, such as an office network with a fixed IP address, the IP restriction will provide access with enhanced security. Similarly, when a worker accesses both SageMaker Ground Truth and S3 objects through the same VPC endpoint, the IP restriction will verify that the presigned URL is only accessible from within this VPC. In both scenarios, the consistent IP address enables the IP-based access controls to function correctly, providing an additional layer of security.

Scenarios where IP-restricted presigned URLs aren’t effective

Scenario Description Example Exit criteria
Asymmetric VPC endpoints SageMaker Ground Truth is accessed through a public internet connection while Amazon S3 is accessed through a VPC endpoint, or vice versa. Worker accesses SageMaker Ground Truth through the public internet but S3 through a VPC endpoint. Verify that both SageMaker Ground Truth and S3 are accessed either entirely through the public internet or entirely through the same VPC endpoint.
Network Address Translation (NAT) layers NAT layers can alter the source IP address of requests, causing IP mismatches. Issues can arise from dynamically assigned IP addresses or asymmetric configurations. Examples include:

  • N-to-M IP translation where multiple internal IP addresses are translated to multiple public IP addresses.
  • A NAT gateway with multiple public IP addresses assigned to it, which can cause requests to appear from different IP addresses.
  • Shared IP addresses where multiple users’ traffic is routed through a single public IP address, making it difficult to enforce IP-based restrictions effectively.
Verify that the NAT gateway is configured to preserve the source IP address. Validate the NAT configuration for consistency when accessing both SageMaker Ground Truth and S3 resources.
Use of VPNs VPNs change the outgoing IP address, leading to potential access issues with IP-restricted presigned URLs. Worker uses a split-tunnel VPN that changes IP address for different requests to Ground Truth or S3, access might be denied. Disable the VPN or use a full tunnel VPN that offers consistent IP address for all requests.

Interface endpoints aren’t supported by the grant_read_access feature because of their inability to resolve public DNS names. This limitation is orthogonal to the IP restrictions and should be considered when configuring your network setup for accessing S3 objects with presigned URLs. In such cases, use the S3 Gateway endpoint when accessing S3 to verify compatibility with the public DNS names generated by grant_read_access.

Using S3 access logs for debugging

To debug issues related to IP-restricted presigned URLs, S3 access logs can provide valuable insights. By enabling access logging for your S3 bucket, you can track every request made to your S3 objects, including the IP addresses from which the requests originate. This can help you identify:

  • Mismatches between expected and actual IP addresses
  • Dynamic IP addresses or VPNs causing access issues
  • Unauthorized access from unexpected locations

To debug using S3 access logs, follow these steps:

  1. Enable S3 access logging: Configure your bucket to deliver access logs to another bucket or a logging service such as Amazon CloudWatch Logs.
  2. Review log files: Analyze the log files to identify patterns or anomalies in IP addresses, request timestamps, and error codes.
  3. Look for IP address changes: If you observe frequent changes in IP addresses within the logs, it might indicate that the worker’s IP address is dynamic or altered by a VPN or proxy.
  4. Check for NAT layer modifications: See if NAT layers are modifying the source IP address by checking the x-forwarded-for header in the log files.
  5. Verify authorized access: Confirm that requests are coming from approved and consistent IP addresses by checking the Remote IP field in the log files.

By following these steps and analyzing the S3 access logs, you can validate that the presigned URLs are accessed only from approved and consistent IP addresses.

Conclusion

The introduction of IP-restricted presigned URLs in Amazon SageMaker Ground Truth significantly enhances the security of data accessed through the service. By allowing you to restrict access to specific IP addresses or VPC endpoints, this feature helps facilitate more fine-tuned control of presigned URLs. It provides organizations with added protection for their sensitive data, offering a valuable option for those with stringent security requirements. We encourage you to explore this new security feature to protect your organization’s data and enhance the overall security of your labeling workflows. To get started with SageMaker Ground Truth, visit Getting Started. To implement IP restrictions on presigned URLs as part of your workteam setup, refer to the CreateWorkteam and UpdateWorkteam API documentation. Follow the guidance provided in this blog to configure these security measures effectively. For more information or assistance, contact your AWS account team or visit the SageMaker community forums.


About the Authors

Sundar Raghavan is an AI/ML Specialist Solutions Architect at AWS, helping customers build scalable and cost-efficient AI/ML pipelines with Human in the Loop services. In his free time, Sundar loves traveling, sports and enjoying outdoor activities with his family.

Michael Borde is a lead software engineer at Amazon AI, where he has been for seven years. He previously studied mathematics and computer science at the University of Chicago. Michael is passionate about cloud computing, distributed systems design, and digital privacy & security. After work, you can often find Michael putzing around the local powerlifting gym in Capitol Hill.

Jacky Shum is a Software Engineer at AWS in the SageMaker Ground Truth team. He works to help AWS customers leverage machine learning applications, including prior work on ML-based fraud detection with Amazon Fraud Detector.

Rohith Kodukula is a Software Development Engineer on the SageMaker Ground Truth team. In his free time he enjoys staying active and reading up on anything that he finds mildly interesting (most things really).

Abhinay Sandeboina is a Engineering Manager at AWS Human In The Loop (HIL). He has been in AWS for over 2 years and his teams are responsible for managing ML platform services. He has a decade of experience in software/ML engineering building infrastructure platforms at scale. Prior to AWS, he worked in various engineering management roles at Zillow and Capital One.

Read More

Unlock the power of structured data for enterprises using natural language with Amazon Q Business

Unlock the power of structured data for enterprises using natural language with Amazon Q Business

One of the most common applications of generative artificial intelligence (AI) and large language models (LLMs) in an enterprise environment is answering questions based on the enterprise’s knowledge corpus. Pre-trained foundation models (FMs) excel at natural language understanding (NLU) tasks, including summarization, text generation, and question answering across a wide range of topics. However, they often struggle to provide accurate answers without hallucinations and fall short when addressing questions about content that wasn’t included in their training data. Furthermore, FMs are trained with a point-in-time snapshot of data and have no inherent ability to access fresh data at inference time; therefore, they might provide responses that are incorrect or inadequate.

We face a fundamental challenge with enterprise data—overcoming the disconnect between natural language and structured data. Natural language is ambiguous and imprecise, whereas data adheres to rigid schemas. For example, SQL queries can be complex and unintuitive for non-technical users. Handling complex queries involving multiple tables, joins, and aggregations makes it difficult to interpret user intent and translate it into correct SQL operations. Domain-specific terminology further complicates the mapping process. Another challenge is accommodating the linguistic variations users employ to express the same requirement. Effectively managing synonyms, paraphrases, and alternative phrasings is important. The inherent ambiguity of natural language can also result in multiple interpretations of a single query, making it difficult to accurately understand the user’s precise intent.

To bridge this gap, you need advanced natural language processing (NLP) to map user queries to database schema, tables, and operations. In this architecture, Amazon Q Business acts as an intermediary, translating natural language into precise SQL queries. You can simply ask questions like “What were the sales for outdoor gear in Q3 2023?” Amazon Q Business analyzes intent, accesses data sources, and generates the SQL query. This simplifies data access for your non-technical users and streamlines workflows for professionals, allowing them to focus on higher-level tasks.

In this post, we discuss an architecture to query structured data using Amazon Q Business, and build out an application to query cost and usage data in Amazon Athena with Amazon Q Business. Amazon Q Business can create SQL queries to your data sources when provided with the database schema, additional metadata describing the columns and tables, and prompting instructions. You can extend this architecture to use additional data sources, query validation, and prompting techniques to cover a wider range of use cases.

Solution overview

The following figure represents the high-level architecture of the proposed solution. Steps 3 and 4 augment the AWS IAM Identity Center integration with Amazon Q Business for an authorization flow. In this architecture, we use Amazon Cognito for user authentication as well as a trusted token issuer to IAM Identity Center. You can also use your own identity provider as a trusted token issuer as long as it supports OpenID Connect (OIDC).

architecture diagram

The workflow includes the following steps:

  1. The user initiates the interaction with the Streamlit application, which is accessible through an Application Load Balancer, acting as the entry point.
  2. The application prompts the user to authenticate using their Amazon Cognito credentials, maintaining secure access.
  3. The application exchanges the token obtained from Amazon Cognito for an IAM Identity Center token, granting the necessary scope to interact with Amazon Q Business.
  4. Using the IAM Identity Center token, the application assumes an AWS Identity and Access Management (IAM) role and retrieves an AWS session from AWS Security Token Service (AWS STS), enabling authorized communication with Amazon Q Business.
  5. Based on the user’s natural language query, the application formulates relevant prompts and metadata, which are then submitted to the chat_sync API of Amazon Q Business. In response, Amazon Q Business provides an appropriate Athena query to run.
  6. The application runs the Athena query received from Amazon Q Business, and the resulting data is displayed on the web application’s UI.

Querying Amazon Q Business LLMs directly

As explained in the response settings for Amazon Q Business, there are different options to generate responses that allow you to either use your enterprise data, use LLMs directly, or fall back on the LLMs if the answer is not found in your enterprise data. Along with the global controls for response settings, you need to specify which chatMode you want to use based on your specific use case. If you want to bypass Retrieval Augmented Generation (RAG) and use plain text in the context window, you should use CREATOR_MODE. Alternatively, RAG is also bypassed when you upload files directly in the context window.

If you just use text in the context window and call Amazon Q Business APIs without switching to CREATOR_MODE, that may break your use case in the future if you add content to the index (RAG). In this use case, because we’re not indexing any data and using schemas as attachments in the API call to Amazon Q Business, RAG is automatically bypassed and the response is generated directly from the LLMs. Another reason to use attachments for this use case is that for the chatSync API, userMessage has a maximum length of 7,000, which can be surpassed depending on how large your text is in the context window.

Data query workflow

Let’s look at the prompts, query generation, and Athena query in detail. We use Athena as the data store in this post. Users enter natural language questions into a web application built with Streamlit. Amazon Q Business converts the natural language questions to valid SQL for Athena using the prompting instructions, the database schema, and data dictionary that are provided as context to the LLM. The generated SQL is sent to Athena to run as a query, and the returned data is displayed to the user in the Streamlit application. The following diagram illustrates this workflow.

query workflow

These are the various components to this data flow, as numbered in the diagram:

  1. User intent
  2. Prompt builder
  3. SQL query generator
  4. Running the query
  5. Query results

In the following sections, we look at each component in more detail.

User intent

The user intent or your inquiry is the starting point of the process. It can be in natural language, such as “What was the total spend for ElasticSearch last year?” The user’s input serves as the basis for the subsequent steps in the workflow.

Prompt builder

The prompt builder component plays a crucial role in bridging the gap between your natural language input and the structured data format required for SQL querying. It augments your question with relevant information from the table schema and data dictionary to provide context for the query generation process. This step involves the following sub-tasks:

  • Natural language processing – NLP techniques are employed to analyze and understand your questions. This includes steps like tokenization and dependency parsing to extract the intent and relevant entities from the natural language input.
  • Entity recognition – Named entity recognition (NER) is used to identify and classify relevant entities mentioned in your question, such as product names, dates, or region. This step helps map your input to the corresponding data elements in the database schema.
  • Intent mapping – The prompt builder maps your intent, extracted from the NLP analysis, to the appropriate data structures and operations required to fulfill the query. This mapping process uses the table schema and data dictionary to establish connections between your natural language questions and the database elements. The output of the prompt builder is a structured representation of your question, augmented with the necessary context from the database schema and data dictionary. This structured representation serves as input for the next step, SQL query generation.

The following is an example prompt for “What was the total spend for ElasticSearch last year?”

You will not respond to gibberish, random character sequences, or prompts that do not make logical sense. 
If the input the input does not make sense or is outside the scope of the provided context, do not respond with SQL 
but respond with - I do not know about this. Please fix your input.
You are an expert SQL developer. Only return the sql query. Do not include any verbiage. 
You are required to return SQL queries based on the provided schema and the service mappings for common services and 
their synonyms. The table with the provided schema is the only source of data. Do not use joins. Assume product, 
service are synonyms for product_servicecode and price,cost,spend are synonymns for line_item_unblended_cost. Use the 
column names from the provided schema while creating queries. Do not use preceding zeroes for the column month when 
creating the query. Only use predicates when asked. For your reference, current date is June 01, 2024. write a sql 
query for this task - What was the total spend for ElasticSearch last year?

SQL query generation

Based on the prompt generated from the prompt builder and your original question, Amazon Q Business generates the corresponding SQL query. The SQL query is tailored to retrieve the relevant data and perform the desired analysis or calculations to accurately answer the user’s question. This step may involve techniques such as:

  • Mapping your intent and entities to SQL clauses (SELECT, FROM, WHERE, JOIN, and so on)
  • Handling complex queries involving aggregations, subqueries, or predicates
  • Incorporating domain-specific knowledge or business rules into the query generation process

Running the query

In this step, the generated SQL query is run against the chosen data store, which could be a relational database, data warehouse, NoSQL database, or an object store like Amazon Simple Storage Service (Amazon S3). The data store serves as the repository for the data required to answer the user’s question. Depending on the architecture and requirements, the data store query may involve additional components or processes, such as:

  • Query optimization and indexing strategies
  • Materialized views for complex queries
  • Real-time data ingestion and updates

Query results

The query engine runs the generated SQL query against the data store and returns the query results. These results contain the insights or answers to the original user question. The presentation of the query results can take various forms, depending on the requirements of the application or UI:

  • Tabular data – The results can be displayed as a table or spreadsheet, suitable for structured data analysis
  • Visualizations – The query results can be rendered as charts, graphs, or other visual representations, providing a more intuitive way to understand and explore the data
  • Natural language responses – In some cases, the query results can be translated back into natural language statements or summaries, making the insights more accessible to non-technical users

In the following sections, we walk through the steps to deploy the web application and test the solution.

Prerequisites

Complete the following prerequisite steps:

  1. Set up IAM Identity Center and add users that you intend to give access to in your Amazon Q Business application.
  2. Have an existing, working Amazon Q Business application and give access to the users created in the previous step to the application.
  3. AWS Cost and Usage Reports (AWS CUR) data is available in Athena. If you have CUR data, you can skip the following steps for CUR data setup. If not, you have a few options to set up CUR data:
    1. To set up sample CUR data, refer to the following lab and follow the instructions.
    2. You also need to set up an AWS Glue crawler to make the data available in Athena.
  4. If you already have an SSL certificate, you can skip this step; otherwise, generate a private certificate.
  5. Import the certificate into AWS Certificate Manager (ACM). For more details, refer to Importing a certificate.

Set up the application

Complete the following steps to set up the application:

  1. From your terminal, clone the GitHub repository:
git clone https://github.com/aws-samples/data-insights-with-amazon-q-business.git
  1. Go to the project directory:
cd data-insights-with-amazon-q-business
  1. Based on your CUR table, update the CUR schema under app/schemas/cur_schema.txt. Review the prompts under app/qb_config.py. The schema looks similar to the following code:

  1. Review the data dictionary under app/schemas/service_mappings.csv. You can modify the mappings according to your dataset. A sample data dictionary for CUR might look like the following screenshot.

  1. Zip up the code repository and upload it to an S3 bucket.
  2. Follow the steps in the GitHub repo to deploy the Streamlit application.

Access the web application

As part of the deployment steps, you launched an AWS CloudFormation stack. On the AWS CloudFormation console, navigate to the Outputs tab for the stack and find the URL to access the Streamlit application. When you open the URL in a browser, you’ll see a login screen like the following screenshot. Sign up to create a user in the Amazon Cognito user pool. After you’re validated, you can use the same credentials to log in to the web application.

Query your cost and usage data

Start with a simple query like “What was the total spend for ElasticSearch this year?” A relevant prompt will be created and sent to Amazon Q Business. It will respond back with the corresponding SQL query. Notice the predicate where product_servicecode = ‘AmazonES’. Amazon Q Business is able to formulate the query because it has the schema and the data dictionary in context. It understands that ElasticSearch is an AWS service represented by a column named product_servicecode in the CUR data schema and its corresponding value of ‘AmazonES’. Next, the query is run against Athena and you get the results back.

The sample dataset used in this post is from 2023. If you’re using the sample dataset, natural language queries referring to current year will give not return results. Modify your queries to 2023 or mention the year in the user intent.

The following figure highlights the steps as explained in the data flow.

sample query run

You can also try complex queries likeGive me a list of the top 3 products by total spend last year. For each of these products, what percentage of the overall spend is from this product?” Because the prompt builder has schema and product (AWS services) information in its context, Amazon Q Business creates the corresponding query. In this case, you’ll see a query similar to the following:

SELECT 
product_servicecode,
SUM(line_item_unblended_cost) AS total_spend,
ROUND(SUM(line_item_unblended_cost) * 100.0 / (SELECT SUM(line_item_unblended_cost)
FROM cur_daily WHERE year = '2023'), 2) AS percentage_of_total
FROM cur_daily
WHERE year = '2023'
GROUP BY product_servicecode
ORDER BY total_spend DESC
LIMIT 3;

When the query is run against Athena, you’ll see similar results corresponding to your data.

Along with the data, you can also see a summary and trend analysis of your data on the Description tab of your Streamlit app.

The prompts used in the application are open domain and you’re free to update them in the code. For example, the following is a prompt used for a summary task:

You are an AI assistant. You are required to return a summary based on the provided data in attachment. Use atleast 
100 words. The spend is in dollars. The unit of measurement is dollars. Give trend analysis too. Start your response 
with - Here is your summary..

The following screenshot shows the results.

Feedback loop

You also have the option of capturing feedback for the generated queries with the thumbs up/down icon on the web application. Currently, the feedback is captured in a local file under /app/feedback. You can change this implementation to write to a database of your choice and have it serve as a query validation mechanism after your testing, to allow only validated queries to run.

Clean up

To clean up your resources, delete the CloudFormation stack, Amazon Q Business application, and Athena tables.

Conclusion

In this post, we demonstrated how Amazon Q Business can effectively bridge the gap between users and data, enabling you to extract valuable insights from various data stores using natural language queries, without the need for extensive technical knowledge or SQL expertise. The natural language understanding capabilities of Amazon Q Business can accurately interpret user intent, extract relevant entities, and generate SQL to translate the user’s query into executable data operations. You can now empower a wider range of enterprise users to unlock the full value of your organization’s data assets. By democratizing data access and analysis using natural language queries, you can foster data-driven decision-making, drive innovation, and unlock new opportunities for growth and success.

In Part 2 of this series, we demonstrate how to integrate this architecture with LangChain using Amazon Q Business as a custom model. We also cover query validation and accuracy measurement.


About the Authors

Vishal Karlupia is a Senior Technical Account Manager/Lead at Amazon Web Services, Toronto. He specializes in generative AI applications and helps customers build and scale their AI/ML workloads on AWS. Outside of work, he enjoys being outdoors and keeping bonfires alive.

Srinivas Ganapathi is a Principal Technical Account Manager at Amazon Web Services. He is based in Toronto, Canada, and works with games customers to run efficient workloads on AWS.

Read More

Cohere Rerank 3 Nimble now generally available on Amazon SageMaker JumpStart

Cohere Rerank 3 Nimble now generally available on Amazon SageMaker JumpStart

The Cohere Rerank 3 Nimble foundation model (FM) is now generally available in Amazon SageMaker JumpStart. This model is the newest FM in Cohere’s Rerank model series, built to enhance enterprise search and Retrieval Augmented Generation (RAG) systems.

In this post, we discuss the benefits and capabilities of this new model with some examples.

Overview of Cohere Rerank models

Cohere’s Rerank family of models are designed to enhance existing enterprise search systems and RAG systems. Rerank models improve search accuracy over both keyword-based and embedding-based search systems. Cohere Rerank 3 is designed to reorder documents retrieved by initial search algorithms based on their relevance to a given query. A reranking model, also known as a cross-encoder, is a type of model that, given a query and document pair, will output a similarity score. For FMs, words, sentences, or entire documents are often encoded as dense vectors in a semantic space. By calculating the cosine of the angle between these vectors, you can quantify their semantic similarity and output as a single similarity score. You can use this score to reorder the documents by relevance to your query.

Cohere Rerank 3 Nimble is the newest model from Cohere’s Rerank family of models, designed to improve speed and efficiency from its predecessor Cohere Rerank 3. According to Cohere’s benchmark tests including BEIR (Benchmarking IR) for accuracy and internal benchmarking datasets, Cohere Rerank 3 Nimble maintains high accuracy while being approximately 3–5 times faster than Cohere Rerank 3. The speed improvement is designed for enterprises looking to enhance their search capabilities without sacrificing performance.

The following diagram represents the two-stage retrieval of a RAG pipeline and illustrates where Cohere Rerank 3 Nimble is incorporated into the search pipeline.

Flow of Solution

In the first stage of retrieval in the RAG architecture, a set of candidate documents are returned based on the knowledge base that’s relevant to the query. In the second stage, Cohere Rerank 3 Nimble analyzes the semantic relevance between the query and each retrieved document, reordering them from most to least relevant. The top-ranked documents augment the original query with additional context. This process improves search result quality by identifying the most pertinent documents. Integrating Cohere Rerank 3 Nimble into a RAG system enables users to send fewer but higher-quality documents to the language model for grounded generation. This results in improved accuracy and relevance of search results without adding latency.

Overview of SageMaker JumpStart

SageMaker JumpStart offers access to a broad selection of publicly available FMs. These pre-trained models serve as powerful starting points that can be deeply customized to address specific use cases. You can now use state-of-the-art model architectures, such as language models, computer vision models, and more, without having to build them from scratch.

Amazon SageMaker is a comprehensive, fully managed machine learning (ML) platform that revolutionizes the entire ML workflow. It offers an unparalleled suite of tools that cater to every stage of the ML lifecycle, from data preparation to model deployment and monitoring. Data scientists and developers can use the SageMaker integrated development environment (IDE) to access a vast array of pre-built algorithms, customize their own models, and seamlessly scale their solutions. The platform’s strength lies in its ability to abstract away the complexities of infrastructure management, allowing you to focus on innovation rather than operational overhead. The automated ML capabilities of SageMaker, including automated machine learning (AutoML) features, democratize ML by enabling even non-experts to build sophisticated models. Furthermore, its robust governance features help organizations maintain control and transparency over their ML projects, addressing critical concerns around regulatory compliance.

Prerequisites

Make sure your SageMaker AWS Identity and Access Management (IAM) service role has the AmazonSageMakerFullAccess permission policy attached.

To deploy Cohere Rerank 3 Nimble successfully, confirm one of the following:

  • Make sure your IAM role has the following permissions and you have the authority to make AWS Marketplace subscriptions in the AWS account used:
    • aws-marketplace:ViewSubscriptions
    • aws-marketplace:Unsubscribe
    • aws-marketplace:Subscribe
  • Alternatively, confirm your AWS account has a subscription to the model. If so, you can skip the following deployment instructions and start with subscribing to the model package.

Deploy Cohere Rerank 3 Nimble on SageMaker JumpStart

You can access the Cohere Rerank 3 family of models using SageMaker JumpStart in Amazon SageMaker Studio, as shown in the following screenshot.

Cohere Sagemaker Jumpstart Viea

Deployment starts when you choose Deploy, and you may be prompted to subscribe to this model through AWS Marketplace. If you are already subscribed, you can choose Deploy again to deploy the model. After deployment finishes, you will see that an endpoint is created. You can test the endpoint by passing a sample inference request payload or by selecting the testing option using the SDK.

Cohere rerank model card

Subscribe to the model package

To subscribe to the model package, complete the following steps:

  1. Depending on the model you want to deploy, open the model package listing page for cohere-rerank-nimble-english or cohere-rerank-nimble-multilingual.
  2. On the AWS Marketplace listing, choose Continue to subscribe.
  3. On the Subscribe to this software page, review and choose Accept Offer if you and your organization agree with EULA, pricing, and support terms.
  4. Choose Continue to configuration and then choose an AWS Region.

A product ARN will be displayed. This is the model package ARN that you need to specify while creating a deployable model using Boto3.

Deploy Cohere Rerank 3 Nimble using the SDK

To deploy the model using the SDK, copy the product ARN from the previous step and specify it in the model_package_arn in the following code:

from cohere_aws import Client
import boto3
region = boto3.Session().region_name

model_package_arn = "Specify the model package ARN here"

After you specify the model package ARN, you can create the endpoint, as shown in the following code. Specify the name of the endpoint, the instance type, and the number of instances being used. Make sure you have the account-level service limit for using ml.g5.xlarge for endpoint usage as one or more instances. To request a service quota increase, refer to AWS service quotas.

co = Client(region_name=region)
co.create_endpoint(arn=model_package_arn, endpoint_name="cohere-rerank-3/cohere-rerank-nimble-multilingual", instance_type="ml.g5.xlarge", n_instances=1)

If the endpoint is already created, you just need to connect to it with the following code:

co.connect_to_endpoint(endpoint_name="cohere-rerank-3/cohere-rerank-nimble-multilingual-v3")

Follow a similar process as detailed earlier to deploy Cohere Rerank 3 on SageMaker JumpStart.

Inference example with Cohere Rerank 3 Nimble

Cohere Rerank 3 Nimble offers robust multilingual support. The model is available in both English and multilingual versions supporting over 100 languages.

The following code example illustrates how to perform real-time inference using Cohere Rerank 3 Nimble-English:

documents = [
    {"Title":"Incorrect Password","Content":"Hello, I have been trying to access my account for the past hour and it keeps saying my password is incorrect. Can you please help me?"},
    {"Title":"Confirmation Email Missed","Content":"Hi, I recently purchased a product from your website but I never received a confirmation email. Can you please look into this for me?"},
    {"Title":"Questions about Return Policy","Content":"Hello, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective."},
    {"Title":"Customer Support is Busy","Content":"Good morning, I have been trying to reach your customer support team for the past week but I keep getting a busy signal. Can you please help me?"},
    {"Title":"Received Wrong Item","Content":"Hi, I have a question about my recent order. I received the wrong item and I need to return it."},
    {"Title":"Customer Service is Unavailable","Content":"Hello, I have been trying to reach your customer support team for the past hour but I keep getting a busy signal. Can you please help me?"},
    {"Title":"Return Policy for Defective Product","Content":"Hi, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective."},
    {"Title":"Wrong Item Received","Content":"Good morning, I have a question about my recent order. I received the wrong item and I need to return it."},
    {"Title":"Return Defective Product","Content":"Hello, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective."}
]

In the following code, the top_n inference parameter for Cohere Rerank 3 and Rerank 3 Nimble specifies the number of top-ranked results to return after reranking the input documents. It allows you to control how many of the most relevant documents are included in the final output. To determine an optimal value for top_n, consider factors such as the diversity of your document set, the complexity of your queries, and the desired balance between precision and latency for enterprise search or RAG.

response = co.rerank(documents=documents, query='What emails have been about returning items?', rank_fields=["Title","Content"], top_n=2)

The following is the output from Cohere Rerank 3 Nimble-English:

Documents: [RerankResult<document: {'Title': 'Received Wrong Item', 'Content': 'Hi, I have a question about my recent order. I received the wrong item and I need to return it.'}, index: 4, relevance_score: 0.0068771075>, RerankResult<document: {'Title': 'Wrong Item Received', 'Content': 'Good morning, I have a question about my recent order. I received the wrong item and I need to return it.'}, index: 7, relevance_score: 0.0064131636>]

Cohere Rerank 3 Nimble multilingual support

The multilingual capabilities of Cohere Rerank 3 Nimble-Multilingual enable global organizations to provide consistent, improved search experiences to users across different Regions and language preferences.

In the following example, we create an input payload for a list of emails in multiple languages. We can take the same set of emails from earlier and translate them to different languages. These examples are available under the SageMaker JumpStart model card and are randomly generated for this example.

documents = [
    {"Title":"Contraseña incorrecta","Content":"Hola, llevo una hora intentando acceder a mi cuenta y sigue diciendo que mi contraseña es incorrecta. ¿Puede ayudarme, por favor?"},
    {"Title":"Confirmation Email Missed","Content":"Hi, I recently purchased a product from your website but I never received a confirmation email. Can you please look into this for me?"},
    {"Title":"أسئلة حول سياسة الإرجاع","Content":"مرحبًا، لدي سؤال حول سياسة إرجاع هذا المنتج. لقد اشتريته قبل بضعة أسابيع وهو معيب"},
    {"Title":"Customer Support is Busy","Content":"Good morning, I have been trying to reach your customer support team for the past week but I keep getting a busy signal. Can you please help me?"},
    {"Title":"Falschen Artikel erhalten","Content":"Hallo, ich habe eine Frage zu meiner letzten Bestellung. Ich habe den falschen Artikel erhalten und muss ihn zurückschicken."},
    {"Title":"Customer Service is Unavailable","Content":"Hello, I have been trying to reach your customer support team for the past hour but I keep getting a busy signal. Can you please help me?"},
    {"Title":"Return Policy for Defective Product","Content":"Hi, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective."},
    {"Title":"收到错误物品","Content":"早上好,关于我最近的订单,我有一个问题。我收到了错误的商品,需要退货。"},
    {"Title":"Return Defective Product","Content":"Hello, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective."}
]

Use the following code to perform real-time inference using Cohere Rerank 3 Nimble-Multilingual:

response = co.rerank(documents=documents, query='What emails have been about returning items?', rank_fields=['Title','Content'], top_n=2)
print(f'Documents: {response}')

The following is the output from Cohere Rerank 3 Nimble-Multilingual:

Documents: [RerankResult<document: {'Title': '收到错误物品', 'Content': '早上好,关于我最近的订单,我有一个问题。我收到了错误的商品,需要退货。'}, index: 7, relevance_score: 0.034553625>, RerankResult<document: {'Title': 'أسئلة حول سياسة الإرجاع', 'Content': 'مرحبًا، لدي سؤال حول سياسة إرجاع هذا المنتج. لقد اشتريته قبل بضعة أسابيع وهو معيب'}, index: 2, relevance_score: 0.00037263767>]

The output translated to English is as follows:

Documents: [RerankResult<document: {'Title': 'Received Wrong Item', 'Content': 'Good morning, I have a question about my recent order. I received the wrong item and need to return it.'}, index: 7, relevance_score: 0.034553625>, RerankResult<document: {'Title': 'Questions about Return Policy', 'Content': 'Hello, I have a question about the return policy for this product. I bought it a few weeks ago and it's defective'}, index: 2, relevance_score: 0.00037263767>]

In both examples, the relevance scores are normalized to be in the range [0, 1]. Scores close to 1 indicate a high relevance to the query, and scores closer to 0 indicate low relevance.

Use cases suitable for Cohere Rerank 3 Nimble

The Cohere Rerank 3 Nimble model provides an option that prioritizes efficiency. The model is ideal for enterprises looking to enable their customers to accurately search complex documentation, build applications that understand over 100 languages, and retrieve the most relevant information from various data stores. In industries such as retail, where website drop-off increases with every 100 milliseconds added to search response time, having a faster AI model like Cohere Rerank 3 Nimble powering the enterprise search system translates to higher conversion rates.

Conclusion

Cohere Rerank 3 and Rerank 3 Nimble are now available on SageMaker JumpStart. To get started, refer to Train, deploy, and evaluate pretrained models with SageMaker JumpStart.

Interested in diving deeper? Check out the Cohere on AWS GitHub repo.


About the Authors

Breanne Warner is an Enterprise Solutions Architect at Amazon Web Services supporting healthcare and life science (HCLS) customers. She is passionate about supporting customers to use generative AI on AWS and evangelizing model adoption. Breanne is also on the Women@Amazon board as co-director of Allyship with the goal of fostering inclusive and diverse culture at Amazon. Breanne holds a Bachelor’s of Science in Computer Engineering from University of Illinois at Urbana Champaign (UIUC)

Nithin Vijeaswaran is a Solutions Architect at AWS. His area of focus is generative AI and AWS AI Accelerators. He holds a Bachelor’s degree in Computer Science and Bioinformatics. Niithiyn works closely with the Generative AI GTM team to enable AWS customers on multiple fronts and accelerate their adoption of generative AI. He’s an avid fan of the Dallas Mavericks and enjoys collecting sneakers.

Karan Singh is a Generative AI Specialist for third-party models at AWS, where he works with top-tier third-party foundational model providers to define and run join GTM motions that help customers train, deploy, and scale foundational models. Karan holds a Bachelor’s of Science in Electrical and Instrumentation Engineering from Manipal University and a Master’s in Science in Electrical Engineering from Northwestern University, and is currently an MBA Candidate at the Haas School of Business at University of California, Berkeley.

Read More

Perform generative AI-powered data prep and no-code ML over any size of data using Amazon SageMaker Canvas

Perform generative AI-powered data prep and no-code ML over any size of data using Amazon SageMaker Canvas

Amazon SageMaker Canvas now empowers enterprises to harness the full potential of their data by enabling support of petabyte-scale datasets. Starting today, you can interactively prepare large datasets, create end-to-end data flows, and invoke automated machine learning (AutoML) experiments on petabytes of data—a substantial leap from the previous 5 GB limit. With over 50 connectors, an intuitive Chat for data prep interface, and petabyte support, SageMaker Canvas provides a scalable, low-code/no-code (LCNC) ML solution for handling real-world, enterprise use cases.

Organizations often struggle to extract meaningful insights and value from their ever-growing volume of data. You need data engineering expertise and time to develop the proper scripts and pipelines to wrangle, clean, and transform data. Then you must experiment with numerous models and hyperparameters requiring domain expertise. Afterward, you need to manage complex clusters to process and train your ML models over these large-scale datasets.

Starting today, you can prepare your petabyte-scale data and explore many ML models with AutoML by chat and with a few clicks. In this post, we show you how you can complete all these steps with the new integration in SageMaker Canvas with Amazon EMR Serverless without writing code.

Solution overview

For this post, we use a sample dataset of a 33 GB CSV file containing flight purchase transactions from Expedia between April 16, 2022, and October 5, 2022. We use the features to predict the base fare of a ticket based on the flight date, distance, seat type, and others.

In the following sections, we demonstrate how to import and prepare the data, optionally export the data, create a model, and run inference, all in SageMaker Canvas.

Prerequisites

You can follow along by completing the following prerequisites:

  1. Set up SageMaker Canvas.
  2. Download the dataset from Kaggle and upload it to an Amazon Simple Storage Service (Amazon S3) bucket.
  3. Add emr-serverless as a trusted entity to the SageMaker Canvas execution role to allow Amazon EMR processing jobs.

Import data in SageMaker Canvas

We start by importing the data from Amazon S3 using Amazon SageMaker Data Wrangler in SageMaker Canvas. Complete the following steps:

  1. In SageMaker Canvas, choose Data Wrangler in the navigation pane.
  2. On the Data flows tab, choose Tabular on the Import and prepare dropdown menu.
  3. Enter the S3 URI for the file and choose Go, then choose Next.
  4. Give your dataset a name, choose Random for Sampling method, then choose Import.

Importing data from the SageMaker Data Wrangler flow allows you to interact with a sample of the data before scaling the data preparation flow to the full dataset. This improves time and performance because you don’t need to work with the entirety of the data during preparation. You can later use EMR Serverless to handle the heavy lifting. When SageMaker Data Wrangler finishes importing, you can start transforming the dataset.

After you import the dataset, you can first look at the Data Quality Insights Report to see recommendations from SageMaker Canvas on how to improve the data quality and therefore improve the model’s performance.

  1. In the flow, choose the options menu (three dots) for the node, then choose Get data insights.
  2. Give your analysis a name, select Regression for Problem type, choose baseFare for Target column, select Sampled dataset for Data Size, then choose Create.

Assessing the data quality and analyzing the report’s findings is often the first step because it can guide the proceeding data preparation steps. Within the report, you will find dataset statistics, high priority warnings around target leakage, skewness, anomalies, and a feature summary.

Prepare the data with SageMaker Canvas

Now that you understand your dataset characteristics and potential issues, you can use the Chat for data prep feature in SageMaker Canvas to simplify data preparation with natural language prompts. This generative artificial intelligence (AI)-powered capability reduces the time, effort, and expertise required for the often complex tasks of data preparation.

  1. Choose the .flow file on the top banner to go back to your flow canvas.
  2. Choose the options menu for the node, then choose Chat for data prep.

For our first example, converting searchDate and flightDate to datetime format might help us perform date manipulations and extract useful features such as year, month, day, and the difference in days between searchDate and flightDate. These features can find temporal patterns in the data that can influence the baseFare.

  1. Provide a prompt like “Convert searchDate and flightDate to datetime format” to view the code and choose Add to steps.

In addition to data preparation using the chat UI, you can use LCNC transforms with the SageMaker Data Wrangler UI to transform your data. For example, we use one-hot encoding as a technique to convert categorical data into numerical format using the LCNC interface.

  1. Add the transform Encode categorical.
  2. Choose One-hot encode for Transform and add the following columns: startingAirport, destinationAirport, fareBasisCode, segmentsArrivalAirportCode, segmentsDepartureAirportCode, segmentsAirlineName, segmentsAirlineCode, segmentsEquipmentDescription, and segmentsCabinCode.

You can use the advanced search and filter option in SageMaker Canvas to select columns that are of String data type to simplify the process.

Refer to the SageMaker Canvas blog for other examples using SageMaker Data Wrangler. For this post, we simplify our efforts with these two steps, but we encourage you to use both chat and transforms to add data preparation steps on your own. In our testing, we successfully ran all our data preparation steps through the chat using the following prompts as an example:

  • “Add another step that extracts relevant features such as year, month, day, and day of the week which can enhance temporality to our dataset”
  • “Have Canvas convert the travelDuration, segmentsDurationInSeconds, and segmentsDistance column from string to numeric”
  • “Handle missing values by imputing the mean for the totalTravelDistance column, and replacing missing values as ‘Unknown’ for the segmentsEquipmentDescription column”
  • “Convert boolean columns isBasicEconomy, isRefundable, and isNonStop to integer format (0 and 1)”
  • “Scale numerical features like totalFare, seatsRemaining, totalTravelDistance using Standard Scaler from scikit-learn”

When these steps are complete, you can move to the next step of processing the full dataset and creating a model.

(Optional) Export your data in Amazon S3 using an EMR Serverless job

You can process the entire 33 GB dataset by running the data flow using EMR Serverless for the data preparation job without worrying about the infrastructure.

  1. From the last node in the flow diagram, choose Export and Export data to Amazon S3.
  2. Provide a dataset name and output location.
  3. It is recommended to keep Auto job configuration selected unless you want to change any of the Amazon EMR or SageMaker Processing configs. (If your data is greater than 5 GB data processing will run in EMR Serverless, otherwise it will run within the SageMaker Canvas workspace.)
  4. Under EMR Serverless, provide a job name and choose Export.

You can view the job status in SageMaker Canvas on the Data Wrangler page on the Jobs tab.

You can also view the job status on the Amazon EMR Studio console by choosing Applications under Serverless in the navigation pane.

Create a model

You can also create a model at the end of your flow.

  1. Choose Create model from the node options, and SageMaker Canvas will create a dataset and then navigate you to create a model.
  2. Provide a dataset and model name, select Predictive analysis for Problem type, choose baseFare as the target column, then choose Export and create model.

The model creation process will take a couple of minutes to complete.

  1. Choose My Models in the navigation pane.
  2. Choose the model you just exported and navigate to version 1.
  3. Under Model type, choose Configure model.
  4. Select Numeric model type, then choose Save.
  5. On the dropdown menu, choose Quick Build to start the build process.

When the build is complete, on the Analyze page, you can the following tabs:

  • Overview – This gives you a general overview of the model’s performance, depending on the model type.
  • Scoring – This shows visualizations that you can use to get more insights into your model’s performance beyond the overall accuracy metrics.
  • Advanced metrics – This contains your model’s scores for advanced metrics and additional information that can give you a deeper understanding of your model’s performance. You can also view information such as the column impacts.

Run inference

In this section, we walk through the steps to run batch predictions against the generated dataset.

  1. On the Analyze page, choose Predict.
  2. To generate predictions on your test dataset, choose Manual.
  3. Select the test dataset you created and choose Generate predictions.
  4. When the predictions are ready, either choose View in the pop-up message at the bottom of the page or navigate to the Status column to choose Preview on the options menu (three dots).

You’re now able to review the predictions.

You have now used the generative AI data preparation capabilities in SageMaker Canvas to prepare a large dataset, trained a model using AutoML techniques, and run batch predictions at scale. All of this was done with a few clicks and using a natural language interface.

Clean up

To avoid incurring future session charges, log out of SageMaker Canvas. To log out, choose Log out in the navigation pane of the SageMaker Canvas application.

When you log out of SageMaker Canvas, your models and datasets aren’t affected, but SageMaker Canvas cancels any Quick build tasks. If you log out of SageMaker Canvas while running a Quick build, your build might be interrupted until you relaunch the application. When you relaunch, SageMaker Canvas automatically restarts the build. Standard builds continue even if you log out.

Conclusion

The introduction of petabyte-scale AutoML support within SageMaker Canvas marks a significant milestone in the democratization of ML. By combining the power of generative AI, AutoML, and the scalability of EMR Serverless, we’re empowering organizations of all sizes to unlock insights and drive business value from even the largest and most complex datasets.

The benefits of ML are no longer confined to the domain of highly specialized experts. SageMaker Canvas is revolutionizing the way businesses approach data and AI, putting the power of predictive analytics and data-driven decision-making into the hands of everyone. Explore the future of no-code ML with SageMaker Canvas today.


About the authors

Bret Pontillo is a Sr. Solutions Architect at AWS. He works closely with enterprise customers building data lakes and analytical applications on the AWS platform. In his free time, Bret enjoys traveling, watching sports, and trying new restaurants.

Polaris Jhandi is a Cloud Application Architect with AWS Professional Services. He has a background in AI/ML & big data. He is currently working with customers to migrate their legacy Mainframe applications to the Cloud.

Peter Chung is a Solutions Architect serving enterprise customers at AWS. He loves to help customers use technology to solve business problems on various topics like cutting costs and leveraging artificial intelligence. He wrote a book on AWS FinOps, and enjoys reading and building solutions.

Read More

Delight your customers with great conversational experiences via QnABot, a generative AI chatbot

Delight your customers with great conversational experiences via QnABot, a generative AI chatbot

QnABot on AWS (an AWS Solution) now provides access to Amazon Bedrock foundational models (FMs) and Knowledge Bases for Amazon Bedrock, a fully managed end-to-end Retrieval Augmented Generation (RAG) workflow. You can now provide contextual information from your private data sources that can be used to create rich, contextual, conversational experiences.

The advent of generative artificial intelligence (AI) provides organizations unique opportunities to digitally transform customer experiences. Enterprises with contact center operations are looking to improve customer satisfaction by providing self-service, conversational, interactive chat bots that have natural language understanding (NLU). Enterprises want to automate frequently asked transactional questions, provide a friendly conversational interface, and improve operational efficiency. In turn, customers can ask a variety of questions and receive accurate answers powered by generative AI.

In this post, we discuss how to use QnABot on AWS to deploy a fully functional chatbot integrated with other AWS services, and delight your customers with human agent like conversational experiences.

Solution overview

QnABot on AWS is an AWS Solution that enterprises can use to enable a multi-channel, multi-language chatbot with NLU to improve end customer experiences. QnABot provides a flexible, tiered conversational interface empowering enterprises to meet customers where they are and provide accurate responses. Some responses need to be exact (for example, regulated industries like healthcare or capital markets), some responses need to be searched from large, indexed data sources and cited, and some answers need to be generated on the fly, conversationally, based on semantic context. With QnABot on AWS, you can achieve all of the above by deploying the solution using an AWS CloudFormation template, with no coding required. The solution is extensible, uses AWS AI and machine learning (ML) services, and integrates with multiple channels such as voice, web, and text (SMS).

QnABot on AWS provides access to multiple FMs through Amazon Bedrock, so you can create conversational interfaces based on your customers’ language needs (such as Spanish, English, or French), sophistication of questions, and accuracy of responses based on user intent. You now have the capability to access various large language models (LLMs) from leading AI enterprises (such as Amazon Titan, Anthropic Claude 3, Cohere Command, Meta Llama 3, Mistal AI Large Model, and others on Amazon Bedrock) to find a model best suited for your use case. Additionally, native integration with Knowledge Bases for Amazon Bedrock allows you to retrieve specific, relevant data from your data sources via pre-built data source connectors (Amazon Simple Storage Service – S3, Confluence, Microsoft SharePoint, Salesforce, or web crawlers), and automatically converted to text embeddings stored in a vector database of your choice. You can then retrieve your company-specific information with source attribution (such as citations) to improve transparency and minimize hallucinations. Lastly, if you don’t want to set up custom integrations with large data sources, you can simply upload your documents and support multi-turn conversations. With prompt engineering, managed RAG workflows, and access to multiple FMs, you can provide your customers rich, human agent-like experiences with precise answers.

Deploying the QnABot solution builds the following environment in the AWS Cloud.

Figure 1: QnABot Architecture Diagram

The high-level process flow for the solution components deployed with the CloudFormation template is as follows:

  1. The admin deploys the solution into their AWS account, opens the Content Designer UI or Amazon Lex web client, and uses Amazon Cognito to authenticate.
  2. After authentication, Amazon API Gateway and Amazon S3 deliver the contents of the Content Designer UI.
  3. The admin configures questions and answers in the Content Designer and the UI sends requests to API Gateway to save the questions and answers.
  4. The Content Designer AWS Lambda function saves the input in Amazon OpenSearch Service in a questions bank index. If using text embeddings, these requests first pass through a LLM model hosted on Amazon Bedrock or Amazon SageMaker to generate embeddings before being saved into the question bank on OpenSearch Service.
  5. Users of the chatbot interact with Amazon Lex through the web client UI, Amazon Alexa, or Amazon Connect.
  6. Amazon Lex forwards requests to the Bot Fulfillment Lambda function. Users can also send requests to this Lambda function through Amazon Alexa devices.
  7. The user and chat information is stored in Amazon DynamoDB to disambiguate follow-up questions from previous question and answer context.
  8. The Bot Fulfillment Lambda function takes the user’s input and uses Amazon Comprehend and Amazon Translate (if necessary) to translate non-native language requests to the native language selected by the user during the deployment, and then looks up the answer in OpenSearch Service. If using LLM features such as text generation and text embeddings, these requests first pass through various LLM models hosted on Amazon Bedrock or SageMaker to generate the search query and embeddings to compare with those saved in the question bank on OpenSearch Service.
  9. If no match is returned from the OpenSearch Service question bank, then the Bot Fulfillment Lambda function forwards the request as follows:
    1. If an Amazon Kendra index is configured for fallback, then the Bot Fulfillment Lambda function forwards the request to Amazon Kendra if no match is returned from the OpenSearch Service question bank. The text generation LLM can optionally be used to create the search query and synthesize a response from the returned document excerpts.
    2. If a knowledge base ID is configured, the Bot Fulfillment Lambda function forwards the request to the knowledge base. The Bot Fulfillment Lambda function uses the RetrieveAndGenerate API to fetch the relevant results for a user query, augment the FM’s prompt, and return the response.
  10. User interactions with the Bot Fulfillment function generate logs and metrics data, which is sent to Amazon Kinesis Data Firehose and then to Amazon S3 for later data analysis.
  11. OpenSearch Dashboards can be used to view usage history, logged utterances, no hits utterances, positive user feedback, and negative user feedback, and also provides the ability to create custom reports.

Prerequisites

To get started, you need the following:

  • An AWS account
  • An active deployment of QnABot on AWS (version 6.0.0 or later)
  • Amazon Bedrock model access (required) for all embeddings and LLM models that will be used in QnABot

Figure 2: Request Access to Bedrock Foundational Models (FMs)

In the following sections, we explore some of QnABot’s generative AI features.

Semantic question matching using an embeddings LLM

QnABot on AWS can use text embeddings to provide semantic search capabilities by using LLMs. The goal of this feature is to improve question matching accuracy while reducing the amount of tuning required when compared to the default OpenSearch Service keyword-based matching.

Some of the benefits include:

  • Improved FAQ accuracy from semantic matching vs. keyword matching (comparing the meaning vs. comparing individual words)
  • Fewer training utterances required to match a diverse set of queries
  • Better multi-language support, because translated utterances only need to match the meaning of the stored text, not the wording

Configure Amazon Bedrock to enable semantic question matching

To enable these expanded semantic search capabilities, QnABot uses an Amazon Bedrock FM to generate text embeddings provided using the EmbeddingsBedrockModelId CloudFormation stack parameter. These models provide the best performance and operate on a pay-per-request model. At the time of writing, the following embeddings models are supported by QnABot on AWS:

For the CloudFormation stack, set the following parameters:

  • Set EmbeddingsAPI to BEDROCK
  • Set EmbeddingsBedrockModelId to one of the available options

For example, with semantic matching enabled, the question “What’s the address of the White House?” matches to “Where does the President live?” This example doesn’t match using keywords because they don’t share any of the same words.

Semantic matching in QnABot

Figure 3: Semantic matching in QnABot

In the UI designer, you can set ENABLE_DEBUG_RESPONSE to true to see the user input, source, or any errors of the answer, as illustrated in the preceding screenshot.

You can also evaluate the matching score on the TEST tab in the content designer UI. In this example, we add a match on “qna item question” with the question “Where does the President live?”

Test and evaluate answer

Figure 4: Test and evaluate answers in QnABot

Similarly, you can try a match on “item text passage” with the question “Where did Humpty Dumpty sit?”

Match items or text passages

Figure 5: Match items or text passages in QnABot

Recommendations for tuning with an embeddings LLM

When using embeddings in QnABot, we recommend generalizing questions because more user utterances will match a general statement. For example, the embeddings LLM model will cluster “checking” and “savings” with “account,” so if you want to match both account types, use “account” in your questions.

Similarly, for the question and utterance of “transfer to an agent,” consider using “transfer to someone” because it will better match with “agent,” “representative,” “human,” “person,” and so on.

In addition, we recommend tuning EMBEDDINGS_SCORE_THRESHOLD, EMBEDDINGS_SCORE_ANSWER_THRESHOLD, and EMBEDDINGS_TEXT_PASSAGE_SCORE_THRESHOLD based on the scores. The default values are generalized to all multiple models, but you might need to modify this based on embeddings model and your experiments.

Text generation and query disambiguation using a text LLM

QnABot on AWS can use LLMs to provide a richer, more conversational chat experience. The goal of these features is to minimize the amount of individually curated answers administrators are required to maintain, improve question matching accuracy by providing query disambiguation, and enable the solution to provide more concise answers to users, especially when using a knowledge base in Amazon Bedrock or the Amazon Kendra fallback feature.

Configure an Amazon Bedrock FM with AWS CloudFormation

To enable these capabilities, QnABot uses one of the Amazon Bedrock FMs to generate text embeddings provided using the LLMBedrockModelId CloudFormation stack parameter. These models provide the best performance and operate on a pay-per-request model.

For the CloudFormation stack, set the following parameters:

  • Set LLMApi to BEDROCK
  • Set LLMBedrockModelId to one of the available LLM options
Setup QnABot to use Bedrock FMs

Figure 6: Setup QnABot to use Bedrock FMs

Query disambiguation (LLM-generated query)

By using an LLM, QnABot can take the user’s chat history and generate a standalone question for the current utterance. This enables users to ask follow-up questions that on their own may not be answerable without context of the conversation. The new disambiguated, or standalone, question can then be used as search queries to retrieve the best FAQ, passage, or Amazon Kendra match.

In QnABot’s Content Designer, you can further customize the prompt and model listed in the Query Matching section:

  • LLM_GENERATE_QUERY_PROMPT_TEMPLATE – The prompt template used to construct a prompt for the LLM to disambiguate a follow-up question. The template may use the following placeholders:
    • history – A placeholder for the last LLM_CHAT_HISTORY_MAX_MESSAGES messages in the conversational history, to provide conversational context.
    • input – A placeholder for the current user utterance or question.
  • LLM_GENERATE_QUERY_MODEL_PARAMS – The parameters sent to the LLM model when disambiguating follow-up questions. Refer to the relevant model documentation for additional values that the model provider accepts.

The following screenshot shows an example with the new LLM disambiguation feature enabled, given the chat history context after answering “Who was Little Bo Peep” and the follow-up question “Did she find them again?”

Use LLMs to disambiguate queries

Figure 7: LLM query disambiguation feature enabled

QnABot rewrites that question to provide all the context required to search for the relevant FAQ or passage: “Did Little Bo Peep find her lost sheep again?”

Query disambiguation with LLMs

Figure 8: With query disambiguation with LLMs, context is maintained

Answer text generation using QnABot

You can now generate answers to questions from context provided by knowledge base search results, or from text passages created or imported directly into QnABot. This allows you to generate answers that reduce the number of FAQs you have to maintain, because you can now synthesize concise answers from your existing documents in a knowledge base, Amazon Kendra index, or document passages stored in QnABot as text items. Additionally, your generated answers can be concise and therefore suitable for voice or contact center chatbots, website bots, and SMS bots. Lastly, these generated answers are compatible with the solution’s multi-language support—customers can interact in their chosen languages and receive generated answers in the same language.

With QnABot, you can use two different data sources to generate responses: text passages or a knowledge base in Amazon Bedrock.

Generate answers to questions from text passages

In the content designer web interface, administrators can store full text passages for QnABot on AWS to use. When a question gets asked that matches against this passage, the solution can use LLMs to answer the user’s question based on information found within the passage. We highly recommend you use this option with semantic question matching using Amazon Bedrock text embedding. In QnABot content designer, you can further customize the prompt and model listed under Text Generation using the General Settings section.

Let’s look at a text passage example:

  1. In the Content Designer, choose Add.
  2. Select the text, enter an item ID and a passage, and choose Create.

You can also import your passages from a JSON file using the Content Designer Import feature. On the tools menu, choose Import, open Examples/Extensions, and choose LOAD next to TextPassage-NurseryRhymeExamples to import two nursery rhyme text items.

The following example shows QnABot generating an answer using a text passage item that contains the nursery rhyme, in response to the question “Where did Humpty Dumpty sit?”

Generate answers from text passages

Figure 9: Generate answers from text passages

You can also use query disambiguation and text generation together, by asking “Who tried to fix Humpty Dumpty?” and the follow-up question “Did they succeed?”

Text generation with query disambiguation

Figure 10: Text generation with query disambiguation to maintain context

You can also modify LLM_QA_PROMPT_TEMPLATE in the Content Designer to answer in different languages. In the prompt, you can specify the prompt and answers in different languages (e.g. prompts in French, Spanish).

Answer in different languages

Figure 11: Answer in different languages

You can also specify answers in two languages with bulleted points.

Answer in multiple languages

Figure 12: Answer in multiple languages

RAG using an Amazon Bedrock knowledge base

By integrating with a knowledge base, QnABot on AWS can generate concise answers to users’ questions from configured data sources. This prevents the need for users to sift through larger text passages to find the answer. You can also create your own knowledge base from files stored in an S3 bucket. Amazon Bedrock knowledge bases with QnABot don’t require EmbeddingsApi and LLMApi because the embeddings and generative response are already provided by the knowledge base. To enable this option, create an Amazon Bedrock knowledge base and use your knowledge base ID for the CloudFormation stack parameter BedrockKnowledgeBaseId.

To configure QnABot to use the knowledge base, refer to Create a knowledge base. The following is a quick setup guide to get started:

  1. Provide your knowledge base details.
Setup Amazon Bedrock Knowledge Base

Figure 13: Setup Amazon Bedrock Knowledge Base for RAG use cases

  1. Configure your data source based on the available options. For this example, we use Amazon S3 as the data source and note that the bucket has to be prepended with qna or QNA.
Setup data sources for Knowledge Base

Figure 14: Setup your RAG data sources for Amazon Knowledge Base

  1. Upload your documents to Amazon S3. For this example, we uploaded the aws-overview.pdf whitepaper to test integration.
  2. Create or choose your vector database store to allow Bedrock to store, update and manage embeddings.
  3. Sync the data source and use your knowledge base ID for the CloudFormation stack parameter BedrockKnowledgeBaseId.
Complete setting up Amazon Bedrock Knowledge Base

Figure 15: Complete setting up Amazon Bedrock Knowledge Base for your RAG use cases

In QnABot Content Designer, you can customize additional settings list under Text Generation using RAG with the Amazon Bedrock knowledge base.

QnABot on AWS can now answer questions from the AWS whitepapers, such as “What services are available in AWS for container orchestration?” and “Are there any upfront fees with ECS?”

Generate answers from your Amazon Bedrock Knowledge Base

Figure 16: Generate answers from your Amazon Bedrock Knowledge Base (RAG)

Conclusion

Customers expect quick and efficient service from enterprises in today’s fast-paced world. But providing excellent customer experience can be significantly challenging when the volume of inquiries outpaces the human resources employed to address them. Companies of all sizes can use QnABot on AWS with built-in Amazon Bedrock integrations to provide access to many market leading FMs, provide specialized lookup needs using RAG to reduce hallucinations, and provide a friendly AI conversational experience. With QnABot on AWS, you can provide high-quality natural text conversations, content management, and multi-turn dialogues. The solution comes with one-click deployment for custom implementation, a content designer for Q&A management, and rich reporting. You can also integrate with contact center systems like Amazon Connect and Genesys Cloud CX. Get started with QnABot on AWS.


About the Author

Ajay Swamy is the Product Leader for Data, ML and Generative AI AWS Solutions. He specializes in building AWS Solutions (production-ready software packages) that deliver compelling value to customers by solving for their unique business needs. Other than QnABot on AWS, he manages Generative AI Application BuilderEnhanced Document UnderstandingDiscovering Hot Topics using Machine Learning and other AWS Solutions. He lives with his wife and dog (Figaro), in New York, NY.

Abhishek Patil is a Software Development Engineer at Amazon Web Services (AWS) based in Atlanta, GA, USA. With over 7 years of experience in the tech industry, he specializes in building distributed software systems, with a primary focus on Generative AI and Machine Learning. Abhishek is a primary builder on AI solution QnABot on AWS and has contributed to other AWS Solutions including Discovering Hot Topics using Machine Learning and OSDU® Data Platform. Outside of work, Abhishek enjoys spending time outdoors, reading, resistance training, and practicing yoga.

Read More

Introducing document-level sync reports: Enhanced data sync visibility in Amazon Q Business

Introducing document-level sync reports: Enhanced data sync visibility in Amazon Q Business

Amazon Q Business is a fully managed, generative artificial intelligence (AI)-powered assistant that helps enterprises unlock the value of their data and knowledge. With Amazon Q, you can quickly find answers to questions, generate summaries and content, and complete tasks by using the information and expertise stored across your company’s various data sources and enterprise systems. At the core of this capability are native data source connectors that seamlessly integrate and index content from multiple repositories into a unified index. This enables the Amazon Q large language model (LLM) to provide accurate, well-written answers by drawing from the consolidated data and information. The data source connectors act as a bridge, synchronizing content from disparate systems like Salesforce, Jira, and SharePoint into a centralized index that powers the natural language understanding and generative abilities of Amazon Q.

Customers appreciate that Amazon Q Business securely connects to over 40 data sources. While using their data source, they want better visibility into the document processing lifecycle during data source sync jobs. They want to know the status of each document they attempted to crawl and index, as well as the ability to troubleshoot why certain documents were not returned with the expected answers. Additionally, they want access to metadata, timestamps, and access control lists (ACLs) for the indexed documents.

We are pleased to announce a new feature now available in Amazon Q Business that significantly improves visibility into data source sync operations. The latest release introduces a comprehensive document-level report incorporated into the sync history, providing administrators with granular indexing status, metadata, and ACL details for every document processed during a data source sync job. This enhancement to sync job observability enables administrators to quickly investigate and resolve ingestion or access issues encountered while setting up an Amazon Q Business application. The detailed document reports are persisted in the new SYNC_RUN_HISTORY_REPORT log stream under the Amazon Q Business application log group, so critical sync job details are available on-demand when troubleshooting.

Lifecycle of a document in a data source sync run job

In this section, we examine the lifecycle of a document within a data source sync in Amazon Q Business. This provides valuable insight into the sync process. The data source sync comprises three key stages: crawling, syncing, and indexing. Crawling involves the connector connecting to the data source and extracting documents meeting the defined sync scope according to the data source configuration. These documents are then synced to Amazon Q Business during the syncing phase. Finally, indexing makes the synced documents searchable within the Amazon Q Business environment.

The following diagram shows a flowchart of a sync run job.

Crawling stage

The first stage is the crawling stage, where the connector crawls all documents and their metadata from the data source. During this stage, the connector also compares the checksum of the document against the Amazon Q index to figure out if a particular document needs to be added, modified, or deleted from the index. This operation corresponds to the CrawlAction field in the sync run history report.

If the document is unmodified, it is marked as UNMODIFIED and skipped in the rest of the stages. If any document fails in the crawling stage, for example due to throttling errors, broken content, or if the document size is too big, that document is marked as failed in the sync run history report with the CrawlStatus as FAILED. If the document was skipped due to any validation errors, its CrawlStatus is marked as SKIPPED. These documents are not sent forward to the next stage. All successful documents are marked as SUCCESS and are sent forward.

We also capture the ACLs and metadata on each document in this stage to be able to add it to the sync run history report.

Syncing stage

During the syncing stage, the document is sent to Amazon Q Business ingestion service APIs like BatchPutDocument and BatchDeleteDocument. After a document is submitted to these APIs, Amazon Q Business runs validation checks on the submitted documents. If any document fails these checks, its SyncStatus is marked as FAILED. If there is an irrecoverable error for a particular document, it is marked as SKIPPED and other documents are sent forward.

Indexing stage

In this step, Amazon Q Business parses the document, processes it according to its content type, and persists it in the index. If the document fails to be persisted, its IndexStatus is marked as FAILED; otherwise, it is marked as SUCCESS.

After the statuses of all the stages have been captured, we emit these statuses as an Amazon Cloudwatch event to the customer’s AWS account.

Key features and benefits of document-level reports

The following are the key features and benefits of the new document level report in Amazon Q Business applications:

  • Enhanced sync run history page – A new Actions column has been added to the sync run history page, providing access to the document-level report for each sync run.
  • Dedicated log stream – A new log stream named SYNC_RUN_HISTORY_REPORT has been created in the Amazon Q Business CloudWatch log group, containing the document-level report.
  • Comprehensive document information – The document-level report includes the following information for each document.
  • Document ID – This is the document ID that is inherited directly from the data source or mapped by the customer in the data source field mappings.
  • Document title – The title of the document is taken from the data source or mapped by the customer in the data source field mappings.
  • Consolidated document status (SUCCESS, FAILED, or SKIPPED) – This is the final consolidated status of the document. It can have a value of SUCCESS, FAILED, or SKIPPED. If the document was successfully processed in all stages, then the value is SUCCESS. If the document has failed or was skipped in any of the stages, then the value of this field will be FAILED or SKIPPED.
  • Error message (if the document failed) – This field contains the error message with which a document failed. If a document was skipped due to throttling errors, or any internal errors, this will be shown in the error message field.
  • Crawl status – This field denotes whether the document was crawled successfully from the data source. This status correlates to the syncing-crawling state in the data source sync.
  • Sync status – This field denotes whether the document was sent for syncing successfully. This correlates to the syncing-indexing state in the data source sync.
  • Index status – This field denotes whether the document was successfully persisted in the index.
  • ACLs – This field contains a list of document-level permissions that were crawled from the data source. The details of each element in the list are:
    • Global name: This is the email/username of the user. This field is mapped across multiple data sources. For example, if a user has 3 data sources – Confluence, Sharepoint and Gmail with the local user ID as confluence_user, sharepoint_user and gmail_user respectively, and their email address user@email.com is the globalName in the ACL for all of them; then Amazon Q Business understands that all of these local user IDs map to the same global name.
    • Name: This is the local unique ID of the user which is assigned by the data source.
    • Type: This field indicates the principal type. This can be either USER or GROUP.
    • Is Federated: This is a boolean flag which indicates whether the group is of INDEX level (true) or DATASOURCE level (false).
    • Access: This field indicates whether the user has access allowed or denied explicitly. Values can be either ALLOWED or DENIED.
    • Data source ID: This is the data source ID. For federated groups (INDEX level), this field will be null.
  • Metadata – This field contains the metadata fields (other than ACL) that were pulled from the data source. This list also includes the metadata fields mapped by the customer in the data source field mappings as well as extra metadata fields added by the connector.
  • Hashed document ID (for troubleshooting assistance) – To safeguard your data privacy, we present a secure, one-way hash of the document identifier. This encrypted value enables the Amazon Q Business team to efficiently locate and analyze the specific document within our logs, should you encounter any issue that requires further investigation and resolution.
  • Timestamp – The timestamp indicates when the document status was logged in CloudWatch.

In the following sections, we explore different use cases for the logging feature.

Troubleshoot “Sorry, I could not find relevant information” with the new logging feature

The new document-level logging feature in Amazon Q Business can help troubleshoot common issues related to the “Sorry, I could not find relevant information to complete your request” response.

Let’s explore an example scenario. A mutual funds manager uses Amazon Q Business chat for knowledge retrieval and insights extraction across their enterprise data stores. When the fund manager asks, “What is the CAGR of the multi-asset fund?” in the Amazon Q chat, they receive the “Sorry, I could not find relevant information to complete your request” response.

As the administrator managing their Amazon Q Business application, you can troubleshoot the issue using the following approach with the new logging feature. First, you want to determine whether the multi-asset fund document was successfully indexed in the Amazon Q Business application. Next, you need to verify if the fund manager’s user account has the required permission to read the information from the multi-asset fund document. Amazon Q Business enforces the document permissions configured in its data source, and you can use this new feature to verify that the document ACL settings are synced in the Amazon Q Business application index.

You can use the following CloudWatch query string to check the document ACL settings:

filter @logStream like 'SYNC_RUN_HISTORY_REPORT/' 
and DocumentTitle = "your-document-title"
| fields DocumentTitle, ConnectorDocumentStatus.Status, Acl
| sort @timestamp desc
| limit 1

This query filter uses the per-document-level logging stream SYNC_RUN_HISTORY_REPORT, and displays the document title and its associated ACL settings. By verifying the document indexing and permissions, you can identify and resolve potential issues that may be causing the “Sorry, I could not find relevant information” response.

The following screenshot shows an example result.

Determine the optimal boosting duration for recent documents in using document-level reporting

When it comes to generating accurate answers, you may want to fine-tune the way Amazon Q prioritizes its content. For instance, you may prefer to boost recent documents over older ones to make sure the most up-to-date passages are used to generate an answer. To achieve this, you can use the business’s relevance tuning feature in Amazon Q Business to boost documents based on the last update date attribute, with a specified boosting duration. However, determining the optimal boosting period can be challenging when dealing with a large number of frequently changing documents.

You can now use the per-document-level report to obtain the _last_updated_at metadata field information for your documents, which can help you determine the appropriate boosting period. For this, you use the following CloudWatch Logs Insights query to retrieve the _last_updated_at metadata attribute for machine learning documents from the SYNC_RUN_HISTORY_REPORT log stream:

filter @logStream like 'SYNC_RUN_HISTORY_REPORT/' 
and Metadata like 'Machine Learning'
| parse Metadata '{"key":"_last_updated_at","value":{"dateValue":"*"}}' as @last_updated_at
| sort @last_updated_at desc, @timestamp desc
| dedup DocumentTitle

With the preceding query, you can gain insights into the last updated timestamps of your documents, enabling you to make informed decisions about the optimal boosting period. This approach makes sure your chat responses are generated using the most recent and relevant information, enhancing the overall accuracy and effectiveness of your Amazon Q Business implementation.

The following screenshot shows an example result.

Common document indexing observability and troubleshooting methods

In this section, we explore some common admin tasks for observing and troubleshooting document indexing using the new document-level reporting feature.

List all successfully indexed documents from a data source

To retrieve a list of all documents that have been successfully indexed from a specific data source, you can use the following CloudWatch query:

fields DocumentTitle, DocumentId, @timestamp
| filter @logStream like 'SYNC_RUN_HISTORY_REPORT/your-data-source-id/'
and ConnectorDocumentStatus.Status = "SUCCESS"
| sort @timestamp desc | dedup DocumentTitle, DocumentId

The following screenshot shows an example result. 

List all successfully indexed documents from a data source sync job

To retrieve a list of all documents that have been successfully indexed during a specific sync job, you can use the following CloudWatch query:

fields DocumentTitle, DocumentId, ConnectorDocumentStatus.Status AS IndexStatus, @timestamp
| filter @logStream like 'SYNC_RUN_HISTORY_REPORT/your-data-source-id/run-id'
and ConnectorDocumentStatus.Status = "SUCCESS"
| sort DocumentTitle

The following screenshot shows an example result.

List all failed indexed documents from a data source sync job

To retrieve a list of all documents that failed to index during a specific sync job, along with the error messages, you can use the following CloudWatch query:

fields DocumentTitle, DocumentId, ConnectorDocumentStatus.Status AS IndexStatus, ErrorMsg, @timestamp
| filter @logStream like 'SYNC_RUN_HISTORY_REPORT/your-data-source-id/run-id'
and ConnectorDocumentStatus.Status = "FAILED"
| sort @timestamp desc

The following screenshot shows an example result.

List all documents that contains a particular user name ACL permission from an Amazon Q Business application

To retrieve a list of documents that have a specific user’s ACL permission, you can use the following CloudWatch Logs Insights query:

filter @logStream like 'SYNC_RUN_HISTORY_REPORT/' 
and Acl like 'aneesh@mydemoaws.onmicrosoft.com'
| display DocumentTitle, SourceUri

The following screenshot shows an example result.

 List the ACL of an indexed document from a data source sync job

To retrieve the ACL information for a specific indexed document from a sync job, you can use the following CloudWatch Logs Insights query:

filter @logStream like 'SYNC_RUN_HISTORY_REPORT/data-source-id/run-id' 
and DocumentTitle = "your-document-title"
| display DocumentTitle, Acl

The following screenshot shows an example result.

List metadata of an indexed document from a data source sync job

To retrieve the metadata information for a specific indexed document from a sync job, you can use the following CloudWatch Logs Insights query:

filter @logStream like 'SYNC_RUN_HISTORY_REPORT/data-source-id/run-id' 
and DocumentTitle = "your-document-title"
| display DocumentTitle, Metadata

The following screenshot shows an example result.

Conclusion

The newly introduced document-level report in Amazon Q Business provides enhanced visibility and observability into the document processing lifecycle during data source sync jobs. This feature addresses a critical need expressed by customers for better troubleshooting capabilities and access to detailed information about the indexing status, metadata, and ACLs of individual documents.

The document-level report is stored in a dedicated log stream named SYNC_RUN_HISTORY_REPORT within the Amazon Q Business application CloudWatch log group. This report contains comprehensive information for each document, including the document ID, title, overall document sync status, error messages (if any), along with its ACLs, and metadata information retrieved from the data sources. The data source sync run history page now includes an Actions column, providing access to the document-level report for each sync run. This feature significantly improves the ability to troubleshoot issues related to document ingestion and access control, and issues related to metadata relevance, and provides better visibility about the documents synced with an Amazon Q index.

To get started with Amazon Q Business, explore the Getting started guide. To learn more about data source connectors and best practices, see Configuring Amazon Q Business data source connectors.


About the authors

Aneesh Mohan is a Senior Solutions Architect at Amazon Web Services (AWS), bringing two decades of experience in creating impactful solutions for business-critical workloads. He is passionate about technology and loves working with customers to build well-architected solutions, focusing on the financial services industry, AI/ML, security, and data technologies.

Ashwin Shukla is a Software Development Engineer II on the Amazon Q for Business and Amazon Kendra engineering team, with 6 years of experience in developing enterprise software. In this role, he works on designing and developing foundational features for Amazon Q for Business.

Read More

Derive generative AI-powered insights from ServiceNow with Amazon Q Business

Derive generative AI-powered insights from ServiceNow with Amazon Q Business

Effective customer support, project management, and knowledge management are critical aspects of providing efficient customer relationship management. ServiceNow is a platform for incident tracking, knowledge management, and project management functions for software projects and has become an indispensable part of many organizations’ workflows to ensure success of the customer and the product. However, extracting valuable insights from the vast amount of data stored in ServiceNow often requires manual effort and building specialized tooling. Users such as support engineers, project managers, and product managers need to be able to ask questions about an incident or a customer, or get answers from knowledge articles in order to provide excellent customer support. Organizations use ServiceNow to manage workflows, such as IT services, ticketing systems, configuration management, and infrastructure changes across IT systems. Generative artificial intelligence (AI) provides the ability to take relevant information from a data source such as ServiceNow and provide well-constructed answers back to the user.

Building a generative AI-based conversational application integrated with relevant data sources requires an enterprise to invest time, money, and people. First, you need to build connectors to the data sources. Next, you need to index this data to make it available for a Retrieval Augmented Generation (RAG) approach, where relevant passages are delivered with high accuracy to a large language model (LLM). To do this, you need to select an index that provides the capabilities to index the content for semantic and vector search, build the infrastructure to retrieve and rank the answers, and build a feature-rich web application. Additionally, you need to hire and staff a large team to build, maintain, and manage such a system.

Amazon Q Business is a fully managed generative AI-powered assistant that can answer questions, provide summaries, generate content, and securely complete tasks based on data and information in your enterprise systems. Amazon Q Business can help you get fast, relevant answers to pressing questions, solve problems, generate content, and take action using the data and expertise found in your company’s information repositories, code, and enterprise systems (such as ServiceNow, among others). Amazon Q provides out-of-the-box native data source connectors that can index content into a built-in retriever and uses an LLM to provide accurate, well-written answers. A data source connector is a component of Amazon Q that helps integrate and synchronize data from multiple repositories into one index.

Amazon Q Business offers multiple prebuilt connectors to a large number of data sources, including ServiceNow, Atlassian Confluence, Amazon Simple Storage Service (Amazon S3), Microsoft SharePoint, Salesforce, and many more, and helps you create your generative AI solution with minimal configuration. For a full list of Amazon Q business supported data source connectors, see Amazon Q Business connectors.

You can use the Amazon Q Business ServiceNow Online data source connector to connect to the ServiceNow Online platform and index ServiceNow entities such as knowledge articles, Service Catalogs, and incident entries, along with the metadata and document access control lists (ACLs).

This post shows how to configure the Amazon Q ServiceNow connector to index your ServiceNow platform and take advantage of generative AI searches in Amazon Q. We use an example of an illustrative ServiceNow platform to discuss technical topics related to AWS services.

Find accurate answers from content in ServiceNow using Amazon Q Business

After you integrate Amazon Q Business with ServiceNow, you can ask questions from the description of the document, such as:

  • How do I troubleshoot an invalid IP configuration on a network router? – This could be derived from an internal knowledge article on that topic
  • Which form do I use to request a new email account? – This could be derived from an internal Service Catalog entry
  • Is there a previous incident on the topic of resetting cloud root user password? – This could be derived from an internal incident entry

Overview of the ServiceNow connector

A data source connector is a mechanism for integrating and synchronizing data from multiple repositories into one container index. Amazon Q Business offers multiple data source connectors that can connect to your data sources and help you create your generative AI solution with minimal configuration.

To crawl and index contents in ServiceNow, we configure Amazon Q Business ServiceNow connector as a data source in your Amazon Q business application.

When you connect Amazon Q Business to a data source and initiate the data synchronization process, Amazon Q Business crawls and adds documents from the data source to its index.

Types of documents

Let’s look at what are considered as documents in the context of Amazon Q Business ServiceNow connector.

The Amazon Q Business ServiceNow connector supports crawling of the following entities in ServiceNow:

  • Knowledge articles – Each article is considered a single document
  • Knowledge article attachments – Each attachment is considered a single document
  • Service Catalog – Each catalog item is considered a single document
  • Service Catalog attachments – Each catalog attachment is considered a single document
  • Incidents – Each incident is considered a single document
  • Incident attachments – Each incident attachment is considered a single document

Although not all metadata is available at the time of writing, you can also configure field mappings. Field mappings allow you to map ServiceNow field names to Amazon Q index field names. This includes both default field mappings created automatically by Amazon Q, as well as custom field mappings that you can create and edit. Refer to ServiceNow data source connector field mappings documentation for more information.

Authentication

The Amazon Q Business ServiceNow connector support two types of authentication methods:

  • Basic authentication – ServiceNow host URL, user name, and password
  • OAuth 2.0 authentication with Resource Owner Password Flow – ServiceNow host URL, user name, password, client ID, and client secret

Supported ServiceNow versions

ServiceNow usually names platform versions after cities for the added convenience of easily differentiating between versions and associated features. At the time of writing, the following versions are natively supported in the Amazon Q Business ServiceNow connector:

  • San Diego
  • Tokyo
  • Rome
  • Vancouver
  • Others

ACL crawling

To maintain a secure environment, Amazon Q Business now requires ACL and identity crawling for all connected data sources. When preparing to connect Amazon Q Business applications to AWS IAM Identity Center, you need to enable ACL indexing and identity crawling and re-synchronize your connector.

Amazon Q Business enforces data security by supporting the crawling of ACLs and identity information from connected data sources. Indexing documents with ACLs is crucial for maintaining data security, because documents without ACLs are considered public.

If you need to index documents without ACLs, make sure they’re explicitly marked as public in your data source. When connecting a ServiceNow data source, Amazon Q Business crawls ACL information, including user and group information, from your ServiceNow instance. With ACL crawling, you can filter chat responses based on the end-user’s document access level, making sure users only see information they’re authorized to access.

In ServiceNow, user IDs are mapped from user emails and exist on files with set access permissions. This mapping allows Amazon Q Business to effectively enforce access controls based on the user’s identity and permissions within the ServiceNow environment.

Refer to How Amazon Q Business connector crawls ServiceNow ACLs for more information.

Overview of solution

Amazon Q is a generative-AI powered assistant that helps customers answer questions, provide summaries, generate content, and complete tasks based on data in their company repository. It also exists as a learning tool for AWS users who want to ask questions about services and best practices in the cloud. You can use the Amazon Q connector for ServiceNow online to crawl your ServiceNow domain and index service tickets, guides, and community posts to discover answers for your questions faster.

Amazon Q understands and respects your existing identities, roles, and permissions and uses this information to personalize its interactions. If a user doesn’t have permission to access data without Amazon Q, they can’t access it using Amazon Q either. The following table outlines which documents each user is authorized to access for our use case. For a complete list of ServiceNow roles, refer to documentation. The documents being used in this example are a subset of AWS public documents from re:Post pre-loaded into ServiceNow with access restriction.

# First Name Last Name Document type authorized for access ServiceNow Roles
1 John Stiles Knowledge Articles, Service Catalog and Incidents knowledge, catalog, incident_manager
2 Mary Major Knowledge Articles and Service Catalog knowledge, catalog
3 Mateo Jackson Incidents incident_manager

In this post, we show how to use the Amazon Q Business ServiceNow connector to index data from your ServiceNow platform for intelligent search.

Prerequisites

For this walkthrough, you should have the following prerequisites:

Configure your ServiceNow connection

In your ServiceNow platform, complete the following steps to create an OAuth2 secret that could be consumed from your Amazon Q application:

  1. In ServiceNow, on the All menu, expand System OAuth and choose Application Registry.

ServiceNow console

  1. Choose New.

ServiceNow System OAuth App Registry

  1. Choose Create an OAuth API endpoint for external clients.

ServiceNow System OAuth App Registry Create Endpoint

  1. For Name, enter a unique name.
  2. Fill out the remaining parameters according to your requirements and choose Submit.

Note down the client ID and client secret to use in later steps.

ServiceNow Create OAuth Token

Create an Amazon Q Business application

Complete the following steps to create an Amazon Q Business application:

  1. On the Amazon Q console, choose Getting started in the navigation pane.
  2. Under Amazon Q Business Pro, choose Q Business to subscribe.

QBusiness Create App

  1. On the Amazon Q Business console, choose Get started.

QBusiness CreateApp2

  1. On the Applications page, choose Create application.

QBusiness CreateApp3

  1. On the Create application page, provide your application details.
  2. Choose Create.

Make sure the Amazon Q Business application is connected to IAM Identity Center. For more information, see Setting up Amazon Q Business with IAM Identity Center as identity provider.

QBusiness CreateApp4

  1. On the Select retriever page, select Use native retriever for your retriever and select Starter for the index provisioning type.
  2. Choose Next.

QBusiness CreateApp5

  1. On the Connect data sources page, choose Next without connecting to any data source (we do that in the next section).

QBusiness CreateApp6

QBusiness CreateApp7

  1. On the Add groups and users page, choose Add groups and users.

QBusiness CreateApp7

  1. Add any groups and users to access the application.

For more details, refer to Adding users and subscriptions to an Amazon Q Business application.

  1. Choose Create application.

QBusiness CreateApp8

Configure the data source using the Amazon Q ServiceNow Online connector

Now let’s configure the ServiceNow Online data source connector with the Amazon Q application that we created in the previous section.

  1. On the Amazon Q console, navigate to the Applications page and choose the application you just created.

Q Business - Connector Config1

  1. In the Data sources section, choose Add data source.

Q Business - Connector Config2

  1. Search for and choose the ServiceNow Online connector.

Q Business - Connector Config3

  1. Provide the name, ServiceNow host, and version information.

If your ServiceNow version isn’t on the dropdown menu, choose Others.

Q Business - Connector Config4

  1. Choose Create and add new secret to create a new secret to connect with the ServiceNow platform account.

Q Business - Connector Config5

  1. Provide the connection information based on the OAuth2 endpoint created in ServiceNow previously, then choose Save.

Q Business - Connector Config6

  1. Leave the defaults for the VPC and Identity crawler
  2. For IAM role, choose Create a new service role (Recommended) and keep the default role name.

Q Business - Connector Config7

  1. Choose entities that you want to bring over from ServiceNow.

This example shows knowledge articles, Service Catalog items, and incidents. The Filter query option helps curate the list of items that you want to bring into Amazon Q. When you use a query, you can specify multiple knowledge bases, including private knowledge bases. For more details on how to build ServiceNow filters, refer to Filters. For additional query building resources, see Specifying documents to index with a query.

Q Business - Connector Config8

Q Business - Connector Config9

Q Business - Connector Config10

  1. For Sync mode, select Full sync.
  2. For Sync run schedule, choose Run on demand.

Q Business - Connector Config11

  1. Leave the remaining options as default and choose Add data source.

Q Business - Connector Config12

  1. When the data source status shows as Active, initiate data synchronization by choosing Sync now.

Q Business - Connector Config12

Wait until the synchronization status changes to Completed before continuing to the next steps.

Q Business Connector Config13

For information about common issues encountered and related troubleshooting steps, refer to Troubleshooting data source connectors.

Run queries with the Amazon Q web experience

Now that the data synchronization is complete, you can start exploring insights from Amazon Q. You have three users for testing— John with admin access, Mary with access to knowledge articles and service catalog, and Mateo with access only to incidents. In the following steps, you will sign in as each user and ask various questions to see what responses Amazon Q provides based on the permitted document types for their respective groups. You will also test edge cases where users try to access information from restricted sources to validate the access control functionality.

  1. On the details page of the new Amazon Q application, navigate to the Web experience settings tab and choose the link under Deployed URL. This will open a new tab with a preview of the UI and options to customize according to your needs.

Q Business - Web Experience1

  1. Log in to the application as John Stiles first, using the credentials for the user that you added to the Amazon Q application.

Q Business - Web Experience2

  1. After the login is successful, choose the application that you just created.

Q Business - Web Application3

  1. From there, you’ll be redirected to the Amazon Q assistant UI, where you can start asking questions using natural language and get insights from your ServiceNow platform.

Q Business - Web Experience4

  1. Let’s run some queries to see how Amazon Q can answer questions related to synchronized data. John has access to all ServiceNow document types. When asked “How do I upgrade my EKS cluster to the latest version”, Amazon Q will provide a summary pulling information from the related knowledge article, highlighting the sources at the end of each excerpt.

QBusiness-ServiceNow-Connector

  1. Still logged in as John, when asked “What is Amazon QLDB?”, Amazon Q will provide a summary pulling information from the related ServiceNow incident.

QBusiness-ServiceNow-Connector

  1. Sign out as user John. Start a new incognito browser session or use a different browser. Copy the web experience URL and sign in as user Mary. Repeat these steps each time you need to sign in as a different user. Mary only has access to knowledge articles and service catalog with no incident access. When asked “How do I perform vector search with Amazon Redshift”, Amazon Q will provide a summary pulling information from the related knowledge article, highlighting the source.

QBusiness-ServiceNow-Connector

  1. However, when asked “What is Amazon QLDB?”, Amazon Q responds that it could not find relevant information. This because Mary does not have access to ServiceNow incidents which is the only place where the answer to that question can be found.

QBusiness-ServiceNow-Connector

  1. Sign out as user Mary. Start a new incognito browser session or use a different browser. Copy the web experience URL and sign in as user Mateo. Mateo only has access to incidents with no knowledge article or service catalog access. When asked “What is Amazon QLDB?”, Amazon Q will provide a summary pulling information from the related incident, highlighting the source.

QBusiness-ServiceNow-Connector

  1. However, when asked “How do I perform vector search with Amazon Redshift?”, Amazon Q responds that it could not find relevant information. This because Mateo does not have access to ServiceNow knowledge article which is the only place where the answer to this question can be found.

QBusiness-ServiceNow-Connector

Try out the assistant with additional queries, such as:

  • How do you set up new blackberry device?
  • How do I set up S3 object replication?
  • How do I resolve empty log issues in CloudWatch?
  • How do I troubleshoot 403 Access Denied errors from Amazon S3?

Frequently asked questions

In this section, we provide guidance to frequently asked questions.

Amazon Q Business is unable to answer your questions

If you get the response “Sorry, I could not find relevant information to complete your request,” this may be due to a few reasons:

  • No permissions – ACLs applied to your account don’t allow you to query certain data sources. If this is the case, reach out to your application administrator to make sure your ACLs are configured to access the data sources.
  • Email ID doesn’t match user ID – In rare scenarios, a user may have a different email ID associated with Amazon Q in IAM Identity Center than what is associated in the ServiceNow user profile. In such cases, make sure the Amazon Q user profile is updated to recognize the ServiceNow email ID through the update-user command in the AWS Command Line Interface (AWS CLI) or the related API call.
  • Data connector sync failed – Your data connector may have failed to sync information from the source to the Amazon Q Business application. Verify the data connector’s sync run schedule and sync history to confirm the sync is successful.
  • Empty or private ServiceNow projects – Private or empty projects aren’t crawled during the sync run.

If none of these reasons apply to your use case, open a support case and work with your technical account manager to get this resolved.

How to generate responses from authoritative data sources

If you want Amazon Q Business to only generate responses from authoritative data sources, you can configure this using the Amazon Q Business application global controls under Admin controls and guardrails.

  1. Log in to the Amazon Q Business console as an Amazon Q Business application administrator.
  2. Navigate to the application and choose Admin controls and guardrails in the navigation pane.
  3. Choose Edit in the Global controls section to set these options.

For more information, refer to Admin controls and guardrails in Amazon Q Business.

Q Business - Troubleshooting

Amazon Q Business responds using old (stale) data even though your data source is updated

Each Amazon Q Business data connector can be configured with a unique sync run schedule frequency. Verifying the sync status and sync schedule frequency for your data connector reveals when the last sync ran successfully. It could be that your data connector’s sync run schedule is either set to sync at a scheduled time of day, week, or month. If it’s set to run on demand, the sync has to be manually invoked. When the sync run is complete, verify the sync history to make sure the run has successfully synced all new issues. Refer to Sync run schedule for more information about each option.

Clean up

To avoid incurring future charges, clean up any resources created as part of this solution. Delete the Amazon Q ServiceNow Online connector data source, OAuth API endpoint created in ServiceNow, and the Q Business application. Also, delete the user management setup in IAM Identity Center.

Conclusion

In this post, we discussed how to configure the Amazon Q ServiceNow Online connector to crawl and index service tickets, community posts, and knowledge guides. We showed how generative AI-based search in Amazon Q enables your business leaders and agents to discover insights from your ServiceNow content quicker. This is all available through a user-friendly interface with Amazon Q Business doing the undifferentiated heavy lifting.

To learn more about the Amazon Q Business connector for ServiceNow Online, refer to Connecting ServiceNow Online to Amazon Q Business.


About the Authors

Prabhakar Chandrasekaran is a Senior Technical Account Manager with AWS Enterprise Support. Prabhakar enjoys helping customers build cutting-edge AI/ML solutions on the cloud. He also works with enterprise customers providing proactive guidance and operational assistance, helping them improve the value of their solutions when using AWS. Prabhakar holds six AWS and seven other professional certifications. With over 20 years of professional experience, Prabhakar was a data engineer and a program leader in the financial services space prior to joining AWS.

Lakshmi Dogiparti is a is a Software Development Engineer at Amazon Web Services. She works on the Amazon Q and Amazon Kendra connector design, development, integration and test operations.

Vijai Gandikota is a Principal Product Manager in the Amazon Q and Amazon Kendra organization of Amazon Web Services. He is responsible for the Amazon Q and Amazon Kendra connectors, ingestion, security, and other aspects of the Amazon Q and Amazon Kendra services.

Read More

Intelligent healthcare forms analysis with Amazon Bedrock

Intelligent healthcare forms analysis with Amazon Bedrock

Generative artificial intelligence (AI) provides an opportunity for improvements in healthcare by combining and analyzing structured and unstructured data across previously disconnected silos. Generative AI can help raise the bar on efficiency and effectiveness across the full scope of healthcare delivery.

The healthcare industry generates and collects a significant amount of unstructured textual data, including clinical documentation such as patient information, medical history, and test results, as well as non-clinical documentation like administrative records. This unstructured data can impact the efficiency and productivity of clinical services, because it’s often found in various paper-based forms that can be difficult to manage and process. Streamlining the handling of this information is crucial for healthcare providers to improve patient care and optimize their operations.

Handling large volumes of data, extracting unstructured data from multiple paper forms or images, and comparing it with the standard or reference forms can be a long and arduous process, prone to errors and inefficiencies. However, advancements in generative AI solutions have introduced automated approaches that offer a more efficient and reliable solution for comparing multiple documents.

Amazon Bedrock is a fully managed service that makes foundation models (FMs) from leading AI startups and Amazon available through an API, so you can choose from a wide range of FMs to find the model that is best suited for your use case. Amazon Bedrock offers a serverless experience, so you can get started quickly, privately customize FMs with your own data, and quickly integrate and deploy them into your applications using the AWS tools without having to manage the infrastructure.

In this post, we explore using the Anthropic Claude 3 on Amazon Bedrock large language model (LLM). Amazon Bedrock provides access to several LLMs, such as Anthropic Claude 3, which can be used to generate semi-structured data relevant to the healthcare industry. This can be particularly useful for creating various healthcare-related forms, such as patient intake forms, insurance claim forms, or medical history questionnaires.

Solution overview

To provide a high-level understanding of how the solution works before diving deeper into the specific elements and the services used, we discuss the architectural steps required to build our solution on AWS. We illustrate the key elements of the solution, giving you an overview of the various components and their interactions.

We then examine each of the key elements in more detail, exploring the specific AWS services that are used to build the solution, and discuss how these services work together to achieve the desired functionality. This provides a solid foundation for further exploration and implementation of the solution.

Part 1: Standard forms: Data extraction and storage

The following diagram highlights the key elements of a solution for data extraction and storage with standard forms.

Figure 1: Architecture – Standard Form – Data Extraction & Storage.

The Standard from processing steps are as follows:

  1. A user upload images of paper forms (PDF, PNG, JPEG) to Amazon Simple Storage Service (Amazon S3), a highly scalable and durable object storage service.
  2. Amazon Simple Queue Service (Amazon SQS) is used as the message queue. Whenever a new form is loaded, an event is invoked in Amazon SQS.
    1. If an S3 object is not processed, then after two tries it will be moved to the SQS dead-letter queue (DLQ), which can be configured further with an Amazon Simple Notification Service (Amazon SNS) topic to notify the user through email.
  3. The SQS message invokes an AWS Lambda The Lambda function is responsible for processing the new form data.
  4. The Lambda function reads the new S3 object and passes it to the Amazon Textract API to process the unstructured data and generate a hierarchical, structured output. Amazon Textract is an AWS service that can extract text, handwriting, and data from scanned documents and images. This approach allows for the efficient and scalable processing of complex documents, enabling you to extract valuable insights and data from various sources.
  5. The Lambda function passes the converted text to Anthropic Claude 3 on Amazon Bedrock Anthropic Claude 3 to generate a list of questions.
  6. Lastly, the Lambda function stores the question list in Amazon S3.

Amazon Bedrock API call to extract form details

We call an Amazon Bedrock API twice in the process for the following actions:

  • Extract questions from the standard or reference form – The first API call is made to extract a list of questions and sub-questions from the standard or reference form. This list serves as a baseline or reference point for comparison with other forms. By extracting the questions from the reference form, we can establish a benchmark against which other forms can be evaluated.
  • Extract questions from the custom form – The second API call is made to extract a list of questions and sub-questions from the custom form or the form that needs to be compared against the standard or reference form. This step is necessary because we need to analyze the custom form’s content and structure to identify its questions and sub-questions before we can compare them with the reference form.

By having the questions extracted and structured separately for both the reference and custom forms, the solution can then pass these two lists to the Amazon Bedrock API for the final comparison step. This approach maintains the following:

  • Accurate comparison – The API has access to the structured data from both forms, making it straightforward to identify matches, mismatches, and provide relevant reasoning
  • Efficient processing – Separating the extraction process for the reference and custom forms helps avoid redundant operations and optimizes the overall workflow
  • Observability and interoperability – Keeping the questions separate enables better visibility, analysis, and integration of the questions from different forms
  • Hallucination avoidance – By following a structured approach and relying on the extracted data, the solution helps avoid generating or hallucinating content, providing integrity in the comparison process

This two-step approach uses the capabilities of the Amazon Bedrock API while optimizing the workflow, enabling accurate and efficient form comparison, and promoting observability and interoperability of the questions involved.

See the following code (API Call):

def get_response_from_claude3(context, prompt_data):
    body = json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 4096,
        "system":"""You are an expert form analyzer and can understand different sections and subsections within a form and can find all the questions  being asked. You can find similarities and differences at the question level between different types of forms.""",
        "messages": [
            {
                "role": "user",
                "content": [
                    {"type": "text", 
                     "text": f"""Given the following document(s): {context} n {prompt_data}"""},
                ],
            }
        ],
    })
    modelId = f'anthropic.claude-3-sonnet-20240229-v1:0'     
    config = Config(read_timeout=1000)
    bedrock = boto3.client('bedrock-runtime',config=config)    
    response = bedrock.invoke_model(body=body, modelId=modelId)
    response_body = json.loads(response.get("body").read())
    answer = response_body.get("content")[0].get("text")
   return answer

User prompt to extract fields and list them

We provide the following user prompt to Anthropic Claude 3 to extract the fields from the raw text and list them for comparison as shown in step 3B (of Figure 3: Data Extraction & Form Field comparison).

get_response_from_claude3(response, f""" Create a summary of the different sections in the form, then
                                         for each section create a list of all questions and sub questions asked in the
                                         whole form and group by section including signature, date, reviews and approvals. 
                                         Then concatenate all questions and return a single numbered list, Be very detailed."""))

The following figure illustrates the output from Amazon Bedrock with a list of questions from the standard or reference form.

Figure 2:  Standard Form Sample Question List

Store this question list in Amazon S3 so it can be used for comparison with other forms, as shown in Part 2 of the process below.

Part 2: Data extraction and form field comparison

The following diagram illustrates the architecture for the next step, which is data extraction and form field comparison.

Figure 3: Data Extraction & Form Field comparison

Steps 1 and 2 are similar to those in Figure 1, but are repeated for the forms to be compared against the standard or reference forms. The next steps are as follows:

  1. The SQS message invokes a Lambda function. The Lambda function is responsible for processing the new form data.
    1. The raw text is extracted by Amazon Textract using a Lambda function. The extracted raw text is then passed to Step 3B for further processing and analysis.
    2. Anthropic Claude 3 generates a list of questions from the custom form that needs to be compared with the standard from. Then both forms and document question lists are passed to Amazon Bedrock, which compares the extracted raw text with standard or reference raw text to identify differences and anomalies to provide insights and recommendations relevant to the healthcare industry by respective category. It then generates the final output in JSON format for further processing and dashboarding. The Amazon Bedrock API call and user prompt from Step 5 (Figure 1: Architecture – Standard Form – Data Extraction & Storage) are reused for this step to generate a question list from the custom form.

We discuss Steps 4–6 in the next section.

The following screenshot shows the output from Amazon Bedrock with a list of questions from the custom form.

Figure 4:  Custom Form Sample Question List

Final comparison using Anthropic Claude 3 on Amazon Bedrock:

The following examples show the results from the comparison exercise using Amazon Bedrock with Anthropic Claude 3, showing one that matched and one that didn’t match with the reference or standard form.

The following is the user prompt for forms comparison:

categories = ['Personal Information','Work History','Medical History','Medications and Allergies','Additional Questions','Physical Examination','Job Description','Examination Results']
forms = f"Form 1 : {reference_form_question_list}, Form 2 : {custom_form_question_list}"

The following is the first call:

match_result = (get_response_from_claude3(forms, f""" Go through questions and sub questions {start}- {processed} in Form 2 return the question whether it matches with any question /sub question/field in Form 1 in terms of meaning and context and provide reasoning, or if it does not match with any question/sub question/field in Form 1 and provide reasoning. Treat each sub question as its own question and the final output should be a numbered list with the same length as the number of questions and sub questions in Form 2. Be concise"""))

The following is the second call:

get_response_from_claude3(match_result, 
f""" Go through all the questions and sub questions in the Form 2 Results and turn this into a JSON object called 'All Questions' which has the keys 'Question' with only the matched or unmatched question, 'Match' with valid values of yes or no, and 'Reason' which is the reason of match or no match, ‘Category' placing the question in one the categories in this list: {categories} . Do not omit any questions in output."""))

The following screenshot shows the questions matched with the reference form.

The following screenshot shows the questions that didn’t match with the reference form.

The steps from the preceding architecture diagram continue as follows:

4. The SQS queue invokes a Lambda function.

5. The Lambda function invokes an AWS Glue job and monitors for completion.

a. The AWS Glue job processes the final JSON output from the Amazon Bedrock model in tabular format for reporting.

6. Amazon QuickSight is used to create interactive dashboards and visualizations, allowing healthcare professionals to explore the analysis, identify trends, and make informed decisions based on the insights provided by Anthropic Claude 3.

The following screenshot shows a sample QuickSight dashboard.

       

Next steps

Many healthcare providers are investing in digital technology, such as electronic health records (EHRs) and electronic medical records (EMRs) to streamline data collection and storage, allowing appropriate staff to access records for patient care. Additionally, digitized health records provide the convenience of electronic forms and remote data editing for patients. Electronic health records offer a more secure and accessible record system, reducing data loss and facilitating data accuracy. Similar solutions can offer capturing the data in these paper forms into EHRs.

Conclusion

Generative AI solutions like Amazon Bedrock with Anthropic Claude 3 can significantly streamline the process of extracting and comparing unstructured data from paper forms or images. By automating the extraction of form fields and questions, and intelligently comparing them against standard or reference forms, this solution offers a more efficient and accurate approach to handling large volumes of data. The integration of AWS services like Lambda, Amazon S3, Amazon SQS, and QuickSight provides a scalable and robust architecture for deploying this solution. As healthcare organizations continue to digitize their operations, such AI-powered solutions can play a crucial role in improving data management, maintaining compliance, and ultimately enhancing patient care through better insights and decision-making.


About the Authors

Satish Sarapuri is a Sr. Data Architect, Data Lake at AWS. He helps enterprise-level customers build high-performance, highly available, cost-effective, resilient, and secure generative AI, data mesh, data lake, and analytics platform solutions on AWS, through which customers can make data-driven decisions to gain impactful outcomes for their business and help them on their digital and data transformation journey. In his spare time, he enjoys spending time with his family and playing tennis.

Harpreet Cheema is a Machine Learning Engineer at the AWS Generative AI Innovation Center. He is very passionate in the field of machine learning and in tackling data-oriented problems. In his role, he focuses on developing and delivering machine learning focused solutions for customers across different domains.

Deborah Devadason is a Senior Advisory Consultant in the Professional Service team at Amazon Web Services. She is a results-driven and passionate Data Strategy specialist with over 25 years of consulting experience across the globe in multiple industries. She leverages her expertise to solve complex problems and accelerate business-focused journeys, thereby creating a stronger backbone for the digital and data transformation journey.

Read More

Harness the power of AI and ML using Splunk and Amazon SageMaker Canvas

Harness the power of AI and ML using Splunk and Amazon SageMaker Canvas

As the scale and complexity of data handled by organizations increase, traditional rules-based approaches to analyzing the data alone are no longer viable. Instead, organizations are increasingly looking to take advantage of transformative technologies like machine learning (ML) and artificial intelligence (AI) to deliver innovative products, improve outcomes, and gain operational efficiencies at scale. Furthermore, the democratization of AI and ML through AWS and AWS Partner solutions is accelerating its adoption across all industries.

For example, a health-tech company may be looking to improve patient care by predicting the probability that an elderly patient may become hospitalized by analyzing both clinical and non-clinical data. This will allow them to intervene early, personalize the delivery of care, and make the most efficient use of existing resources, such as hospital bed capacity and nursing staff.

AWS offers the broadest and deepest set of AI and ML services and supporting infrastructure, such as Amazon SageMaker and Amazon Bedrock, to help you at every stage of your AI/ML adoption journey, including adoption of generative AI. Splunk, an AWS Partner, offers a unified security and observability platform built for speed and scale.

As the diversity and volume of data increases, it is vital to understand how they can be harnessed at scale by using complementary capabilities of the two platforms. For organizations looking beyond the use of out-of-the-box Splunk AI/ML features, this post explores how Amazon SageMaker Canvas, a no-code ML development service, can be used in conjunction with data collected in Splunk to drive actionable insights. We also demonstrate how to use the generative AI capabilities of SageMaker Canvas to speed up your data exploration and help you build better ML models.

Use case overview

In this example, a health-tech company offering remote patient monitoring is collecting operational data from wearables using Splunk. These device metrics and logs are ingested into and stored in a Splunk index, a repository of incoming data. Within Splunk, this data is used to fulfill context-specific security and observability use cases by Splunk users, such as monitoring the security posture and uptime of devices and performing proactive maintenance of the fleet.

Separately, the company uses AWS data services, such as Amazon Simple Storage Service (Amazon S3), to store data related to patients, such as patient information, device ownership details, and clinical telemetry data obtained from the wearables. These could include exports from customer relationship management (CRM), configuration management database (CMDB), and electronic health record (EHR) systems. In this example, they have access to an extract of patient information and hospital admission records that reside in an S3 bucket.

The following table illustrates the different data explored in this example use case.

Description

Feature Name

Storage

Example Source

Age of patient

age

AWS

EHR

Units of alcohol consumed by patient every week

alcohol_consumption

AWS

EHR

Tobacco usage by patient per week

tabacco_use

AWS

EHR

Average systolic blood pressure of patient

avg_systolic

AWS

Wearables

Average diastolic blood pressure of patient

avg_diastolic

AWS

Wearables

Average resting heart rate of patient

avg_resting_heartrate

AWS

Wearables

Patient admission record

admitted

AWS

EHR

Number of days the device has been active over a period

num_days_device_active

Splunk

Wearables

Average end of the day battery level over a period

avg_eod_device_battery_level

Splunk

Wearables

This post describes an approach with two key components:

  • The two data sources are stored alongside each other using a common AWS data engineering pipeline. Data is presented to the personas that need access using a unified interface.
  • An ML model to predict hospital admissions (admitted) is developed using the combined dataset and SageMaker Canvas. Professionals without a background in ML are empowered to analyze the data using no-code tooling.

The solution allows custom ML models to be developed from a broader variety of clinical and non-clinical data sources to cater for different real-life scenarios. For example, it can be used to answer questions such as “If patients have a propensity to have their wearables turned off and there is no clinical telemetry data available, can the likelihood that they are hospitalized still be accurately predicted?”

AWS data engineering pipeline

The adaptable approach detailed in this post starts with an automated data engineering pipeline to make data stored in Splunk available to a wide range of personas, including business intelligence (BI) analysts, data scientists, and ML practitioners, through a SQL interface. This is achieved by using the pipeline to transfer data from a Splunk index into an S3 bucket, where it will be cataloged.

The approach is shown in the following diagram.

The diagram shows an architecture overview of data engineering pipeline. The components marked in the diagram are listed below.

Figure 1: Architecture overview of data engineering pipeline

The automated AWS data pipeline consists of the following steps:

  1. Data from wearables is stored in a Splunk index where it can be queried by users, such as security operations center (SOC) analysts, using the Splunk search processing language (SPL). Spunk’s out-of-the-box AI/ML capabilities, such as the Splunk Machine Learning Toolkit (Splunk MLTK) and purpose-built models for security and observability use cases (for example, for anomaly detection and forecasting), can be utilized inside the Splunk Platform. Using these Splunk ML features allows you to derive contextualized insights quickly without the need for additional AWS infrastructure or skills.
  2. Some organizations may look to develop custom, differentiated ML models, or want to build AI-enabled applications using AWS services for their specific use cases. To facilitate this, an automated data engineering pipeline is built using AWS Step Functions. The Step Functions state machine is configured with an AWS Lambda function to retrieve data from the Splunk index using the Splunk Enterprise SDK for Python. The SPL query requested through this REST API call is scoped to only retrieve the data of interest.
      1. Lambda supports container images. This solution uses a Lambda function that runs a Docker container image. This allows larger data manipulation libraries, such as pandas and PyArrow, to be included in the deployment package.
      2. If a large volume of data is being exported, the code may need to run for longer than the maximum possible duration, or require more memory than supported by Lambda functions. If this is the case, Step Functions can be configured to directly run a container task on Amazon Elastic Container Service (Amazon ECS).
  3. For authentication and authorization, the Spunk bearer token is securely retrieved from AWS Secrets Manager by the Lambda function before calling the Splunk /search REST API endpoint. This bearer authentication token lets users access the REST endpoint using an authenticated identity.
  4. Data retrieved by the Lambda function is transformed (if required) and uploaded to the designated S3 bucket alongside other datasets. This data is partitioned and compressed, and stored in storage and performance-optimized Apache Parquet file format.
  5. As its last step, the Step Functions state machine runs an AWS Glue crawler to infer the schema of the Splunk data residing in the S3 bucket, and catalogs it for wider consumption as tables using the AWS Glue Data Catalog.
  6. Wearable data exported from Splunk is now available to users and applications through the Data Catalog as a table. Analytics tooling such as Amazon Athena can now be used to query the data using SQL.
  7. As data stored in your AWS environment grows, it is essential to have centralized governance in place. AWS Lake Formation allows you to simplify permissions management and data sharing to maintain security and compliance.

An AWS Serverless Application Model (AWS SAM) template is available to deploy all AWS resources required by this solution. This template can be found in the accompanying GitHub repository.

Refer to the README file for required prerequisites, deployment steps, and the process to test the data engineering pipeline solution.

AWS AI/ML analytics workflow

After the data engineering pipeline’s Step Functions state machine successfully completes and wearables data from Splunk is accessible alongside patient healthcare data using Athena, we use an example approach based on SageMaker Canvas to drive actionable insights.

SageMaker Canvas is a no-code visual interface that empowers you to prepare data, build, and deploy highly accurate ML models, streamlining the end-to-end ML lifecycle in a unified environment. You can prepare and transform data through point-and-click interactions and natural language, powered by Amazon SageMaker Data Wrangler. You can also tap into the power of automated machine learning (AutoML) and automatically build custom ML models for regression, classification, time series forecasting, natural language processing, and computer vision, supported by Amazon SageMaker Autopilot.

In this example, we use the service to classify whether a patient is likely to be admitted to a hospital over the next 30 days based on the combined dataset.

The approach is shown in the following diagram.

The diagram shows an architecture overview of ML development. Important components of the solution are listed below.

Figure 2: Architecture overview of ML development

The solution consists of the following steps:

  1. An AWS Glue crawler crawls the data stored in S3 bucket. The Data Catalog exposes this data found in the folder structure as tables.
  2. Athena provides a query engine to allow people and applications to interact with the tables using SQL.
  3. SageMaker Canvas uses Athena as a data source to allow the data stored in the tables to be used for ML model development.

Solution overview

SageMaker Canvas allows you to build a custom ML model using a dataset that you have imported. In the following sections, we demonstrate how to create, explore, and transform a sample dataset, use natural language to query the data, check for data quality, create additional steps for the data flow, and build, test, and deploy an ML model.

Prerequisites

Before proceeding, refer to Getting started with using Amazon SageMaker Canvas to make sure you have the required prerequisites in place. Specifically, validate that the AWS Identity and Access Management (IAM) role your SageMaker domain is using has a policy attached with sufficient permissions to access Athena, AWS Glue, and Amazon S3 resources.

Create the dataset

SageMaker Canvas supports Athena as a data source. Data from wearables and patient healthcare data residing across your S3 bucket is accessed using Athena and the Data Catalog. This allows this tabular data to be directly imported into SageMaker Canvas to start your ML development.

To create your dataset, complete the following steps:

  1. On the SageMaker Canvas console, choose Data Wrangler in the navigation pane.
  2. On the Import and prepare dropdown menu, choose Tabular as the dataset type to denote that the imported data consists of rows and columns.
The screenshot shows how tabular data is imported using SageMaker Data Wrangler. Tabular from the import and prepare option is highlighted.

Figure 3: Importing tabular data using SageMaker Data Wrangler

  1. For Select a data source, choose Athena.

On this page, you will see your Data Catalog database and tables listed, named patient_data and splunk_ops_data.

  1. Join (inner join) the tables together using the user_id and id to create one overarching dataset that can be used during ML model development.
  2. Under Import settings, enter unprocessed_data for Dataset name.
  3. Choose Import to complete the process.
The screenshot shows how tabular data is joined using SageMaker Data Wrangler. 2 tables discovered from Athena are highlighted, alongside the user id fields that are used to join the 2 tables together.

Figure 4: Joining data using SageMaker Data Wrangler

The combined dataset is now available to explore and transform using SageMaker Data Wrangler.

Explore and transform the dataset

SageMaker Data Wrangler enables you to transform and analyze the source dataset through data flows while still maintaining a no-code approach.

The previous step automatically created a data flow in the SageMaker Canvas console which we have renamed to data_prep_data_flow.flow. Additionally, two steps are automatically generated, as listed in the following table.

Step

Name

Description

1

Athena Source

Sets the unprocessed_data dataset as source of the data flow

2

Data types

Sets column types of unprocessed_data dataset

Before we create additional transform steps, let’s explore two SageMaker Canvas features that can help us focus on the right actions.

Use natural language to query the data

SageMaker Data Wrangler also provides generative AI capabilities called Chat for data prep powered by a large language model (LLM). This feature allows you to explore your data using natural language without any background in ML or SQL. Furthermore, any contextualized recommendations returned by the generative AI model can be introduced directly back into the data flow without writing any code.

In this section, we present some example prompts to demonstrate this in action. These examples have been selected to illustrate the art of the possible. We recommend that you experiment with different prompts to gain the best results for your particular use cases.

Example 1: Identify Splunk default fields

In this first example, we want to know whether there are Splunk default fields that we could potentially exclude from our dataset prior to ML model development.

  1. In SageMaker Data Wrangler, open your data flow.
  2. Choose Step 2 Data types, and choose Chat for data prep.
  3. In the Chat for data prep pane, you can enter prompts in natural language to explore and transform the data. For example:
Prompt: This data has been exported from a Splunk index. Which columns are likely to be Splunk default fields? 
Response: _bkt, _cd, _indextime, _raw, _si, _sourcetype, _time are likely Splunk default fields based on the Summary section showing they are mostly metadata/index fields. 
Prompt: OK. Let's drop _bkt, _cd, _indextime, _raw, _si, _sourcetype, and _time. 
Response: The code drops the columns '_bkt', '_cd', '_indextime', '_raw', '_si', '_sourcetype' and '_time' from the dataframe df using the drop() method.

In this example, the generative AI LLM has correctly identified Splunk default fields that could be safely dropped from the dataset.

  1. Choose Add to steps to add this identified transformation to the data flow.
The screenshot shows how SageMaker Data Wrangler’s chat for data prep is used to identify Splunk’s default fields. Conversation has been captured above the screenshot. Add to steps button is highlighted in the screenshot.

Figure 5: Using SageMaker Data Wrangler’s chat for data prep to identify Splunk’s default fields

Example 2: Identify additional columns that could be dropped

We now want to identify any further columns that could be dropped without being too specific about what we’re looking for. We want the LLM to make the suggestions based on the data, and provide us with the rationale. For example:

Prompt: List out the names of other columns that we should drop before ML development. Give me a list of columns, along with the reasons why. 
Response: The code drops the timestamp, punct, id, index and linecount columns from the dataset. These columns do not contain useful information for machine learning and would add noise to the models.

In addition to the Splunk default fields identified earlier, the generative AI model is now proposing the removal of columns such as timestamp, punct, id, index, and linecount that don’t appear to be conducive to ML model development.

The screenshot shows how SageMaker Data Wrangler’s chat for data prep is used to identify additional fields that can be dropped. Conversation has been captured above the screenshot.

Figure 6: Using SageMaker Data Wrangler’s chat for data prep to identify additional fields that can be dropped

Example 3: Calculate average age column in dataset

You can also use the generative AI model to perform Text2SQL tasks in which you can simply ask questions of the data using natural language. This is useful if you want to validate the content of the dataset.

In this example, we want to know what the average patient age value is within the dataset:

Prompt: What is the average age column of the dataset? 
Response: The average age in the dataset is 67 years.

By expanding View code, you can see what SQL statements the LLM has constructed using its Text2SQL capabilities. This gives you full visibility into how the results are being returned.

The screenshot shows how SageMaker Data Wrangler’s chat for data prep is used to run SQL statements. Under view code, the screenshot shows SELECT AVG(age) FROM df; which is the SQL statement chat for data prep has returned.

Figure 7: Using SageMaker Data Wrangler’s chat for data prep to run SQL statements

Check for data quality

SageMaker Canvas also provides exploratory data analysis (EDA) capabilities that allow you to gain deeper insights into the data prior to the ML model build step. With EDA, you can generate visualizations and analyses to validate whether you have the right data, and whether your ML model build is likely to yield results that are aligned to your organization’s expectations.

Example 1: Create a Data Quality and Insights Report

Complete the following steps to create a Data Quality and Insights Report:

  1. While in the data flow step, choose the Analyses tab.
  2. For Analysis type, choose Data Quality and Insights Report.
  3. For Target column, choose admitted.
  4. For Problem type, choose Classification.

This performs an analysis of the data that you have and provides information such as the number of missing values and outliers.

The screenshot shows how SageMaker Data Wrangler’s data quality and insights report is used to perform analysis of the data. It shows a summary of dataset characteristics, such as number of features, number of rows, missing values, duplicated rows and data validity.

Figure 8: Running SageMaker Data Wrangler’s data quality and insights report

Refer to Get Insights On Data and Data Quality for details on how to interpret the results of this report.

Example 2: Create a Quick Model

In this second example, choose Quick Model for Analysis type and for Target column, choose admitted. The Quick Model estimates the expected predicted quality of the model.

By running the analysis, the estimated F1 score (a measure of predictive performance) of the model and feature importance scores are displayed.

The screenshot shows how SageMaker Data Wrangler’s quick model feature is used to assess the potential accuracy of the model. It has determined that the model achieved a F1 score of 0.76, and that systlolic blood pressure, average end of day device battery level, average number of days device is active and age values all have an impact to the hospital admission prediction.

Figure 9: Running SageMaker Data Wrangler’s quick model feature to assess the potential accuracy of the model

SageMaker Canvas supports many other analysis types. By reviewing these analyses in advance of your ML model build, you can continue to engineer the data and features to gain sufficient confidence that the ML model will meet your business objectives.

Create additional steps in the data flow

In this example, we have decided to update our data_prep_data_flow.flow data flow to implement additional transformations. The following table summarizes these steps.

Step

Transform

Description

3

Chat for data prep

Removes Splunk default fields identified.

4

Chat for data prep

Removes additional fields identified as being unhelpful to ML model development.

5

Group by

Groups together the rows by user_id and calculates an average
of time-ordered numerical fields from Splunk. This is performed to convert the ML problem type from time series forecasting into a simple two-category prediction of target feature (
admitted) using averages of the input values over a given time period. Alternatively, SageMaker Canvas also supports time series forecasting.

6

Drop column (manage columns)

Drops remaining columns that are unnecessary for our ML development, such as columns with high cardinality (for example, user_id).

7

Parse column as type

Converts numerical value types, for example from Float to Long. This is performed to make sure values, such as those in unit of days, remain integers after calculations.

8

Parse column as type

Converts additional columns that need to be parsed (each column requires a separate step).

9

Drop duplicates (manage rows)

Drops duplicate rows to avoid overfitting.

To create a new transform, view the data flow, then choose Add transform on the last step.

The screenshot shows how a transform can be added to a data flow in SageMaker Data Wrangler. The add transform option on the final step is highlighted.

Figure 10: Using SageMaker Data Wrangler to add a transform to a data flow

Choose Add transform, and proceed to choose a transform type and its configuration.

The screenshot shows how a transform can be added to a data flow in SageMaker Data Wrangler. The add transform option on the final step is highlighted.

Figure 11: Using SageMaker Data Wrangler to add a transform to a data flow

The following screenshot shows our newly updated end-to-end data flow featuring multiple steps. In this example, we ran the analyses at the end of the data flow.

The screenshot shows the end-to-end data flow in SageMaker Data Wrangler. The steps shown in the data flow are described in the table above.

Figure 12: Showing the end-to-end SageMaker Canvas Data Wrangler data flow

If you want to incorporate this data flow into a productionized ML workflow, SageMaker Canvas can create a Jupyter notebook that exports your data flow to Amazon SageMaker Pipelines.

Develop the ML model

To get started with ML model development, complete the following steps:

  1. Choose Create model directly from the last step of the data flow.
The screenshot shows how a model is created from the data flow in SageMaker Data Wrangler. Create model option is highlighted on the final data flow step.

Figure 13: Creating a model from the SageMaker Data Wrangler data flow

  1. For Dataset name, enter a name for your transformed dataset (for example, processed_data).
  2. Choose Export.
The screenshot shows how the exported dataset is named in SageMaker Data Wrangler. A name, processed_data, is being entered into the dataset name field.

Figure 14: Naming the exported dataset to be used by the model in SageMaker Data Wrangler

This step will automatically create a new dataset.

  1. After the dataset has been created successfully, choose Create model to begin the ML model creation.
The screenshot shows how the model is then created from the exported dataset using SageMaker Data Wrangler. The create model link at the borttom of the screen is being highlighted.

Figure 15: Creating the model in SageMaker Data Wrangler

  1. For Model name, enter a name for the model (for example, my_healthcare_model).
  2. For Problem type, select Predictive analysis.
  3. Choose Create.
The screenshot shows how the model is named and predictive analysis type is selected in SageMaker Canvas. Model name my_healthcare_model is being entered, and the predictive analysis option being selected.

Figure 16: Naming the model in SageMaker Canvas and selecting the predictive analysis type

You are now ready to progress through the Build, Analyze, Predict, and Deploy stages to develop and operationalize the ML model using SageMaker Canvas.

  1. On the Build tab, for Target column, choose the column you want to predict (admitted).
  2. Choose Quick build to build the model.

The Quick build option has a shorter build time, but the Standard build option generally enjoys higher accuracy.

The screenshot shows how the target column to predict for the model is selected in SageMaker Canvas. Field admitted has been chosen in the target column drop-down. The quick build button is highlighted.

Figure 17: Selecting the target column to predict in SageMaker Canvas

After a few minutes, on the Analyze tab, you will be able to view the accuracy of the model, along with column impact, scoring, and other advanced metrics. For example, we can see that a feature from the wearables data captured in Splunk—average_num_days_device_active—has a strong impact on whether the patient is likely to be admitted or not, along with their age. As such, the health-tech company may proactively reach out to elderly patients who tend to keep their wearables off to minimize the risk of their hospitalization.

The screenshot shows how the results from the model quick build is displayed in SageMaker Canvas. For the specific column impact selected, it shows that there is strong correlation between the average number of days a device has been active for and the probability of the patient’s admission. Model accuracy is 82% with a F1 score of 0.609.

Figure 18: Displaying the results from the model quick build in SageMaker Canvas

When you’re happy with the results from the Quick build, repeat the process with a Standard build to make sure you have an ML model with higher accuracy that can be deployed.

Test the ML model

Our ML model has now been built. If you’re satisfied with its accuracy, you can make predictions using this ML model using net new data on the Predict tab. Predictions can be performed either using batch (list of patients) or for a single entry (one patient).

Experiment with different values and choose Update prediction. The ML model will respond with a prediction for the new values that you have entered.

In this example, the ML model has identified a 64.5% probability that this particular patient will be admitted to hospital in the next 30 days. The health-tech company will likely want to prioritize the care of this patient.

The screenshot shows how the results from a single prediction using the developed model is displayed in SageMaker Canvas. A prediction has been made for 88-year old patient. The model has returned that there is a 64.487% that they will be admitted into hospital.

Figure 19: Displaying the results from a single prediction using the model in SageMaker Canvas

Deploy the ML model

It is now possible for the health-tech company to build applications that can use this ML model to make predictions. ML models developed in SageMaker Canvas can be operationalized using a broader set of SageMaker services. For example:

To deploy the ML model, complete the following steps:

  1. On the Deploy tab, choose Create Deployment.
  2. Specify Deployment name, Instance type, and Instance count.
  3. Choose Deploy to make the ML model available as a SageMaker endpoint.

In this example, we reduced the instance type to ml.m5.4xlarge and instance count to 1 before deployment.

The screenshot shows how the developed model is deployed using SageMaker Canvas. The ml.m5.4xlarge instance type with an instance count of 1 has been selected.

Figure 20: Deploying the using SageMaker Canvas

At any time, you can directly test the endpoint from SageMaker Canvas on the Test deployment tab of the deployed endpoint listed under Operations on the SageMaker Canvas console.

Refer to the Amazon SageMaker Canvas Developer Guide for detailed steps to take your ML model development through its full development lifecycle and build applications that can consume the ML model to make predictions.

Clean up

Refer to the instructions in the README file to clean up the resources provisioned for the AWS data engineering pipeline solution.

SageMaker Canvas bills you for the duration of the session, and we recommend logging out of SageMaker Canvas when you are not using it. Refer to Logging out of Amazon SageMaker Canvas for more details. Furthermore, if you deployed a SageMaker endpoint, make sure you have deleted it.

Conclusion

This post explored a no-code approach involving SageMaker Canvas that can drive actionable insights from data stored across both Splunk and AWS platforms using AI/ML techniques. We also demonstrated how you can use the generative AI capabilities of SageMaker Canvas to speed up your data exploration and build ML models that are aligned to your business’s expectations.

Learn more about AI on Splunk and ML on AWS.


About the Authors

Alan Peaty

Alan Peaty is a Senior Partner Solutions Architect, helping Global Systems Integrators (GSIs), Global Independent Software Vendors (GISVs), and their customers adopt AWS services. Prior to joining AWS, Alan worked as an architect at systems integrators such as IBM, Capita, and CGI. Outside of work, Alan is a keen runner who loves to hit the muddy trails of the English countryside, and is an IoT enthusiast.

Brett Roberts

Brett Roberts is the Global Partner Technical Manager for AWS at Splunk, leading the technical strategy to help customers better secure and monitor their critical AWS environments and applications using Splunk. Brett was a member of the Splunk Trust and holds several Splunk and AWS certifications. Additionally, he co-hosts a community podcast and blog called Big Data Beard, exploring trends and technologies in the analytics and AI space.

Arnaud Lauer

Arnaud Lauer is a Principal Partner Solutions Architect in the Public Sector team at AWS. He enables partners and customers to understand how to best use AWS technologies to translate business needs into solutions. He brings more than 18 years of experience in delivering and architecting digital transformation projects across a range of industries, including public sector, energy, and consumer goods.

Read More