Driving Mobility Forward, Vay Brings Advanced Automotive Solutions to Roads With NVIDIA DRIVE AGX

Driving Mobility Forward, Vay Brings Advanced Automotive Solutions to Roads With NVIDIA DRIVE AGX

Vay, a Berlin-based provider of automotive-grade remote driving (teledriving) technology, is offering an alternative approach to autonomous driving.

Through the company’s app, a user can hail a car, and a professionally trained teledriver will remotely drive the vehicle to the customer’s location. Once the car arrives, the user manually drives it.

After completing their trip, the user can end the rental in the app and pull over to a safe location to exit the car, away from traffic flow. There’s no need to park the vehicle, as the teledriver will handle the parking or drive the car to the next customer.

This system offers sustainable, door-to-door mobility, with the unique advantage of having a human driver remotely controlling the vehicle in real time.

Vay’s technology is built on the NVIDIA DRIVE AGX centralized compute platform, running the NVIDIA DriveOS operating system for safe, AI-defined autonomous vehicles.

These technologies enable Vay’s fleets to process large volumes of camera and other vehicle data over the air. DRIVE AGX’s real-time, low-latency video streaming capabilities provide enhanced situational awareness for teledrivers, while its automotive-grade design ensures reliability in any driving condition.

“By combining Vay’s innovative remote driving capabilities with the advanced AI and computing power of NVIDIA DRIVE AGX, we’re setting a new standard for remotely driven vehicles,” said Justin Spratt, chief business officer at Vay. “This collaboration helps us bring safe, reliable and accessible driverless options to the market and provides an adaptable solution that can be deployed in real-world environments now — not years from now.”

High-Quality Video Stream

Vay’s advanced technology stack includes NVIDIA DRIVE AGX software that’s optimized for latency and processing power. By harnessing NVIDIA GPUs specifically designed for autonomous driving, the company’s teledriving system can process and transmit high-definition video feeds in real time, delivering critical situational awareness to the teledriver, even in complex environments. In the event of an emergency, the vehicle can safely bring itself to a complete stop.

“Working with NVIDIA, Vay is setting a new standard in driverless technology,” said Bogdan Djukic, cofounder and vice president of engineering, teledrive experience and autonomy at Vay. “We are proud to not only accelerate the deployment of remotely driven and autonomous vehicles but also to expand the boundaries of what’s possible in urban transportation, logistics and beyond — transforming mobility for both businesses and communities.”

Reshaping Mobility With Teledriving

Vay’s technology enables professionally trained teledrivers to remotely drive vehicles from specialized teledrive stations equipped with industry-standard controls, such as a steering wheel and pedals.

The company’s teledrivers are totally immersed in the drive — road traffic sounds, such as those from emergency vehicles and other warning signals, are transmitted via microphones to the operator’s headphones. Camera sensors reproduce the car’s surroundings and transmit them to the screens of the teledrive station with minimum latency. The vehicles can operate at speeds of up to 26 mph.

Vay’s technology effectively addresses complex edge cases with human supervision, enhancing safety while significantly reducing costs and development challenges.

Vay is a member of NVIDIA Inception, a program that nurtures AI startups with go-to-market support, expertise and technology. Last year, Vay became the first and only company in Europe to teledrive a vehicle on public streets without a safety driver.

Since January, Vay has been operating its commercial services in Las Vegas. The startup recently secured a partnership with Bayanat, a provider of AI-powered geospatial solutions, and is working with Ush and Poppy, Belgium-based car-sharing companies, as well as Peugeot, a French automaker.

In October, Vay announced a $35 million investment from the European Investment Bank, which will help it roll out its technology across Europe and expand its development team.

Learn more about the NVIDIA DRIVE platform.

Read More

How AWS sales uses Amazon Q Business for customer engagement

How AWS sales uses Amazon Q Business for customer engagement

Earlier this year, we published the first in a series of posts about how AWS is transforming our seller and customer journeys using generative AI. In addition to planning considerations when building an AI application from the ground up, it focused on our Account Summaries use case, which allows account teams to quickly understand the state of a customer account, including recent trends in service usage, opportunity pipeline, and recommendations to help customers maximize the value they receive from AWS.

In the same spirit of using generative AI to equip our sales teams to most effectively meet customer needs, this post reviews how we’ve delivered an internally-facing conversational sales assistant using Amazon Q Business. We discuss how our sales teams are using it today, compare the benefits of Amazon Q Business as a managed service to the do-it-yourself option, review the data sources available and high-level technical design, and talk about some of our future plans.

Introducing Field Advisor

In April 2024, we launched our AI sales assistant, which we call Field Advisor, making it available to AWS employees in the Sales, Marketing, and Global Services organization, powered by Amazon Q Business. Since that time, thousands of active users have asked hundreds of thousands of questions through Field Advisor, which we have embedded in our customer relationship management (CRM) system, as well as through a Slack application. The following screenshot shows an example of an interaction with Field Advisor.

Field Advisor serves four primary use cases:

  • AWS-specific knowledge search – With Amazon Q Business, we’ve made internal data sources as well as public AWS content available in Field Advisor’s index. This enables sales teams to interact with our internal sales enablement collateral, including sales plays and first-call decks, as well as customer references, customer- and field-facing incentive programs, and content on the AWS website, including blog posts and service documentation.
  • Document upload – When users need to provide context of their own, the chatbot supports uploading multiple documents during a conversation. We’ve seen our sales teams use this capability to do things like consolidate meeting notes from multiple team members, analyze business reports, and develop account strategies. For example, an account manager can upload a document representing their customer’s account plan, and use the assistant to help identify new opportunities with the customer.
  • General productivity – Amazon Q Business specializes in Retrieval Augmented Generation (RAG) over enterprise and domain-specific datasets, and can also perform general knowledge retrieval and content generation tasks. Our sales, marketing, and operations teams use Field Advisor to brainstorm new ideas, as well as generate personalized outreach that they can use with their customers and stakeholders.
  • Notifications and recommendations – To complement the conversational capabilities provided by Amazon Q, we’ve built a mechanism that allows us to deliver alerts, notifications, and recommendations to our field team members. These push-based notifications are available in our assistant’s Slack application, and we’re planning to make them available in our web experience as well. Example notifications we deliver include field-wide alerts in support of AWS summits like AWS re:Invent, reminders to generate an account summary when there’s an upcoming customer meeting, AI-driven insights around customer service usage and business data, and cutting-edge use cases like autonomous prospecting, which we’ll talk more about in an upcoming post.

Based on an internal survey, our field teams estimate that roughly a third of their time is spent preparing for their customer conversations, and another 20% (or more) is spent on administrative tasks. This time adds up individually, but also collectively at the team and organizational level. Using our AI assistant built on Amazon Q, team members are saving hours of time each week. Not only that, but our sales teams devise action plans that they otherwise might have missed without AI assistance.

Here’s a sampling of what some of our more active users had to say about their experience with Field Advisor:

“I use Field Advisor to review executive briefing documents, summarize meetings and outline actions, as well analyze dense information into key points with prompts. Field Advisor continues to enable me to work smarter, not harder.”– Sales Director

“When I prepare for onsite customer meetings, I define which advisory packages to offer to the customer. We work backward from the customer’s business objectives, so I download an annual report from the customer website, upload it in Field Advisor, ask about the key business and tech objectives, and get a lot of valuable insights. I then use Field Advisor to brainstorm ideas on how to best position AWS services. Summarizing the business objectives alone saves me between 4–8 hours per customer, and we have around five customer meetings to prepare for per team member per month.” – AWS Professional Services, EMEA

“I benefit from getting notifications through Field Advisor that I would otherwise not be aware of. My customer’s Savings Plans were expiring, and the notification helped me kick off a conversation with them at the right time. I asked Field Advisor to improve the content and message of an email I needed to send their executive team, and it only took me a minute. Thank you!” – Startup Account Manager, North America

Amazon Q Business underpins this experience, reducing the time and effort it takes for internal teams to have productive conversations with their customers that drive them toward the best possible outcomes on AWS.

The rest of this post explores how we’ve built our AI assistant for sales teams using Amazon Q Business, and highlights some of our future plans.

Putting Amazon Q Business into action

We started our journey in building this sales assistant before Amazon Q Business was available as a fully managed service. AWS provides the primitives needed for building new generative AI applications from the ground up: services like Amazon Bedrock to provide access to several leading foundation models, several managed vector database options for semantic search, and patterns for using Amazon Simple Storage Service (Amazon S3) as a data lake to host knowledge bases that can be used for RAG. This approach works well for teams like ours with builders experienced in these technologies, as well as for teams who need deep control over every component of the tech stack to meet their business objectives.

When Amazon Q Business became generally available in April 2024, we quickly saw an opportunity to simplify our architecture, because the service was designed to meet the needs of our use case—to provide a conversational assistant that could tap into our vast (sales) domain-specific knowledge bases. By moving our core infrastructure to Amazon Q, we no longer needed to choose a large language model (LLM) and optimize our use of it, manage Amazon Bedrock agents, a vector database and semantic search implementation, or custom pipelines for data ingestion and management. In just a few weeks, we were able to cut over to Amazon Q and significantly reduce the complexity of our service architecture and operations. Not only that, we expected this move to pay dividends—and it has—as the Amazon Q Business service team has continued to add new features (like automatic personalization) and enhance performance and result accuracy.

The following diagram illustrates Field Advisor’s high-level architecture:

Architecture of AWS Field Advisor using Amazon Q Business

Solution overview

We built Field Advisor using the built-in capabilities of Amazon Q Business. This includes how we configured data sources that comprise our knowledge base, indexing documents and relevancy tuning, security (authentication, authorization, and guardrails), and Amazon Q’s APIs for conversation management and custom plugins. We deliver our chatbot experience through a custom web frontend, as well as through a Slack application.

Data management

As mentioned earlier in this post, our initial knowledge base is comprised of all of our internal sales enablement materials, as well as publicly available content including the AWS website, blog posts, and service documentation. Amazon Q Business provides a number of out-of-the-box connectors to popular data sources like relational databases, content management systems, and collaboration tools. In our case, where we have several applications built in-house, as well as third-party software backed by Amazon S3, we make heavy use of Amazon Q connector for Amazon S3, and as well as custom connectors we’ve written. Using the service’s built-in source connectors standardizes and simplifies the work needed to maintain data quality and manage the overall data lifecycle. Amazon Q gives us a templatized way to filter source documents when generating responses on a particular topic, making it straightforward for the application to produce a higher quality response. Not only that, but each time Amazon Q provides an answer using the knowledge base we’ve connected, it automatically cites sources, enabling our sellers to verify authenticity in the information. Previously, we had to build and maintain custom logic to handle these tasks.

Security

Amazon Q Business provides capabilities for authentication, authorization, and access control out of the box. For authentication, we use AWS IAM Identity Center for enterprise single sign-on (SSO), using our internal identity provider called Amazon Federate. After going through a one-time setup for identity management that governs access to our sales assistant application, Amazon Q is aware of the users and roles across our sales teams, making it effortless for our users to access Field Advisor across multiple delivery channels, like the web experience embedded in our CRM, as well as the Slack application.

Also, with our multi-tenant AI application serving thousands of users across multiple sales teams, it’s critical that end-users are only interacting with data and insights that they should be seeing. Like any large organization, we have information firewalls between teams that help us properly safeguard customer information and adhere to privacy and compliance rules. Amazon Q Business provides the mechanisms for protecting each individual document in its knowledge base, simplifying the work required to make sure we’re respecting permissions on the underlying content that’s accessible to a generative AI application. This way, when a user asks a question of the tool, the answer will be generated using only information that the user is permitted to access.

Web experience

As noted earlier, we built a custom web frontend rather than using the Amazon Q built-in web experience. The Amazon Q experience works great, with features like conversation history, sample quick prompts, and Amazon Q Apps. Amazon Q Business makes these features available through the service API, allowing for a customized look and feel on the frontend. We chose this path to have a more fluid integration with our other field-facing tools, control over branding, and sales-specific contextual hints that we’ve built into the experience. As an example, we’re planning to use Amazon Q Apps as the foundation for an integrated prompt library that is personalized for each user and field-facing role.

A look at what’s to come

Field Advisor has seen early success, but it’s still just the beginning, or Day 1 as we like to say here at Amazon. We’re continuing to work on bringing our field-facing teams and field support functions more generative AI across the board. With Amazon Q Business, we no longer need to manage each of the infrastructure components required to deliver a secure, scalable conversational assistant—instead, we can focus on the data, insights, and experience that benefit our salesforce and help them make our customers successful on AWS. As Amazon Q Business adds features, capabilities, and improvements (which we often have the privilege of being able to test in early access) we automatically reap the benefits.

The team that built this sales assistant has been focused on developing—and will be launching soon—deeper integration with our CRM. This will enable teams across all roles to ask detailed questions about their customer and partner accounts, territories, leads and contacts, and sales pipeline. With an Amazon Q custom plugin that uses an internal library used for natural language to SQL (NL2SQL), the same that powers generative SQL capabilities across some AWS database services like Amazon Redshift, we will provide the ability to aggregate and slice-and-dice the opportunity pipeline and trends in product consumption conversationally. Finally, a common request we get is to use the assistant to generate more hyper-personalized customer-facing collateral—think of a first-call deck about AWS products and solutions that’s specific to an individual customer, localized in their language, that draws from the latest available service options, competitive intelligence, and the customer’s existing usage in the AWS Cloud.

Conclusion

In this post, we reviewed how we’ve made a generative AI assistant available to AWS sales teams, powered by Amazon Q Business. As new capabilities land and usage continues to grow, we’re excited to see how our field teams use this, along with other AI solutions, to help customers maximize their value on the AWS Cloud.

The next post in this series will dive deeper into another recent generative AI use case and how we applied this to autonomous sales prospecting. Stay tuned for more, and reach out to us with any questions about how you can drive growth with AI at your business.


About the authors

Joe Travaglini is a Principal Product Manager on the AWS Field Experiences (AFX) team who focuses on helping the AWS salesforce deliver value to AWS customers through generative AI. Prior to AFX, Joe led the product management function for Amazon Elastic File System, Amazon ElastiCache, and Amazon MemoryDB.

Jonathan Garcia is a Sr. Software Development Manager based in Seattle with over a decade of experience at AWS. He has worked on a variety of products, including data visualization tools and mobile applications. He is passionate about serverless technologies, mobile development, leveraging Generative AI, and architecting innovative high-impact solutions. Outside of work, he enjoys golfing, biking, and exploring the outdoors.

Umesh Mohan is a Software Engineering Manager at AWS, where he has been leading a team of talented engineers for over three years. With more than 15 years of experience in building data warehousing products and software applications, he is now focusing on the use of generative AI to drive smarter and more impactful solutions. Outside of work, he enjoys spending time with his family and playing tennis.

Read More

Discover insights from your Amazon Aurora PostgreSQL database using the Amazon Q Business connector

Discover insights from your Amazon Aurora PostgreSQL database using the Amazon Q Business connector

Amazon Aurora PostgreSQL-Compatible Edition is a fully managed, PostgreSQL-compatible, ACID-aligned relational database engine that combines the speed, reliability, and manageability of Amazon Aurora with the simplicity and cost-effectiveness of open source databases. Aurora PostgreSQL-Compatible is a drop-in replacement for PostgreSQL and makes it simple and cost-effective to set up, operate, and scale your new and existing PostgreSQL deployments, freeing you to focus on your business and applications.

Effective data management and performance optimization are critical aspects of running robust and scalable applications. Aurora PostgreSQL-Compatible, a managed relational database service, has become an indispensable part of many organizations’ infrastructure to maintain the reliability and efficiency of their data-driven applications. However, extracting valuable insights from the vast amount of data stored in Aurora PostgreSQL-Compatible often requires manual efforts and specialized tooling. Users such as database administrators, data analysts, and application developers need to be able to query and analyze data to optimize performance and validate the success of their applications. Generative AI provides the ability to take relevant information from a data source and deliver well-constructed answers back to the user.

Building a generative AI-based conversational application that is integrated with the data sources that contain relevant content requires time, money, and people. You first need to build connectors to the data sources. Next, you need to index this data to make it available for a Retrieval Augmented Generation (RAG) approach, where relevant passages are delivered with high accuracy to a large language model (LLM). To do this, you need to select an index that provides the capabilities to index the content for semantic and vector search, build the infrastructure to retrieve and rank the answers, and build a feature-rich web application. You also need to hire and staff a large team to build, maintain, and manage such a system.

Amazon Q Business is a fully managed generative AI-powered assistant that can answer questions, provide summaries, generate content, and securely complete tasks based on data and information in your enterprise systems. Amazon Q Business can help you get fast, relevant answers to pressing questions, solve problems, generate content, and take action using the data and expertise found in your company’s information repositories, code, and enterprise systems (such as an Aurora PostgreSQL database, among others). Amazon Q provides out-of-the-box data source connectors that can index content into a built-in retriever and uses an LLM to provide accurate, well-written answers. A data source connector is a component of Amazon Q that helps integrate and synchronize data from multiple repositories into one index.

Amazon Q Business offers multiple prebuilt connectors to a large number of data sources, including Aurora PostgreSQL-Compatible, Atlassian Confluence, Amazon Simple Storage Service (Amazon S3), Microsoft SharePoint, Salesforce, and helps you create your generative AI solution with minimal configuration. For a full list of Amazon Q Business supported data source connectors, see Amazon Q Business connectors.

In this post, we walk you through configuring and integrating Amazon Q for Business with Aurora PostgreSQL-Compatible to enable your database administrators, data analysts, application developers, leadership, and other teams to quickly get accurate answers to their questions related to the content stored in Aurora PostgreSQL databases.

Use cases

After you integrate Amazon Q Business with Aurora PostgreSQL-Compatible, users can ask questions directly from the database content. This enables the following use cases:

  • Natural language search – Users can search for specific data, such as records or entries, using conversational language. This makes it straightforward to find the necessary information without needing to remember exact keywords or filters.
  • Summarization – Users can request a concise summary of the data matching their search query, helping them quickly understand key points without manually reviewing each record.
  • Query clarification – If a user’s query is ambiguous or lacks sufficient context, Amazon Q Business can engage in a dialogue to clarify the intent, making sure the user receives the most relevant and accurate results.

Overview of the Amazon Q Business Aurora (PostgreSQL) connector

A data source connector is a mechanism for integrating and synchronizing data from multiple repositories into one container index. Amazon Q Business offers multiple data source connectors that can connect to your data sources and help you create your generative AI solution with minimal configuration.

A data source is a data repository or location that Amazon Q Business connects to in order to retrieve your data stored in the database. After the PostgreSQL data source is set up, you can create one or multiple data sources within Amazon Q Business and configure them to start indexing data from your Aurora PostgreSQL database. When you connect Amazon Q Business to a data source and initiate the sync process, Amazon Q Business crawls and adds documents from the data source to its index.

Types of documents

Let’s look at what are considered as documents in the context of the Amazon Q Business Aurora (PostgreSQL) connector. A document is a collection of information that consists of a title, the content (or the body), metadata (data about the document), and access control list (ACL) information to make sure answers are provided from documents that the user has access to.

The Amazon Q Business Aurora (PostgreSQL) connector supports crawling of the following entities as a document:

  • Table data in a single database
  • View data in a single database

Each row in a table and view is considered a single document.

The Amazon Q Business Aurora (PostgreSQL) connector also supports field mappings. Field mappings allow you to map document attributes from your data sources to fields in your Amazon Q index. This includes both reserved or default field mappings created automatically by Amazon Q, as well as custom field mappings that you can create and edit.

Refer to Aurora (PostgreSQL) data source connector field mappings for more information.

ACL crawling

Amazon Q Business supports crawling ACLs for document security by default. Turning off ACLs and identity crawling is no longer supported. In preparation for connecting Amazon Q Business applications to AWS IAM Identity Center, enable ACL indexing and identity crawling for secure querying and re-sync your connector. After you turn ACL and identity crawling on, you won’t be able to turn them off.

If you want to index documents without ACLs, make sure the documents are marked as public in your data source.

When you connect a database data source to Amazon Q, Amazon Q crawls user and group information from a column in the source table. You specify this column on the Amazon Q console or using the configuration parameter as part of the CreateDataSource operation.

If you activate ACL crawling, you can use that information to filter chat responses to your end-user’s document access level.

The following are important considerations for a database data source:

  • You can only specify an allow list for a database data source. You can’t specify a deny list.
  • You can only specify groups. You can’t specify individual users for the allow list.
  • The database column should be a string containing a semicolon delimited list of groups.

Refer to How Amazon Q Business connector crawls Aurora (PostgreSQL) ACLs for more information.

Solution overview

In the following sections, we demonstrate how to set up the Amazon Q Business Aurora (PostgreSQL) connector. This connector allows you to query your Aurora PostgreSQL database using Amazon Q using natural language. Then we provide examples of how to use the AI-powered chat interface to gain insights from the connected data source.

After the configuration is complete, you can configure how often Amazon Q Business should synchronize with your Aurora PostgreSQL database to keep up to date with the database content. This enables you to perform complex searches and retrieve relevant information quickly and efficiently, leading to intelligent insights and informed decision-making. By centralizing search functionality and seamlessly integrating with other AWS services, the connector enhances operational efficiency and productivity, while enabling organizations to use the full capabilities of the AWS landscape for data management, analytics, and visualization.

Prerequisites

For this walkthrough, you should have the following prerequisites:

  • An AWS account where you can follow the instructions mentioned below
  • An Amazon Aurora PostgreSQL database.
  • Your Aurora PostgreSQL-Compatible authentication credentials in an AWS Secrets Manager
  • Your Aurora PostgreSQL database user name and password. As a best practice, provide Amazon Q with read-only database credentials.
  • Your database host URL, port, and instance. You can find this information on the Amazon RDS console.

Create an Amazon Q Business application

In this section, we walk through the configuration steps for the Amazon Q Business Aurora (PostgreSQL) connector. For more information, see Creating an Amazon Q Business application environment. Complete the following steps to create your application:

  1. On the Amazon Q Business console, choose Applications in the navigation pane.
  2. Choose Create application.

Create Application

  1. For Application name¸ enter a name (for example, aurora-connector).
  2. For Access management method, select AWS IAM Identity Center.
  3. For Advanced IAM Identity Center settings, enable Enable cross-region calls to allow Amazon Q Business to connect to an AWS IAM Identity Center instance that exists in an AWS Region not already supported by Amazon Q Business. For more information, see Creating a cross-region IAM Identity Center integration.
  4. Then, you will see the following options based on whether you have an IAM Identity Center instance already configured, or need to create one.
    1. If you don’t have an IAM Identity Center instance configured, you see the following:
      1. The Region your Amazon Q Business application environment is in.
      2. Specify tags for IAM Identity Center – Add tags to keep track of your IAM Identity Center instance.
      3. Create IAM Identity Center – Select to create an IAM Identity Center instance. Depending on your setup, you may be prompted to create an account instance or an organization instance, or both. The console will display an ARN for your newly created resource after it’s created.
    2. If you have both an IAM Identity Center organization instance and an account instance configured, your instances will be auto-detected, and you see the following options:
        1. Organization instance of IAM Identity Center – Select this option to manage access to Amazon Q Business by assigning users and groups from the IAM Identity Center directory for your organization. If you have an IAM Identity Center organization instance configured, your organization instance will be auto-detected.
        2. Account instance of IAM Identity Center – Select this option to manage access to Amazon Q Business by assigning existing users and groups from your IAM Identity Center directory. If you have an IAM Identity Center account instance configured, your account instance will be auto-detected.
        3. The Region your Amazon Q Business application environment is in.
        4. IAM Identity Center – The ARN for your IAM Identity Center instance.

If your IAM Identity Center instance is configured in a Region Amazon Q Business isn’t available in, and you haven’t activated cross-Region IAM Identity Center calls, you will see a message saying that a connection is unavailable with an option to Switch Region. When you allow a cross-Region connection between Amazon Q Business and IAM Identity Center using Advanced IAM Identity Center settings, your cross-Region IAM Identity Center instance will be auto-detected by Amazon Q Business.

Create Application 2

  1. Keep everything else as default and choose Create.

Create Application 3

Create an Amazon Q Business retriever

After you create the application, you can create a retriever. Complete the following steps:

  1. On the application page, choose Data sources in the navigation pane.

Add Retriever 1

  1. Choose Select retriever.

Add Retriever 2

  1. For Retrievers, select your type of retriever. For this post, we select Native.
  2. For Index provisioning¸ select your index type. For this post, we select Enterprise.
  3. For Number of units, enter a number of index units. For this post, we use 1 unit, which can read up to 20,000 documents. This limit applies to the connectors you configure for this retriever.
  4. Choose Confirm.

Select Retriever

Connect data sources

After you create the retriever, complete the following steps to add a data source:

  1. On the Data sources page, choose Add data source.

Connect data sources

  1. Choose your data source. For this post, we choose Aurora (PostgreSQL).

You can configure up to 50 data sources per application.

Add data sources

  1. Under Name and description, enter a data source name. Your name can include hyphens (-) but not spaces. The name has a maximum of 1,000 alphanumeric characters.
  2. Under Source, enter the following information:
    1. For Host, enter the database host URL, for example http://instance URL.region.rds.amazonaws.com.
    2. For Port, enter the database port, for example 5432.
    3. For Instance, enter the name of the database that you want to connect with and where tables and views are created, for example postgres.

Configure data sources

  1. If you enable SSL Certificate Location, enter the Amazon S3 path to your SSL certificate file.
  2. For Authorization, Amazon Q Business crawls ACL information by default to make sure responses are generated only from documents your end-users have access to. See Authorization for more details.
  3. Under Authentication, if you have an existing Secrets Manager secret that has the database user name and password, you can use it; otherwise, enter the following information for your new secret:
    1. For Secret name, enter a name for your secret.
    2. For Database user name and Password, enter the authentication credentials you copied from your database.
    3. Choose Save.

Database Secrets

  1. For Configure VPC and security group, choose whether you want to use a virtual private cloud (VPC). For more information, see Virtual private cloud. If you do, enter the following information:
    1. For Virtual Private Cloud (VPC), choose the VPC where Aurora PostgreSQL-Compatible is present.
    2. For Subnets, choose up to six repository subnets that define the subnets and IP ranges the repository instance uses in the selected VPC.
    3. For VPC security groups, choose up to 10 security groups that allow access to your data source.

Make sure the security group allows incoming traffic from Amazon Elastic Compute Cloud (Amazon EC2) instances and devices outside your VPC. For databases, security group instances are required.

Authentication

  1. Keep the default setting for IAM role (Create a new service role) and a new role name is generated automatically. For more information, see IAM role for Aurora (PostgreSQL) connector.

IAM Role creation

  1. Under Sync scope, enter the following information:
    1. For SQL query, enter SQL query statements like SELECT and JOIN operations. SQL queries must be less than 1,000 characters and not contain any semi-colons (;). Amazon Q will crawl database content that matches your query.
    2. For Primary key column, enter the primary key for the database table. This identifies a table row within your database table. Each row in a table and view is considered a single document.
    3. For Title column, enter the name of the document title column in your database table.
    4. For Body column, enter the name of the document body column in your database table.
  2. Under Additional configuration, configure the following settings:
    1. For Change-detecting columns, enter the names of the columns that Amazon Q will use to detect content changes. Amazon Q will re-index content when there is a change in these columns.
    2. For Users’ IDs column, enter the name of the column that contains user IDs to be allowed access to content.
    3. For Groups column, enter the name of the column that contains groups to be allowed access to content.
    4. For Source URLs column, enter the name of the column that contains source URLs to be indexed.
    5. For Timestamp column, enter the name of the column that contains timestamps. Amazon Q uses timestamp information to detect changes in your content and sync only changed content.
    6. For Timestamp format of table, enter the name of the column that contains timestamp formats to use to detect content changes and re-sync your content.
    7. For Database time zone, enter the name of the column that contains time zones for the content to be crawled.

Sync Scope

  1. Under Sync mode, choose how you want to update your index when your data source content changes. When you sync your data source with Amazon Q for the first time, content is synced by default. For more details, see Sync mode.
    1. New, modified, or deleted content sync – Sync and index new, modified, or deleted content only.
    2. New or modified content sync – Sync and index new or modified content only.
    3. Full sync – Sync and index content regardless of previous sync status.
  2. Under Sync run schedule, for Frequency, choose how often Amazon Q will sync with your data source. For more details, see Sync run schedule.
  3. Under Tags, add tags to search and filter your resources or track your AWS costs. See Tags for more details.
  4. Under Field mappings, you can list data source document attributes to map to your index fields. Add the fields from the Data source details page after you finish adding your data source. For more information, see Field mappings. You can choose from two types of fields:
    1. Default – Automatically created by Amazon Q on your behalf based on common fields in your data source. You can’t edit these.
    2. Custom – Automatically created by Amazon Q on your behalf based on common fields in your data source. You can edit these. You can also create and add new custom fields.
  5. Once done click on the Add data source button.

Add Data Source Final

  1. When the data source state is Active, choose Sync now.

Sync Now

Add groups and users

After you add the data source, you can add users and groups in the Amazon Q Business application to query the data ingested from data source. Complete the following steps:

  1. On your application page, choose Manage user access.

Manage User Access

  1. Choose to add new users or assign existing users:
    1. Select Add new users to create new users in IAM Identity Center.
    2. Select Assign existing users and groups if you already have users and groups in IAM Identity Center. For this post, we select this option.
  2. Choose Next.

Assign existing users and groups

  1. Search for the users or groups you want to assign and choose Assign to add them to the application.

ssign Users and Groups

  1. After the users are added, choose Change subscription to assign either the Business Lite or Business Pro subscription plan.

Change Subscription

  1. Choose Confirm to confirm your subscription choice.

Confirm Subscription

Test the solution

To access the Amazon Q Business Web Experience, navigate to the Web experience settings tab and choose the link for Deployed URL.

Web Experience Settings

You will need to authenticate with the IAM Identity Center user details before you’re redirected to the chat interface.

Chat Interface

Our data source is the Aurora PostgreSQL database, which contains a Movie table. We have indexed this to our Amazon Q Business application, and we will ask questions related to this data. The following screenshot shows a sample of the data in this table.

Sample Data

For the first query, we ask Amazon Q Business to provide recommendations for kids’ movies in natural language, and it queries the indexed data to provide the response shown in the following screenshot.

First Query

For the second query, we ask Amazon Q Business to provide more details of a specific movie in natural language. It uses the indexed data from the column of our table to provide the response.

Second Query

Frequently asked questions

In this section, we provide guidance to frequently asked questions.

Amazon Q Business is unable to answer your questions

If you get the response “Sorry, I could not find relevant information to complete your request,” this may be due to a few reasons:

  • No permissions – ACLs applied to your account don’t allow you to query certain data sources. If this is the case, reach out to your application administrator to make sure your ACLs are configured to access the data sources. You can go to the Sync History tab to view the sync history, and then choose the View Report link, which opens an Amazon CloudWatch Logs Insights query that provides additional details like the ACL list, metadata, and other useful information that might help with troubleshooting. For more details, see Introducing document-level sync reports: Enhanced data sync visibility in Amazon Q Business.
  • Data connector sync failed – Your data connector may have failed to sync information from the source to the Amazon Q Business application. Verify the data connector’s sync run schedule and sync history to confirm the sync is successful.

If none of these reasons apply to your use case, open a support case and work with your technical account manager to get this resolved.

How to generate responses from authoritative data sources

If you want Amazon Q Business to only generate responses from authoritative data sources, you can configure this using the Amazon Q Business application global controls under Admin controls and guardrails.

  1. Log in to the Amazon Q Business console as an Amazon Q Business application administrator.
  2. Navigate to the application and choose Admin controls and guardrails in the navigation pane.
  3. Choose Edit in the Global controls section to set these options.

For more information, refer to Admin controls and guardrails in Amazon Q Business.

Admin controls and guardrails

Amazon Q Business responds using old (stale) data even though your data source is updated

Each Amazon Q Business data connector can be configured with a unique sync run schedule frequency. Verifying the sync status and sync schedule frequency for your data connector reveals when the last sync ran successfully. Your data connector’s sync run schedule could be set to sync at a scheduled time of day, week, or month. If it’s set to run on demand, the sync has to be manually invoked. When the sync run is complete, verify the sync history to make sure the run has successfully synced new issues. Refer to Sync run schedule for more information about each option.

Sync Schedule

Using different IdPs such as Okta, Entra ID, or Ping Identity

For more information about how to set up Amazon Q Business with other identity providers (IdPs) as your SAML 2.0-aligned IdP, see Creating an Amazon Q Business application using Identity Federation through IAM.

Limitations

For more details about limitations your Amazon Q Business Aurora (PostgreSQL) connector, see Known limitations for the Aurora (PostgreSQL) connector.

Clean up

To avoid incurring future charges and to clean up unused roles and policies, delete the resources you created:

  1. If you created a Secrets Manager secret to store the database password, delete the secret.
  2. Delete the data source IAM role. You can find the role ARN on the data source page.

  1. Delete the Amazon Q application:
    1. On the Amazon Q console, choose Applications in the navigation pane.
    2. Select your application and on the Actions menu, choose Delete.
    3. To confirm deletion, enter delete in the field and choose Delete.
    4. Wait until you get the confirmation message; the process can take up to 15 minutes.
  2. Delete your IAM Identity Center instance.

Conclusion

Amazon Q Business unlocks powerful generative AI capabilities, allowing you to gain intelligent insights from your Aurora PostgreSQL-Compatible data through natural language querying and generation. By following the steps outlined in this post, you can seamlessly connect your Aurora PostgreSQL database to Amazon Q Business and empower your developers and end-users to interact with structured data in a more intuitive and conversational manner.

To learn more about the Amazon Q Business Aurora (PostgreSQL) connector, refer to Connecting Amazon Q Business to Aurora (PostgreSQL) using the console.


About the Authors

Moumita Dutta is a Technical Account Manager at Amazon Web Services. With a focus on financial services industry clients, she delivers top-tier enterprise support, collaborating closely with them to optimize their AWS experience. Additionally, she is a member of the AI/ML community and serves as a generative AI expert at AWS. In her leisure time, she enjoys gardening, hiking, and camping.

Manoj CS is a Solutions Architect at AWS, based in Atlanta, Georgia. He specializes in assisting customers in the telecommunications industry to build innovative solutions on the AWS platform. With a passion for generative AI, he dedicates his free time to exploring this field. Outside of work, Manoj enjoys spending quality time with his family, gardening, and traveling.

Gopal Gupta is a Software Development Engineer at Amazon Web Services. With a passion for software development and expertise in this domain, he designs and develops highly scalable software solutions.

Read More

How Tealium built a chatbot evaluation platform with Ragas and Auto-Instruct using AWS generative AI services

How Tealium built a chatbot evaluation platform with Ragas and Auto-Instruct using AWS generative AI services

This post was co-written with Varun Kumar from Tealium

Retrieval Augmented Generation (RAG) pipelines are popular for generating domain-specific outputs based on external data that’s fed in as part of the context. However, there are challenges with evaluating and improving such systems. Two open-source libraries, Ragas (a library for RAG evaluation) and Auto-Instruct, used Amazon Bedrock to power a framework that evaluates and improves upon RAG.

In this post, we illustrate the importance of generative AI in the collaboration between Tealium and the AWS Generative AI Innovation Center (GenAIIC) team by automating the following:

  • Evaluating the retriever and the generated answer of a RAG system based on the Ragas Repository powered by Amazon Bedrock.
  • Generating improved instructions for each question-and-answer pair using an automatic prompt engineering technique based on the Auto-Instruct Repository. An instruction refers to a general direction or command given to the model to guide generation of a response. These instructions were generated using Anthropic’s Claude on Amazon Bedrock.
  • Providing a UI for a human-based feedback mechanism that complements an evaluation system powered by Amazon Bedrock.

Amazon Bedrock is a fully managed service that makes popular FMs available through an API, so you can choose from a wide range of foundational models (FMs) to find the model that’s best suited for your use case. Because Amazon Bedrock is serverless, you can get started quickly, privately customize FMs with your own data, and integrate and deploy them into your applications without having to manage any infrastructure.

Tealium background and use case

Tealium is a leader in real-time customer data integration and management. They empower organizations to build a complete infrastructure for collecting, managing, and activating customer data across channels and systems. Tealium uses AI capabilities to integrate data and derive customer insights at scale. Their AI vision is to provide their customers with an active system that continuously learns from customer behaviors and optimizes engagement in real time.

Tealium has built a question and answer (QA) bot using a RAG pipeline to help identify common issues and answer questions about using the platform. The bot is expected to act as a virtual assistant to answer common questions, identify and solve issues, monitor platform health, and provide best practice suggestions, all aimed at helping Tealium customers get the most value from their customer data platform.

The primary goal of this solution with Tealium was to evaluate and improve the RAG solution that Tealium uses to power their QA bot. This was achieved by building an:

  • Evaluation pipeline.
  • Error correction mechanism to semi-automatically improve upon the metrics generated from evaluation. In this engagement, automatic prompt engineering was the only technique used, but others such as different chunking strategies and using semantic instead of hybrid search can be explored depending on your use case.
  • A human-in the-loop feedback system allowing the human to approve or disapprove RAG outputs

Amazon Bedrock was vital in powering an evaluation pipeline and error correction mechanism because of its flexibility in choosing a wide range of leading FMs and its ability to customize models for various tasks. This allowed for testing of many types of specialized models on specific data to power such frameworks. The value of Amazon Bedrock in text generation for automatic prompt engineering and text summarization for evaluation helped tremendously in the collaboration with Tealium. Lastly, Amazon Bedrock allowed for more secure generative AI applications, giving Tealium full control over their data while also encrypting it at rest and in transit.

Solution prerequisites

To test the Tealium solution, start with the following:

  1. Get access to an AWS account.
  2. Create a SageMaker domain instance.
  3. Obtain access to the following models on Amazon Bedrock: Anthropic’s Claude Instant, Claude v2, Claude 3 Haiku, and Titan Embeddings G1 – Text. The evaluation using Ragas can be performed using any foundation model (FM) that’s available on Amazon Bedrock. Automatic prompt engineering must use Anthropic’s Claude v2, v2.1, or Claude Instant.
  4. Obtain a golden set of question and answer pairs. Specifically, you need to provide examples of questions that you will ask the RAG bot and their expected ground truths.
  5. Clone automatic prompt engineering and human-in-the-loop repositories. If you want access to a Ragas repository with prompts favorable towards Anthropic Claude models available on Amazon Bedrock, clone and navigate through this repository and this notebook.

The code repositories allow for flexibility of various FMs and customized models with minimal updates, illustrating Amazon Bedrock’s value in this engagement.

Solution overview

The following diagram illustrates a sample solution architecture that includes an evaluation framework, error correction technique (Auto-Instruct and automatic prompt engineering), and human-in-the-loop. As you can see, generative AI is an important part of the evaluation pipeline and the automatic prompt engineering pipeline.

The workflow consists of the following steps:

  1. You first enter a query into the Tealium RAG QA bot. The RAG solution uses FAISS to retrieve an appropriate context for the specified query. Then, it outputs a response.
  2. Ragas takes in this query, context, answer, and a ground truth that you input, and calculates faithfulness, context precision, context recall, answer correctness, answer relevancy, and answer similarity. Ragas can be integrated with Amazon Bedrock (look at the Ragas section of the notebook link). This illustrates integrating Amazon Bedrock in different frameworks.
  3. If any of the metrics are below a certain threshold, the specific question and answer pair is run by the Auto-Instruct library, which generates candidate instructions using Amazon Bedrock. Various FMs can be used for this text generation use case.
  4. The new instructions are appended to the original query to be prepared to be run by the Tealium RAG QA bot.
  5. The QA bot runs an evaluation to determine whether improvements have been made. Steps 3 and 4 can be iterated until all metrics are above a certain threshold. In addition, you can set a maximum number of times steps 3 and 4 are iterated to prevent an infinite loop.
  6. A human-in-the-loop UI is used to allow a subject matter expert (SME) to provide their own evaluation on given model outputs. This can also be used to provide guard rails against a system powered by generative AI.

In the following sections, we discuss how an example question, its context, its answer (RAG output) and ground truth (expected answer) can be evaluated and revised for a more ideal output. The evaluation is done using Ragas, a RAG evaluation library. Then, prompts and instructions are automatically generated based on their relevance to the question and answer. Lastly, you can approve or disapprove the RAG outputs based on the specific instruction generated from the automatic prompt engineering step.

Out-of-scope

Error correction and human-in-the-loop are two important aspects in this post. However, for each component, the following is out-of-scope, but can be improved upon in future iterations of the solution:

Error correction mechanism

  • Automatic prompt engineering is the only method used to correct the RAG solution. This engagement didn’t go over other techniques to improve the RAG solution; such as using Amazon Bedrock to find optimal chunking strategies, vector stores, models, semantic or hybrid search, and other mechanisms. Further testing needs to be done to evaluate whether FMs from Amazon Bedrock can be a good decision maker for such parameters of a RAG solution.
  • Based on the technique presented for automatic prompt engineering, there might be opportunities to optimize the cost. This wasn’t analyzed during the engagement. Disclaimer: The technique described in this post might not be the most optimal approach in terms of cost.

Human-in-the-loop

  • SMEs provide their evaluation of the RAG solution by approving and disapproving FM outputs. This feedback is stored in the user’s file directory. There is an opportunity to improve upon the model based on this feedback, but this isn’t touched upon in this post.

Ragas – Evaluation of RAG pipelines

Ragas is a framework that helps evaluate a RAG pipeline. In general, RAG is a natural language processing technique that uses external data to augment an FM’s context. Therefore, this framework evaluates the ability for the bot to retrieve relevant context as well as output an accurate response to a given question. The collaboration between the AWS GenAIIC and the Tealium team showed the success of Amazon Bedrock integration with Ragas with minimal changes.

The inputs to Ragas include a set of questions, ground truths, answers, and contexts. For each question, an expected answer (ground truth), LLM output (answer), and a list of contexts (retrieved chunks) were inputted. Context recall, precision, answer relevancy, faithfulness, answer similarity, and answer correctness were evaluated using Anthropic’s Claude on Amazon Bedrock (any version). For your reference, here are the metrics that have been successfully calculated using Amazon Bedrock:

  • Faithfulness – This measures the factual consistency of the generated answer against the given context, so it requires the answer and retrieved context as an input. This is a two-step prompt where the generated answer is first broken down into multiple standalone statements and propositions. Then, the evaluation LLM validates the attribution of the generated statement to the context. If the attribution can’t be validated, it’s assumed that the statement is at risk of hallucination. The answer is scaled to a 0–1 range; the higher the better.
  • Context precision – This evaluates the relevancy of the context to the answer, or in other words, the retriever’s ability to capture the best context to answer your query. An LLM verifies if the information in the given context is directly relevant to the question with a single “Yes” or “No” response. The context is passed in as a list, so if the list is size one (one chunk), then the metric for context precision is either 0 (representing the context isn’t relevant to the question) or 1 (representing that it is relevant). If the context list is greater than one (or includes multiple chunks), then context precision is between 0–1, representing a specific weighted average precision calculation. This involves the context precision of the first chunk being weighted heavier than the second chunk, which itself is weighted heavier than the third chunk, and onwards, taking into account the ordering of the chunks being outputted as contexts.
  • Context recall – This measures the alignment between the context and the expected RAG output, the ground truth. Similar to faithfulness, each statement in the ground truth is checked to see if it is attributed to the context (thereby evaluating the context).
  • Answer similarity – This assesses the semantic similarity between the RAG output (answer) and expected answer (ground truth), with a range between 0–1. A higher score signifies better performance. First, the embeddings of answer and ground truth are created, and then a score between 0–1 is predicted, representing the semantic similarity of the embeddings using a cross encoder Tiny BERT model.
  • Answer relevance – This focuses on how pertinent the generated RAG output (answer) is to the question. A lower score is assigned to answers that are incomplete or contain redundant information. To calculate this score, the LLM is asked to generate multiple questions from a given answer. Then using an Amazon Titan Embeddings model, embeddings are generated for the generated question and the actual question. The metric therefore is the mean cosine similarity between all the generated questions and the actual question.
  • Answer correctness – This is the accuracy between the generated answer and the ground truth. This is calculated from the semantic similarity metric between the answer and the ground truth in addition to a factual similarity by looking at the context. A threshold value is used if you want to employ a binary 0 or 1 answer correctness score, otherwise a value between 0–1 is generated.

AutoPrompt – Automatically generate instructions for RAG

Secondly, generative AI services were shown to successfully generate and select instructions for prompting FMs. In a nutshell, instructions are generated by an FM that best map a question and context to the RAG QA bot answer based on a certain style. This process was done using the Auto-Instruct library. The approach harnesses the ability of FMs to produce candidate instructions, which are then ranked using a scoring model to determine the most effective prompts.

First, you need to ask an Anthropic’s Claude model on Amazon Bedrock to generate an instruction for a set of inputs (question and context) that map to an output (answer). The FM is then asked to generate a specific type of instruction, such as a one-paragraph instruction, one-sentence instruction, or step-by-step instruction. Many candidate instructions are then generated. Look at the generate_candidate_prompts() function to see the logic in code.

Then, the resulting candidate instructions are tested against each other using an evaluation FM. To do this, first, each instruction is compared against all other instructions. Then, the evaluation FM is used to evaluate the quality of the prompts for a given task (query plus context to answer pairs). The evaluation logic for a sample pair of candidate instructions is shown in the test_candidate_prompts() function.

This outputs the most ideal prompt generated by the framework. For each question-and-answer pair, the output includes the best instruction, second best instruction, and third best instruction.

For a demonstration of performing automatic prompt engineering (and calling Ragas):

  • Navigate through the following notebook.
  • Code snippets for how candidate prompts are generated and evaluated are included in this source file with their associated prompts included in this config file.

You can review the full repository for automatic prompt engineering using FMs from Amazon Bedrock.

Human-in-the-loop evaluation

So far, you have learned about the applications of FMs in their generation of quantitative metrics and prompts. However, depending on the use case, they need to be aligned with human evaluators’ preferences to have ultimate confidence in these systems. This section presents a HITL web UI (Streamlit) demonstration, showing a side-by-side comparison of instructions and question inputs and RAG outputs. This is shown in the following image:

HITL-homepage-1

The structure of the UI is:

  • On the left, select an FM and two instruction templates (as marked by the index number) to test. After you choose Start, you will see the instructions on the main page.
  • The top text box on the main page is the query.
  • The text box below that is the first instruction sent to the LLM as chosen by the index number in the first bullet point.
  • The text box below the first instruction is the second instruction sent to the LLM as chosen by the index number in the first bullet point.
  • Then comes the model output for Prompt A, which is the output when the first instruction and query is sent to the LLM. This is compared against the model output for Prompt B, which is the output when the second instruction and query is sent to the LLM.
  • You can give your feedback for the two outputs, as shown in the following image.

After you input your results, they’re saved in a file in your directory. These can be used for further enhancement of the RAG solution.

Follow the instructions in this repository to run your own human-in-the-loop UI.

Chatbot live evaluation metrics

Amazon Bedrock has been used to continuously analyze the bot performance. The following are the latest results using Ragas:

. Context Utilization Faithfulness Answer Relevancy
Count 714 704 714
Mean 0.85014 0.856887 0.7648831
Standard Deviation 0.357184 0.282743 0.304744
Min 0 0 0
25% 1 1 0.786385
50% 1 1 0.879644
75% 1 1 0.923229
Max 1 1 1

The Amazon Bedrock-based chatbot with Amazon Titan embeddings achieved 85% context utilization, 86% faithfulness, and 76% answer relevancy.

Conclusion

Overall, the AWS team was able to use various FMs on Amazon Bedrock using the Ragas library to evaluate Tealium’s RAG QA bot when inputted with a query, RAG response, retrieved context, and expected ground truth. It did this by finding out if:

  1. The RAG response is attributed to the context.
  2. The context is attributed to the query.
  3. The ground truth is attributed to the context.
  4. Whether the RAG response is relevant to the question and similar to the ground truth.

Therefore, it was able to evaluate a RAG solution’s ability to retrieve relevant context and answer the sample question accurately.

In addition, an FM was able to generate multiple instructions from a question-and-answer pair and rank them based on the quality of the responses. After instructions were generated, it was able to slightly improve errors in the LLM response. The human in the loop demonstration provides a side-by-side view of outputs for different prompts and instructions. This was an enhanced thumbs up/thumbs down approach to further improve inputs to the RAG bot based on human feedback.

Some next steps with this solution include the following:

  • Improving RAG performance using different models or different chunking strategies based on specific metrics
  • Testing out different strategies to optimize the cost (number of FM calls) to evaluate generated instructions in the automatic prompt engineering phase
  • Allowing SME feedback in the human evaluation step to automatically improve upon ground truth or instruction templates

The value of Amazon Bedrock was shown throughout the collaboration with Tealium. The flexibility of Amazon Bedrock in choosing a wide range of leading FMs and the ability to customize models for specific tasks allow Tealium to power the solution in specialized ways with minimal updates in the future. The importance of Amazon Bedrock in text generation and success in evaluation were shown in this engagement, providing potential and flexibility for Tealium to build on the solution. Its emphasis on security allows Tealium to be confident in building and delivering more secure applications.

As stated by Matt Gray, VP of Global Partnerships at Tealium,

“In collaboration with the AWS Generative AI Innovation Center, we have developed a sophisticated evaluation framework and an error correction system, utilizing Amazon Bedrock, to elevate the user experience. This initiative has resulted in a streamlined process for assessing the performance of the Tealium QA bot, enhancing its accuracy and reliability through advanced technical metrics and error correction methodologies. Our partnership with AWS and Amazon Bedrock is a testament to our dedication to delivering superior outcomes and continuing to innovate for our mutual clients.”

This is just one of the ways AWS enables builders to deliver generative AI based solutions. You can get started with Amazon Bedrock and see how it can be integrated in example code bases today. If you’re interested in working with the AWS generative AI services, reach out to the GenAIIC.


About the authors

Suren Gunturu is a Data Scientist working in the Generative AI Innovation Center, where he works with various AWS customers to solve high-value business problems. He specializes in building ML pipelines using large language models, primarily through Amazon Bedrock and other AWS Cloud services.

Varun Kumar is a Staff Data Scientist at Tealium, leading its research program to provide high-quality data and AI solutions to its customers. He has extensive experience in training and deploying deep learning and machine learning models at scale. Additionally, he is accelerating Tealium’s adoption of foundation models in its workflow including RAG, agents, fine-tuning, and continued pre-training.

Vidya Sagar Ravipati is a Science Manager at the Generative AI Innovation Center, where he leverages his vast experience in large-scale distributed systems and his passion for machine learning to help AWS customers across different industry verticals accelerate their AI and cloud adoption.

Read More

EBSCOlearning scales assessment generation for their online learning content with generative AI

EBSCOlearning scales assessment generation for their online learning content with generative AI

EBSCOlearning offers corporate learning and educational and career development products and services for businesses, educational institutions, and workforce development organizations. As a division of EBSCO Information Services, EBSCOlearning is committed to enhancing professional development and educational skills.

In this post, we illustrate how EBSCOlearning partnered with AWS Generative AI Innovation Center (GenAIIC) to use the power of generative AI in revolutionizing their learning assessment process. We explore the challenges faced in traditional question-answer (QA) generation and the innovative AI-driven solution developed to address them.

In the rapidly evolving landscape of education and professional development, the ability to effectively assess learners’ understanding of content is crucial. EBSCOlearning, a leader in the realm of online learning, recognized this need and embarked on an ambitious journey to transform their assessment creation process using cutting-edge generative AI technology.

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies such as AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities to build generative AI applications with security, privacy, and responsible AI, and is well positioned to address these types of tasks.

The challenge: Scaling quality assessments

EBSCOlearning’s learning paths—comprising videos, book summaries, and articles—form the backbone of a multitude of educational and professional development programs. However, the company faced a significant hurdle: creating high-quality, multiple-choice questions for these learning paths was a time-consuming and resource-intensive process.

Traditionally, subject matter experts (SMEs) would meticulously craft each question set to be relevant, accurate, and to align with learning objectives. Although this approach guaranteed quality, it was slow, expensive, and difficult to scale. As EBSCOlearning’s content library continues to grow, so does the need for a more efficient solution.

Enter AI: A promising solution

Recognizing the potential of AI to address this challenge, EBSCOlearning partnered with the GenAIIC to develop an AI-powered question generation system. The goal was ambitious: to create an automated solution that could produce high-quality, multiple-choice questions at scale, while adhering to strict guidelines on bias, safety, relevance, style, tone, meaningfulness, clarity, and diversity, equity, and inclusion (DEI). The QA pairs had to be grounded in the learning content and test different levels of understanding, such as recall, comprehension, and application of knowledge. Additionally, explanations were needed to justify why an answer was correct or incorrect.

The team faced several key challenges:

  • Making sure AI-generated questions matched the quality of human-created ones
  • Developing a system that could handle diverse content types
  • Implementing robust quality control measures
  • Creating a scalable solution that could grow with EBSCOlearning’s needs

Crafting the AI solution

The GenAIIC team developed a sophisticated pipeline using the power of large language models (LLMs), specifically Anthropic’s Claude 3.5 Sonnet in Amazon Bedrock. This pipeline is illustrated in the following figure and consists of several key components: QA generation, multifaceted evaluation, and intelligent revision.

QA generation

The process begins with the QA generation component. This module takes in the learning content—which could be a video transcript, book summary, or article—and generates an initial set of multiple-choice questions using in-context learning.

EBSCOlearning experts and GenAIIC scientists worked together to develop a sophisticated prompt engineering approach using Anthropic’s Claude 3.5 Sonnet model in Amazon Bedrock. To align with EBSCOlearning’s high standards, the prompt includes:

  • Detailed guidelines on what constitutes a high-quality question, covering aspects such as relevance, clarity, difficulty level, and objectivity
  • Instructions to match the conversational style of the original content
  • Directives to include diversity and inclusivity in the language and scenarios used
  • Few-shot examples to enable in-context learning for the AI model

The system aims to generate up to seven questions for each piece of content, each with four answer choices including a correct answer, and detailed explanations for why each answer is correct or incorrect.

Multifaceted evaluation

After the initial set of questions is generated, it undergoes a rigorous evaluation process. This multifaceted approach makes sure that the questions adhere to all quality standards and guidelines. The evaluation process includes three phases: LLM-based guideline evaluation, rule-based checks, and a final evaluation.

LLM-based guideline evaluation

In collaboration with EBSCOlearning, the GenAIIC team manually developed a comprehensive set of evaluation guidelines covering fundamental requirements for multiple-choice questions, such as validity, accuracy, and relevance. Additionally, they incorporated EBSCOlearning’s specific standards for diversity, equity, inclusion, and belonging (DEIB), in addition to style and tone preferences. The AI system evaluates each question according to the established guidelines and generates a structured output that includes detailed reasoning along with a rating on a three-point scale, where 1 indicates invalid, 2 indicates partially valid, and 3 indicates valid. This rating is later used for revising the questions.

This process presented several significant challenges. The primary challenge was making sure that the AI model could effectively assess multiple complex guidelines simultaneously without overlooking any crucial aspects. This was particularly difficult because the evaluation needed to consider so many different factors—all while maintaining consistency across questions.

To overcome the challenge of LLMs potentially overlooking guidelines when presented with them all at one time, the evaluation process was split into smaller manageable tasks by getting the AI model to focus on fewer guidelines at a time or evaluating smaller chunks of questions in parallel. This way, each guideline receives focused attention, resulting in a more accurate and comprehensive evaluation. Additionally, the system was designed with modularity in mind, streamlining the addition or removal of guidelines. Because of this flexibility, the evaluation process can adapt quickly to new requirements or changes in educational standards.

By generating detailed, structured feedback for each question, including numerical ratings, concise summaries, and in-depth reasoning, the system provides invaluable insights for continual improvement. This level of detail allows for a nuanced understanding of how well each question aligns with the established criteria, offering possibilities for targeted enhancements to the question generation process.

Rule-based checks

Some quantitative aspects of question quality proved challenging for the AI to consistently evaluate. For instance, the team noticed that correct answers were often longer than incorrect ones, making them straightforward to identify. To address this, they developed a custom algorithm that analyzes answer lengths and flags potential issues without relying on the LLM’s judgment.

Final evaluation

Beyond evaluating individual questions, the system also assesses the entire set of questions for a given piece of content. This step checks for duplicates, promotes diversity in the types of questions asked, and verifies that the set as a whole provides a comprehensive assessment of the learning material.

Intelligent revision

One key component of the pipeline is the intelligent revision module. This is where the iterative improvement happens. When the evaluation process flags issues with a question—whether it’s a guideline violation or a structural problem—the question is sent back for revision. The AI model is provided with specific feedback on how it can address the specific violation and directs it to fix the issue by revising or replacing the QA.

The power of iteration

The whole pipeline goes through multiple iterations until the question aligns with all of the specified quality standards. If after several attempts a question still doesn’t meet the criteria, it’s flagged for human review. This iterative approach makes sure that the final output isn’t merely a raw AI generation, but a refined product that has gone through multiple checks and improvements.

Throughout the development process, the team maintained a strong focus on iterative tracking of changes. They implemented a unique history tracking system so they could monitor the evolution of each question through multiple rounds of generation, evaluation, and revision. This approach not only provided valuable insights into the AI model’s decision-making process, but also allowed for targeted improvements to the system over time. By closely tracking the AI model’s performance across multiple iterations, we were able to fine-tune our prompts and evaluation criteria, resulting in a significant improvement in output quality.

Scalability and robustness

With EBSCOlearning’s vast content library in mind, the team built scalability into the core of their solution. They implemented multithreading capabilities, allowing the system to process multiple pieces of content simultaneously. They also developed sophisticated retry mechanisms to handle potential API failures or invalid outputs so the system remained reliable even when processing large volumes of content.

The results: A game-changer for learning assessment

By combining these components—intelligent generation, comprehensive evaluation, and adaptive revision—EBSCOlearning and the GenAIIC team created a system that not only automates the process of creating assessment questions but does so with a level of quality that rivals human-created content. This pipeline represents a significant leap forward in the application of AI to educational content creation, promising to revolutionize how learning assessments are developed and delivered.

The impact of this AI-powered solution on EBSCOlearning’s assessment creation process has been nothing short of transformative. Feedback from EBSCOlearning’s subject matter experts has been overwhelmingly positive, with the AI-generated questions meeting or exceeding the quality of manually created ones in many cases.

Key benefits of the new system include:

  • Dramatically reduced time and cost for assessment creation
  • Consistent quality across a wide range of subjects and content types
  • Improved scalability, allowing EBSCOlearning to keep pace with their growing content library
  • Enhanced learning experiences for end users, with more diverse and engaging assessments

“The Generative AI Innovation Center’s automated solution for generating multiple-choice questions and answers considerably accelerated the timeline for deployment of assessments for our online learning platform. Their approach of leveraging advanced language models and implementing carefully constructed guidelines in collaboration with our subject matter experts and product management team has resulted in assessment material that is accurate, relevant, and of high quality. This solution is saving us considerable time and effort, and will enable us to scale assessments across the wide range of skills development resources on our platform.”

—Michael Laddin, Senior Vice President & General Manager, EBSCOlearning.

Here are two examples of generated QA.

Question 1: What does the Consumer Relevancy model proposed by Crawford and Mathews assert about human values in business transactions?

  • A. Human values are less important than monetary value.
  • B. Human values and monetary value are equally important in all transactions.
  • C. Human values are more important than traditional value propositions.
  • D. Human values are irrelevant compared to monetary value in business transactions.

Correct answer: C

Answer explanations:

  • A. This is contrary to what the Consumer Relevancy model asserts. The model emphasizes the importance of human values over traditional value propositions.
  • B. While this might seem balanced, it doesn’t reflect the emphasis placed on human values in the Consumer Relevancy model. The model asserts that human values are more important.
  • C. This correctly reflects the assertion of the Consumer Relevancy model as described in the Book Summary. The model emphasizes the importance of human values over traditional value propositions.
  • D. This is the opposite of what the Consumer Relevancy model asserts. The model emphasizes the importance of human values, not their irrelevance.

Overall: The Book Summary states that the Consumer Relevancy model asserts that human values are more important than traditional value propositions and that businesses must recognize this need for human values as the contemporary currency of commerce.

Question 2: According to Sara N. King, David G. Altman, and Robert J. Lee, what is the primary benefit of leaders having clarity about their own values?

  • A. It guarantees success in all leadership positions.
  • B. It helps leaders make more fulfilling career choices.
  • C. It eliminates all conflicts in the workplace environment.
  • D. It ensures leaders always make ethically perfect decisions.

Correct answer: B

Answer explanations:

  • A. The Book Summary does not suggest that clarity about values guarantees success in all leadership positions. This is an overstatement of the benefits.
  • B. This is the correct answer, as the Book Summary states that clarity about values allows leaders to make more fulfilling career choices and recognize conflicts with their core values.
  • C. While understanding one’s values can help in managing conflicts, the Book Summary does not claim it eliminates all workplace conflicts. This is an exaggeration of the benefits.
  • D. The Book Summary does not state that clarity about values ensures leaders always make ethically perfect decisions. This is an unrealistic expectation not mentioned in the content.

Overall: The Book Summary emphasizes that having clarity about one’s own values allows leaders to make more fulfilling career choices and helps them recognize when they are participating in actions that conflict with their core values.

Looking to the future

For EBSCOlearning, this project is the first step towards their goal of scaling assessments across their entire online learning platform. They’re already planning to expand the system’s capabilities, including:

  • Adapting the solution to handle more complex, technical content
  • Incorporating additional question types beyond multiple-choice
  • Exploring ways to personalize assessments based on individual learner profiles

The potential applications of this technology extend far beyond EBSCOlearning’s current use case. From personalized learning paths to adaptive testing, the possibilities for AI in education and professional development are vast and exciting.

Conclusion: A new era of learning assessment

The collaboration between EBSCOlearning and the GenAIIC demonstrates the transformative power of AI when applied thoughtfully to real-world challenges. By combining cutting-edge technology with deep domain expertise, they’ve created a solution that not only solves a pressing business need but also has the potential to enhance learning experiences for millions of people. This solution is slated to produce assessment questions for hundreds and eventually thousands of learning paths in EBSCOlearning’s curriculum.

As we look to the future of education and professional development, it’s clear that AI will play an increasingly important role. The success of this project serves as a compelling example of how AI can be used to create more efficient, effective, and engaging learning experiences.

For businesses and educational institutions alike, the message is clear: embracing AI isn’t just about keeping up with technology trends—it’s about unlocking new possibilities to better serve learners and drive innovation in education. As EBSCOlearning’s journey shows, the future of learning assessment is here, and it’s powered by AI. Consider how such a solution can enrich your own e-learning content and delight your customers with high quality and on-point assessments. To get started, contact your AWS account manager. If you don’t have an AWS account manager, contact sales. Visit Generative AI Innovation Center to learn more about our program.


About the authors

Yasin Khatami is a Senior Applied Scientist at the Generative AI Innovation Center. With more than a decade of experience in artificial intelligence (AI), he implements state-of-the-art AI products for AWS customers to drive innovation, efficiency and value for customer platforms. His expertise is in generative AI, large language models (LLM), multi-agent techniques, and multimodal learning.

Yifu Hu is an Applied Scientist at the Generative AI Innovation Center. He develops machine learning and generative AI solutions for diverse customer challenges across various industries. Yifu specializes in creative problem-solving, with expertise in AI/ML technologies, particularly in applications of large language models and AI agents.

Aude Genevay is a Senior Applied Scientist at the Generative AI Innovation Center, where she helps customers tackle critical business challenges and create value using generative AI. She holds a PhD in theoretical machine learning and enjoys turning cutting-edge research into real-world solutions.

Mike Laddin is Senior Vice President & General Manager of EBSCOlearning, a division of EBSCO Information Services. EBSCOlearning offers highly acclaimed online products and services for companies, educational institutions, and workforce development organizations. Mike oversees a team of professionals focused on unlocking the potential of people and organizations with on-demand upskilling and microlearning solutions. He has over 25 years of experience as both an entrepreneur and software executive in the information services industry. Mike received an MBA from the Lally School of Management at Rensselaer Polytechnic Institute, and outside of work he is an avid boater.

Alyssa Gigliotti is a Content Strategy and Product Operations Manager at EBSCOlearning, where she collaborates with her team to design top-tier microlearning solutions focused on enhancing business and power skills. With a background in English, professional writing, and technical communications from UMass Amherst, Alyssa combines her expertise in language with a strategic approach to educational content. Alyssa’s in-depth knowledge of the product and voice of the customer allows her to actively engage in product development planning to ensure her team continuously meets the needs of users. Outside the professional sphere, she is both a talented artist and a passionate reader, continuously seeking inspiration from creative and literary pursuits.

Read More

Built for the Era of AI, NVIDIA RTX AI PCs Enhance Content Creation, Gaming, Entertainment and More

Built for the Era of AI, NVIDIA RTX AI PCs Enhance Content Creation, Gaming, Entertainment and More

Editor’s note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, software, tools and accelerations for GeForce RTX PC and NVIDIA RTX workstation users.

NVIDIA and GeForce RTX GPUs are built for the era of AI.

RTX GPUs feature specialized AI Tensor Cores that can deliver more than 1,300 trillion operations per second (TOPS) of processing power for cutting-edge performance in gaming, creating, everyday productivity and more. Today there are more than 600 deployed AI-powered games and apps that are accelerated by RTX.

RTX AI PCs can help anyone start their AI journey and supercharge their work.

Every RTX AI PC comes with regularly updated NVIDIA Studio Drivers — fine-tuned in collaboration with developers — that enhance performance in top creative apps and are tested extensively to deliver maximum stability. Download the December Studio Driver today.

The importance of large language models (LLM) continues to grow. Two benchmarks were introduced this week to spotlight LLM performance on various hardware: MLPerf Client v0.5 and Procyon AI Text Generation. These LLM-based benchmarks, which internal tests have shown accurately replicate real-world performance, are easy to run.

This holiday season, content creators can participate in the #WinterArtChallenge, running through February. Share winter-themed art on Facebook, Instagram or X with #WinterArtChallenge for a chance to be featured on NVIDIA Studio social media channels.

Advanced AI

With NVIDIA and GeForce RTX GPUs, AI elevates everyday tasks and activities, as covered in our AI Decoded blog series. For example, AI can enable:

Faster creativity: With Stable Diffusion, users can quickly create and refine images from text prompts to achieve their desired output. When using an RTX GPU, these results can be generated up to 2.2x faster than on an NPU. And thanks to software optimizations using the NVIDIA TensorRT SDK, the applications used to run these models, like ComfyUI, get an additional 60% boost.

Greater gaming: NVIDIA DLSS technology boosts frame rates and improves image quality, using AI to automatically generate pixels in video games. With ongoing improvements, including to Ray Reconstruction, DLSS enables richer visual quality for more immersive gameplay.

Enhanced entertainment: RTX Video Super Resolution uses AI to enhance video by removing compression artifacts and sharpening edges while upscaling video quality. RTX Video HDR converts any standard dynamic range video into vibrant high dynamic range, enabling more vivid, dynamic colors when streamed in Google Chrome, Microsoft Edge, Mozilla Firefox or VLC media player.

Improved productivity: The NVIDIA ChatRTX tech demo app connects a large language model, like Meta’s Llama, to a user’s data for quickly querying notes, documents or images. Free for RTX GPU owners, the custom chatbot provides quick, contextually relevant answers. Since it runs locally on Windows RTX PCs and workstations, results are fast and private.

This snapshot of AI capabilities barely scratches the surface of the technology’s possibilities. With an NVIDIA or GeForce RTX GPU-powered system, users can also supercharge their STEM studies and research, and tap into the NVIDIA Studio suite of AI-powered tools.

Decisions, Decisions

More than 200 powerful RTX AI PCs are capable of running advanced AI.

ASUS’ Vivobook Pro 16X comes with up to a GeForce RTX 4070 Laptop GPU.

ASUS’ Vivobook Pro 16X comes with up to a GeForce RTX 4070 Laptop GPU, as well as a superbright 550-nit panel, ultrahigh contrast ratio and ultrawide 100% DCI-P3 color gamut. It’s available on Amazon and ASUS.com.

Dell’s Inspiron 16 Plus 7640 comes with up to a GeForce RTX 4060 Laptop GPU.

Dell’s Inspiron 16 Plus 7640 comes with up to a GeForce RTX 4060 Laptop GPU and a 16:10 aspect ratio display, ideal for users working on multiple projects. It boasts military-grade testing for added reliability and an easy-to-use, built-in Trusted Platform Module to protect sensitive data. It’s available on Amazon and Dell.com.

GIGABYTE’s AERO 16 OLED comes with up to a GeForce RTX 4070 Laptop GPU.

GIGABYTE’s AERO 16 OLED, equipped with up to a GeForce RTX 4070 Laptop GPU, is designed for professionals, designers and creators. The 16:10 thin-bezel 4K+ OLED screen is certified by multiple third parties to provide the best visual experience with X-Rite 2.0 factory-by-unit color calibration and Pantone Validated color calibration. It’s available on Amazon and GIGABYTE.com.

MSI’s Creator M14 comes with up to a GeForce RTX 4070 Laptop GPU.

MSI’s Creator M14 comes with up to a GeForce RTX 4070 Laptop GPU, delivering a quantum leap in performance with DLSS 3 to enable lifelike virtual worlds with full ray tracing. Plus, its Max-Q suite of technologies optimizes system performance, power, battery life and acoustics for peak efficiency. Purchase one on Amazon or MSI.com.

These are just a few of the many RTX AI PCs available, with some on sale, including the Acer Nitro V, ASUS TUF 16″, HP Envy 16″ and Lenovo Yoga Pro 9i.

Follow NVIDIA Studio on Facebook, Instagram and X. Access tutorials on the Studio YouTube channel and get updates directly in your inbox by subscribing to the Studio newsletter. 

Generative AI is transforming gaming, videoconferencing and interactive experiences of all kinds. Make sense of what’s new and what’s next by subscribing to the AI Decoded newsletter.

Read More