This blog post is co-written with Gene Arnold from Alation.
To build a generative AI-based conversational application integrated with relevant data sources, an enterprise needs to invest time, money, and people. First, you would need build connectors to the data sources. Next you need to index this data to make it available for a Retrieval Augmented Generation (RAG) approach where relevant passages are delivered with high accuracy to a large language model (LLM). To do this, you need to select an index that provides the capabilities to index the content for semantic and vector search, build the infrastructure to retrieve data, rank the answers, and build a feature rich web application. Additionally, you might need to hire and staff a large team to build, maintain, and manage such a system.
Amazon Q Business is a fully managed generative AI-powered assistant that can answer questions, provide summaries, generate content, and securely complete tasks based on data and information in your enterprise systems. Amazon Q Business can help you get fast, relevant answers to pressing questions, solve problems, generate content, and take actions using the data and expertise found in your company’s information repositories, code, and enterprise systems. To do this Amazon Q Business provides out-of-the-box native data source connectors that can index content into a built-in retriever and uses an LLM to provide accurate, well written answers. A data source connector is a component of Amazon Q Business that helps to integrate and synchronize data from multiple repositories into one index. Amazon Q Business offers multiple prebuilt connectors to a large number of data sources, including ServiceNow, Atlassian Confluence, Amazon Simple Storage Service (Amazon S3), Microsoft SharePoint, Salesforce, and many more. For a full list of supported data source connectors, see Amazon Q Business connectors.
However, many organizations store relevant information in the form of unstructured data on company intranets or within file systems on corporate networks that are inaccessible to Amazon Q Business using its native data source connectors. You can now use the custom data source connector within Amazon Q Business to upload content to your index from a wider range of data sources.
Using an Amazon Q Business custom data source connector, you can gain insights into your organization’s third party applications with the integration of generative AI and natural language processing. This post shows how to configure an Amazon Q Business custom connector and derive insights by creating a generative AI-powered conversation experience on AWS using Amazon Q Business while using access control lists (ACLs) to restrict access to documents based on user permissions.
Alation is a data intelligence company serving more than 600 global enterprises, including 40% of the Fortune 100. Customers rely on Alation to realize the value of their data and AI initiatives. Headquartered in Redwood City, California, Alation is an AWS Specialization Partner and AWS Marketplace Seller with Data and Analytics Competency. Organizations trust Alation’s platform for self-service analytics, cloud transformation, data governance, and AI-ready data, fostering innovation at scale. In this post, we will showcase a sample of how Alation’s business policies can be integrated with an Amazon Q Business application using a custom data source connector.
Finding accurate answers from content in custom data sources using Amazon Q Business
After you integrate Amazon Q Business with data sources such as Alation, users can ask questions from the description of the document. For example,
- What are the top sections of the HR benefits policies?
- Who are the data stewards for my proprietary database sources?
Overview of a custom connector
A data source connector is a mechanism for integrating and synchronizing data from multiple repositories into one container index. Amazon Q Business offers multiple pre-built data source connectors that can connect to your data sources and help you create your generative AI solution with minimal configuration. However, if you have valuable data residing in spots for which those pre-built connectors cannot be used, you can use a custom connector.
When you connect Amazon Q Business to a data source and initiate the data synchronization process, Amazon Q Business crawls and adds documents from the data source to its index.
You would typically use an Amazon Q Business custom connector when you have a repository that Amazon Business doesn’t yet provide a data source connector for. Amazon Q Business only provides metric information that you can use to monitor your data source sync jobs. You must create and run the crawler that determines the documents your data source indexes. A simple architectural representation of the steps involved is shown in the following figure.
Solution overview
The solution shown of integrating Alation’s business policies is for demonstration purposes only. We recommend running similar scripts only on your own data sources after consulting with the team who manages them, or be sure to follow the terms of service for the sources that you’re trying to fetch data from. The steps involved for other custom data sources are very similar except the part where we connect to Alation and fetch data from it. To crawl and index contents in Alation you configure an Amazon Q Business custom connector as a data source in your Amazon Q Business application.
Prerequisites
For this walkthrough, you should have the following prerequisites:
- An AWS account
- Access to the Alation service with the ability to create new policies and access tokens. You can verify if you have access by navigating to
https://[[your-domain]].alationcloud.com/admin/auth/
and see the OAuth Client Applications. Alation admins can navigate tohttps://[[your-domain]].alationcloud.com/admin/users/
and change user access if needed. - Privileges to create an Amazon Q Business application, AWS resources, and AWS Identity and Access Management (IAM) roles and policies.
- Basic knowledge of AWS services and working knowledge of Alation or other data sources of choice.
- Set up AWS IAM Identity Center integration with Amazon Q Business for user management.
- Set up SageMaker Studio notebook and ensure the execution role on it has the necessary privileges to access both the Amazon Q Business application (specifically
StartDataSourceSyncJob
,BatchPutDocument
, andStopDataSourceSyncJob
permissions) and the AWS Secrets Manager secret (GetSecretValue
). Additionally, it’s recommended that the policy restricts access to only the Amazon Q Business application Amazon Resource Name (ARN) and the Secrets Manager secret created in the following steps.
Configure your Alation connection
In your Alation cloud account, create an OAuth2 client application that can be consumed from an Amazon Q Business application.
- In Alation, sign in as a user with administrator privileges, navigate to the settings page, and choose Authentication (
https://[[your-domain]].alationcloud.com/admin/auth/
).
- In the OAuth Client Applications section, choose Add.
- Enter an easily identifiable application name, and choose Save.
- Take note of the OAuth client application data—the Client ID and the Client Secret—created and choose Close.
- As a security best practice, storing the client application data in Secrets Manager is recommended. In AWS console, navigate to AWS Secrets Manager and add a new secret. Key in the
Client_Id
andClient_Secret
values copied from the previous step.
- Provide a name and description for the secret and choose Next.
- Leave the defaults and choose Next.
- Choose Store in the last page.
Create sample Alation policies
In our example, you would create three different sets of Alation policies for a fictional organization named Unicorn Rentals. Grouped as Workplace, HR, and Regulatory, each policy contains a rough two-page summary of crucial organizational items of interest. You can find details on how to create policies on Alation documentation.
On the Amazon Q Business side, let’s assume that we want to ensure that the following access policies are enforced. Users and access are setup via code illustrated in later sections.
# | First name | Last name | Policies authorized for access |
1 | Alejandro | Rosalez | Workplace, HR, and Regulatory |
2 | Sofia | Martinez | Workplace and HR |
3 | Diego | Ramirez | Workplace and Regulatory |
Create an Amazon Q Business application
- Sign in to the AWS Management Console and navigate to Amazon Q Business from the search bar at the top.
- On the Amazon Q Business console, choose Get Started.
- On the Applications page, choose Create application.
- In the first step of the Create application wizard, enter the default values. Additionally, you need to choose a list of users who require access to the Amazon Q Business application by including them through the IAM Identity Center settings.
- In the access management settings page, you would create and add users via AWS IAM Identity Center.
- Once all users are added, choose Create.
- After the application is created, take note of the Application ID value from the landing page.
- Next is to choose an index type for the Amazon Q Business application. Choose the native retriever option.
- After the index is created, verify that the status has changed to Active. You can then take a note of the Index ID.
- Next step is for you to add the custom data source.
- Search for Custom data source and choose the plus sign next to it.
- Provide a name and description for the custom data source.
- Once done, choose Add data source.
- After the data source is added and its status is Active, take note of the Data source ID.
Load policy data from Alation to Amazon Q Business using the custom connector
Now let’s load the Alation data into Amazon Q Business using the correct access permissions. The code examples that follow are also available on the accompanying GitHub code repository.
- With the connector ready, move over to the SageMaker Studio notebook and perform data synchronization operations by invoking Amazon Q Business APIs.
- To start, retrieve the Alation OAuth client application credentials stored in Secrets Manager.
- Next, initiate the connection using the OAuth client application credentials from Alation.
- You then configure policy type level user access. This section can be customized based on how user access information is stored on any data sources. Here, we assume a pre-set access based on the user’s email IDs.
- You then pull individual policy details from Alation. This step can be repeated for all three policy types: Workplace, HR, and regulatory
- The next step is to define the Amazon Q Business application, index, and data source information that you created in the previous steps.
- Now you explicitly create the users in Amazon Q Business. Individual user access to different policy type data sets is configured later.
- For each policy type data set (Workplace, HR, and Regulatory), we execute the following three steps.
- Start an Amazon Q Business data source sync job.
- Encode and batch upload data with user access mapping.
- Stop the data source sync job and wait for the data set to be indexed.
- Go back to the Amazon Q Business console and see if the data uploads were successful.
- Find and open the custom data source from the list of data sources.
- Ensure the ingested documents are added in the Sync history tab and are in the Completed status.
- Also ensure the Last sync status for the custom data source connector is Completed.
Run queries with the Amazon Q Business web experience
Now that the data synchronization is complete, you can start exploring insights from Amazon Q Business. With the newly created Amazon Q Business application, select the Web Application settings tab and navigate to the auto-created URL. This will open a new tab with a preview of the user interface and options that you can customize to fit your use case.
- Sign in as user Alejandro Rosales. As you might recall, Alejandro has access to all three policy type data sets (Workplace, HR and Regulator).
- Start by asking a question about HR policy, such as “Per the HR Payroll Policy of Unicorn Rents, what are some additional voluntary deductions taken from employee paychecks.” Note how Q Business provides an answers and also shows where it pulled the answer from.
- Next, ask a question about a Regulatory policy: “Per the PCI DSS compliance policy of Unicorn Rentals, how is the third-party service provider access to cardholder information protected?” The result includes the summarized answer on PCI DSS compliance and also shows sources where it gathered the data from.
- Lastly, see how Amazon Q Business responds when asked a question about generic workplace policy. “What does Unicorn Rentals do to protect information of children under the age of 13.” In this case, the application returns the answer and marks it as a Workplace policy question.
- Let’s next sign in as Sofia Martinez. Sofia has access to HR and Workplace policy types, but not to Regulatory policies.
- Start by asking a question about HR policy: “Per the HR Payroll Policy of Unicorn Rentals, list the additional voluntary deductions taken from employee paychecks.” Note how Q Business list the deductions and cite policy where the answer is gathered from.
- Next, ask a Regulatory policy question: “What are the record keeping requirements mentioned in the ECOA compliance policy of Unicorn Rentals?”. Note how Amazon Q Business contextually answers the question mentioning Sofia does not have access to that data –
- Finally, sign in as Diego Ramirez. Diego has access to Workplace and Regulatory policies but not to HR policies.
- Start by asking the same Regulatory policy question that: “Per the PCI DSS compliance policy of Unicorn Rentals, how is third-party service provider access to cardholder information protected?”. Since Diego has access to Regulatory policy data, expected answer is generated.
- Next, when Diego asks a question about a HR policy: “Per the HR Compensation Policy of Unicorn Rentals, how is job pricing determined?.” Note how Amazon Q Business contextually answers the question mentioning Diego does not have access to that data.
Troubleshooting
If you’re unable to get answers to any of your questions and get the message “Sorry, I could not find relevant information to complete your request,” check to see if any of the following issues apply:
- No permissions: ACLs applied to your account doesn’t allow you to query certain data sources. If this is the case, please reach out to your application administrator to ensure your ACLs are configured to access the data sources.
- EmailID not matching UserID: In rare scenarios, a user might have a different email ID associated with the Amazon Q Business Identity Center connection than is associated in the data source’s user profile. Make sure that the Amazon Q Business user profile is updated to recognize the email ID using the update-user CLI command or the related API call.
- Data connector sync failed: Data connector fails to synchronize information from the source to Amazon Q Business application. Verify the data connectors sync run schedule and sync history to help ensure that the synchronization is successful.
- Empty or private data sources: Private or empty projects will not be crawled during the synchronization run.
If none of the above are true then open a support case to get this resolved.
Clean up
To avoid incurring future charges, clean up any resources created as part of this solution. Delete the Amazon Q Business custom connector data source and client application created in Alation and the Amazon Q Business application. Next, delete the Secrets Manager secret with Alation OAuth client application credential data. Also, delete the user management setup in IAM Identity Center and the SageMaker Studio domain.
Conclusion
In this post, we discussed how to configure the Amazon Q Business custom connector to crawl and index tasks from Alation as a sample. We showed how you can use Amazon Q Business generative AI-based search to enable your business leaders and agents discover insights from your enterprise data.
To learn more about the Amazon Q Business custom connector, see the Amazon Q Business developer guide. To learn more about Alation Data Catalog, which is available for purchase through AWS Marketplace. Speak to your Alation account representative for custom purchase options. For any additional information, contact your Alation business partner.
Alation – AWS Partner Spotlight
Alation is an AWS Specialization Partner that has pioneered the modern data catalog and is making the leap into a full-service source for data intelligence. Alation is passionate about helping enterprises create thriving data cultures where anyone can find, understand, and trust data.
Contact Alation | Partner Overview | AWS Marketplace
About the Authors
Gene Arnold is a Product Architect with Alation’s Forward Deployed Engineering team. A curious learner with over 25 years of experience, Gene focuses how to sharpen selling skills and constantly explores new product lines.
Prabhakar Chandrasekaran is a Senior Technical Account Manager with AWS Enterprise Support. Prabhakar enjoys helping customers build cutting-edge AI/ML solutions on the cloud. He also works with enterprise customers providing proactive guidance and operational assistance, helping them improve the value of their solutions when using AWS. Prabhakar holds eight AWS and seven other professional certifications. With over 21 years of professional experience, Prabhakar was a data engineer and a program leader in the financial services space prior to joining AWS.
Sindhu Jambunathan is a Senior Solutions Architect at AWS, specializing in supporting ISV customers in the data and generative AI vertical to build scalable, reliable, secure, and cost-effective solutions on AWS. With over 13 years of industry experience, she joined AWS in May 2021 after a successful tenure as a Senior Software Engineer at Microsoft. Sindhu’s diverse background includes engineering roles at Qualcomm and Rockwell Collins, complemented by a Master’s of Science in Computer Engineering from the University of Florida. Her technical expertise is balanced by a passion for culinary exploration, travel, and outdoor activities.
Prateek Jain is a Sr. Solutions Architect with AWS, based out of Atlanta Georgia. He is passionate about GenAI and helping customers build amazing solutions on AWS. In his free time, he enjoys spending time with Family and playing tennis.