This post is co-written with Sherwin Chu from Alida.
Alida helps the world’s biggest brands create highly engaged research communities to gather feedback that fuels better customer experiences and product innovation.
Alida’s customers receive tens of thousands of engaged responses for a single survey, therefore the Alida team opted to leverage machine learning (ML) to serve their customers at scale. However, when employing the use of traditional natural language processing (NLP) models, they found that these solutions struggled to fully understand the nuanced feedback found in open-ended survey responses. The models often only captured surface-level topics and sentiment, and missed crucial context that would allow for more accurate and meaningful insights.
In this post, we learn about how Anthropic’s Claude Instant model on Amazon Bedrock enabled the Alida team to quickly build a scalable service that more accurately determines the topic and sentiment within complex survey responses. The new service achieved a 4-6 times improvement in topic assertion by tightly clustering on several dozen key topics vs. hundreds of noisy NLP keywords.
Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies, such as AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon via a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI.
Using Amazon Bedrock allowed Alida to bring their service to market faster than if they had used other machine learning (ML) providers or vendors.
The challenge
Surveys with a combination of multiple-choice and open-ended questions allow market researchers to get a more holistic view by capturing both quantitative and qualitative data points.
Multiple-choice questions are easy to analyze at scale, but lack nuance and depth. Set response options may also lead to biasing or priming participant responses.
Open-ended survey questions allow responders to provide context and unanticipated feedback. These qualitative data points deepen researchers’ understanding beyond what multiple-choice questions can capture alone. The challenge with the free-form text is that it can lead to complex and nuanced answers that are difficult for traditional NLP to fully understand. For example:
“I recently experienced some of life’s hardships and was really down and disappointed. When I went in, the staff were always very kind to me. It’s helped me get through some tough times!”
Traditional NLP methods will identify topics as “hardships,” “disappointed,” “kind staff,” and “get through tough times.” It can’t distinguish between the responder’s overall current negative life experiences and the specific positive store experiences.
Alida’s existing solution automatically process large volumes of open-ended responses, but they wanted their customers to gain better contextual comprehension and high-level topic inference.
Amazon Bedrock
Prior to the introduction of LLMs, the way forward for Alida to improve upon their existing single-model solution was to work closely with industry experts and develop, train, and refine new models specifically for each of the industry verticals that Alida’s customers operated in. This was both a time- and cost-intensive endeavor.
One of the breakthroughs that make LLMs so powerful is the use of attention mechanisms. LLMs use self-attention mechanisms that analyze the relationships between words in a given prompt. This allows LLMs to better handle the topic and sentiment in the earlier example and presents an exciting new technology that can be used to address the challenge.
With Amazon Bedrock, teams and individuals can immediately start using foundation models without having to worry about provisioning infrastructure or setting up and configuring ML frameworks. You can get started with the following steps:
- Verify that your user or role has permission to create or modify Amazon Bedrock resources. For details, see Identity-based policy examples for Amazon Bedrock
- Log in into the Amazon Bedrock console.
- On the Model access page, review the EULA and enable the FMs you’d like in your account.
- Start interacting with the FMs via the following methods:
- Directly in the Amazon Bedrock console using the Amazon Bedrock playgrounds.
- Programmatically using the Amazon Bedrock API and SDKs.
- In a console terminal using the Amazon Bedrock CLI.
Alida’s executive leadership team was eager to be an early adopter of the Amazon Bedrock because they recognized its ability to help their teams to bring new generative AI-powered solutions to market faster.
Vincy William, the Senior Director of Engineering at Alida who leads the team responsible for building the topic and sentiment analysis service, says,
“LLMs provide a big leap in qualitative analysis and do things (at a scale that is) humanly not possible to do. Amazon Bedrock is a game changer, it allows us to leverage LLMs without the complexity.”
The engineering team experienced the immediate ease of getting started with Amazon Bedrock. They could select from various foundation models and start focusing on prompt engineering instead of spending time on right-sizing, provisioning, deploying, and configuring resources to run the models.
Solution overview
Sherwin Chu, Alida’s Chief Architect, shared Alida’s microservices architecture approach. Alida built the topic and sentiment classification as a service with survey response analysis as its first application. With this approach, common LLM implementation challenges such as the complexity of managing prompts, token limits, request constraints, and retries are abstracted away, and the solution allows for consuming applications to have a simple and stable API to work with. This abstraction layer approach also enables the service owners to continually improve internal implementation details and minimize API-breaking changes. Finally, the service approach allows for a single point to implement any data governance and security policies that evolve as AI governance matures in the organization.
The following diagram illustrates the solution architecture and flow.
Alida evaluated LLMs from various providers, and found Anthropic’s Claude Instant to be the right balance between cost and performance. Working closely with the prompt engineering team, Chu advocated to implement a prompt chaining strategy as opposed to a single monolith prompt approach.
Prompt chaining enables you to do the following:
- Break down your objective into smaller, logical steps
- Build a prompt for each step
- Provide the prompts sequentially to the LLM
This creates additional points of inspection, which has the following benefits:
- It’s straightforward to systematically evaluate changes you make to the input prompt
- You can implement more detailed tracking and monitoring of the accuracy and performance at each step
Key considerations with this strategy include the increase in the number of requests made to the LLM and the resulting increase in the overall time it takes to complete the objective. For Alida’s use case they chose to batching a collection of open-ended responses in a single prompt to the LLM is what they chose to offset these effects.
NLP vs. LLM
Alida’s existing NLP solution relies on clustering algorithms and statistical classification to analyze open-ended survey responses. When applied to sample feedback for a coffee shop’s mobile app, it extracted topics based on word patterns but lacked true comprehension. The following table includes some examples comparing NLP responses vs. LLM responses.
Survey Response | Existing Traditional NLP | Amazon Bedrock with Claude Instant | |
Topic | Topic | Sentiment | |
I almost exclusively order my drinks through the app bc of convenience and it’s less embarrassing to order super customized drinks lol. And I love earning rewards! | [‘app bc convenience’, ‘drink’, ‘reward’] | Mobile Ordering Convenience | positive |
The app works pretty good the only complaint I have is that I can’t add Any number of money that I want to my gift card. Why does it specifically have to be $10 to refill?! | [‘complaint’, ‘app’, ‘gift card’, ‘number money’] | Mobile Order Fulfillment Speed | negative |
The example results show how the existing solution was able to extract relevant keywords, but isn’t able to achieve a more generalized topic group assignment.
In contrast, using Amazon Bedrock and Anthropic Claude Instant, the LLM with in-context training is able to assign the responses to pre-defined topics and assign sentiment.
In additional to delivering better answers for Alida’s customers, for this particular use-case, pursuing a solution using an LLM over traditional NLP methods saved a vast amount of time and effort in training and maintaining a suitable model. The following table compares training a traditional NLP model vs. in-context training of an LLM.
. | Data Requirement | Training Process | Model Adaptability |
Training a traditional NLP model | Thousands of human-labeled examples |
Combination of automated and manual feature engineering. Iterative train and evaluate cycles. |
Slower turnaround due to the need to retrain model |
In-context training of LLM | Several examples |
Trained on the fly within the prompt. Limited by context window size. |
Faster iterations by modifying the prompt. Limited retention due to context window size. |
Conclusion
Alida’s use of Anthropic’s Claude Instant model on Amazon Bedrock demonstrates the powerful capabilities of LLMs for analyzing open-ended survey responses. Alida was able to build a superior service that was 4-6 times more precise at topic analysis when compared to their NLP-powered service. Additionally, using in-context prompt engineering for LLMs significantly reduced development time, because they didn’t need to curate thousands of human-labeled data points to train a traditional NLP model. This ultimately allows Alida to give their customers richer insights sooner!
If you’re ready to start building your own foundation model innovation with Amazon Bedrock, checkout this link to Set up Amazon Bedrock. If you interested in reading about other intriguing Amazon Bedrock applications, see the Amazon Bedrock specific section of the AWS Machine Learning Blog.
About the authors
Kinman Lam is an ISV/DNB Solution Architect for AWS. He has 17 years of experience in building and growing technology companies in the smartphone, geolocation, IoT, and open source software space. At AWS, he uses his experience to help companies build robust infrastructure to meet the increasing demands of growing businesses, launch new products and services, enter new markets, and delight their customers.
Sherwin Chu is the Chief Architect at Alida, helping product teams with architectural direction, technology choice, and complex problem-solving. He is an experienced software engineer, architect, and leader with over 20 years in the SaaS space for various industries. He has built and managed numerous B2B and B2C systems on AWS and GCP.
Mark Roy is a Principal Machine Learning Architect for AWS, helping customers design and build AI/ML and generative AI solutions. His focus since early 2023 has been leading solution architecture efforts for the launch of Amazon Bedrock, AWS’ flagship generative AI offering for builders. Mark’s work covers a wide range of use cases, with a primary interest in generative AI, agents, and scaling ML across the enterprise. He has helped companies in insurance, financial services, media and entertainment, healthcare, utilities, and manufacturing. Prior to joining AWS, Mark was an architect, developer, and technology leader for over 25 years, including 19 years in financial services. Mark holds six AWS certifications, including the ML Specialty Certification.