OpenSearch is a scalable, flexible, and extensible open source software suite for search, analytics, security monitoring, and observability applications, licensed under the Apache 2.0 license. Amazon OpenSearch Service is a fully managed service that makes it straightforward to deploy, scale, and operate OpenSearch in the AWS Cloud.
OpenSearch uses a probabilistic ranking framework called BM-25 to calculate relevance scores. If a distinctive keyword appears more frequently in a document, BM-25 assigns a higher relevance score to that document. This framework, however, doesn’t consider user behavior like click-through or purchase data, which can further improve relevance for individual users.
Improving the functionality of search is an integral aspect of enhancing the overall user experience and engagement on a website or application. Search traffic is considered high intent because users are actively seeking a particular item, and they have been found to convert up to two times more than non-site search visitors on average. By using user interaction data such as clicks, likes, and purchases, businesses can improve search relevancy to capitalize on this traffic and reduce instances of users abandoning their sessions due to difficulties in finding the desired items. By refining the quality of search results, businesses can significantly improve their customer engagement, satisfaction, and loyalty, as well as increase their conversion rates, ultimately leading to greater profitability and success.
Amazon Personalize allows you to add sophisticated personalization capabilities to your applications by using the same machine learning (ML) technology used on Amazon.com for over 20 years. No ML expertise is required.
Amazon Personalize supports the automatic adjustment of recommendations based on contextual information about your user, such as device type, location, time of day, or other information you provide. You supply Amazon Personalize with historical data about your users and their interactions within your application, such as purchase history, ratings, and likes. You can add data to Amazon Personalize in bulk by importing large historical datasets all at once from an Amazon Simple Storage Service (Amazon S3) CSV file, using a format required by Amazon Personalize. You can also add data incrementally by importing records using the Amazon Personalize console or API. After your historical data is imported, you can continue to provide new data in real time by sending user interaction events. Based on the use case you want to address, such as product recommendations, you select a pre-built recipe that is optimized for that goal. Amazon Personalize analyzes your data and trains a custom ML model based on the parameters in the recipe to generate personalized recommendations optimized for your users and application. After the model is trained, you can generate real-time personalized recommendations for your users.
With the newly launched Amazon Personalized Search Plugin for Amazon OpenSearch Service, you can use user interaction histories and interests to enhance their search results. By utilizing an Amazon Personalize recipe such as Personalized-Ranking, you can help boost search results for relevant items based on user interests at the time of getting search results from OpenSearch Service.
This post explains how to integrate the Amazon Personalize Search Ranking plugin with OpenSearch Service to enable personalized search experiences. To build Amazon Personalize artifacts in this post, we use a dataset from IMDb, the world’s most authoritative source for movie, TV, and celebrity content, available on AWS Marketplace, as well as the MovieLens dataset prepared by GroupLens research at the University of Minnesota, consisting of user rankings for various movies.
Solution overview
The following diagram illustrates the solution architecture.
The workflow includes the following steps:
- A user issues a search request through their website or portal. This search request is sent to OpenSearch Service.
- The top N search results are returned from the OpenSearch Service index and sent to the plugin to preprocess and prepare the input for an Amazon Personalize campaign.
- The request is sent to Amazon Personalize to get the re-ranked search results.
- Amazon Personalize returns the personalized ranking of the search results with the relevant score for each result.
- The reranked hits are returned by the plugin to OpenSearch Service, with a weighting applied between the OpenSearch Service relevance score and the Amazon Personalize personalized ranking score. You specify a weight parameter (between 0.0–1.0) that controls the balance between OpenSearch Service and Amazon Personalize when reranking results. A higher weight means more influence from the Amazon Personalize ranking scores vs. the OpenSearch Service scores. This allows you to customize how much the personalized recommendations affect the final search results ranking returned to the user.
- The user gets personalized search results based on their preferences and interactions.
Prerequisites
You should have the following prerequisites:
- An AWS account.
- An AWS Identity and Access Management (IAM) role with appropriate access permissions. We provide AWS CloudFormation templates and Jupyter notebooks to help set up the required IAM role and access.
- To enable personalization in OpenSearch Service, you need to set up the required Amazon Personalize resources, including a dataset group, solution version, and campaign. We have provided a Jupyter notebook that creates all the Amazon Personalize resources, taking advantage of the fully managed Jupyter notebook instance capabilities of Amazon SageMaker.
Deploy the CloudFormation stack
The CloudFormation stack automates the deployment of the OpenSearch Service domain and SageMaker Notebook instance. Complete the following steps to deploy the stack:
- Sign in to the AWS Management Console with your credentials in the account where you want to deploy the CloudFormation stack.
- Launch the CloudFormation stack directly.
- On the Specify details page, provide any parameters required by the template, such as OpenSearch Service and SageMaker instance sizes.
- On the Configure stack options page, specify a stack name and any other options you want to set.
- Complete creating the stack and monitor the status on the stack details page.
- After the stack is created, open the SageMaker notebook instance from the console.
The notebook instance will already be preloaded with the required notebooks.
Set up and complete the Amazon Personalize workflow
Open the 1.Configure_Amazon_Personalize.ipynb notebook to set up the Amazon Personalize artifacts. This notebook walks you through the following steps:
- Download the dataset and preprocess the data to create the required input files for creating the datasets.
- Create a dataset group.
- Create datasets and schemas.
- Prepare and import data.
- Create a solution and a solution version.
- Create a campaign for the solution version.
Install the Amazon Personalize Search Ranking plugin using a Jupyter notebook
Open the 2.Configure_Amazon_OpenSearch.ipynb notebook and run through the instructions. This notebook walks you through the following steps:
- Ingest sample index data into the OpenSearch Service instance. Populating the index with representative data facilitates thorough testing and validation of the plugin.
- Install the plugin package in the OpenSearch Service domain. This integrates the personalization capabilities into the OpenSearch environment.
- Set up search pipelines to activate the plugin’s functionality. Search pipelines contain request preprocessors and response postprocessors that transform queries and results. When constructing a pipeline, specify the Amazon Personalize campaign ARN created earlier in a
personalized_search_ranking
postprocessor to enable personalized re-ranking. This configures the plugin to retrieve real-time personalization results from Amazon Personalize for application during result processing. Defining pipelines allows the plugin to augment search relevance based on user preferences.
Install the Amazon Personalize Search Ranking plugin using the console
You can also set up the Amazon Personalize search plugin from the console. You only need to do this if you have not installed the plugin using the Jupyter notebook from earlier.
To install the Amazon Personalize Search Ranking plugin on OpenSearch Service, complete the following steps:
- On the OpenSearch Service console, navigate to your domain.
- On the Packages tab, choose Associate package to associate the Amazon Personalize Search Ranking plugin with your OpenSearch Service domain. The plugin version must match the OpenSearch Service domain version.
The Amazon Personalize Search Ranking plugin can be installed on OpenSearch Service versions 2.9 and above.
- Locate the Amazon Personalize Search Ranking plugin in the list of available plugins.
- Choose Associate next to the plugin to install it and associate it with your existing OpenSearch Service domain.
After you have connected the plugin, it will appear in the list of packages as a plugin type. With the plugin installed, the installation process is now finished.
Enable the Amazon Personalize Search Ranking plugin
The Amazon Personalize Search Ranking plugin uses the search-pipeline
feature of OpenSearch Service, released starting with version 2.9. The plugin depends on the search-pipeline
feature to apply Amazon Personalized ranking on search results provided by OpenSearch Service and also needs to be set up as a search-pipeline
response processor. This pipeline definition will contain configuration for the Amazon Personalize plugin, which includes the Amazon Personalize campaign to call for getting Amazon Personalize ranking, the IAM role to access Amazon Personalize resources, as well as the parameters defined in the following table.
Settings | Required | Default | Description |
campaign |
Yes | None | Specify the ARN of the Amazon Personalize campaign to use to personalize results. |
recipe |
Yes | None | Specify the name of the Amazon Personalize recipe to use. As of this writing, aws-personalized-ranking is the only supported value. |
item_id_field |
No | “_id” | If the _id field for an indexed document in OpenSearch doesn’t correspond with your Amazon Personalize itemId , specify the name of the field that does. |
weight |
Yes | None | Specify the emphasis that the response processor puts on personalization when it re-ranks results. Specify a value within a range of 0.0–1.0. The closer to 1.0 that it is, the more likely it is that results from Amazon Personalize rank higher. If you specify 0.0, no personalization occurs and OpenSearch Service takes precedence. |
tag |
No | None | Specify an identifier for the processor. |
iam_role_arn |
Yes | None | Specify the IAM role to access Amazon Personalize resources. This is required for OpenSearch Service, and optional for open source OpenSearch. |
aws_region |
Yes | None | Specify the AWS Region where you created your Amazon Personalize campaign. |
ignore_failure |
No | None | Specify whether the plugin ignores any processor failures. For values, specify true or false . For your production environments, we recommend that you specify true to avoid any interruptions for query responses. For test environments, you can specify false to view any errors that the plugin generates. |
external_account_iam_role_arn |
No | None | If you use OpenSearch Service and your Amazon Personalize and OpenSearch Service resources exist in different accounts, specify the ARN of the role that has permission to access to Amazon Personalize. |
The following Python code snippet creates a search pipeline with a personalized_search_ranking
response processor on an OpenSearch Service domain. You run this step one time as a part of the notebook that accompanies this post:
Define search pipeline for personalized ranking
You can use the following Python code to create a search pipeline with a personalized_search_ranking
response processor on an OpenSearch Service domain. Replace domain endpoint with your domain endpoint URL. For example: https://<domain name>.<AWS region>.es.amazonaws.com
.
Apply a search pipeline to an individual query
After you configure a search pipeline with a personalized_search_ranking
response processor, you can apply the Amazon Personalize Search Ranking plugin to your OpenSearch queries and view the re-ranked results. Update the code to specify your domain endpoint, your OpenSearch Service index, the name of your pipeline (you configured above), and your query (we use “Tom Cruise” for query). For user_id
, specify the ID of the user that you’re getting search results for. This user must be in the data that you used to create your Amazon Personalize solution version.
Evaluate the results
Open the 3.Testing.ipynb notebook and walk through the steps to test and compare the results for queries that use personalization and those that don’t. The Amazon Personalize Search Ranking plugin re-ranks the search results in the OpenSearch Service query response. It considers both the ranking from Amazon Personalize and the ranking from OpenSearch Service. This notebook walks you through the following steps:
- Define the necessary connection parameters to establish a connection with your OpenSearch Service domain. This involves specifying the domain endpoint, authentication credentials, and any additional configuration settings required for your specific OpenSearch Service setup.
- Create a set of sample queries, including queries with personalization parameters and queries without personalization parameters. These queries will be used to evaluate the impact of personalization on the search results.
- Run and compare the results for queries that use personalization and those that do not.
For our example, we used a query for “Tom Cruise” and for the personalization parameter, we used a user with a recent history of viewing drama and romance film genres. The subsequent search results exhibit how the plugin tailors and prioritizes recommendations predicated on the user’s observed viewing behavior. This exemplifies the plugin’s ability to deliver a customized, curated experience by considering individual user preferences and engagement patterns. The capability to refine and attune search outcomes based on inferences of a user’s preferences enables delivering enhanced relevance and utility.
Personalized vs. non-personalized results
Let’s consider personalizing results for a user with ID 12. First, we check this user’s recent interactions by running the code in the 3.Testing.ipynb notebook to retrieve their interaction history. This allows us to see what types of movies this user has reviewed recently, which can inform how we personalize recommendations for them.
In this example, we see that the user has expressed interest in drama, romance, and thriller movie genres. To provide personalized recommendations, we first run queries with personalization parameters enabled, utilizing the user’s genre preferences. We then run the same queries without personalization enabled, for comparison. The following results show the difference between the non-personalized and personalized recommendation outputs.
The first two columns display the default OpenSearch Service results for the query “Tom Cruise” on a movies index, showing a variety of Tom Cruise films across different genres. The next two columns showcase personalized OpenSearch Service results for the same “Tom Cruise” query, but customized for a user interested in drama, romance, and thriller genres. Compared to the generic results, the personalized results prominently feature Tom Cruise movies in the user’s preferred drama, romance, and thriller genres. The delta highlights how the personalized results have been re-ranked relative to the non-personalized results, prioritizing films that match the user’s genre preferences. This demonstrates how personalization can tailor OpenSearch Service results to individual users’ tastes and interests.
This comparison demonstrates how Amazon Personalize can customize OpenSearch Service movie results to match an individual user’s interests. Although standard OpenSearch Service aims to universally serve relevant movie results for Tom Cruise, Amazon Personalize tailors the results to focus on Tom Cruise films it predicts this user will enjoy based on their unique viewing history and preferences.
The side-by-side results illustrate how Amazon Personalize provides a more targeted, user-centric search experience by personalizing the movie results to the individual.
Clean up
Complete the following steps to clean up your resources:
- Follow the steps in the 4.Cleanup.ipynb notebook to clean up the resources created through the notebook.
- On the AWS CloudFormation console, delete the stack that you created.
Conclusion
The Amazon Personalize Search Ranking plugin integrates seamlessly with OpenSearch Service to enable personalized search experiences. By using user behavior data and the ML capabilities of Amazon Personalize, the plugin can reorder OpenSearch Service result rankings to boost relevance for each unique user. This creates a custom-tailored search experience that surfaces the most relevant content higher in the results. The plugin is configurable to balance personalization with OpenSearch Service native scoring to fit diverse use cases. Overall, the Amazon Personalize Search Ranking plugin is a powerful way to enhance OpenSearch Service search relevance and engagement by factoring in the individual interests and preferences of your users. With just a few configuration steps, you can start serving hyper-relevant results that resonate strongly with your users.
Additional resources
- Amazon Personalize Developer Guide
- Personalizing search results from OpenSearch
- Setting up Amazon Personalize
- Amazon Personalize workflow
- Configuring the plugin
- Configuring permissions when resources are in different accounts
- The IMDb dataset is available on AWS Data Exchange and provides over 1.6 billion user ratings; credits for more than 13 million cast and crew members; 10 million movie, TV, and entertainment titles; and global box office reporting data from more than 60 countries
About the Authors
James Jory is a Principal Solutions Architect in Applied AI with AWS. He has a special interest in personalization and recommender systems and a background in ecommerce, marketing technology, and customer data analytics. In his spare time, he enjoys camping and auto racing simulations.
Reagan Rosario is a Solutions Architect at AWS, specializing in building scalable, highly available, and secure cloud solutions for education technology companies. With over 10 years of experience in software engineering and architecture roles, Reagan loves using his technical knowledge to help AWS customers architect robust cloud solutions that leverage the breadth and depth of AWS.