OfferUp improved local results by 54% and relevance recall by 27% with multimodal search on Amazon Bedrock and Amazon OpenSearch Service

This post is co-written with Andrés Vélez Echeveri and Sean Azlin from OfferUp.

OfferUp is an online, mobile-first marketplace designed to facilitate local transactions and discovery. Known for its user-friendly app and trust-building features, including user ratings and in-app chat, OfferUp enables users to buy and sell items and explore a broad range of jobs and local services. As part of its ongoing mission to enhance user experience and drive business growth, OfferUp constantly seeks to improve its search capabilities, making it faster and more intuitive for users to discover, transact, and connect in their local communities.

In this two-part blog post series, we explore the key opportunities OfferUp embraced on their journey to boost and transform their existing search solution from traditional lexical search to modern multimodal search powered by Amazon Bedrock and Amazon OpenSearch Service. OfferUp found that multimodal search improved relevance recall by 27%, reduced geographic spread (which means more local results) by 54%, and grew search depth by 6.5%. This series delves into strategies, architecture patterns, business benefits and technical steps to modernize your own search solution

Foundational search architecture

OfferUp hosts millions of active listings, with millions more added monthly by its users. Previously, OfferUp’s search engine was built with Elasticsearch (v7.10) on Amazon Elastic Compute Cloud (Amazon EC2), using a keyword search algorithm to find relevant listings. The following diagram illustrates the data pipeline for indexing and query in the foundational search architecture.

Figure 1: Foundational search architecture

The data indexing workflow consists of the following steps:

As an OfferUp user creates or updates a listing, any new images are uploaded directly to Amazon Simple Storage Service (Amazon S3) using signed upload URLs.
The OfferUp user submits the new or updated listing details (title, description, image ids) to a posting microservice.
The posting microservice then persists the changes using the listing writer microservice in Amazon DynamoDB.
The listing writer microservice publishes listing change events to an Amazon Simple Notification Service (Amazon SNS) topic, which an Amazon Simple Queue Service (Amazon SQS) queue subscribes to.
The listing indexer AWS Lambda function continuously polls the queue and processes incoming listing updates.
The indexer retrieves the full listing details through the listing reader microservice from the DynamoDB table.
Finally, the indexer updates or inserts these listing details into Elasticsearch.

This flow makes sure that new or updated listings are indexed and made available for search queries in Elasticsearch.

The data query workflow consists of the following steps:

OfferUp users perform text searches, such as “summer shirt” or “running shoes”’.
The search microservice processes the query requests and retrieves relevant listings from Elasticsearch using keyword search (BM25 as a ranking algorithm).

Challenges with the foundational search architecture

OfferUp continuously strives to enhance user experience, focusing specifically on improving search relevance, which directly impacts Engagement with Seller Response (EWSR) and drives ad impressions. Although the foundational search architecture effectively surfaces a broad and diverse inventory, OfferUp encountered several limitations that prevent it from achieving optimal outcomes. These challenges include:

Context understanding – Keyword searches don’t account for the context in which a term is used. This can lead to irrelevant results if the same keyword has different meanings or uses. Keywords alone can’t discern user intent. For instance, “apple” could refer to the fruit, the technology company, or the brand name in different contexts.
Synonym and variation awareness – Keyword searches might miss results if the search terms vary or if synonyms are used. For example, searching for “car” might not return results for “sedan”. Similarly, searching for iPhone 11 can return results for iPhone 10 and iPhone 12.
Complex query management – The foundational search approach struggled with complex, multi-concept queries like “red running shoes,” often returning results that included shoes in other colors or footwear not designed for running.

Keyword search, which uses BM25 as a ranking algorithm, lacks the ability to understand semantic relationships between words, often missing semantically relevant results if they don’t contain exact keywords.

Solution overview

To improve search quality, OfferUp explored various software and hardware solutions focused on boosting search relevance while maintaining cost-efficiency. Ultimately, OfferUp selected Amazon Titan Multimodal Embeddings and Amazon OpenSearch Service for their fully managed services, which support a robust multimodal search solution capable of delivering high accuracy and fast responses across search and recommendation use cases. This choice also simplifies the deployment and operation of large-scale search capabilities on the OfferUp app, meeting the high throughput and latency requirements.

Amazon Titan Multimodal Embeddings G1 model

This model is pre-trained on large datasets, so you can use it as-is or customize this model by fine-tuning with your own data for a particular task. This model is used for use cases like searching images by text, by image, or by a combination of text and image for similarity and personalization. It translates the input image or text into an embedding that contains the semantic meaning of both the image and text in the same semantic space. By comparing embeddings, the model produces more relevant and contextual responses than keyword matching alone.

The Amazon Titan Multimodal Embeddings G1 offers the following configurations:

Model ID – amazon.titan-embed-image-v1
Max input text tokens – 256
Max input image size – 25 MB
Output vector size – 1,024 (default), 384, 256
Inference types – On-Demand, Provisioned Throughput

OpenSearch Service’s vector database capabilities

Vector databases enable the storage and indexing of vectors alongside metadata, facilitating low-latency queries to discover assets based on similarity. These databases typically use k-nearest (k-NN) indexes built with advanced algorithms such as Hierarchical Navigable Small Worlds (HNSW) and Inverted File (IVF) systems. Beyond basic k-NN functionality, vector databases offer a robust foundation for applications that require data management, fault tolerance, resource access controls, and an efficient query engine.

OpenSearch is a powerful, open-source suite that provides scalable and flexible tools for search, analytics, security monitoring, and observability—all under the Apache 2.0 license. With Amazon OpenSearch Service, you get a fully managed solution that makes it simple to deploy, scale, and operate OpenSearch in the AWS Cloud. By using Amazon OpenSearch Service as a vector database, you can combine traditional search, analytics, and vector search into one comprehensive solution. OpenSearch’s vector capabilities help accelerate AI application development, making it easier for teams to operationalize, manage, and integrate AI-driven assets.

To further boost these capabilities, OpenSearch offers advanced features, such as:

Connector for Amazon Bedrock – You can seamlessly integrate Amazon Bedrock machine learning (ML) models with OpenSearch through built-in connectors for services, enabling direct access to advanced ML features.
Ingest Pipeline – With ingest pipelines, you can process, transform, and route data efficiently, maintaining smooth data flows and real-time accessibility for search.
Neural Search – Neural search transforms text and images into vectors and facilitates vector search both at ingestion time and at search time. This allows end-to-end configuration of ingest pipelines, search pipelines, and the necessary connectors without having to leave OpenSearch
Transformed multimodal search architecture – OfferUp transformed its foundational search architecture with Amazon Bedrock Titan Multimodal and Amazon OpenSearch Service.

The following diagram below illustrates the data pipeline for indexing and query in the transformed multimodal search architecture:

Figure 2: Transformed multimodal search architecture

The data indexing workflow consists of the following steps:

As an OfferUp user creates or updates a listing, any new images are uploaded directly to Amazon Simple Storage Service (Amazon S3) using signed upload URLs.
The OfferUp user submits the new or updated listing details (title, description, image ids) to a posting microservice.
The posting microservice then persists the changes using the listing writer microservice in Amazon DynamoDB.
The listing writer microservice publishes listing change events to an Amazon Simple Notification Service (Amazon SNS) topic, which an Amazon Simple Queue Service (Amazon SQS) queue subscribes to.
The listing indexer AWS Lambda function continuously polls the queue and processes incoming listing updates.
The indexer retrieves the full listing details through the listing reader microservice from the DynamoDB table.
The Lambda indexer relies on the image microservice to retrieve listing images and encode them in base64 format.
The indexer lambda sends inserts and updates with listing details and base 64-encoded images to an Amazon OpenSearch Service domain.
An OpenSearch Ingest pipeline invokes the OpenSearch connector for Amazon Bedrock. The Titan Multimodal Embeddings model generates multi-dimensional vector embeddings for the listing image and description.
Listing data and embeddings are then stored in an Amazon OpenSearch index.

The data query workflow consists of the following steps:

OfferUp users perform both text and image searches, such as “gray faux leather sofa” or “running shoes”.
The search microservice captures the query and forwards it to Amazon OpenSearch Service domain, which invokes a neural search pipeline. The neural search pipeline forwards each search request to the same Amazon Titan Multimodal Embeddings model to convert the text and images into multi-dimensional vector embeddings.
OpenSearch Service then uses the vectors to find the k-nearest neighbors (KNN) to the vectorized search term and image to retrieve the relevant listings.

After extensive A/B testing with various k values, OfferUp found that a k value of 128 delivers the best search results while optimizing compute resources.

OfferUp multimodal search migration path

OfferUp adopted a three-step process to implement multimodal search functionality into their foundational search architecture.

Identify the Designated Market Areas (DMAs) – OfferUp categorizes its DMAs into high density and low density. High DMA density represents geographic locations with a higher user concentration, whereas low DMA density refers to locations with fewer users. OfferUp initially identified three business-critical high-density locations where multimodal search solutions demonstrated promising results in offline experiments, making them ideal candidates for multimodal search.
Set up Infrastructure and necessary configurations – This includes the following
- OpenSearch Service: The OpenSearch domain is deployed across 3 Availability Zones (AZs) to provide high availability. The cluster comprises 3 cluster manager nodes (m6g.xlarge.search instance) dedicated to manage cluster operations. For data handling, 24 data nodes (r6gd.2xlarge.search instances) are used, optimized for both storage and processing. The index is configured with 12 shards and three read replicas to enhance read performance. Each shard consumes around 11.6GB of memory.
- Embeddings model: The infrastructure enables access to Amazon Titan Multimodal Embeddings G1 in Amazon Bedrock.
Use backfilling – Backfilling converts an image of every active listing into vectors using Amazon Titan Multimodal Embeddings and stores that in OpenSearch Service. In the first phase, OfferUp backfilled 12 million active listings.
OfferUp rolled out multimodal search experiments in these three DMAs where input token size could vary between 3 – 15.

Benefits of multimodal search

In this section, we discuss the benefits of multimodal search

Business metrics

OfferUp evaluated the impact of multimodal search through A/B testing to manage traffic control and user experiment variations. In this experiment, the control group used the existing keyword-based search, and the variant group experienced the new multimodal search functionality. The test included a substantial user base, allowing for a robust comparison.

The results of the multimodal search implementation were compelling.
User engagement increased by 2.2%, and EWSR saw a 3.8% improvement, highlighting enhanced relevance in search outcomes
Search depth grew by 6.5%, as users explored results more thoroughly, indicating improved relevance beyond the top search items
Importantly, the need for fanout searches (broader search queries) decreased by 54.2%, showing that more users found relevant local results quickly
Ad impressions also rose by 0.91%, sustaining ad visibility while enhancing search performance

Technical metrics

OfferUp conducted additional experiments to assess technical metrics, utilizing 6 months of production system data to examine relevance recall with a focus on the top k=10 most relevant results within high-density and low-density DMAs. By segmenting these locations, OfferUp gained insights into how variations in user distribution across different market densities affect system performance, allowing for a deeper understanding of relevance recall efficiency in diverse markets.

relevance recall (RR)= sum(listing relevance score) / number of retrieved listings

Listing relevance is labeled as (1, 0) and is based on query correlations with the listing retrieved.

1: Listing is relevant
0: listing is not relevant

Conclusion

In this post, we demonstrated how OfferUp transformed its foundational search architecture using Amazon Titan Multimodal Embeddings and OpenSearch Service, significantly increasing user engagement, improving search quality and offering users the ability to search with both text and images. OfferUp selected Amazon Titan Multimodal Embeddings and Amazon OpenSearch Service for their fully managed capabilities, enabling the development of a robust multimodal search solution with high accuracy and a faster time to market for search and recommendation use cases.

We are excited to share these insights with the broader community and support organizations embarking on their own multimodal search journeys or seeking to improve search precision. Based on our experience, we highly recommend using Amazon Bedrock and Amazon OpenSearch services to achieve similar outcomes.

In the next part of the series, we discuss how to build multimodal search solution with an Amazon SageMaker Jupyter notebook, Amazon Titan Multimodal Embeddings model and OpenSearch Service.

About the authors

Purna Sanyal is GenAI Specialist Solution Architect at AWS, helping customers to solve their business problems with successful adoption of cloud native architecture and digital transformation. He has specialization in data strategy, machine learning and Generative AI. He is passionate about building large-scale ML systems that can serve global users with optimal performance.

Andrés Vélez Echeveri is a Staff Data Scientist and Machine Learning Engineer at OfferUp, focused on enhancing the search experience by optimizing retrieval and ranking components within a recommendation system. He has a specialization in machine learning and generative AI. He is passionate about creating scalable AI systems that drive innovation and user impact.

Sean Azlin is a Principal Software Development Engineer at OfferUp, focused on leveraging technology to accelerate innovation, decrease time-to-market, and empower others to succeed and thrive. He is highly experienced in building cloud-native distributed systems at any scale. He is particularly passionate about GenAI and its many potential applications.

Vedere AI