Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

Customers of every size and industry are innovating on AWS by infusing machine learning (ML) into their products and services. Recent developments in generative AI models have further sped up the need of ML adoption across industries. However, implementing security, data privacy, and governance controls are still key challenges faced by customers when implementing ML workloads at scale. Addressing those challenges builds the framework and foundations for mitigating risk and responsible use of ML-driven products. Although generative AI may need additional controls in place, such as removing toxicity and preventing jailbreaking and hallucinations, it shares the same foundational components for security and governance as traditional ML.

We hear from customers that they require specialized knowledge and investment of up to 12 months for building out their customized Amazon SageMaker ML platform implementation to ensure scalable, reliable, secure, and governed ML environments for their lines of business (LOBs) or ML teams. If you lack a framework for governing the ML lifecycle at scale, you may run into challenges such as team-level resource isolation, scaling experimentation resources, operationalizing ML workflows, scaling model governance, and managing security and compliance of ML workloads.

Governing ML lifecycle at scale is a framework to help you build an ML platform with embedded security and governance controls based on industry best practices and enterprise standards. This framework addresses challenges by providing prescriptive guidance through a modular framework approach extending an AWS Control Tower multi-account AWS environment and the approach discussed in the post Setting up secure, well-governed machine learning environments on AWS.

It provides prescriptive guidance for the following ML platform functions:

  • Multi-account, security, and networking foundations – This function uses AWS Control Tower and well-architected principles for setting up and operating multi-account environment, security, and networking services.
  • Data and governance foundations – This function uses a data mesh architecture for setting up and operating the data lake, central feature store, and data governance foundations to enable fine-grained data access.
  • ML platform shared and governance services – This function enables setting up and operating common services such as CI/CD, AWS Service Catalog for provisioning environments, and a central model registry for model promotion and lineage.
  • ML team environments – This function enables setting up and operating environments for ML teams for model development, testing, and deploying their use cases for embedding security and governance controls.
  • ML platform observability – This function helps with troubleshooting and identifying the root cause for problems in ML models through centralization of logs and providing tools for log analysis visualization. It also provides guidance for generating cost and usage reports for ML use cases.

Although this framework can provide benefits to all customers, it’s most beneficial for large, mature, regulated, or global enterprises customers that want to scale their ML strategies in a controlled, compliant, and coordinated approach across the organization. It helps enable ML adoption while mitigating risks. This framework is useful for the following customers:

  • Large enterprise customers that have many LOBs or departments interested in using ML. This framework allows different teams to build and deploy ML models independently while providing central governance.
  • Enterprise customers with a moderate to high maturity in ML. They have already deployed some initial ML models and are looking to scale their ML efforts. This framework can help accelerate ML adoption across the organization. These companies also recognize the need for governance to manage things like access control, data usage, model performance, and unfair bias.
  • Companies in regulated industries such as financial services, healthcare, chemistry, and the private sector. These companies need strong governance and audibility for any ML models used in their business processes. Adopting this framework can help facilitate compliance while still allowing for local model development.
  • Global organizations that need to balance centralized and local control. This framework’s federated approach allows the central platform engineering team to set some high-level policies and standards, but also gives LOB teams flexibility to adapt based on local needs.

In the first part of this series, we walk through the reference architecture for setting up the ML platform. In a later post, we will provide prescriptive guidance for how to implement the various modules in the reference architecture in your organization.

The capabilities of the ML platform are grouped into four categories, as shown in the following figure. These capabilities form the foundation of the reference architecture discussed later in this post:

  • Build ML foundations
  • Scale ML operations
  • Observable ML
  • Secure ML

Solution overview

The framework for governing ML lifecycle at scale framework enables organizations to embed security and governance controls throughout the ML lifecycle that in turn help organizations reduce risk and accelerate infusing ML into their products and services. The framework helps optimize the setup and governance of secure, scalable, and reliable ML environments that can scale to support an increasing number of models and projects. The framework enables the following features:

  • Account and infrastructure provisioning with organization policy compliant infrastructure resources
  • Self-service deployment of data science environments and end-to-end ML operations (MLOps) templates for ML use cases
  • LOB-level or team-level isolation of resources for security and privacy compliance
  • Governed access to production-grade data for experimentation and production-ready workflows
  • Management and governance for code repositories, code pipelines, deployed models, and data features
  • A model registry and feature store (local and central components) for improving governance
  • Security and governance controls for the end-to-end model development and deployment process

In this section, we provide an overview of prescriptive guidance to help you build this ML platform on AWS with embedded security and governance controls.

The functional architecture associated with the ML platform is shown in the following diagram. The architecture maps the different capabilities of the ML platform to AWS accounts.

The functional architecture with different capabilities is implemented using a number of AWS services, including AWS Organizations, SageMaker, AWS DevOps services, and a data lake. The reference architecture for the ML platform with various AWS services is shown in the following diagram.

This framework considers multiple personas and services to govern the ML lifecycle at scale. We recommend the following steps to organize your teams and services:

  1. Using AWS Control Tower and automation tooling, your cloud administrator sets up the multi-account foundations such as Organizations and AWS IAM Identity Center (successor to AWS Single Sign-On) and security and governance services such as AWS Key Management Service (AWS KMS) and Service Catalog. In addition, the administrator sets up a variety of organization units (OUs) and initial accounts to support your ML and analytics workflows.
  2. Data lake administrators set up your data lake and data catalog, and set up the central feature store working with the ML platform admin.
  3. The ML platform admin provisions ML shared services such as AWS CodeCommit, AWS CodePipeline, Amazon Elastic Container Registry (Amazon ECR), a central model registry, SageMaker Model Cards, SageMaker Model Dashboard, and Service Catalog products for ML teams.
  4. The ML team lead federates via IAM Identity Center, uses Service Catalog products, and provisions resources in the ML team’s development environment.
  5. Data scientists from ML teams across different business units federate into their team’s development environment to build the model pipeline.
  6. Data scientists search and pull features from the central feature store catalog, build models through experiments, and select the best model for promotion.
  7. Data scientists create and share new features into the central feature store catalog for reuse.
  8. An ML engineer deploys the model pipeline into the ML team test environment using a shared services CI/CD process.
  9. After stakeholder validation, the ML model is deployed to the team’s production environment.
  10. Security and governance controls are embedded into every layer of this architecture using services such as AWS Security Hub, Amazon GuardDuty, Amazon Macie, and more.
  11. Security controls are centrally managed from the security tooling account using Security Hub.
  12. ML platform governance capabilities such as SageMaker Model Cards and SageMaker Model Dashboard are centrally managed from the governance services account.
  13. Amazon CloudWatch and AWS CloudTrail logs from each member account are made accessible centrally from an observability account using AWS native services.

Next, we dive deep into the modules of the reference architecture for this framework.

Reference architecture modules

The reference architecture comprises eight modules, each designed to solve a specific set of problems. Collectively, these modules address governance across various dimensions, such as infrastructure, data, model, and cost. Each module offers a distinct set of functions and interoperates with other modules to provide an integrated end-to-end ML platform with embedded security and governance controls. In this section, we present a short summary of each module’s capabilities.

Multi-account foundations

This module helps cloud administrators build an AWS Control Tower landing zone as a foundational framework. This includes building a multi-account structure, authentication and authorization via IAM Identity Center, a network hub-and-spoke design, centralized logging services, and new AWS member accounts with standardized security and governance baselines.

In addition, this module gives best practice guidance on OU and account structures that are appropriate for supporting your ML and analytics workflows. Cloud administrators will understand the purpose of the required accounts and OUs, how to deploy them, and key security and compliance services they should use to centrally govern their ML and analytics workloads.

A framework for vending new accounts is also covered, which uses automation for baselining new accounts when they are provisioned. By having an automated account provisioning process set up, cloud administrators can provide ML and analytics teams the accounts they need to perform their work more quickly, without sacrificing on a strong foundation for governance.

Data lake foundations

This module helps data lake admins set up a data lake to ingest data, curate datasets, and use the AWS Lake Formation governance model for managing fine-grained data access across accounts and users using a centralized data catalog, data access policies, and tag-based access controls. You can start small with one account for your data platform foundations for a proof of concept or a few small workloads. For medium-to-large-scale production workload implementation, we recommend adopting a multi-account strategy. In such a setting, LOBs can assume the role of data producers and data consumers using different AWS accounts, and the data lake governance is operated from a central shared AWS account. The data producer collects, processes, and stores data from their data domain, in addition to monitoring and ensuring the quality of their data assets. Data consumers consume the data from the data producer after the centralized catalog shares it using Lake Formation. The centralized catalog stores and manages the shared data catalog for the data producer accounts.

ML platform services

This module helps the ML platform engineering team set up shared services that are used by the data science teams on their team accounts. The services include a Service Catalog portfolio with products for SageMaker domain deployment, SageMaker domain user profile deployment, data science model templates for model building and deploying. This module has functionalities for a centralized model registry, model cards, model dashboard, and the CI/CD pipelines used to orchestrate and automate model development and deployment workflows.

In addition, this module details how to implement the controls and governance required to enable persona-based self-service capabilities, allowing data science teams to independently deploy their required cloud infrastructure and ML templates.

ML use case development

This module helps LOBs and data scientists access their team’s SageMaker domain in a development environment and instantiate a model building template to develop their models. In this module, data scientists work on a dev account instance of the template to interact with the data available on the centralized data lake, reuse and share features from a central feature store, create and run ML experiments, build and test their ML workflows, and register their models to a dev account model registry in their development environments.

Capabilities such as experiment tracking, model explainability reports, data and model bias monitoring, and model registry are also implemented in the templates, allowing for rapid adaptation of the solutions to the data scientists’ developed models.

ML operations

This module helps LOBs and ML engineers work on their dev instances of the model deployment template. After the candidate model is registered and approved, they set up CI/CD pipelines and run ML workflows in the team’s test environment, which registers the model into the central model registry running in a platform shared services account. When a model is approved in the central model registry, this triggers a CI/CD pipeline to deploy the model into the team’s production environment.

Centralized feature store

After the first models are deployed to production and multiple use cases start to share features created from the same data, a feature store becomes essential to ensure collaboration across use cases and reduce duplicate work. This module helps the ML platform engineering team set up a centralized feature store to provide storage and governance for ML features created by the ML use cases, enabling feature reuse across projects.

Logging and observability

This module helps LOBs and ML practitioners gain visibility into the state of ML workloads across ML environments through centralization of log activity such as CloudTrail, CloudWatch, VPC flow logs, and ML workload logs. Teams can filter, query, and visualize logs for analysis, which can help enhance security posture as well.

Cost and reporting

This module helps various stakeholders (cloud admin, platform admin, cloud business office) to generate reports and dashboards to break down costs at ML user, ML team, and ML product levels, and track usage such as number of users, instance types, and endpoints.

Customers have asked us to provide guidance on how many accounts to create and how to structure those accounts. In the next section, we provide guidance on that account structure as reference that you can modify to suit your needs according to your enterprise governance requirements.

Reference account structure

In this section, we discuss our recommendation for organizing your account structure. We share a baseline reference account structure; however, we recommend ML and data admins work closely with their cloud admin to customize this account structure based on their organization controls.

We recommend organizing accounts by OU for security, infrastructure, workloads, and deployments. Furthermore, within each OU, organize by non-production and production OU because the accounts and workloads deployed under them have different controls. Next, we briefly discuss those OUs.

Security OU

The accounts in this OU are managed by the organization’s cloud admin or security team for monitoring, identifying, protecting, detecting, and responding to security events.

Infrastructure OU

The accounts in this OU are managed by the organization’s cloud admin or network team for managing enterprise-level infrastructure shared resources and networks.

We recommend having the following accounts under the infrastructure OU:

  • Network – Set up a centralized networking infrastructure such as AWS Transit Gateway
  • Shared services – Set up centralized AD services and VPC endpoints

Workloads OU

The accounts in this OU are managed by the organization’s platform team admins. If you need different controls implemented for each platform team, you can nest other levels of OU for that purpose, such as an ML workloads OU, data workloads OU, and so on.

We recommend the following accounts under the workloads OU:

  • Team-level ML dev, test, and prod accounts – Set this up based on your workload isolation requirements
  • Data lake accounts – Partition accounts by your data domain
  • Central data governance account – Centralize your data access policies
  • Central feature store account – Centralize features for sharing across teams

Deployments OU

The accounts in this OU are managed by the organization’s platform team admins for deploying workloads and observability.

We recommend the following accounts under the deployments OU because the ML platform team can set up different sets of controls at this OU level to manage and govern deployments:

  • ML shared services accounts for test and prod – Hosts platform shared services CI/CD and model registry
  • ML observability accounts for test and prod – Hosts CloudWatch logs, CloudTrail logs, and other logs as needed

Next, we briefly discuss organization controls that need to be considered for embedding into member accounts for monitoring the infrastructure resources.

AWS environment controls

A control is a high-level rule that provides ongoing governance for your overall AWS environment. It’s expressed in plain language. In this framework, we use AWS Control Tower to implement the following controls that help you govern your resources and monitor compliance across groups of AWS accounts:

  • Preventive controls – A preventive control ensures that your accounts maintain compliance because it disallows actions that lead to policy violations and are implemented using a Service Control Policy (SCP). For example, you can set a preventive control that ensures that CloudTrail is not deleted or stopped in AWS accounts or Regions.
  • Detective controls – A detective control detects noncompliance of resources within your accounts, such as policy violations, provides alerts through the dashboard, and is implemented using AWS Config rules. For example, you can create a detective control to detects whether public read access is enabled to the Amazon Simple Storage Service (Amazon S3) buckets in the log archive shared account.
  • Proactive controls – A proactive control scans your resources before they are provisioned and makes sure that the resources are compliant with that control and are implemented using AWS CloudFormation hooks. Resources that aren’t compliant will not be provisioned. For example, you can set a proactive control that checks that direct internet access is not allowed for a SageMaker notebook instance.

Interactions between ML platform services, ML use cases, and ML operations

Different personas, such as the head of data science (lead data scientist), data scientist, and ML engineer, operate modules 2–6 as shown in the following diagram for different stages of ML platform services, ML use case development, and ML operations along with data lake foundations and the central feature store.

The following table summarizes the ops flow activity and setup flow steps for different personas. Once a persona initiates a ML activity as part of ops flow, the services run as mentioned in setup flow steps.

Persona Ops Flow Activity – Number Ops Flow Activity – Description Setup Flow Step – Number Setup Flow Step – Description
Lead Data Science or ML Team Lead

1

Uses Service Catalog in the ML platform services account and deploys the following:

    • ML infrastructure
    • SageMaker projects
    • SageMaker model registry

1-A

  • Sets up the dev, test, and prod environments for LOBs
  • Sets up SageMaker Studio in the ML platform services account

1-B

  • Sets up SageMaker Studio with the required configuration
Data Scientist

2

Conducts and tracks ML experiments in SageMaker notebooks

2-A

  • Uses data from Lake Formation
  • Saves features in the central feature store

3

Automates successful ML experiments with SageMaker projects and pipelines

3-A

    • Initiates SageMaker pipelines (preprocess, train, evaluate) in the dev account
  • Initiates the build CI/CD process with CodePipeline in the dev account

3-B

After the SageMaker pipelines run, saves the model in the local (dev) model registry
Lead Data Scientist or ML Team Lead

4

Approves the model in the local (dev) model registry

4-A

Model metadata and model package writes from the local (dev) model registry to the central model registry

5

Approves the model in the central model registry

5-A

Initiates the deployment CI/CD process to create SageMaker endpoints in the test environment

5-B

Writes the model information and metadata to the ML governance module (model card, model dashboard) in the ML platform services account from the local (dev) account
ML Engineer

6

Tests and monitors the SageMaker endpoint in the test environment after CI/CD .

7

Approves deployment for SageMaker endpoints in the prod environment

7-A

Initiates the deployment CI/CD process to create SageMaker endpoints in the prod environment

8

Tests and monitors the SageMaker endpoint in the test environment after CI/CD .

Personas and interactions with different modules of the ML platform

Each module caters to particular target personas within specific divisions that utilize the module most often, granting them primary access. Secondary access is then permitted to other divisions that require occasional use of the modules. The modules are tailored towards the needs of particular job roles or personas to optimize functionality.

We discuss the following teams:

  • Central cloud engineering – This team operates at the enterprise cloud level across all workloads for setting up common cloud infrastructure services, such as setting up enterprise-level networking, identity, permissions, and account management
  • Data platform engineering – This team manages enterprise data lakes, data collection, data curation, and data governance
  • ML platform engineering – This team operates at the ML platform level across LOBs to provide shared ML infrastructure services such as ML infrastructure provisioning, experiment tracking, model governance, deployment, and observability

The following table details which divisions have primary and secondary access for each module according to the module’s target personas.

Module Number Modules Primary Access Secondary Access Target Personas Number of accounts

1

Multi-account foundations Central cloud engineering Individual LOBs
  • Cloud admin
  • Cloud engineers
Few

2

Data lake foundations Central cloud or data platform engineering Individual LOBs
  • Data lake admin
  • Data engineers
Multiple

3

ML platform services Central cloud or ML platform engineering Individual LOBs
  • ML platform Admin
  • ML team Lead
  • ML engineers
  • ML governance lead
One

4

ML use case development Individual LOBs Central cloud or ML platform engineering
  • Data scientists
  • Data engineers
  • ML team lead
  • ML engineers
Multiple

5

ML operations Central cloud or ML engineering Individual LOBs
  • ML Engineers
  • ML team leads
  • Data scientists
Multiple

6

Centralized feature store Central cloud or data engineering Individual LOBs
  • Data engineer
  • Data scientists
One

7

Logging and observability Central cloud engineering Individual LOBs
  • Cloud admin
  • IT auditors
One

8

Cost and reporting Individual LOBs Central platform engineering
  • LOB executives
  • ML managers
One

Conclusion

In this post, we introduced a framework for governing the ML lifecycle at scale that helps you implement well-architected ML workloads embedding security and governance controls. We discussed how this framework takes a holistic approach for building an ML platform considering data governance, model governance, and enterprise-level controls. We encourage you to experiment with the framework and concepts introduced in this post and share your feedback.


About the authors

Ram Vittal is a Principal ML Solutions Architect at AWS. He has over 3 decades of experience architecting and building distributed, hybrid, and cloud applications. He is passionate about building secure, scalable, reliable AI/ML and big data solutions to help enterprise customers with their cloud adoption and optimization journey to improve their business outcomes. In his spare time, he rides motorcycle and walks with his three-year old sheep-a-doodle!

Sovik Kumar Nath is an AI/ML solution architect with AWS. He has extensive experience designing end-to-end machine learning and business analytics solutions in finance, operations, marketing, healthcare, supply chain management, and IoT. Sovik has published articles and holds a patent in ML model monitoring. He has double masters degrees from the University of South Florida, University of Fribourg, Switzerland, and a bachelors degree from the Indian Institute of Technology, Kharagpur. Outside of work, Sovik enjoys traveling, taking ferry rides, and watching movies.

Maira Ladeira Tanke is a Senior Data Specialist at AWS. As a technical lead, she helps customers accelerate their achievement of business value through emerging technology and innovative solutions. Maira has been with AWS since January 2020. Prior to that, she worked as a data scientist in multiple industries focusing on achieving business value from data. In her free time, Maira enjoys traveling and spending time with her family someplace warm.

Ryan Lempka is a Senior Solutions Architect at Amazon Web Services, where he helps his customers work backwards from business objectives to develop solutions on AWS. He has deep experience in business strategy, IT systems management, and data science. Ryan is dedicated to being a lifelong learner, and enjoys challenging himself every day to learn something new.

Sriharsh Adari is a Senior Solutions Architect at Amazon Web Services (AWS), where he helps customers work backwards from business outcomes to develop innovative solutions on AWS. Over the years, he has helped multiple customers on data platform transformations across industry verticals. His core area of expertise include Technology Strategy, Data Analytics, and Data Science. In his spare time, he enjoys playing sports, binge-watching TV shows, and playing Tabla.

Read More

How Meesho built a generalized feed ranker using Amazon SageMaker inference

How Meesho built a generalized feed ranker using Amazon SageMaker inference

This is a guest post co-written by Rama Badrinath, Divay Jindal and Utkarsh Agrawal at Meesho.


Meesho is India’s fastest growing ecommerce company with a mission to democratize internet commerce for everyone and make it accessible to the next billion users of India. Meesho was founded in 2015 and today focuses on buyers and sellers across India. The Meesho marketplace provides micro, small, and medium businesses and individual entrepreneurs access to millions of customers, a selection from over 30 categories and more than 900 sub-categories, pan-India logistics, payment services, and customer support capabilities to efficiently run their businesses on the Meesho ecosystem.

As an ecommerce platform, Meesho aims to improve the user experience by offering personalized and relevant product recommendations. We wanted to create a generalized feed ranker that considers individual preferences and historical behavior to effectively display products in each user’s feed. Through this, we wanted to boost user engagement, conversion rates, and overall business growth by tailoring the shopping experience to each customer’s unique requirements and providing the best value for their money.

We used AWS machine learning (ML) services like Amazon SageMaker to develop a powerful generalized feed ranker (GFR). In this post, we discuss the key components of the GFR and how this ML-driven solution streamlined the ML lifecycle, ensuring efficient infra management, scalability, and reliability within the ecosystem.

Solution overview

To personalize users’ feeds, we analyzed extensive historical data, extracting insights into features that include browsing patterns and interests. These valuable features are used to construct ranking models. The GFR personalizes each user’s feed in real time, considering various factors like geography, prior shopping pattern, acquisition channels, and more. Several interaction-based features are also used to capture the affinity of the user towards an item, item category, or item properties like price, rating, or discount.

Several user-agnostic features and scores at item level are used as well. These include an item popularity score and item propensity to buy score. All these features go as input to the Learning to Rank (LTR) model that tries to emit the Probability of Click (PCTR) and Probability of Purchase (PCVR).

For diverse and relevant recommendations, the GFR sources candidate products from multiple channels, including exploit (known user preferences), explore (novel and potentially interesting products), popularity (trending items), and recent (latest additions).

The following diagram illustrates the GFR architecture.

The architecture can be divided into two different components: model training and model deployment. In the following sections, we discuss each component and the AWS services used in more detail.

Model training

Meesho used Amazon EMR with Apache Spark to process hundreds of millions of data points, depending on the model’s complexity. One of the major challenges was to run distributed training at scale. We used Dask—a distributed data science computing framework that natively integrates with Python libraries—on Amazon EMR to scale out the training jobs across the cluster. The distributed training of the model helped cut down training time from days to hours and allowed us to schedule Spark jobs efficiently and cost-effectively. We used an offline feature store to maintain a historical record of all feature values that will be used for model training. Model artifacts from training are stored in Amazon Simple Storage Service (Amazon S3), providing convenient access and version management.

We used a time sampling strategy to create training, validation, and test datasets for model training. We kept track of various metrics to evaluate the performance of the model—the most important ones being area under the ROC curve and area under the precision recall curve. We also tracked calibration of the model to prevent overconfidence and underconfidence issues while predicting the probability scores.

Model deployment

Meesho used SageMaker inference endpoints with auto scaling enabled for deploying the trained model. SageMaker offered ease of deployment with support for various ML frameworks, allowing models to be served with low latency. Although AWS offers standard inference images suitable for most use cases, we built a custom inference image that caters specifically to our needs and pushed it to Amazon Elastic Container Registry (Amazon ECR).

We built an in-house A/B testing platform that facilitated live monitoring of A/B metrics, enabling us to make data-driven decisions promptly. We also used the A/B testing feature of SageMaker to deploy multiple production variants on an endpoint. Through A/B experiments, we observed an approximate 3.5% enhancement in the platform’s conversion rate and an increase in app open frequency of the users, highlighting the effectiveness of this approach.

We kept track of various drifts such as feature drift and prior drift multiple times a day after model deployment to prevent the model performance from deteriorating.

We used AWS Lambda to set up various automations and triggers that are required during model retraining, endpoint updates, and monitoring processes.

The recommendation workflow after model deployment works as follows (as noted in the solution architecture diagram):

  1. The input requests with user context and interaction features are received at the application layer from Meesho’s mobile and web app.
  2. The application layer fetches additional features like historical data of the user from the online feature store and appends these to the input requests.
  3. The appended features are sent to the real-time endpoints for generating recommendations.
  4. The model predictions are sent back to the application layer.
  5. The application layer uses these predictions to personalize the user feeds on the mobile or web application.

Conclusion

Meesho successfully implemented a generalized feed ranker using SageMaker, which resulted in highly personalized product recommendations for each customer based on their preferences and historical behavior. This approach significantly improved user engagement and led to higher conversion rates, contributing to the company’s overall business growth. As a result of utilizing AWS services, our ML lifecycle runtime reduced significantly, from taking months to just weeks, leading to increased efficiency and productivity for our team.

With this advanced feed ranker, Meesho continues to deliver tailored shopping experiences, adding more value to its customers and fulfilling its mission to democratize ecommerce for everyone.

The team is grateful for the continuous support and guidance from Ravindra Yadav, Director of Data Science at Meesho, and Debdoot Mukherjee, Head of AI at Meesho, who played a key role in enabling this success.

To learn more about SageMaker, refer to the Amazon SageMaker Developer Guide.


About the Authors

Utkarsh Agrawal is currently working as a Senior Data Scientist at Meesho. He previously worked with Fractal Analytics and Trell on various domains, including recommender systems, time series, NLP, and more. He holds a master’s degree in Mathematics and Computing from Indian Institute of Technology Kharagpur (IIT), India.

Rama Badrinath is currently working as a Principal Data Scientist at Meesho. He previously worked with Microsoft and ShareChat on various domains, including recommender systems, image AI, NLP, and more. He holds a master’s degree in Machine Learning from Indian Institute of Science (IISc), India. He has also published papers in renowned conferences such as KDD and ECIR.

Divay Jindal is currently working as a Lead Data Scientist at Meesho. He previously worked with Bookmyshow on various domains, including recommender systems and dynamic pricing.

Venugopal Pai is a Solutions Architect at AWS. He lives in Bengaluru, India, and helps digital-native customers scale and optimize their applications on AWS.

Read More

Announcing Rekogniton Custom Moderation: Enhance accuracy of pre-trained Rekognition moderation models with your data

Announcing Rekogniton Custom Moderation: Enhance accuracy of pre-trained Rekognition moderation models with your data

Companies increasingly rely on user-generated images and videos for engagement. From ecommerce platforms encouraging customers to share product images to social media companies promoting user-generated videos and images, using user content for engagement is a powerful strategy. However, it can be challenging to ensure that this user-generated content is consistent with your policies and fosters a safe online community for your users.

Many companies currently depend on human moderators or respond reactively to user complaints to manage inappropriate user-generated content. These approaches don’t scale to effectively moderate millions of images and videos at sufficient quality or speed, which leads to a poor user experience, high costs to achieve scale, or even potential harm to brand reputation.

In this post, we discuss how to use the Custom Moderation feature in Amazon Rekognition to enhance the accuracy of your pre-trained content moderation API.

Content moderation in Amazon Rekognition

Amazon Rekognition is a managed artificial intelligence (AI) service that offers pre-trained and customizable computer vision capabilities to extract information and insights from images and videos. One such capability is Amazon Rekognition Content Moderation, which detects inappropriate or unwanted content in images and videos. Amazon Rekognition uses a hierarchical taxonomy to label inappropriate or unwanted content with 10 top-level moderation categories (such as violence, explicit, alcohol, or drugs) and 35 second-level categories. Customers across industries such as ecommerce, social media, and gaming can use content moderation in Amazon Rekognition to protect their brand reputation and foster safe user communities.

By using Amazon Rekognition for image and video moderation, human moderators have to review a much smaller set of content, typically 1–5% of the total volume, already flagged by the content moderation model. This enables companies to focus on more valuable activities and still achieve comprehensive moderation coverage at a fraction of their existing cost.

Introducing Amazon Rekognition Custom Moderation

You can now enhance the accuracy of the Rekognition moderation model for your business-specific data with the Custom Moderation feature. You can train a custom adapter with as few as 20 annotated images in less than 1 hour. These adapters extend the capabilities of the moderation model to detect images used for training with higher accuracy. For this post, we use a sample dataset containing both safe images and images with alcoholic beverages (considered unsafe) to enhance the accuracy of the alcohol moderation label.

The unique ID of the trained adapter can be provided to the existing DetectModerationLabels API operation to process images using this adapter. Each adapter can only be used by the AWS account that was used for training the adapter, ensuring that the data used for training remains safe and secure in that AWS account. With the Custom Moderation feature, you can tailor the Rekognition pre-trained moderation model for improved performance on your specific moderation use case, without any machine learning (ML) expertise. You can continue to enjoy the benefits of a fully managed moderation service with a pay-per-use pricing model for Custom Moderation.

Solution overview

Training a custom moderation adapter involves five steps that you can complete using the AWS Management Console or the API interface:

  1. Create a project
  2. Upload the training data
  3. Assign ground truth labels to images
  4. Train the adapter
  5. Use the adapter

workflow diagram

Let’s walk through these steps in more detail using the console.

Create a project

A project is a container to store your adapters. You can train multiple adapters within a project with different training datasets to assess which adapter performs best for your specific use case. To create your project, complete the following steps:

  1. On the Amazon Rekognition console, choose Custom Moderation in the navigation pane.
  2. Choose Create project.

screenshot - list of tasks

  1. For Project name, enter a name for your project.
  2. For Adapter name, enter a name for your adapter.
  3. Optionally, enter a description for your adapter.

screenshot - create task

Upload training data

You can begin with as few as 20 sample images to adapt the moderation model to detect fewer false positives (images that are appropriate for your business but are flagged by the model with a moderation label). To reduce false negatives (images that are inappropriate for your business but don’t get flagged with a moderation label), you are required to start with 50 sample images.

You can select from the following options to provide the image datasets for adapter training:

Complete the following steps:

  1. For this post, select Import images from S3 bucket and enter your S3 URI.

screenshot - provide dataset

Like any ML training process, training a Custom Moderation adapter in Amazon Rekognition requires two separate datasets: one for training the adapter and another for evaluating the adapter. You can either upload a separate test dataset or choose to automatically split your training dataset for training and testing.

  1. For this post, select Autosplit.
  2. Select Enable auto-update to ensure that the system automatically retrains the adapter when a new version of the content moderation model is launched.
  3. Choose Create project.

screenshot - create project

Assign ground truth labels to images

If you uploaded unannotated images, you can use the Amazon Rekognition console to provide image labels as per the moderation taxonomy. In the following example, we train an adapter to detect hidden alcohol with higher accuracy, and label all such images with the label alcohol. Images not considered inappropriate can be labeled as Safe.

screenshot - label images

Train the adapter

After you label all the images, choose Start training to initiate the training process. Amazon Rekognition will use the uploaded image datasets to train an adapter model for enhanced accuracy on the specific type of images provided for training.

After the custom moderation adapter is trained, you can view all the adapter details (adapterID, test and training manifest files) in the Adapter performance section.

The Adapter performance section displays improvements in false positives and false negatives when compared to the pre-trained moderation model. The adapter we trained to enhance the detection of the alcohol label reduces the false negative rate in test images by 73%. In other words, the adapter now accurately predicts the alcohol moderation label for 73% more images compared to the pre-trained moderation model. However, no improvement is observed in false positives, as no false positive samples were used for training.

screenshot - accuracy

Use the adapter

You can perform inference using the newly trained adapter to achieve enhanced accuracy. To do this, call the Amazon Rekognition DetectModerationLabel API with an additional parameter, ProjectVersion, which is the unique AdapterID of the adapter. The following is a sample command using the AWS Command Line Interface (AWS CLI):

aws rekognition detect-moderation-labels 
--image 'S3Object={Bucket="<bucket>",Name="<key>"}' 
--project-version <ARN of the Adapter> 
--region us-east-1

The following is a sample code snippet using the Python Boto3 library:

import boto3
client = boto3.client('rekognition')
response = client.detect_moderation_labels(
    Image={
        "S3Object":{
            "Bucket":"<bucket>",
            "Name":"<key>"
        }
    }, 
    ProjectVersion="<ARN of the Adapter>"
)

Best practices for training

To maximize the performance of your adapter, the following best practices are recommended for training the adapter:

  • The sample image data should capture the representative errors that you want to improve the moderation model accuracy for
  • Instead of only bringing in error images for false positives and false negatives, you can also provide true positives and true negatives for improved performance
  • Supply as many annotated images as possible for training

Conclusion

In this post, we presented an in-depth overview of the new Amazon Rekognition Custom Moderation feature. Furthermore, we detailed the steps for performing training using the console, including best practices for optimal results. For additional information, visit the Amazon Rekognition console and explore the Custom Moderation feature.

Amazon Rekognition Custom Moderation is now generally available in all AWS Regions where Amazon Rekognition is available.

Learn more about content moderation on AWS. Take the first step towards streamlining your content moderation operations with AWS.


About the Authors

Author - Shipra KanoriaShipra Kanoria is a Principal Product Manager at AWS. She is passionate about helping customers solve their most complex problems with the power of machine learning and artificial intelligence. Before joining AWS, Shipra spent over 4 years at Amazon Alexa, where she launched many productivity-related features on the Alexa voice assistant.

Author - Aakash DeepAakash Deep is a Software Development Engineering Manager based in Seattle. He enjoys working on computer vision, AI, and distributed systems. His mission is to enable customers to address complex problems and create value with AWS Rekognition. Outside of work, he enjoys hiking and traveling.

Author - Lana ZhangLana Zhang is a Senior Solutions Architect at AWS WWSO AI Services team, specializing in AI and ML for Content Moderation, Computer Vision, Natural Language Processing and Generative AI. With her expertise, she is dedicated to promoting AWS AI/ML solutions and assisting customers in transforming their business solutions across diverse industries, including social media, gaming, e-commerce, media, advertising & marketing.

Read More

Defect detection in high-resolution imagery using two-stage Amazon Rekognition Custom Labels models

Defect detection in high-resolution imagery using two-stage Amazon Rekognition Custom Labels models

High-resolution imagery is very prevalent in today’s world, from satellite imagery to drones and DLSR cameras. From this imagery, we can capture damage due to natural disasters, anomalies in manufacturing equipment, or very small defects such as defects on printed circuit boards (PCBs) or semiconductors. Building anomaly detection models using high-resolution imagery can be challenging because modern computer vision models typically resize images to a lower resolution to fit into memory for training and running inference. Reducing the image resolution significantly means that visual information relating to the defect is degraded or completely lost.

One approach to overcome these challenges is to build two-stage models. Stage 1 models detect a region of interest, and Stage 2 models detect defects on the cropped region of interest, thereby maintaining sufficient resolution for small detects.

In this post, we go over how to build an effective two-stage defect detection system using Amazon Rekognition Custom Labels and compare results for this specific use case with one-stage models. Note that several one-stage models are effective even at lower or resized image resolutions, and others may accommodate large images in smaller batches.

Solution overview

For our use case, we use a dataset of images of PCBs with synthetically generated missing hole pins, as shown in the following example.

We use this dataset to demonstrate that a one-stage approach using object detection results in subpar detection performance for the missing hole pin defects. A two-step model is preferred, in which we use Rekognition Custom Labels first for object detection to identify the pins and then a second-stage model to classify cropped images of the pins into pins with missing holes or normal pins.

The training process for a Rekognition Custom Labels model consists of several steps, as illustrated in the following diagram.

First, we use Amazon Simple Storage Service (Amazon S3) to store the image data. The data is ingested in Amazon Sagemaker Jupyter notebooks, where typically a data scientist will inspect the images and preprocess them, removing any images that are of poor quality such as blurred images or poor lighting conditions, and resize or crop the images. Then data is split into training and test sets, and Amazon SageMaker Ground Truth labeling jobs are run to label the sets of images and output a train and test manifest file. The manifest files are used by Rekognition Custom Labels for training.

One-stage model approach

The first approach we take to identifying missing holes on the PCB is to label the missing holes and train an object detection model to identify the missing holes. The following is an image example from the dataset.

We train a model with a dataset with 95 images used as training and 20 images used for testing. The following table summarizes our results.

Evaluation Results
F1 Score Average Precision Overall Recall
0.468 0.750 0.340
Training Time Training Dataset Testing Dataset
Trained in 1.791 hours 1 label, 95 images 1 label, 20 images
Per Label Performance
Label Name F1 Score Test Images Precision Recall Assumed Threshold
missing_hole 0.468 20 0.750 0.340 0.053

The resulting model has high precision but low recall, meaning that when we localize a region for a missing hole, we’re usually correct, but we’re missing a lot of missing holes that are present on the PCB. To build an effective defect detection system, we need to improve recall. The low performance of this model may be due to the defects being small on this high-resolution image of the PCB, so the model has no reference of a healthy pin.

Next, we explore splitting the image into four or six crops depending on the PCB size and labeling both healthy and missing holes. The following is an example of the resulting cropped image.

We train a model with 524 images used as training and 106 images used for testing. We maintain the same PCBs used in train and test as the full board model. The results for cropped healthy pins vs. missing holes are shown in the following table.

Evaluation Results
F1 Score Average Precision Overall Recall
0.967 0.989 0.945
Training Time Training Dataset Testing Dataset
Trained in 2.118 hours 2 labels, 524 images 2 labels, 106 images
Per Label Performance
Label Name F1 Score Test Images Precision Recall Assumed Threshold
missing_hole 0.949 42 0.980 0.920 0.536
pin 0.984 106 0.998 0.970 0.696

Both precision and recall have improved significantly. Training the model with zoomed-in cropped images and a reference to the model for healthy pins helped. However, recall is still at 92%, meaning that we would still miss 8% of the missing holes and let defects go by unnoticed.

Next, we explore a two-stage model approach in which we can improve the model performance further.

Two-stage model approach

For the two-stage model, we train two models: one for detecting pins and one for detecting if the pin is missing or not on zoomed-in cropped images of the pin. The following is an image from the pin detection dataset.

The data is similar to our previous experiment, in which we cropped the PCB into four or six cropped images. This time, we label all pins and don’t make any distinctions if the pin has a missing hole or not. We train this model with 522 images and test with 108 images, maintaining the same train/test split as previous experiments. The results are shown in the following table.

Evaluation Results
F1 Score Average Precision Overall Recall
1.000 0.999 1.000
Training Time Training Dataset Testing Dataset
Trained in 1.581 hours 1 label, 522 images 1 label, 108 images
Per Label Performance
Label Name F1 Score Test Images Precision Recall Assumed Threshold
pin 1.000 108 0.999 1.000 0.617

The model detects the pins perfectly on this synthetic dataset.

Next, we build the model to make the distinction for missing holes. We use cropped images of the holes to train the second stage of the model, as shown in the following examples. This model is separate from the previous models because it’s a classification model and will be focused on the narrow task of determining if the pin has a missing hole.

We train this second-stage model on 16,624 images and test on 3,266, maintaining the same train/test splits as the previous experiments. The following table summarizes our results.

Evaluation Results
F1 Score Average Precision Overall Recall
1.000 1.000 1.000
Training Time Training Dataset Testing Dataset
Trained in 6.660 hours 2 labels, 16,624 images 2 labels, 3,266 images
Per Label Performance
Label Name F1 Score Test Images Precision Recall Assumed Threshold
anomaly 1.000 88 1.000 1.000 0.960
normal 1.000 3,178 1.000 1.000 0.996

Again, we receive perfect precision and recall on this synthetic dataset. Combining the previous pin detection model with this second-stage missing hole classification model, we can build a model that outperforms any single-stage model.

The following table summarizes the experiments we conducted.

Experiment Type Description F1 Score Precision Recall
1 One-stage model Object detection model to detect missing holes on full images 0.468 0.75 0.34
2 One-stage model Object detection model to detect healthy pins and missing holes on cropped images 0.967 0.989 0.945
3 Two-stage model Stage 1: Object detection on all pins 1.000 0.999 1.000
Stage 2: Image classification of healthy pin or missing holes 1.000 1.000 1.000
End-to-end average 1.000 0.9995 1.000

Inference pipeline

You can use the following architecture to deploy the one-stage and two-stage models that we described in this post. The following main components are involved:

For one-stage models, you can send an input image to the API Gateway endpoint, followed by Lambda for any basic image preprocessing, and route to the Rekognition Custom Labels trained model endpoint. In our experiments, we explored one-stage models that can detect only missing holes, and missing holes and healthy pins.

For two-stage models, you can similarly send an image to the API Gateway endpoint, followed by Lambda. Lambda acts as an orchestrator that first calls the object detection model (trained using Rekognition Custom Labels), which generates the region of interest. The original image is then cropped in the Lambda function, and sent to another Rekognition Custom Labels classification model for detecting defects in each cropped image.

Conclusion

In this post, we trained one- and two-stage models to detect missing holes in PCBs using Rekognition Custom Labels. We reported results for various models; in our case, two-stage models outperformed other variants. We encourage customers with high-resolution imagery from other domains to test model performance with one- and two-stage models. Additionally, consider the following ways to expand the solution:

  • Sliding window crops for your actual datasets
  • Reusing your object detection models in the same pipeline
  • Pre-labeling workflows using bounding box predictions

About the authors

Andreas Karagounis is a Data Science Manager at Accenture. He holds a masters in Computer Science from Brown University. He has a background in computer vision and works with customers to solve their business challenges using data science and machine learning.

Yogesh Chaturvedi is a Principal Solutions Architect at AWS with a focus in computer vision. He works with customers to address their business challenges using cloud technologies. Outside of work, he enjoys hiking, traveling, and watching sports.

Shreyas Subramanian is a Principal Data Scientist, and helps customers by using machine learning to solve their business challenges using the AWS platform. Shreyas has a background in large-scale optimization and machine learning, and in the use of machine learning and reinforcement learning for accelerating optimization tasks.

Selimcan “Can” Sakar is a cloud-first developer and Solutions Architect at AWS Accenture Business Group with a focus on emerging technologies such as GenAI, ML, and blockchain. When he isn’t watching models converge, he can be seen biking or playing the clarinet.

Read More

Automatically redact PII for machine learning using Amazon SageMaker Data Wrangler

Automatically redact PII for machine learning using Amazon SageMaker Data Wrangler

Customers increasingly want to use deep learning approaches such as large language models (LLMs) to automate the extraction of data and insights. For many industries, data that is useful for machine learning (ML) may contain personally identifiable information (PII). To ensure customer privacy and maintain regulatory compliance while training, fine-tuning, and using deep learning models, it’s often necessary to first redact PII from source data.

This post demonstrates how to use Amazon SageMaker Data Wrangler and Amazon Comprehend to automatically redact PII from tabular data as part of your machine learning operations (ML Ops) workflow.

Problem: ML data that contains PII

PII is defined as any representation of information that permits the identity of an individual to whom the information applies to be reasonably inferred by either direct or indirect means. PII is information that either directly identifies an individual (name, address, social security number or other identifying number or code, telephone number, email address, and so on) or information that an agency intends to use to identify specific individuals in conjunction with other data elements, namely, indirect identification.

Customers in business domains such as financial, retail, legal, and government deal with PII data on a regular basis. Due to various government regulations and rules, customers have to find a mechanism to handle this sensitive data with appropriate security measures to avoid regulatory fines, possible fraud, and defamation. PII redaction is the process of masking or removing sensitive information from a document so it can be used and distributed, while still protecting confidential information.

Businesses need to deliver delightful customer experiences and better business outcomes by using ML. Redaction of PII data is often a key first step to unlock the larger and richer data streams needed to use or fine-tune generative AI models, without worrying about whether their enterprise data (or that of their customers) will be compromised.

Solution overview

This solution uses Amazon Comprehend and SageMaker Data Wrangler to automatically redact PII data from a sample dataset.

Amazon Comprehend is a natural language processing (NLP) service that uses ML to uncover insights and relationships in unstructured data, with no managing infrastructure or ML experience required. It provides functionality to locate various PII entity types within text, for example names or credit card numbers. Although the latest generative AI models have demonstrated some PII redaction capability, they generally don’t provide a confidence score for PII identification or structured data describing what was redacted. The PII functionality of Amazon Comprehend returns both, enabling you to create redaction workflows that are fully auditable at scale. Additionally, using Amazon Comprehend with AWS PrivateLink means that customer data never leaves the AWS network and is continuously secured with the same data access and privacy controls as the rest of your applications.

Similar to Amazon Comprehend, Amazon Macie uses a rules-based engine to identify sensitive data (including PII) stored in Amazon Simple Storage Service (Amazon S3). However, its rules-based approach relies on having specific keywords that indicate sensitive data located close to that data (within 30 characters). In contrast, the NLP-based ML approach of Amazon Comprehend uses sematic understanding of longer chunks of text to identify PII, making it more useful for finding PII within unstructured data.

Additionally, for tabular data such as CSV or plain text files, Macie returns less detailed location information than Amazon Comprehend (either a row/column indicator or a line number, respectively, but not start and end character offsets). This makes Amazon Comprehend particularly helpful for redacting PII from unstructured text that may contain a mix of PII and non-PII words (for example, support tickets or LLM prompts) that is stored in a tabular format.

Amazon SageMaker provides purpose-built tools for ML teams to automate and standardize processes across the ML lifecycle. With SageMaker MLOps tools, teams can easily prepare, train, test, troubleshoot, deploy, and govern ML models at scale, boosting productivity of data scientists and ML engineers while maintaining model performance in production. The following diagram illustrates the SageMaker MLOps workflow.

SageMaker Pipelines

SageMaker Data Wrangler is a feature of Amazon SageMaker Studio that provides an end-to-end solution to import, prepare, transform, featurize, and analyze datasets stored in locations such as Amazon S3 or Amazon Athena, a common first step in the ML lifecycle. You can use SageMaker Data Wrangler to simplify and streamline dataset preprocessing and feature engineering by either using built-in, no-code transformations or customizing with your own Python scripts.

Using Amazon Comprehend to redact PII as part of a SageMaker Data Wrangler data preparation workflow keeps all downstream uses of the data, such as model training or inference, in alignment with your organization’s PII requirements. You can integrate SageMaker Data Wrangler with Amazon SageMaker Pipelines to automate end-to-end ML operations, including data preparation and PII redaction. For more details, refer to Integrating SageMaker Data Wrangler with SageMaker Pipelines. The rest of this post demonstrates a SageMaker Data Wrangler flow that uses Amazon Comprehend to redact PII from text stored in tabular data format.

This solution uses a public synthetic dataset along with a custom SageMaker Data Wrangler flow, available as a file in GitHub. The steps to use the SageMaker Data Wrangler flow to redact PII are as follows:

  1. Open SageMaker Studio.
  2. Download the SageMaker Data Wrangler flow.
  3. Review the SageMaker Data Wrangler flow.
  4. Add a destination node.
  5. Create a SageMaker Data Wrangler export job.

This walkthrough, including running the export job, should take 20–25 minutes to complete.

Prerequisites

For this walkthrough, you should have the following:

Open SageMaker Studio

To open SageMaker Studio, complete the following steps:

  1. On the SageMaker console, choose Studio in the navigation pane.
  2. Choose the domain and user profile
  3. Choose Open Studio.

To get started with the new capabilities of SageMaker Data Wrangler, it’s recommended to upgrade to the latest release.

Download the SageMaker Data Wrangler flow

You first need to retrieve the SageMaker Data Wrangler flow file from GitHub and upload it to SageMaker Studio. Complete the following steps:

  1. Navigate to the SageMaker Data Wrangler redact-pii.flow file on GitHub.
  2. On GitHub, choose the download icon to download the flow file to your local computer.
  3. In SageMaker Studio, choose the file icon in the navigation pane.
  4. Choose the upload icon, then choose redact-pii.flow.

Upload Data Wrangler Flow

Review the SageMaker Data Wrangler flow

In SageMaker Studio, open redact-pii.flow. After a few minutes, the flow will finish loading and show the flow diagram (see the following screenshot). The flow contains six steps: an S3 Source step followed by five transformation steps.

Data Wrangler Flow steps

On the flow diagram, choose the last step, Redact PII. The All Steps pane opens on the right and shows a list of the steps in the flow. You can expand each step to view details, change parameters, and potentially add custom code.

Data Wrangler Flow step details

Let’s walk through each step in the flow.

Steps 1 (S3 Source) and 2 (Data types) are added by SageMaker Data Wrangler whenever data is imported for a new flow. In S3 Source, the S3 URI field points to the sample dataset, which is a CSV file stored in Amazon S3. The file contains roughly 116,000 rows, and the flow sets the value of the Sampling field to 1,000, which means that SageMaker Data Wrangler will sample 1,000 rows to display in the user interface. Data types sets the data type for each column of imported data.

Step 3 (Sampling) sets the number of rows SageMaker Data Wrangler will sample for an export job to 5,000, via the Approximate sample size field. Note that this is different from the number of rows sampled to display in the user interface (Step 1). To export data with more rows, you can increase this number or remove Step 3.

Steps 4, 5, and 6 use SageMaker Data Wrangler custom transforms. Custom transforms allow you to run your own Python or SQL code within a Data Wrangler flow. The custom code can be written in four ways:

  • In SQL, using PySpark SQL to modify the dataset
  • In Python, using a PySpark data frame and libraries to modify the dataset
  • In Python, using a pandas data frame and libraries to modify the dataset
  • In Python, using a user-defined function to modify a column of the dataset

The Python (pandas) approach requires your dataset to fit into memory and can only be run on a single instance, limiting its ability to scale efficiently. When working in Python with larger datasets, we recommend using either the Python (PySpark) or Python (user-defined function) approach. SageMaker Data Wrangler optimizes Python user-defined functions to provide performance similar to an Apache Spark plugin, without needing to know PySpark or Pandas. To make this solution as accessible as possible, this post uses a Python user-defined function written in pure Python.

Expand Step 4 (Make PII column) to see its details. This step combines different types of PII data from multiple columns into a single phrase that is saved in a new column, pii_col. The following table shows an example row containing data.

customer_name customer_job billing_address customer_email
Katie Journalist 19009 Vang Squares Suite 805 hboyd@gmail.com

This is combined into the phrase “Katie is a Journalist who lives at 19009 Vang Squares Suite 805 and can be emailed at hboyd@gmail.com”. The phrase is saved in pii_col, which this post uses as the target column to redact.

Step 5 (Prep for redaction) takes a column to redact (pii_col) and creates a new column (pii_col_prep) that is ready for efficient redaction using Amazon Comprehend. To redact PII from a different column, you can change the Input column field of this step.

There are two factors to consider to efficiently redact data using Amazon Comprehend:

  • The cost to detect PII is defined on a per-unit basis, where 1 unit = 100 characters, with a 3-unit minimum charge for each document. Because tabular data often contains small amounts of text per cell, it’s generally more time- and cost-efficient to combine text from multiple cells into a single document to send to Amazon Comprehend. Doing this avoids the accumulation of overhead from many repeated function calls and ensures that the data sent is always greater than the 3-unit minimum.
  • Because we’re doing redaction as one step of a SageMaker Data Wrangler flow, we will be calling Amazon Comprehend synchronously. Amazon Comprehend sets a 100 KB (100,000 character) limit per synchronous function call, so we need to ensure that any text we send is under that limit.

Given these factors, Step 5 prepares the data to send to Amazon Comprehend by appending a delimiter string to the end of the text in each cell. For the delimiter, you can use any string that doesn’t occur in the column being redacted (ideally, one that is as few characters as possible, because they’re included in the Amazon Comprehend character total). Adding this cell delimiter allows us to optimize the call to Amazon Comprehend, and will be discussed further in Step 6.

Note that if the text in any individual cell is longer than the Amazon Comprehend limit, the code in this step truncates it to 100,000 characters (roughly equivalent to 15,000 words or 30 single-spaced pages). Although this amount of text is unlikely to be stored in in a single cell, you can modify the transformation code to handle this edge case another way if needed.

Step 6 (Redact PII) takes a column name to redact as input (pii_col_prep) and saves the redacted text to a new column (pii_redacted). When you use a Python custom function transform, SageMaker Data Wrangler defines an empty custom_func that takes a pandas series (a column of text) as input and returns a modified pandas series of the same length. The following screenshot shows part of the Redact PII step.

Data Wrangler custom function redaction code

The function custom_func contains two helper (inner) functions:

  • make_text_chunks – This function does the work of concatenating text from individual cells in the series (including their delimiters) into longer strings (chunks) to send to Amazon Comprehend.
  • redact_pii– This function takes text as input, calls Amazon Comprehend to detect PII, redacts any that is found, and returns the redacted text. Redaction is done by replacing any PII text with the type of PII found in square brackets, for example John Smith would be replaced with [NAME]. You can modify this function to replace PII with any string, including the empty string (“”) to remove it. You also could modify the function to check the confidence score of each PII entity and only redact if it’s above a specific threshold.

After the inner functions are defined, custom_func uses them to do the redaction, as shown in the following code excerpt. When the redaction is complete, it converts the chunks back into original cells, which it saves in the pii_redacted column.

# concatenate text from cells into longer chunks
chunks = make_text_chunks(series, COMPREHEND_MAX_CHARS)

redacted_chunks = []
# call Comprehend once for each chunk, and redact
for text in chunks:
  redacted_text = redact_pii(text)
  redacted_chunks.append(redacted_text)
  
# join all redacted chunks into one text string
redacted_text = ''.join(redacted_chunks)

# split back to list of the original rows
redacted_rows = redacted_text.split(CELL_DELIM)

Add a destination node

To see the result of your transformations, SageMaker Data Wrangler supports exporting to Amazon S3, SageMaker Pipelines, Amazon SageMaker Feature Store, and Python code. To export the redacted data to Amazon S3, we first need to create a destination node:

  1. In the SageMaker Data Wrangler flow diagram, choose the plus sign next to the Redact PII step.
  2. Choose Add destination, then choose Amazon S3.
  3. Provide an output name for your transformed dataset.
  4. Browse or enter the S3 location to store the redacted data file.
  5. Choose Add destination.

You should now see the destination node at the end of your data flow.

Create a SageMaker Data Wrangler export job

Now that the destination node has been added, we can create the export job to process the dataset:

  1. In SageMaker Data Wrangler, choose Create job.
  2. The destination node you just added should already be selected. Choose Next.
  3. Accept the defaults for all other options, then choose Run.

This creates a SageMaker Processing job. To view the status of the job, navigate to the SageMaker console. In the navigation pane, expand the Processing section and choose Processing jobs. Redacting all 116,000 cells in the target column using the default export job settings (two ml.m5.4xlarge instances) takes roughly 8 minutes and costs approximately $0.25. When the job is complete, download the output file with the redacted column from Amazon S3.

Clean up

The SageMaker Data Wrangler application runs on an ml.m5.4xlarge instance. To shut it down, in SageMaker Studio, choose Running Terminals and Kernels in the navigation pane. In the RUNNING INSTANCES section, find the instance labeled Data Wrangler and choose the shutdown icon next to it. This shuts down the SageMaker Data Wrangler application running on the instance.

Conclusion

In this post, we discussed how to use custom transformations in SageMaker Data Wrangler and Amazon Comprehend to redact PII data from your ML dataset. You can download the SageMaker Data Wrangler flow and start redacting PII from your tabular data today.

For other ways to enhance your MLOps workflow using SageMaker Data Wrangler custom transformations, check out Authoring custom transformations in Amazon SageMaker Data Wrangler using NLTK and SciPy. For more data preparation options, check out the blog post series that explains how to use Amazon Comprehend to react, translate, and analyze text from either Amazon Athena or Amazon Redshift.


About the Authors

Tricia JamisonTricia Jamison is a Senior Prototyping Architect on the AWS Prototyping and Cloud Acceleration (PACE) Team, where she helps AWS customers implement innovative solutions to challenging problems with machine learning, internet of things (IoT), and serverless technologies. She lives in New York City and enjoys basketball, long distance treks, and staying one step ahead of her children.

Neelam Koshiya is an Enterprise Solutions Architect at AWS. With a background in software engineering, she organically moved into an architecture role. Her current focus is helping enterprise customers with their cloud adoption journey for strategic business outcomes with the area of depth being AI/ML. She is passionate about innovation and inclusion. In her spare time, she enjoys reading and being outdoors.

Adeleke Coker is a Global Solutions Architect with AWS. He works with customers globally to provide guidance and technical assistance in deploying production workloads at scale on AWS. In his spare time, he enjoys learning, reading, gaming and watching sport events.

Read More

Optimize pet profiles for Purina’s Petfinder application using Amazon Rekognition Custom Labels and AWS Step Functions

Optimize pet profiles for Purina’s Petfinder application using Amazon Rekognition Custom Labels and AWS Step Functions

Purina US, a subsidiary of Nestle, has a long history of enabling people to more easily adopt pets through Petfinder, a digital marketplace of over 11,000 animal shelters and rescue groups across the US, Canada, and Mexico. As the leading pet adoption platform, Petfinder has helped millions of pets find their forever homes.

Purina consistently seeks ways to make the Petfinder platform even better for both shelters and rescue groups and pet adopters. One challenge they faced was adequately reflecting the specific breed of animals up for adoption. Because many shelter animals are mixed breed, identifying breeds and attributes correctly in the pet profile required manual effort, which was time consuming. Purina used artificial intelligence (AI) and machine learning (ML) to automate animal breed detection at scale.

This post details how Purina used Amazon Rekognition Custom Labels, AWS Step Functions, and other AWS Services to create an ML model that detects the pet breed from an uploaded image and then uses the prediction to auto-populate the pet attributes. The solution focuses on the fundamental principles of developing an AI/ML application workflow of data preparation, model training, model evaluation, and model monitoring.

Solution overview

Predicting animal breeds from an image needs custom ML models. Developing a custom model to analyze images is a significant undertaking that requires time, expertise, and resources, often taking months to complete. Additionally, it often requires thousands or tens of thousands of hand-labeled images to provide the model with enough data to accurately make decisions. Setting up a workflow for auditing or reviewing model predictions to validate adherence to your requirements can further add to the overall complexity.

With Rekognition Custom Labels, which is built on the existing capabilities of Amazon Rekognition, you can identify the objects and scenes in images that are specific to your business needs. It is already trained on tens of millions of images across many categories. Instead of thousands of images, you can upload a small set of training images (typically a few hundred images or less per category) that are specific to your use case.

The solution uses the following services:

  • Amazon API Gateway is a fully managed service that makes it easy for developers to publish, maintain, monitor, and secure APIs at any scale.
  • The AWS Cloud Development Kit (AWS CDK) is an open-source software development framework for defining cloud infrastructure as code with modern programming languages and deploying it through AWS CloudFormation.
  • AWS CodeBuild is a fully managed continuous integration service in the cloud. CodeBuild compiles source code, runs tests, and produces packages that are ready to deploy.
  • Amazon DynamoDB is a fast and flexible nonrelational database service for any scale.
  • AWS Lambda is an event-driven compute service that lets you run code for virtually any type of application or backend service without provisioning or managing servers.
  • Amazon Rekognition offers pre-trained and customizable computer vision (CV) capabilities to extract information and insights from your images and videos. With Amazon Rekognition Custom Labels, you can identify the objects and scenes in images that are specific to your business needs.
  • AWS Step Functions is a fully managed service that makes it easier to coordinate the components of distributed applications and microservices using visual workflows.
  • AWS Systems Manager is a secure end-to-end management solution for resources on AWS and in multicloud and hybrid environments. Parameter Store, a capability of Systems Manager, provides secure, hierarchical storage for configuration data management and secrets management.

Purina’s solution is deployed as an API Gateway HTTP endpoint, which routes the requests to obtain pet attributes. It uses Rekognition Custom Labels to predict the pet breed. The ML model is trained from pet profiles pulled from Purina’s database, assuming the primary breed label is the true label. DynamoDB is used to store the pet attributes. Lambda is used to process the pet attributes request by orchestrating between API Gateway, Amazon Rekognition, and DynamoDB.

The architecture is implemented as follows:

  1. The Petfinder application routes the request to obtain the pet attributes via API Gateway.
  2. API Gateway calls the Lambda function to obtain the pet attributes.
  3. The Lambda function calls the Rekognition Custom Label inference endpoint to predict the pet breed.
  4. The Lambda function uses the predicted pet breed information to perform a pet attributes lookup in the DynamoDB table. It collects the pet attributes and sends it back to the Petfinder application.

The following diagram illustrates the solution workflow.

The Petfinder team at Purina wants an automated solution that they can deploy with minimal maintenance. To deliver this, we use Step Functions to create a state machine that trains the models with the latest data, checks their performance on a benchmark set, and redeploys the models if they have improved. The model retraining is triggered from the number of breed corrections made by users submitting profile information.

Model training

Developing a custom model to analyze images is a significant undertaking that requires time, expertise, and resources. Additionally, it often requires thousands or tens of thousands of hand-labeled images to provide the model with enough data to accurately make decisions. Generating this data can take months to gather and requires a large effort to label it for use in machine learning. A technique called transfer learning helps produce higher-quality models by borrowing the parameters of a pre-trained model, and allows models to be trained with fewer images.

Our challenge is that our data is not perfectly labeled: humans who enter the profile data can and do make mistakes. However, we found that for large enough data samples, the mislabeled images accounted for a sufficiently small fraction and model performance was not impacted more than 2% in accuracy.

ML workflow and state machine

The Step Functions state machine is developed to aid in the automatic retraining of the Amazon Rekognition model. Feedback is gathered during profile entry—each time a breed that has been inferred from an image is modified by the user to a different breed, the correction is recorded. This state machine is triggered from a configurable threshold number of corrections and additional pieces of data.

The state machine runs through several steps to create a solution:

  1. Create train and test manifest files containing the list of Amazon Simple Storage Service (Amazon S3) image paths and their labels for use by Amazon Rekognition.
  2. Create an Amazon Rekognition dataset using the manifest files.
  3. Train an Amazon Rekognition model version after the dataset is created.
  4. Start the model version when training is complete.
  5. Evaluate the model and produce performance metrics.
  6. If performance metrics are satisfactory, update the model version in Parameter Store.
  7. Wait for the new model version to propagate in the Lambda functions (20 minutes), then stop the previous model.

Model evaluation

We use a random 20% holdout set taken from our data sample to validate our model. Because the breeds we detect are configurable, we don’t use a fixed dataset for validation during training, but we do use a manually labeled evaluation set for integration testing. The overlap of the manually labeled set and the model’s detectable breeds is used to compute metrics. If the model’s breed detection accuracy is above a specified threshold, we promote the model to be used in the endpoint.

The following are a few screenshots of the pet prediction workflow from Rekognition Custom Labels.

Deployment with the AWS CDK

The Step Functions state machine and associated infrastructure (including Lambda functions, CodeBuild projects, and Systems Manager parameters) are deployed with the AWS CDK using Python. The AWS CDK code synthesizes a CloudFormation template, which it uses to deploy all infrastructure for the solution.

Integration with the Petfinder application

The Petfinder application accesses the image classification endpoint through the API Gateway endpoint using a POST request containing a JSON payload with fields for the Amazon S3 path to the image and the number of results to be returned.

KPIs to be impacted

To justify the added cost of running the image inference endpoint, we ran experiments to determine the value that the endpoint adds for Petfinder. The use of the endpoint offers two main types of improvement:

  • Reduced effort for pet shelters who are creating the pet profiles
  • More complete pet profiles, which are expected to improve search relevance

Metrics for measuring effort and profile completeness include the number of auto-filled fields that are corrected, total number of fields filled, and time to upload a pet profile. Improvements to search relevance are indirectly inferred from measuring key performance indicators related to adoption rates. According to Purina, after the solution went live, the average time for creating a pet profile on the Petfinder application was reduced from 7 minutes to 4 minutes. That is a huge improvement and time savings because in 2022, 4 million pet profiles were uploaded.

Security

The data that flows through the architecture diagram is encrypted in transit and at rest, in accordance with the AWS Well-Architected best practices. During all AWS engagements, a security expert reviews the solution to ensure a secure implementation is provided.

Conclusion

With their solution based on Rekognition Custom Labels, the Petfinder team is able to accelerate the creation of pet profiles for pet shelters, reducing administrative burden on shelter personnel. The deployment based on the AWS CDK deploys a Step Functions workflow to automate the training and deployment process. To start using Rekognition Custom Labels, refer to Getting Started with Amazon Rekognition Custom Labels. You can also check out some Step Functions examples and get started with the AWS CDK.


About the Authors

Mason Cahill is a Senior DevOps Consultant with AWS Professional Services. He enjoys helping organizations achieve their business goals, and is passionate about building and delivering automated solutions on the AWS Cloud. Outside of work, he loves spending time with his family, hiking, and playing soccer.

Matthew Chasse is a Data Science consultant at Amazon Web Services, where he helps customers build scalable machine learning solutions.  Matthew has a Mathematics PhD and enjoys rock climbing and music in his free time.

Rushikesh Jagtap is a Solutions Architect with 5+ years of experience in AWS Analytics services. He is passionate about helping customers to build scalable and modern data analytics solutions to gain insights from the data. Outside of work, he loves watching Formula1, playing badminton, and racing Go Karts.

Tayo Olajide is a seasoned Cloud Data Engineering generalist with over a decade of experience in architecting and implementing data solutions in cloud environments. With a passion for transforming raw data into valuable insights, Tayo has played a pivotal role in designing and optimizing data pipelines for various industries, including finance, healthcare, and auto industries. As a thought leader in the field, Tayo believes that the power of data lies in its ability to drive informed decision-making and is committed to helping businesses leverage the full potential of their data in the cloud era. When he’s not crafting data pipelines, you can find Tayo exploring the latest trends in technology, hiking in the great outdoors, or tinkering with gadgetry and software.

Read More

Learn how Amazon Pharmacy created their LLM-based chat-bot using Amazon SageMaker

Learn how Amazon Pharmacy created their LLM-based chat-bot using Amazon SageMaker

Amazon Pharmacy is a full-service pharmacy on Amazon.com that offers transparent pricing, clinical and customer support, and free delivery right to your door. Customer care agents play a crucial role in quickly and accurately retrieving information related to pharmacy information, including prescription clarifications and transfer status, order and dispensing details, and patient profile information, in real time. Amazon Pharmacy provides a chat interface where customers (patients and doctors) can talk online with customer care representatives (agents). One challenge that agents face is finding the precise information when answering customers’ questions, because the diversity, volume, and complexity of healthcare’s processes (such as explaining prior authorizations) can be daunting. Finding the right information, summarizing it, and explaining it takes time, slowing down the speed to serve patients.

To tackle this challenge, Amazon Pharmacy built a generative AI question and answering (Q&A) chatbot assistant to empower agents to retrieve information with natural language searches in real time, while preserving the human interaction with customers. The solution is HIPAA compliant, ensuring customer privacy. In addition, agents submit their feedback related to the machine-generated answers back to the Amazon Pharmacy development team, so that it can be used for future model improvements.

In this post, we describe how Amazon Pharmacy implemented its customer care agent assistant chatbot solution using AWS AI products, including foundation models in Amazon SageMaker JumpStart to accelerate its development. We start by highlighting the overall experience of the customer care agent with the addition of the large language model (LLM)-based chatbot. Then we explain how the solution uses the Retrieval Augmented Generation (RAG) pattern for its implementation. Finally, we describe the product architecture. This post demonstrates how generative AI is integrated into an already working application in a complex and highly regulated business, improving the customer care experience for pharmacy patients.

The LLM-based Q&A chatbot

The following figure shows the process flow of a patient contacting Amazon Pharmacy customer care via chat (Step 1). Agents use a separate internal customer care UI to ask questions to the LLM-based Q&A chatbot (Step 2). The customer care UI then sends the request to a service backend hosted on AWS Fargate (Step 3), where the queries are orchestrated through a combination of models and data retrieval processes, collectively known as the RAG process. This process is the heart of the LLM-based chatbot solution and its details are explained in the next section. At the end of this process, the machine-generated response is returned to the agent, who can review the answer before providing it back to the end-customer (Step 4). It should be noted that agents are trained to exercise judgment and use the LLM-based chatbot solution as a tool that augments their work, so they can dedicate their time to personal interactions with the customer. Agents also label the machine-generated response with their feedback (for example, positive or negative). This feedback is then used by the Amazon Pharmacy development team to improve the solution (through fine-tuning or data improvements), forming a continuous cycle of product development with the user (Step 5).

Process flow and high level architecture

The following figure shows an example from a Q&A chatbot and agent interaction. Here, the agent was asking about a claim rejection code. The Q&A chatbot (Agent AI Assistant) answers the question with a clear description of the rejection code. It also provides the link to the original documentation for the agents to follow up, if needed.

Example screenshot from Q&A chatbot

Accelerating the ML model development

In the previous figure depicting the chatbot workflow, we skipped the details of how to train the initial version of the Q&A chatbot models. To do this, the Amazon Pharmacy development team benefited from using SageMaker JumpStart. SageMaker JumpStart allowed the team to experiment quickly with different models, running different benchmarks and tests, failing fast as needed. Failing fast is a concept practiced by the scientist and developers to quickly build solutions as realistic as possible and learn from their efforts to make it better in the next iteration. After the team decided on the model and performed any necessary fine-tuning and customization, they used SageMaker hosting to deploy the solution. The reuse of the foundation models in SageMaker JumpStart allowed the development team to cut months of work that otherwise would have been needed to train models from scratch.

The RAG design pattern

One core part of the solution is the use of the Retrieval Augmented Generation (RAG) design pattern for implementing Q&A solutions. The first step in this pattern is to identify a set of known question and answer pairs, which is the initial ground truth for the solution. The next step is to convert the questions to a better representation for the purpose of similarity and searching, which is called embedding (we embed a higher-dimensional object into a hyperplane with less dimensions). This is done through an embedding-specific foundation model. These embeddings are used as indexes to the answers, much like how a database index maps a primary key to a row. We’re now ready to support new queries coming from the customer. As explained previously, the experience is that customers send their queries to agents, who then interface with the LLM-based chatbot. Within the Q&A chatbot, the query is converted to an embedding and then used as a search key for a matching index (from the previous step). The matching criteria is based on a similarity model, such as FAISS or Amazon Open Search Service (for more details, refer to Amazon OpenSearch Service’s vector database capabilities explained). When there are matches, the top answers are retrieved and used as the prompt context for the generative model. This corresponds to the second step in the RAG pattern—the generative step. In this step, the prompt is sent to the LLM (generator foundation modal), which composes the final machine-generated response to the original question. This response is provided back through the customer care UI to the agent, who validates the answer, edits it if needed, and sends it back to the patient. The following diagram illustrates this process.

Rag flow

Managing the knowledge base

As we learned with the RAG pattern, the first step in performing Q&A consists of retrieving the data (the question and answer pairs) to be used as context for the LLM prompt. This data is referred to as the chatbot’s knowledge base. Examples of this data are Amazon Pharmacy internal standard operating procedures (SOPs) and information available in Amazon Pharmacy Help Center. To facilitate the indexing and the retrieval process (as described previously), it’s often useful to gather all this information, which may be hosted across different solutions such as in wikis, files, and databases, into a single repository. In the particular case of the Amazon Pharmacy chatbot, we use Amazon Simple Storage Service (Amazon S3) for this purpose because of its simplicity and flexibility.

Solution overview

The following figure shows the solution architecture. The customer care application and the LLM-based Q&A chatbot are deployed in their own VPC for network isolation. The connection between the VPC endpoints is realized through AWS PrivateLink, guaranteeing their privacy. The Q&A chatbot likewise has its own AWS account for role separation, isolation, and ease of monitoring for security, cost, and compliance purposes. The Q&A chatbot orchestration logic is hosted in Fargate with Amazon Elastic Container Service (Amazon ECS). To set up PrivateLink, a Network Load Balancer proxies the requests to an Application Load Balancer, which stops the end-client TLS connection and hands requests off to Fargate. The primary storage service is Amazon S3. As mentioned previously, the related input data is imported into the desired format inside the Q&A chatbot account and persisted in S3 buckets.

Solutions architecture

When it comes to the machine learning (ML) infrastructure, Amazon SageMaker is at the center of the architecture. As explained in the previous sections, two models are used, the embedding model and the LLM model, and these are hosted in two separate SageMaker endpoints. By using the SageMaker data capture feature, we can log all inference requests and responses for troubleshooting purposes, with the necessary privacy and security constraints in place. Next, the feedback taken from the agents is stored in a separate S3 bucket.

The Q&A chatbot is designed to be a multi-tenant solution and support additional health products from Amazon Health Services, such as Amazon Clinic. For example, the solution is deployed with AWS CloudFormation templates for infrastructure as a code (IaC), allowing different knowledge bases to be used.

Conclusion

This post presented the technical solution for Amazon Pharmacy generative AI customer care improvements. The solution consists of a question answering chatbot implementing the RAG design pattern on SageMaker and foundation models in SageMaker JumpStart. With this solution, customer care agents can assist patients more quickly, while providing precise, informative, and concise answers.

The architecture uses modular microservices with separate components for knowledge base preparation and loading, chatbot (instruction) logic, embedding indexing and retrieval, LLM content generation, and feedback supervision. The latter is especially important for ongoing model improvements. The foundation models in SageMaker JumpStart are used for fast experimentation with model serving being done with SageMaker endpoints. Finally, the HIPAA-compliant chatbot server is hosted on Fargate.

In summary, we saw how Amazon Pharmacy is using generative AI and AWS to improve customer care while prioritizing responsible AI principles and practices.

You can start experimenting with foundation models in SageMaker JumpStart today to find the right foundation models for your use case and start building your generative AI application on SageMaker.


About the author

Burak Gozluklu is a Principal AI/ML Specialist Solutions Architect located in Boston, MA. He helps global customers adopt AWS technologies and specifically AI/ML solutions to achieve their business objectives. Burak has a PhD in Aerospace Engineering from METU, an MS in Systems Engineering, and a post-doc in system dynamics from MIT in Cambridge, MA. Burak is passionate about yoga and meditation.

Jangwon Kim is a Sr. Applied Scientist at Amazon Health Store & Tech. He has expertise in LLM, NLP, Speech AI, and Search. Prior to joining Amazon Health, Jangwon was an applied scientist at Amazon Alexa Speech. He is based out of Los Angeles.

Alexandre Alves is a Sr. Principal Engineer at Amazon Health Services, specializing in ML, optimization, and distributed systems. He helps deliver wellness-forward health experiences.

Nirvay Kumar is a Sr. Software Dev Engineer at Amazon Health Services, leading architecture within Pharmacy Operations after many years in Fulfillment Technologies. With expertise in distributed systems, he has cultivated a growing passion for AI’s potential. Nirvay channels his talents into engineering systems that solve real customer needs with creativity, care, security, and a long-term vision. When not hiking the mountains of Washington, he focuses on thoughtful design that anticipates the unexpected. Nirvay aims to build systems that withstand the test of time and serve customers’ evolving needs.

Read More

Keeping an eye on your cattle using AI technology

Keeping an eye on your cattle using AI technology

At Amazon Web Services (AWS), not only are we passionate about providing customers with a variety of comprehensive technical solutions, but we’re also keen on deeply understanding our customers’ business processes. We adopt a third-party perspective and objective judgment to help customers sort out their value propositions, collect pain points, propose appropriate solutions, and create the most cost-effective and usable prototypes to help them systematically achieve their business goals.

This method is called working backwards at AWS. It means putting aside technology and solutions, starting from the expected results of customers, confirming their value, and then deducing what needs to be done in reverse order before finally implementing a solution. During the implementation phase, we also follow the concept of minimum viable product and strive to quickly form a prototype that can generate value within a few weeks, and then iterate on it.

Today, let’s review a case study where AWS and New Hope Dairy collaborated to build a smart farm on the cloud. From this blog post, you can have a deep understanding about what AWS can provide for building a smart farm and how to build smart farm applications on the cloud with AWS experts.

Project background

Milk is a nutritious beverage. In consideration of national health, China has been actively promoting the development of the dairy industry. According to data from Euromonitor International, the sale of dairy products in China reached 638.5 billion RMB in 2020 and is expected to reach 810 billion RMB in 2025. In addition, the compound annual growth rate in the past 14 years has also reached 10 percent, showing rapid development.

On the other hand, as of 2022, most of the revenue in the Chinese dairy industry still comes from liquid milk. Sixty percent of the raw milk is used for liquid milk and yogurt, and another 20 percent is milk powder—a derivative of liquid milk. Only a very small amount is used for highly processed products such as cheese and cream.

Liquid milk is a lightly processed product and its output, quality, and cost are closely linked to raw milk. This means that if the dairy industry wants to free capacity to focus on producing highly processed products, create new products, and conduct more innovative biotechnology research, it must first improve and stabilize the production and quality of raw milk.

As a dairy industry leader, New Hope Dairy has been thinking about how to improve the efficiency of its ranch operations and increase the production and quality of raw milk. New Hope Dairy hopes to use the third-party perspective and technological expertise of AWS to facilitate innovation in the dairy industry. With support and promotion from Liutong Hu, VP and CIO of New Hope Dairy, the AWS customer team began to organize operations and potential innovation points for the dairy farms.

Dairy farm challenges

AWS is an expert in the field of cloud technology, but to implement innovation in the dairy industry, professional advice from dairy subject matter experts is necessary. Therefore, we conducted several in-depth interviews with Liangrong Song, the Deputy Director of Production Technology Center of New Hope Dairy, the ranch management team, and nutritionists to understand some of the issues and challenges facing the farm.

First is taking inventory of reserve cows

The dairy cows on the ranch are divided into two types: dairy cows and reserve cows. Dairy cows are mature and continuously produce milk, while reserve cows are cows that have not yet reached the age to produce milk. Large and medium-sized farms usually provide reserve cows with a larger open activity area to create a more comfortable growing environment.

However, both dairy cows and reserve cows are assets of the farm and need to be inventoried monthly. Dairy cows are milked every day, and because they are relatively still during milking, inventory tracking is easy. However, reserve cows are in an open space and roam freely, which makes it inconvenient to inventory them. Each time inventory is taken, several workers count the reserve cows repeatedly from different areas, and finally, the numbers are checked. This process consumes one to two days for several workers, and often there are problems with aligning the counts or uncertainties about whether each cow has been counted.

Significant time can be saved if we have a way to inventory reserve cows quickly and accurately.

Second is identifying lame cattle

Currently, most dairy companies use a breed named Holstein to produce milk. Holsteins are the black and white cows most of us are familiar with. Despite most dairy companies using the same breed, there are still differences in milk production quantity and quality among different companies and ranches. This is because the health of dairy cows directly affects milk production.

However, cows cannot express discomfort on their own like humans can, and it isn’t practical for veterinarians to give thousands of cows physical examinations regularly. Therefore, we have to use external indicators to quickly judge the health status of cows.

smart ranch with aws

The external indicators of a cow’s health include body condition score and lameness degree. Body condition score is largely related to the cow’s body fat percentage and is a long-term indicator, while lameness is a short-term indicator caused by leg problems or foot infections and other issues that affect the cow’s mood, health, and milk production. Additionally, adult Holstein cows can weigh over 500 kg, which can cause significant harm to their feet if they aren’t stable. Therefore, when lameness occurs, veterinarians should intervene as soon as possible.

According to a 2014 study, the proportion of severely lame cows in China can be as high as 31 percent. Although the situation might have improved since the study, the veterinarian count on farms is extremely limited, making it difficult to monitor cows regularly. When lameness is detected, the situation is often severe, and treatment is time-consuming and difficult, and milk production is already affected.

If we have a way to timely detect lameness in cows and prompt veterinarians to intervene at the mild lameness stage, the overall health and milk production of the cows will increase, and the performance of the farm will improve.

Lastly, there is feed cost optimization

Within the livestock industry, feed is the biggest variable cost. To ensure the quality and inventory of feed, farms often need to purchase feed ingredients from domestic and overseas suppliers and deliver them to feed formulation factories for processing. There are many types of modern feed ingredients, including soybean meal, corn, alfalfa, oat grass, and so on, which means that there are many variables at play. Each type of feed ingredient has its own price cycle and price fluctuations. During significant fluctuations, the total cost of feed can fluctuate by more than 15 percent, causing a significant impact.

Feed costs fluctuate, but dairy product prices are relatively stable over the long term. Consequently, under otherwise unchanged conditions, the overall profit can fluctuate significantly purely due to feed cost changes.

To avoid this fluctuation, it’s necessary to consider storing more ingredients when prices are low. But stocking also needs to consider whether the price is genuinely at the trough and what quantity of feed should be purchased according to the current consumption rate.

If we have a way to precisely forecast feed consumption and combine it with the overall price trend to suggest the best time and quantity of feed to purchase, we can reduce costs and increase efficiency on the farm.

It’s evident that these issues are directly related to the customer’s goal of improving farm operational efficiency, and the methods are respectively freeing up labor, increasing production and reducing costs. Through discussions on the difficulty and value of solving each issue, we chose increasing production as the starting point and prioritized solving the problem of lame cows.

Research

Before discussing technology, research had to be conducted. The research was jointly conducted by the AWS customer team, the AWS Generative AI Innovation Center, which managed the machine learning algorithm models, and AWS AI Shanghai Lablet, which provides algorithm consultation on the latest computer vision research and the expert farming team from New Hope Dairy. The research was divided into several parts:

  • Understanding the traditional paper-based identification method of lame cows and developing a basic understanding of what lame cows are.
  • Confirming existing solutions, including those used in farms and in the industry.
  • Conducting farm environment research to understand the physical situation and limitations.

Through studying materials and observing on-site videos, the teams gained a basic understanding of lame cows. Readers can also get a basic idea of the posture of lame cows through the animated image below.

Lame Cows

In contrast to a relatively healthy cow.

healthy cow

Lame cows have visible differences in posture and gait compared to healthy cows.

Regarding existing solutions, most ranches rely on visual inspection by veterinarians and nutritionists to identify lame cows. In the industry, there are solutions that use wearable pedometers and accelerometers for identification, as well as solutions that use partitioned weighbridges for identification, but both are relatively expensive. For the highly competitive dairy industry, we need to minimize identification costs and the costs and dependence on non-generic hardware.

After discussing and analyzing the information with ranch veterinarians and nutritionists, the AWS Generative AI Innovation Center experts decided to use computer vision (CV) for identification, relying only on ordinary hardware: civilian surveillance cameras, which don’t add any additional burden to the cows and reduce costs and usage barriers.

After deciding on this direction, we visited a medium-sized farm with thousands of cows on site, investigated the ranch environment, and determined the location and angle of camera placement.

Initial proposal

Now, for the solution. The core of our CV-based solution consists of the following steps:

  • Cow identification: Identify multiple cows in a single frame of video and mark the position of each cow.
  • Cow tracking: While video is recording, we need to continuously track cows as the frames change and assign a unique number to each cow.
  • Posture marking: Reduce the dimensionality of cow movements by converting cow images to marked points.
  • Anomaly identification: Identify anomalies in the marked points’ dynamics.
  • Lame cow algorithm: Normalize the anomalies to obtain a score to determine the degree of cow lameness.
  • Threshold determination: Obtain a threshold based on expert inputs.

According to the judgment of the AWS Generative AI Innovation Center experts, the first few steps are generic requirements that can be solved using open-source models, while the latter steps require us to use mathematical methods and expert intervention.

Difficulties in the solution

To balance cost and performance, we chose the yolov5l model, a medium-sized pre-trained model for cow recognition, with an input width of 640 pixels, which provides good value for this scene.

While YOLOv5 is responsible for recognizing and tagging cows in a single image, in reality, videos consist of multiple images (frames) that change continuously. YOLOv5 cannot identify that cows in different frames belong to the same individual. To track and locate a cow across multiple images, another model called SORT is needed.

SORT stands for simple online and realtime tracking, where online means it considers only the current and previous frames to track without consideration of any other frames, and realtime means it can identify the object’s identity immediately.

After the development of SORT, many engineers implemented and optimized it, leading to the development of OC-SORT, which considers the appearance of the object, DeepSORT (and its upgraded version, StrongSORT), which includes human appearance, and ByteTrack, which uses a two-stage association linker to consider low-confidence recognition. After testing, we found that for our scene, DeepSORT’s appearance tracking algorithm is more suitable for humans than for cows, and ByteTrack’s tracking accuracy is slightly weaker. As a result, we ultimately chose OC-SORT as our tracking algorithm.

Next, we use DeepLabCut (DLC for short) to mark the skeletal points of the cows. DLC is a markerless model, which means that although different points, such as the head and limbs, might have different meanings, they are all just points for DLC, which only requires us to mark the points and train the model.

This leads to a new question: how many points should we mark on each cow and where should we mark them? The answer to this question affects the workload of marking, training, and subsequent inference efficiency. To solve this problem, we must first understand how to identify lame cows.

Based on our research and the inputs of our expert clients, lame cows in videos exhibit the following characteristics:

  • An arched back: The neck and back are curved, forming a triangle with the root of the neck bone (arched-back).
  • Frequent nodding: Each step can cause the cow to lose balance or slip, resulting in frequent nodding (head bobbing).
  • Unstable gait: The cow’s gait changes after a few steps, with slight pauses (gait pattern change).

Comparison between healthy cow and lame cow

With regards to neck and back curvature as well as nodding, experts from AWS Generative AI Innovation Center have determined that marking only seven back points (one on the head, one at the base of the neck, and five on the back) on cattle can result in good identification. Since we now have a frame of identification, we should also be able to recognize unstable gait patterns.

Next, we use mathematical expressions to represent the identification results and form algorithms.

Human identification of these problems isn’t difficult, but precise algorithms are required for computer identification. For example, how does a program know the degree of curvature of a cow’s back given a set of cow back coordinate points? How does it know if a cow is nodding?

In terms of back curvature, we first consider treating the cow’s back as an angle and then we find the vertex of that angle, which allows us to calculate the angle. The problem with this method is that the spine might have bidirectional curvature, making the vertex of the angle difficult to identify. This requires switching to other algorithms to solve the problem.

key-points-of-a-cow

In terms of nodding, we first considered using the Fréchet distance to determine if the cow is nodding by comparing the difference in the curve of the cow’s overall posture. However, the problem is that the cow’s skeletal points might be displaced, causing significant distance between similar curves. To solve this problem, we need to take out the position of the head relative to the recognition box and normalize it.

After normalizing the position of the head, we encountered a new problem. In the image that follows, the graph on the left shows the change in the position of the cow’s head. We can see that due to recognition accuracy issues, the position of the head point will constantly shake slightly. We need to remove these small movements and find the relatively large movement trend of the head. This is where some knowledge of signal processing is needed. By using a Savitzky-Golay filter, we can smooth out a signal and obtain its overall trend, making it easier for us to identify nodding, as shown by the orange curve in the graph on the right.

key points curve

Additionally, after dozens of hours of video recognition, we found that some cows with extremely high back curvature actually did not have a hunched back. Further investigation revealed that this was because most of the cows used to train the DLC model were mostly black or black and white, and there weren’t many cows that were mostly white or close to pure white, resulting in the model recognizing them incorrectly when they had large white areas on their bodies, as shown by the red arrow in the figure below. This can be corrected through further model training.

In addition to solving the preceding problems, there were other generic problems that needed to be solved:

  • There are two paths in the video frame, and cows in the distance might also be recognized, causing problems.
  • The paths in the video also have a certain curvature, and the cow’s body length becomes shorter when the cow is on the sides of the path, making the posture easy to identify incorrectly.
  • Due to the overlap of multiple cows or occlusion from the fence, the same cow might be identified as two cows.
  • Due to tracking parameters and occasional frame skipping of the camera, it’s impossible to correctly track the cows, resulting in ID confusion issues.

In the short term, based on the alignment with New Hope Dairy on delivering a minimum viable product and then iterate on it, these problems can usually be solved by outlier judgment algorithms combined with confidence filtering, and if they cannot be solved, they will become invalid data, which requires us to perform additional training and continuously iterate our algorithms and models.

In the long term, AWS AI Shanghai Lablet provided future experiment suggestions to solve the preceding problems based on their object-centric research: Bridging the Gap to Real-World Object-Centric Learning and Self-supervised Amodal Video Object Segmentation. Besides invalidating those outlier data, the issues can also be addressed by developing more precise object-level models for pose estimation, amodal segmentation, and supervised tracking. However, traditional vision pipelines for these tasks typically require extensive labeling. Object-centric learning focuses on tackling the binding problem of pixels to objects without additional supervision. The binding process not only provides information on the location of objects but also results in robust and adaptable object representations for downstream tasks. Because the object-centric pipeline focuses on self-supervised or weakly-supervised settings, we can improve performance without significantly increasing labeling costs for our customers.

After solving a series of problems and combining the scores given by the farm veterinarian and nutritionist, we have obtained a comprehensive lameness score for cows, which helps us identify cows with different degrees of lameness such as severe, moderate, and mild, and can also identify multiple body posture attributes of cows, helping further analysis and judgment.

Within weeks, we developed an end-to-end solution for identifying lame cows. The hardware camera for this solution cost only 300 RMB, and the Amazon SageMaker batch inference, when using the g4dn.xlarge instance, took about 50 hours for 2 hours of video, totaling only 300 RMB. When it enters production, if five batches of cows are detected per week (assuming about 10 hours), and including the rolling saved videos and data, the monthly detection cost for a medium-sized ranch with several thousand cows is less than 10,000 RMB.

Currently, our machine learning model process is as follows:

  1. Raw video is recorded.
  2. Cows are detected and identified.
  3. Each cow is tracked, and key points are detected.
  4. Each cow’s movement is analyzed.
  5. A lameness score is determined.

identification process

Model deployment

We’ve described the solution for identifying lame cows based on machine learning before. Now, we need to deploy these models on SageMaker. As shown in the following figure:

Architecture diagram

Business implementation

Of course, what we’ve discussed so far is just the core of our technical solution. To integrate the entire solution into the business process, we also must address the following issues:

  • Data feedback: For example, we must provide veterinarians with an interface to filter and view lame cows that need to be processed and collect data during this process to use as training data.
  • Cow identification: After a veterinarian sees a lame cow, they also need to know the cow’s identity, such as its number and pen.
  • Cow positioning: In a pen with hundreds of cows, quickly locate the target cow.
  • Data mining: For example, find out how the degree of lameness affects feeding, rumination, rest, and milk production.
  • Data-driven: For example, identify the genetic, physiological, and behavioral characteristics of lame cows to achieve optimal breeding and reproduction.

Only by addressing these issues can the solution truly solve the business problem, and the collected data can generate long-term value. Some of these problems are system integration issues, while others are technology and business integration issues. We will share further information about these issues in future articles.

Summary

In this article, we briefly explained how the AWS Customer Solutions team innovates quickly based on the customer’s business. This mechanism has several characteristics:

  • Business led: Prioritize understanding the customer’s industry and business processes on site and in person before discussing technology, and then delve into the customer’s pain points, challenges, and problems to identify important issues that can be solved with technology.
  • Immediately available: Provide a simple but complete and usable prototype directly to the customer for testing, validation, and rapid iteration within weeks, not months.
  • Minimal cost: Minimize or even eliminate the customer’s costs before the value is truly validated, avoiding concerns about the future. This aligns with the AWS frugality leadership principle.

In our collaborative innovation project with the dairy industry, we not only started from the business perspective to identify specific business problems with business experts, but also conducted on-site investigations at the farm and factory with the customer. We determined the camera placement on site, installed and deployed the cameras, and deployed the video streaming solution. Experts from AWS Generative AI Innovation Center dissected the customer’s requirements and developed an algorithm, which was then engineered by a solution architect for the entire algorithm.

With each inference, we could obtain thousands of decomposed and tagged cow walking videos, each with the original video ID, cow ID, lameness score, and various detailed scores. The complete calculation logic and raw gait data were also retained for subsequent algorithm optimization.

Lameness data can not only be used for early intervention by veterinarians, but also combined with milking machine data for cross-analysis, providing an additional validation dimension and answering some additional business questions, such as: What are the physical characteristics of cows with the highest milk yield? What is the effect of lameness on milk production in cows? What is the main cause of lame cows, and how can it be prevented? This information will provide new ideas for farm operations.

The story of identifying lame cows ends here, but the story of farm innovation has just begun. In subsequent articles, we will continue to discuss how we work closely with customers to solve other problems.


About the Authors


Hao Huang
is an applied scientist at the AWS Generative AI Innovation Center. He specializes in Computer Vision (CV) and Visual-Language Model (VLM). Recently, he has developed a strong interest in generative AI technologies and has already collaborated with customers to apply these cutting-edge technologies to their business. He is also a reviewer for AI conferences such as ICCV and AAAI.


Peiyang He
is a senior data scientist at the AWS Generative AI Innovation Center. She works with customers across a diverse spectrum of industries to solve their most pressing and innovative business needs leveraging GenAI/ML solutions. In her spare time, she enjoys skiing and traveling.


Xuefeng Liu
leads a science team at the AWS Generative AI Innovation Center in the Asia Pacific and Greater China regions. His team partners with AWS customers on generative AI projects, with the goal of accelerating customers’ adoption of generative AI.


Tianjun Xiao
is a senior applied scientist at the AWS AI Shanghai Lablet, co-leading the computer vision efforts. Presently, his primary focus lies in the realms of multimodal foundation models and object-centric learning. He is actively investigating their potential in diverse applications, including video analysis, 3D vision and autonomous driving.


Zhang Dai
is a an AWS senior solution architect for China Geo Business Sector. He helps companies of various sizes achieve their business goals by providing consultancy on business processes, user experience and cloud technology. He is a prolific blog writer and also author of two books: The Modern Autodidact and Designing Experience.


Jianyu Zeng
is a senior customer solutions manager at AWS, whose responsibility is to support customers, such as New Hope group, during their cloud transition and assist them in realizing business value through cloud-based technology solutions. With a strong interest in artificial intelligence, he is constantly exploring ways to leverage AI to drive innovative changes in our customer’s businesses.


Carol Tong Min
is a senior business development manager, responsible for Key Accounts in GCR GEO West, including two important enterprise customers: Jiannanchun Group and New Hope Group. She is customer obsessed, and always passionate about supporting and accelerating customers’ cloud journey.

Nick Jiang is a senior specialist sales at AIML SSO team in China. He is focus on transferring innovative AIML solutions and helping with customer to build the AI related workloads within AWS.

Read More

Personalize your search results with Amazon Personalize and Amazon OpenSearch Service integration

Personalize your search results with Amazon Personalize and Amazon OpenSearch Service integration

Amazon Personalize has launched a new integration with Amazon OpenSearch Service that enables you to personalize search results for each user and assists in predicting their search needs. The Amazon Personalize Search Ranking plugin within OpenSearch Service allows you to improve the end-user engagement and conversion from your website and app search by taking advantage of the deep learning capabilities offered by Amazon Personalize. This feature is also available with self-managed OpenSearch.

Search is crucial in engaging users because it brings high-intent traffic from individuals seeking specific products or categories. Previously, customers found it challenging to capitalize on this traffic and provide relevant search results to their users due to infrastructure limitations or lack of ML expertise. This led to increased instances of users failing to find the items they were searching for. With the Amazon Personalize Search Ranking plugin, customers of OpenSearch Service version 2.9.0 or later can go beyond the traditional keyword matching approach and boost relevant items in an individual user’s search results based on their interests, context, and past interactions in real time. You can also fine-tune the level of personalization for every search query to ensure flexibility and control over the search experience.

AWS Partners like Cognizant are excited by the personalization possibilities that the Amazon Personalize Search Ranking plugin will unlock for their media and retail customers.

“Amazon Personalize has been proven to be highly impactful for many businesses with its cost-effective and streamlined implementation. With the release of the new Amazon Personalize Search Ranking plugin within Amazon OpenSearch Service, we can now rapidly deploy and implement real-time user personalization to search results. We are highly confident that it will deliver improved customer experience and satisfaction as well as increase conversion and clickthrough rates by two to three times. Personalized search is a differentiator, especially for media and retail platforms. We are really excited to be a launch partner with AWS on this release and are looking forward to helping businesses deliver personalized search solutions powered by Amazon Personalize.”

– Andy Huang, Head of AI/ML at Cognizant Servian.

In this post, we show you how search results get personalized based on the user and how they vary when you adjust the personalization weight. We specify a value closer to zero to place less emphasis on personalization, and specify a value closer to 1 to re-rank search results based on a higher level of personalization.

Example use cases

To explore the impact of this new feature in greater detail, let’s review an example using a dataset from the Retail Demo Store.

First, we use OpenSearch Service to get search results for the search query “Grooming.” When the personalization weight is set to 0.0, no personalization takes place. As shown in the following table, the top five search results from OpenSearch Service show the grooming items with a higher gender affinity towards women (refer to the Gender_Affinity column, where M stands for male and F stands for female).

Rank Item_ID Item_Name Description Gender_Affinity
1 1bcb66c4-ee9d-4c0c-ba53-168cb243569f Women’s Grooming Kit A must-have in every bathroom F
2 f91ec34f-a08e-4408-8bb0-592bdd09375c Besto Hairbrush for Women Soft brush for everyday use F
3 4296626c-fbb0-42b4-9a50-b6c6c16095f3 Makeup Brush Kit This nifty makeup brush kit is essential in ev… F
4 09920b2e-4e07-41f7-aca6-47744777a2a7 Trendy Razor A must-have in every bathroom F
5 39945ad0-57c9-4c28-a69c-532d5d167202 Makeup Brushes Makeup brushes for every bathroom F
6 1bfbe5c7-6f02-4465-82f1-6083a4b302c0 Premium Men’s Razor Razor for every bathroom M
7 6d5b3f03-ade6-42f7-969d-acd1f2162332 5-Blade Razor for Men Razor for every bathroom M
8 83095a08-2968-4275-a375-4fab404df7ac Fusion5 Razers for Men Razor for every bathroom M
9 afdd9c41-2762-45bf-b6a7-e3fb8f1b34ba Minimalistic Razor A must-have in every bathroom M
10 5dbc7cb7-39c5-4795-9064-d1655d78b3ca Razor Brand for Men Razor for every bathroom M

Let’s suppose that a user with gender M (male) performs a search using the same query for “Grooming.” When the personalization weight is set to 0.3, the items with a gender affinity towards men get a subtle boost in ranking. In this example, Premium Men’s Razor, which was originally ranked number 6 in the previous table by OpenSearch Service, gets boosted to rank 2 in the updated table. Similarly, Razor Brand for Men shows up higher in position (rank 6) despite being the lowest-ranked item in the previous table.

Rank Item_ID Item_Name Description Gender_Affinity
1 1bcb66c4-ee9d-4c0c-ba53-168cb243569f Women’s Grooming Kit A must-have in every bathroom F
2 1bfbe5c7-6f02-4465-82f1-6083a4b302c0 Premium Men’s Razor Razor for every bathroom M
3 f91ec34f-a08e-4408-8bb0-592bdd09375c Besto Hairbrush for Women Soft brush for everyday use F
4 4296626c-fbb0-42b4-9a50-b6c6c16095f3 Makeup Brush Kit This nifty makeup brush kit is essential in ev… F
5 09920b2e-4e07-41f7-aca6-47744777a2a7 Trendy Razor A must-have in every bathroom F
6 5dbc7cb7-39c5-4795-9064-d1655d78b3ca Razor Brand for Men Razor for every bathroom M
7 39945ad0-57c9-4c28-a69c-532d5d167202 Makeup Brushes Makeup brushes for every bathroom F
8 afdd9c41-2762-45bf-b6a7-e3fb8f1b34ba Minimalistic Razor A must-have in every bathroom M
9 83095a08-2968-4275-a375-4fab404df7ac Fusion5 Razers for Men Razor for every bathroom M
10 6d5b3f03-ade6-42f7-969d-acd1f2162332 5-Blade Razor for Men Razor for every bathroom M

Next, we fine-tune the personalization weight to a value of 0.8 to get more personalized search results for “Grooming.” In the following table, the top four items in the search results are highly suited for men. Premium Men’s Razor and Razor Brand for Men shoot up further in rank. We also see other grooming items such as Minimalistic Razor and Fusion5 Razers for Men surfaced at the top of the search results even though they had a lower ranking in our first query.

Rank Item_ID Item_Name Description Gender_Affinity
1 1bfbe5c7-6f02-4465-82f1-6083a4b302c0 Premium Men’s Razor Razor for every bathroom M
2 5dbc7cb7-39c5-4795-9064-d1655d78b3ca Razor Brand for Men Razor for every bathroom M
3 afdd9c41-2762-45bf-b6a7-e3fb8f1b34ba Minimalistic Razor A must-have in every bathroom M
4 83095a08-2968-4275-a375-4fab404df7ac Fusion5 Razers for Men Razor for every bathroom M
5 1bcb66c4-ee9d-4c0c-ba53-168cb243569f Women’s Grooming Kit A must-have in every bathroom F
6 f91ec34f-a08e-4408-8bb0-592bdd09375c Besto Hairbrush for Women Soft brush for everyday use F
7 6d5b3f03-ade6-42f7-969d-acd1f2162332 5-Blade Razor for Men Razor for every bathroom M
8 09920b2e-4e07-41f7-aca6-47744777a2a7 Trendy Razor A must-have in every bathroom F
9 39945ad0-57c9-4c28-a69c-532d5d167202 Makeup Brushes Makeup brushes for every bathroom F
10 4296626c-fbb0-42b4-9a50-b6c6c16095f3 Makeup Brush Kit This nifty makeup brush kit is essential in ev… F

For more details on how to implement personalized search with OpenSearch Service, refer to Personalizing search results from OpenSearch.

Conclusion

With the new Amazon Personalize Search Ranking plugin, customers of both self-managed OpenSearch and OpenSearch Service v2.9 and above can boost relevant items in their search results by including signals from each user’s history, context, and preferences. The plugin enables you to exercise greater control over the level of personalization for each user and query type, and improve the overall search experience for your users.

For more details on Amazon Personalize, refer to the Amazon Personalize Developer Guide.


About the Authors


Shreeya Sharma
is a Sr. Technical Product Manager working with AWS AI/ML on the Amazon Personalize team. She has a background in computer science engineering, technology consulting, and data analytics

Ketan Kulkarni is a Software Development Engineer with the Amazon Personalize team focused on building AI-powered recommender systems at scale. In his spare time, he enjoys reading and traveling.

Prashant Mishra is a Software Development Engineer on the Amazon Personalize team.

Branislav Kveton is a Principal Scientist at AWS AI Labs. He proposes, analyzes, and applies algorithms that learn incrementally, run in real time, and converge to near optimal solutions as the number of observations increases.

Read More