Optimize pet profiles for Purina’s Petfinder application using Amazon Rekognition Custom Labels and AWS Step Functions

Purina US, a subsidiary of Nestle, has a long history of enabling people to more easily adopt pets through Petfinder, a digital marketplace of over 11,000 animal shelters and rescue groups across the US, Canada, and Mexico. As the leading pet adoption platform, Petfinder has helped millions of pets find their forever homes.

Purina consistently seeks ways to make the Petfinder platform even better for both shelters and rescue groups and pet adopters. One challenge they faced was adequately reflecting the specific breed of animals up for adoption. Because many shelter animals are mixed breed, identifying breeds and attributes correctly in the pet profile required manual effort, which was time consuming. Purina used artificial intelligence (AI) and machine learning (ML) to automate animal breed detection at scale.

This post details how Purina used Amazon Rekognition Custom Labels, AWS Step Functions, and other AWS Services to create an ML model that detects the pet breed from an uploaded image and then uses the prediction to auto-populate the pet attributes. The solution focuses on the fundamental principles of developing an AI/ML application workflow of data preparation, model training, model evaluation, and model monitoring.

Solution overview

Predicting animal breeds from an image needs custom ML models. Developing a custom model to analyze images is a significant undertaking that requires time, expertise, and resources, often taking months to complete. Additionally, it often requires thousands or tens of thousands of hand-labeled images to provide the model with enough data to accurately make decisions. Setting up a workflow for auditing or reviewing model predictions to validate adherence to your requirements can further add to the overall complexity.

With Rekognition Custom Labels, which is built on the existing capabilities of Amazon Rekognition, you can identify the objects and scenes in images that are specific to your business needs. It is already trained on tens of millions of images across many categories. Instead of thousands of images, you can upload a small set of training images (typically a few hundred images or less per category) that are specific to your use case.

The solution uses the following services:

Amazon API Gateway is a fully managed service that makes it easy for developers to publish, maintain, monitor, and secure APIs at any scale.
The AWS Cloud Development Kit (AWS CDK) is an open-source software development framework for defining cloud infrastructure as code with modern programming languages and deploying it through AWS CloudFormation.
AWS CodeBuild is a fully managed continuous integration service in the cloud. CodeBuild compiles source code, runs tests, and produces packages that are ready to deploy.
Amazon DynamoDB is a fast and flexible nonrelational database service for any scale.
AWS Lambda is an event-driven compute service that lets you run code for virtually any type of application or backend service without provisioning or managing servers.
Amazon Rekognition offers pre-trained and customizable computer vision (CV) capabilities to extract information and insights from your images and videos. With Amazon Rekognition Custom Labels, you can identify the objects and scenes in images that are specific to your business needs.
AWS Step Functions is a fully managed service that makes it easier to coordinate the components of distributed applications and microservices using visual workflows.
AWS Systems Manager is a secure end-to-end management solution for resources on AWS and in multicloud and hybrid environments. Parameter Store, a capability of Systems Manager, provides secure, hierarchical storage for configuration data management and secrets management.

Purina’s solution is deployed as an API Gateway HTTP endpoint, which routes the requests to obtain pet attributes. It uses Rekognition Custom Labels to predict the pet breed. The ML model is trained from pet profiles pulled from Purina’s database, assuming the primary breed label is the true label. DynamoDB is used to store the pet attributes. Lambda is used to process the pet attributes request by orchestrating between API Gateway, Amazon Rekognition, and DynamoDB.

The architecture is implemented as follows:

The Petfinder application routes the request to obtain the pet attributes via API Gateway.
API Gateway calls the Lambda function to obtain the pet attributes.
The Lambda function calls the Rekognition Custom Label inference endpoint to predict the pet breed.
The Lambda function uses the predicted pet breed information to perform a pet attributes lookup in the DynamoDB table. It collects the pet attributes and sends it back to the Petfinder application.

The following diagram illustrates the solution workflow.

The Petfinder team at Purina wants an automated solution that they can deploy with minimal maintenance. To deliver this, we use Step Functions to create a state machine that trains the models with the latest data, checks their performance on a benchmark set, and redeploys the models if they have improved. The model retraining is triggered from the number of breed corrections made by users submitting profile information.

Model training

Developing a custom model to analyze images is a significant undertaking that requires time, expertise, and resources. Additionally, it often requires thousands or tens of thousands of hand-labeled images to provide the model with enough data to accurately make decisions. Generating this data can take months to gather and requires a large effort to label it for use in machine learning. A technique called transfer learning helps produce higher-quality models by borrowing the parameters of a pre-trained model, and allows models to be trained with fewer images.

Our challenge is that our data is not perfectly labeled: humans who enter the profile data can and do make mistakes. However, we found that for large enough data samples, the mislabeled images accounted for a sufficiently small fraction and model performance was not impacted more than 2% in accuracy.

ML workflow and state machine

The Step Functions state machine is developed to aid in the automatic retraining of the Amazon Rekognition model. Feedback is gathered during profile entry—each time a breed that has been inferred from an image is modified by the user to a different breed, the correction is recorded. This state machine is triggered from a configurable threshold number of corrections and additional pieces of data.

The state machine runs through several steps to create a solution:

Create train and test manifest files containing the list of Amazon Simple Storage Service (Amazon S3) image paths and their labels for use by Amazon Rekognition.
Create an Amazon Rekognition dataset using the manifest files.
Train an Amazon Rekognition model version after the dataset is created.
Start the model version when training is complete.
Evaluate the model and produce performance metrics.
If performance metrics are satisfactory, update the model version in Parameter Store.
Wait for the new model version to propagate in the Lambda functions (20 minutes), then stop the previous model.

Model evaluation

We use a random 20% holdout set taken from our data sample to validate our model. Because the breeds we detect are configurable, we don’t use a fixed dataset for validation during training, but we do use a manually labeled evaluation set for integration testing. The overlap of the manually labeled set and the model’s detectable breeds is used to compute metrics. If the model’s breed detection accuracy is above a specified threshold, we promote the model to be used in the endpoint.

The following are a few screenshots of the pet prediction workflow from Rekognition Custom Labels.

Deployment with the AWS CDK

The Step Functions state machine and associated infrastructure (including Lambda functions, CodeBuild projects, and Systems Manager parameters) are deployed with the AWS CDK using Python. The AWS CDK code synthesizes a CloudFormation template, which it uses to deploy all infrastructure for the solution.

Integration with the Petfinder application

The Petfinder application accesses the image classification endpoint through the API Gateway endpoint using a POST request containing a JSON payload with fields for the Amazon S3 path to the image and the number of results to be returned.

KPIs to be impacted

To justify the added cost of running the image inference endpoint, we ran experiments to determine the value that the endpoint adds for Petfinder. The use of the endpoint offers two main types of improvement:

Reduced effort for pet shelters who are creating the pet profiles
More complete pet profiles, which are expected to improve search relevance

Metrics for measuring effort and profile completeness include the number of auto-filled fields that are corrected, total number of fields filled, and time to upload a pet profile. Improvements to search relevance are indirectly inferred from measuring key performance indicators related to adoption rates. According to Purina, after the solution went live, the average time for creating a pet profile on the Petfinder application was reduced from 7 minutes to 4 minutes. That is a huge improvement and time savings because in 2022, 4 million pet profiles were uploaded.

Security

The data that flows through the architecture diagram is encrypted in transit and at rest, in accordance with the AWS Well-Architected best practices. During all AWS engagements, a security expert reviews the solution to ensure a secure implementation is provided.

Conclusion

With their solution based on Rekognition Custom Labels, the Petfinder team is able to accelerate the creation of pet profiles for pet shelters, reducing administrative burden on shelter personnel. The deployment based on the AWS CDK deploys a Step Functions workflow to automate the training and deployment process. To start using Rekognition Custom Labels, refer to Getting Started with Amazon Rekognition Custom Labels. You can also check out some Step Functions examples and get started with the AWS CDK.

About the Authors

Mason Cahill is a Senior DevOps Consultant with AWS Professional Services. He enjoys helping organizations achieve their business goals, and is passionate about building and delivering automated solutions on the AWS Cloud. Outside of work, he loves spending time with his family, hiking, and playing soccer.

Matthew Chasse is a Data Science consultant at Amazon Web Services, where he helps customers build scalable machine learning solutions. Matthew has a Mathematics PhD and enjoys rock climbing and music in his free time.

Rushikesh Jagtap is a Solutions Architect with 5+ years of experience in AWS Analytics services. He is passionate about helping customers to build scalable and modern data analytics solutions to gain insights from the data. Outside of work, he loves watching Formula1, playing badminton, and racing Go Karts.

Tayo Olajide is a seasoned Cloud Data Engineering generalist with over a decade of experience in architecting and implementing data solutions in cloud environments. With a passion for transforming raw data into valuable insights, Tayo has played a pivotal role in designing and optimizing data pipelines for various industries, including finance, healthcare, and auto industries. As a thought leader in the field, Tayo believes that the power of data lies in its ability to drive informed decision-making and is committed to helping businesses leverage the full potential of their data in the cloud era. When he’s not crafting data pipelines, you can find Tayo exploring the latest trends in technology, hiking in the great outdoors, or tinkering with gadgetry and software.

Vedere AI