Speaking the Language of the Genome: Gordon Bell Finalist Applies Large Language Models to Predict New COVID Variants

Speaking the Language of the Genome: Gordon Bell Finalist Applies Large Language Models to Predict New COVID Variants

A finalist for the Gordon Bell special prize for high performance computing-based COVID-19 research has taught large language models (LLMs) a new lingo — gene sequences — that can unlock insights in genomics, epidemiology and protein engineering.

Published in October, the groundbreaking work is a collaboration by more than two dozen academic and commercial researchers from Argonne National Laboratory, NVIDIA, the University of Chicago and others.

The research team trained an LLM to track genetic mutations and predict variants of concern in SARS-CoV-2, the virus behind COVID-19. While most LLMs applied to biology to date have been trained on datasets of small molecules or proteins, this project is one of the first models trained on raw nucleotide sequences — the smallest units of DNA and RNA.

“We hypothesized that moving from protein-level to gene-level data might help us build better models to understand COVID variants,” said Arvind Ramanathan, computational biologist at Argonne, who led the project. “By training our model to track the entire genome and all the changes that appear in its evolution, we can make better predictions about not just COVID, but any disease with enough genomic data.”

The Gordon Bell awards, regarded as the Nobel Prize of high performance computing, will be presented at this week’s SC22 conference by the Association for Computing Machinery, which represents around 100,000 computing experts worldwide. Since 2020, the group has awarded a special prize for outstanding research that advances the understanding of COVID with HPC.

Training LLMs on a Four-Letter Language

LLMs have long been trained on human languages, which usually comprise a couple dozen letters that can be arranged into tens of thousands of words, and joined together into longer sentences and paragraphs. The language of biology, on the other hand, has only four letters representing nucleotides — A, T, G and C in DNA, or A, U, G and C in RNA — arranged into different sequences as genes.

While fewer letters may seem like a simpler challenge for AI, language models for biology are actually far more complicated. That’s because the genome — made up of over 3 billion nucleotides in humans, and about 30,000 nucleotides in coronaviruses — is difficult to break down into distinct, meaningful units.

“When it comes to understanding the code of life, a major challenge is that the sequencing information in the genome is quite vast,” Ramanathan said. “The meaning of a nucleotide sequence can be affected by another sequence that’s much further away than the next sentence or paragraph would be in human text. It could reach over the equivalent of chapters in a book.”

NVIDIA collaborators on the project designed a hierarchical diffusion method that enabled the LLM to treat long strings of around 1,500 nucleotides as if they were sentences.

“Standard language models have trouble generating coherent long sequences and learning the underlying distribution of different variants,” said paper co-author Anima Anandkumar, senior director of AI research at NVIDIA and Bren professor in the computing + mathematical sciences department at Caltech. “We developed a diffusion model that operates at a higher level of detail that allows us to generate realistic variants and capture better statistics.”

Predicting COVID Variants of Concern

Using open-source data from the Bacterial and Viral Bioinformatics Resource Center, the team first pretrained its LLM on more than 110 million gene sequences from prokaryotes, which are single-celled organisms like bacteria. It then fine-tuned the model using 1.5 million high-quality genome sequences for the COVID virus.

By pretraining on a broader dataset, the researchers also ensured their model could generalize to other prediction tasks in future projects — making it one of the first whole-genome-scale models with this capability.

Once fine-tuned on COVID data, the LLM was able to distinguish between genome sequences of the virus’ variants. It was also able to generate its own nucleotide sequences, predicting potential mutations of the COVID genome that could help scientists anticipate future variants of concern.

visualization of sequenced covid genomes
Trained on a year’s worth of SARS-CoV-2 genome data, the model can infer the distinction between various viral strains. Each dot on the left corresponds to a sequenced SARS-CoV-2 viral strain, color-coded by variant. The figure on the right zooms into one particular strain of the virus, which captures evolutionary couplings across the viral proteins specific to this strain. Image courtesy of Argonne National Laboratory’s Bharat Kale, Max Zvyagin and Michael E. Papka. 

“Most researchers have been tracking mutations in the spike protein of the COVID virus, specifically the domain that binds with human cells,” Ramanathan said. “But there are other proteins in the viral genome that go through frequent mutations and are important to understand.”

The model could also integrate with popular protein-structure-prediction models like AlphaFold and OpenFold, the paper stated, helping researchers simulate viral structure and study how genetic mutations impact a virus’ ability to infect its host. OpenFold is one of the pretrained language models included in the NVIDIA BioNeMo LLM service for developers applying LLMs to digital biology and chemistry applications.

Supercharging AI Training With GPU-Accelerated Supercomputers

The team developed its AI models on supercomputers powered by NVIDIA A100 Tensor Core GPUs — including Argonne’s Polaris, the U.S. Department of Energy’s Perlmutter, and NVIDIA’s in-house Selene system. By scaling up to these powerful systems, they achieved performance of more than 1,500 exaflops in training runs, creating the largest biological language models to date.

“We’re working with models today that have up to 25 billion parameters, and we expect this to significantly increase in the future,” said Ramanathan. “The model size, the genetic sequence lengths and the amount of training data needed means we really need the computational complexity provided by supercomputers with thousands of GPUs.”

The researchers estimate that training a version of their model with 2.5 billion parameters took over a month on around 4,000 GPUs. The team, which was already investigating LLMs for biology, spent about four months on the project before publicly releasing the paper and code. The GitHub page includes instructions for other researchers to run the model on Polaris and Perlmutter.

The NVIDIA BioNeMo framework, available in early access on the NVIDIA NGC hub for GPU-optimized software, supports researchers scaling large biomolecular language models across multiple GPUs. Part of the NVIDIA Clara Discovery collection of drug discovery tools, the framework will support chemistry, protein, DNA and RNA data formats.

Find NVIDIA at SC22.

Image at top represents COVID strains sequenced by the researchers’ LLM. Each dot is color-coded by COVID variant. Image courtesy of  Argonne National Laboratory’s Bharat Kale, Max Zvyagin and Michael E. Papka. 

The post Speaking the Language of the Genome: Gordon Bell Finalist Applies Large Language Models to Predict New COVID Variants appeared first on NVIDIA Blog.

Read More

Going the Distance: NVIDIA Platform Solves HPC Problems at the Edge

Going the Distance: NVIDIA Platform Solves HPC Problems at the Edge

Collaboration among researchers, like the scientific community itself, spans the globe.

Universities and enterprises sharing work over long distances require a common language and secure pipeline to get every device — from microscopes and sensors to servers and campus networks — to see and understand the data each is transmitting. The increasing amount of data that needs to be stored, transmitted and analyzed only compounds the challenge.

To overcome this problem, NVIDIA has introduced a high performance computing platform that combines edge computing and AI to capture and consolidate streaming data from scientific edge instruments, and then allow the devices to talk to each other over long distances.

The platform consists of three major components. NVIDIA Holoscan is a software development kit that data scientists and domain experts can use to build GPU-accelerated pipelines for sensors that stream data. MetroX-3 is a new long-haul system that extends the connectivity of the NVIDIA Quantum-2 InfiniBand platform. And NVIDIA BlueField-3 DPUs provide secure and intelligent data migration.

Researchers can use the new NVIDIA platform for HPC edge computing to  securely communicate and collaborate on solving problems and bring their disparate devices and algorithms together to operate as one large supercomputer.

Holoscan for HPC at the Edge

Accelerated by GPU computing platforms — including NVIDIA IGX, HGX, DGX systems — NVIDIA Holoscan delivers the extreme performance required to process massive streams of data generated by the world’s scientific instruments.

NVIDIA Holoscan for HPC includes new APIs for C++ and Python that HPC researchers can use to build sensor data processing workflows that are flexible enough for non-image formats and scalable enough to translate raw data into real-time insights.

Holoscan also manages memory allocation to ensure zero-copy data exchanges, so developers can focus on the workflow logic and not worry about managing file and memory I/O.

The new features in Holoscan will be available to all the HPC developers next month. Sign up to be notified of early access to Holoscan 0.4 SDK.

MetroX-3 Goes the Distance

The NVIDIA MetroX-3 long-haul system, available next month, extends the latest cloud-native capabilities of the NVIDIA Quantum-2 InfiniBand platform from the edge to the HPC data center core. It enables GPUs between sites to securely share data over the InfiniBand network up to 25 miles (40km) away.

Taking advantage of native remote direct memory access, users can easily migrate data and compute jobs from one InfiniBand-connected mini-cluster to the main data center, or combine geographically dispersed compute clusters for higher overall performance and scalability.

Data center operators can efficiently provision, monitor and operate across all the InfiniBand-connected data center networks by using the NVIDIA Unified Fabric Manager to manage their MetroX-3 systems.

BlueField for Secure, Efficient HPC

NVIDIA BlueField data processing units offload, accelerate and isolate advanced networking, storage and security services to boost performance and efficiency for modern HPC.

During SC22, system software company Zettar is demonstrating its data migration and storage offload solution based on BlueField-3. Zettar software can consolidate data migration tasks to a data center footprint of 4U rack space, which today requires 13U with x86-based solutions.

Learn more about the new NVIDIA platform for HPC computing at the edge.

The post Going the Distance: NVIDIA Platform Solves HPC Problems at the Edge appeared first on NVIDIA Blog.

Read More

Supercomputing Superpowers: NVIDIA Brings Digital Twin Simulation to HPC Data Center Operators

Supercomputing Superpowers: NVIDIA Brings Digital Twin Simulation to HPC Data Center Operators

The technologies powering the world’s 7 million data centers are changing rapidly. The latest have allowed IT organizations to reduce costs even while dealing with exponential data growth.

Simulation and digital twins can help data center designers, builders and operators create highly efficient and performant facilities. But building a digital twin that can accurately represent all components of an AI supercomputing facility is a massive, complex undertaking.

The NVIDIA Omniverse simulation platform helps address this challenge by streamlining the process for collaborative virtual design. An Omniverse demo at SC22 showcased how the people behind data centers can use this open development platform to enhance the design and development of complex supercomputing facilities.

Omniverse, for the first time, lets data center operators aggregate real-time data inputs from their core third-party computer-aided design, simulation and monitoring applications so they can see and work with their complete datasets in real time.

The demo shows how Omniverse allows users to tap into the power of accelerated computing, simulation and operational digital twins connected to real-time monitoring and AI. This enables teams to streamline facility design, accelerate construction and deployment, and optimize ongoing operations.

The demo also highlighted NVIDIA Air, a data center simulation platform designed to work in conjunction with Omniverse to simulate the network — the central nervous system of the data center. With NVIDIA Air, teams can model the entire network stack, allowing them to automate and validate network hardware and software prior to bring-up.

Creating Digital Twins to Elevate Design and Simulation

In planning and constructing one of NVIDIA’s latest AI supercomputers, multiple engineering CAD datasets were collected from third-party industry tools such as Autodesk Revit, PTC Creo and Trimble SketchUp. This allowed designers and engineers to view the Universal Scene Description-based model in full fidelity, and they could collaboratively iterate on the design in real time.

PATCH MANAGER is an enterprise software application for planning cabling, assets and physical layer point-to-point connectivity in network domains. With PATCH MANAGER connected to Omniverse, the complex topology of port-to-port connections, rack and node layouts, and cabling can be integrated directly into the live model. This enables data center engineers to see the full view of the model and its dependencies.

To predict airflow and heat transfers, engineers used Cadence 6SigmaDCX, a software for computational fluid dynamics. Engineers can also use AI surrogates trained with NVIDIA Modulus for “what-if” analysis in near-real time. This lets teams simulate changes in complex thermals and cooling, and they can see the results instantly.

And with NVIDIA Air, the exact network topology — including protocols, monitoring and automation — can be simulated and prevalidated.

Once construction of a data center is complete, its sensors, control system and telemetry can be connected to the digital twin inside Omniverse, enabling real-time monitoring of operations.

With a perfectly synchronized digital twin, engineers can simulate common dangers such as power peaking or cooling system failures. Operators can benefit from AI-recommended changes that optimize for key priorities like boosting energy efficiency and reducing carbon footprint. The digital twin also allows them to test and validate software and component upgrades before deploying to the physical data center.

Catch up on the latest announcements by watching NVIDIA’s SC22 special address, and learn more about NVIDIA Omniverse.

The post Supercomputing Superpowers: NVIDIA Brings Digital Twin Simulation to HPC Data Center Operators appeared first on NVIDIA Blog.

Read More

NVIDIA and Dell Technologies Deliver AI and HPC Performance in Leaps and Bounds With Hopper, at SC22

NVIDIA and Dell Technologies Deliver AI and HPC Performance in Leaps and Bounds With Hopper, at SC22

Whether focused on tiny atoms or the immensity of outer space, supercomputing workloads benefit from the flexibility that the largest systems provide scientists and researchers.

To meet the needs of organizations with such large AI and high performance computing (HPC) workloads, Dell Technologies today unveiled the Dell PowerEdge XE9680 system — its first system with eight NVIDIA GPUs interconnected with NVIDIA NVLink — at SC22, an international supercomputing conference running through Friday.

The Dell PowerEdge XE9680 system is built on the NVIDIA HGX H100 architecture and packs eight NVIDIA H100 Tensor Core GPUs to serve the growing demand for large-scale AI and HPC workflows.

These include large language models for communications, chemistry and biology, as well as simulation and research in industries spanning aerospace, agriculture, climate, energy and manufacturing.

The XE9680 system is arriving alongside other new Dell servers announced today with NVIDIA Hopper architecture GPUs, including the Dell PowerEdge XE8640.

“Organizations working on advanced research and development need both speed and efficiency to accelerate discovery,” said Ian Buck, vice president of Hyperscale and High Performance Computing, NVIDIA. “Whether researchers are building more efficient rockets or investigating the behavior of molecules, Dell Technologies’ new PowerEdge systems provide the compute power and efficiency needed for massive AI and HPC workloads.”

“Dell Technologies and NVIDIA have been working together to serve customers for decades,” said Rajesh Pohani, vice president of portfolio and product management for PowerEdge, HPC and Core Compute at Dell Technologies. “As enterprise needs have grown, the forthcoming Dell PowerEdge servers with NVIDIA Hopper Tensor Core GPUs provide leaps in performance, scalability and security to accelerate the largest workloads.”

NVIDIA H100 to Turbocharge Dell Customer Data Centers

Fresh off setting world records in the MLPerf AI training benchmarks earlier this month, NVIDIA H100 is the world’s most advanced GPU. It’s packed with 80 billion transistors and features major advances to accelerate AI, HPC, memory bandwidth and interconnects at data center scale.

H100 is the engine of AI factories that organizations use to process and refine large datasets to produce intelligence and accelerate their AI-driven businesses. It features a dedicated Transformer Engine and fourth generation NVIDIA NVLink interconnect to accelerate exascale workloads.

Each system built on the NVIDIA HGX H100 platform features four or eight Hopper GPUs to deliver the highest AI performance with 3.5x more energy efficiency compared with the prior generation, saving development costs while accelerating discoveries.

Powerful Performance and Customer Options for AI, HPC Workloads

Dell systems power the work of leading organizations, and the forthcoming Hopper-based systems will broaden Dell’s portfolio of solutions for its customers around the world.

With its enhanced, air-cooled design and support for eight NVIDIA H100 GPUs with built-in NVLink connectivity, the PowerEdge XE9680 is purpose-built for optimal performance to help modernize operations and infrastructure to drive AI initiatives.

The PowerEdge XE8640, Dell’s new HGX H100 system with four Hopper GPUs, enables businesses to develop, train and deploy AI and machine learning models. A 4U rack system, the XE8540 delivers faster AI training performance and increased core capabilities with up to four PCIe Gen5 slots, NVIDIA Multi-Instance GPU  (MIG) technology and NVIDIA GPUDirect Storage support.

Availability

The Dell PowerEdge XE9680 and XE8640 will be available from Dell starting in the first half of 2023.

Customers can now try NVIDIA H100 GPUs on Dell PowerEdge servers on NVIDIA LaunchPad, which provides free hands-on experiences and gives companies access to the latest hardware and NVIDIA AI software.

To take a first look at Dell’s new servers with NVIDIA H100 GPUs at SC22, visit Dell in booth 2443.

The post NVIDIA and Dell Technologies Deliver AI and HPC Performance in Leaps and Bounds With Hopper, at SC22 appeared first on NVIDIA Blog.

Read More

Amazon SageMaker Studio Lab continues to democratize ML with more scale and functionality

Amazon SageMaker Studio Lab continues to democratize ML with more scale and functionality

To make machine learning (ML) more accessible, Amazon launched Amazon SageMaker Studio Lab at AWS re:Invent 2021. Today, tens of thousands of customers use it every day to learn and experiment with ML for free. We made it simple to get started with just an email address, without the need for installs, setups, credit cards, or an AWS account.

SageMaker Studio Lab resonates with customers who want to learn in either an informal or formal setting, as indicated by a recent survey that suggests 49% of our current customer base is learning on their own, whereas 21% is taking a formal ML class. Higher learning institutions have started to adopt it, because it helps them teach ML fundamentals beyond the notebook, like environment and resource management, which are critical areas for successful ML projects. Enterprise partners like Hugging Face, Snowflake, and Roboflow are using SageMaker Studio Lab to showcase their own ML capabilities.

In this post, we discuss new features in SageMaker Studio Lab, and share some customer success stories.

New features in SageMaker Studio Lab

We have continued to develop new features and mechanisms to delight, protect, and enable our ML community. Here are the latest enhancements:

  • To safeguard the CPU and GPU capacity from potential usage abuse, we launched a 2-step verification,  increasing the size of the community we can serve.  Going forward every customer be required to link their account to a mobile phone number.
  • In October 2022, we rolled out automated account approvals, enabling you to get a SageMaker Studio Lab account in less than a day.
  • We tripled capacity for GPU and CPU, enabling most of our customers to get an instance when they need it.
  • A safe mode was introduced to help you move forward if your environment becomes unstable. Although this is rare, it typically happens when customers exceed their storage limits.
  • We’ve added support for the Juptyer-LSP (Language Server Protocol) extension, providing you with code completion functionality. Note that if you got your account before November 2022, you can get this functionality by following few simple instructions (see FAQ for details).

Customer success stories

We continue to be customer obsessed, offering important features to customers based on their feedback. Here are some highlights from key institutions and partners:

“SageMaker Studio Lab solves a real problem in the classroom in that it provides an industrial-strength hosted Jupyter solution with GPU that goes beyond just a hosted notebook alone. The ability to add packages, configure an environment, and open a terminal has opened up many new learning opportunities for students. Finally, fine-tuning Hugging Face models with powerful GPUs has been an amazing emerging workflow to present to students. LLMs (large language models) are the future of AI, and SageMaker Studio Lab has enabled me to teach the future of AI.”

—Noah Gift, Executive in Residence at Duke MIDS (Data Science)

“SageMaker Studio Lab has been used by my team since it was in beta because of its powerful experience for ML developers. It effortlessly integrates with Snowpark, Snowflake’s developer framework, to provide an easy-to-get-started notebook interface for Snowflake Python developers. I’ve used it for multiple demos with customers and partners, and the response has been overwhelmingly favorable.”

—Eda Johnson, Partner Industry Solutions Manager at Snowflake

“Roboflow empowers developers to build their own computer vision applications, no matter their skillset or experience. With SageMaker Studio Lab, our large community of computer vision developers can access our models and data in an environment that closely resembles a local JupyterLab, which is what they are most accustomed to. The persistent storage of SageMaker Studio Lab is a game changer, because you don’t need to start from the beginning for each user session. SageMaker Studio Lab has personally become my go-to notebook platform of choice.”

—Mark McQuade, Field Engineering at Roboflow

“RPI owns one of the most powerful super computers in the world, but it (AiMOS) has a steep learning curve. We needed a way for our students to get started effectively, and frugally. SageMaker Studio Lab’s intuitive interface enabled our students to get started quickly, and provided powerful GPU, enabling them to work with complex deep learning models for their capstone projects.”

—Mohammed J. Zaki, Professor of Computer Science at Rensselaer Polytechnic Institute

“I use SageMaker Studio Lab in basic machine learning and Python-related courses that are designed to give students a solid foundation in many cloud technologies. Studio Lab enables our students to get hands-on experience with real-world data science projects, without them having to get bogged down in setups or configurations. Unlike other vendors, it is a Linux machine for students, and students can do much more coding exercises indeed!”

—Cyrus Wong, Senior Lecturer, Higher Diploma in Cloud and Data Centre Administration at the Department of Information Technology, IVE (LWL)

“Students in Northwestern Engineering’s Master of Science in Artificial Intelligence (MSAI) program were given a quick tour of SageMaker Studio Lab before using it in a 5-hour hackathon to apply what they learned to a real-world situation. We expected the students to naturally hit some obstacles during the very short time period. Instead, the students exceeded our expectations by not only completing all the projects but also giving very good presentations in which they showcased fascinating solutions to important real-world problems.”

—Mohammed Alam, Deputy Director of the MSAI program at Northwestern University

Get started with SageMaker Studio Lab

SageMaker Studio Lab is a great entry point for anyone interested in learning more about ML and data science. Amazon continues to invest in this free service, as well as other training assets and scholarship programs, to make ML accessible to all.

Get started with SageMaker Studio Lab today!


About the author

Michele Monclova is a principal product manager at AWS on the SageMaker team. She is a native New Yorker and Silicon Valley veteran. She is passionate about innovations that improve our quality of life.

Read More

How Prodege saved $1.5 million in annual human review costs using low-code computer vision AI

How Prodege saved $1.5 million in annual human review costs using low-code computer vision AI

This post was co-authored by Arun Gupta, the Director of Business Intelligence at Prodege, LLC.

Prodege is a data-driven marketing and consumer insights platform comprised of consumer brands—Swagbucks, MyPoints, Tada, ySense, InboxDollars, InboxPounds, DailyRewards, PollFish, and Upromise—along with a complementary suite of business solutions for marketers and researchers. Prodege has 120 million users and has paid $2.1 billion in rewards since 2005. In 2021, Prodege launched Magic Receipts, a new way for its users to earn cash back and redeem gift cards, just by shopping in-store at their favorite retailers, and uploading a receipt.

Remaining on the cutting edge of customer satisfaction requires constant focus and innovation.

Building a data science team from scratch is a great investment, but takes time, and often there are opportunities to create immediate business impact with AWS AI services. According to Gartner, by the end of 2024, 75% of enterprises will shift from piloting to operationalizing AI. With the reach of AI and machine learning (ML) growing, teams need to focus on how to create a low-cost, high-impact solution that can be easily adopted by an organization.

In this post, we share how Prodege improved their customer experience by infusing AI and ML into its business. Prodege wanted to find a way to reward its customers faster after uploading their receipts. They didn’t have an automated way to visually inspect the receipts for anomalies before issuing rebates. Because the volume of receipts was in the tens of thousands per week, the manual process of identifying anomalies wasn’t scalable.

Using Amazon Rekognition Custom Labels, Prodege rewarded their customers 5 times faster after uploading receipts, increased the correct classification of anomalous receipts from 70% to 99%, and saved $1.5 million in annual human review costs.

The challenge: Detecting anomalies in receipts quickly and accurately at scale

Prodege’s commitment to top-tier customer experience required an increase in the speed at which customers receive rewards for its massively popular Magic Receipts product. To do that, Prodege needed to detect receipt anomalies faster. Prodege investigated building their own deep learning models using Keras. This solution was promising in the long term, but couldn’t be implemented at Prodege’s desired speed for the following reasons:

  • Required a large dataset – Prodege realized the number of images they would need for training the model would be in the tens of thousands, and they would also need heavy compute power with GPUs to train the model.
  • Time consuming and costly – Prodege had hundreds of human-labeled valid and anomalous receipts, and the anomalies were all visual. Adding additional labeled images created operational expenses and could only function during normal business hours.
  • Required custom code and high maintenance – Prodege would have to develop custom code to train and deploy the custom model and maintain its lifecycle.

Overview of solution: Rekognition Custom Labels

Prodege worked with the AWS account team to first identify the business use case of being able to efficiently process receipts in an automated way so that their business was only issuing rebates to valid receipts. The Prodege data science team wanted a solution that required a small dataset to get started, could create immediate business impact, and required minimal code and low maintenance.

Based on these inputs, the account team identified Rekognition Custom Labels as a potential solution to train a model to identify which receipts are valid and which ones have anomalies. Rekognition Custom Labels provides a computer vision AI capability with a visual interface to automatically train and deploy models with as few as a couple of hundred images of uploaded labeled data.

The first step was to train a model using the labeled receipts from Prodege. The receipts were categorized into two labels: valid and anomalous. Approximately a hundred receipts of each kind were carefully selected by the Prodege business team, who had knowledge of the anomalies. The key to a good model in Rekognition Custom Labels is having accurate training data. The next step was to set up training of the model with a few clicks on the Rekognition Custom Labels console. The F1 score, which is used to gauge the accuracy and quality of the model, came in at 97%. This encouraged Prodege to do some additional testing in their sandbox and use the trained model to infer if new receipts were valid or had anomalies. Setting up inference with Rekognition Custom Labels is an easy one-click process, and it provides sample code to set up programmatic inference as well.

Encouraged by the accuracy of the model, Prodege set up a pilot batch inference pipeline. The pipeline would start the model, run hundreds of receipts against the model, store the results, and then shut down the model every week. The compliance team would then evaluate the receipts to check for accuracy. The accuracy remained as high for the pilot as it was during the initial testing. The Prodege team also set up a pipeline to train new receipts in order to maintain and improve the accuracy of the model.

Finally, the Prodege business intelligence team worked with the application team and support from the AWS account and product team to set up an inference endpoint that would work with their application to predict the validity of uploaded receipts in real time and provide its users a best-in-class consumer rewards experience. The solution is highlighted in the following figure. Based on the prediction and confidence score from Rekognition Custom Labels, the Prodege business intelligence team applied business logic to either have it processed or go through additional scrutiny. By introducing a human in the loop, Prodege is able to monitor the quality of the predictions and retrain the model as needed.

Prodege Anomaly Detection Solution

Prodege Anomaly Detection Architecture

Results

With Rekognition Custom Labels, Prodege increased the correct classification of anomalous receipts from 70% to 99% and saved $1.5 million in annual human review costs. This allowed Prodege to reward its customers 5 times faster after uploading their receipts. The best part of Rekognition Custom Labels was that it was easy to set up and required only a small set of pre-classified images to train the ML model for high confidence image detection (approximately 200 images vs. 50,000 required to train a model from scratch). The model’s endpoints could be easily accessed using the API. Rekognition Custom Labels has been an extremely effective solution for Prodege to enable the smooth functioning of their validated receipt scanning product, and helped Prodege save a lot of time and resources performing manual detection.

Conclusion

Remaining on the cutting edge of customer satisfaction requires constant focus and innovation, and is a strategic goal for businesses today. AWS computer vision services allowed Prodege to create immediate business impact with a low-cost and low-code solution. In partnership with AWS, Prodege continues to innovate and remain on the cutting edge of customer satisfaction. You can get started today with Rekognition Custom Labels and improve your business outcomes.


About the Authors

Arun Gupta is the Director of Business Intelligence at Prodege LLC. He is passionate about applying Machine Learning technologies to provide effective solutions across diverse business problems.

Prashanth GanapathyPrashanth Ganapathy is a Senior Solutions Architect in the Small Medium Business (SMB) segment at AWS. He enjoys learning about AWS AI/ML services and helping customers meet their business outcomes by building solutions for them. Outside of work, Prashanth enjoys photography, travel, and trying out different cuisines.

Amit GuptaAmit Gupta is an AI Services Solutions Architect at AWS. He is passionate about enabling customers with well-architected machine learning solutions at scale.

Nick Nick RamosRamos is a Senior Account Manager with AWS. He is passionate about helping customers solve their most complex business challenges, infusing AI/ML into customers’ businesses, and help customers grow top-line revenue.

Read More

Subspace Recovery from Heterogeneous Data with Non-isotropic Noise

*= Equal Contributions
Recovering linear subspaces from data is a fundamental and important task in statistics and machine learning. Motivated by heterogeneity in Federated Learning settings, we study a basic formulation of this problem: the principal component analysis (PCA), with a focus on dealing with irregular noise. Our data come from users with user contributing data samples from a -dimensional distribution with mean . Our goal is to recover the linear subspace shared by using the data points from all users, where every data point from user is formed by adding an independent…Apple Machine Learning Research