nVidia AI – Page 3

Accelerate DeepSeek Reasoning Models With NVIDIA GeForce RTX 50 Series AI PCs

January 31, 2025

by Annamalai Chockalingam nVidia AI

The recently released DeepSeek-R1 model family has brought a new wave of excitement to the AI community, allowing enthusiasts and developers to run state-of-the-art reasoning models with problem-solving, math and code capabilities, all from the privacy of local PCs.

With up to 3,352 trillion operations per second of AI horsepower, NVIDIA GeForce RTX 50 Series GPUs can run the DeepSeek family of distilled models faster than anything on the PC market.

A New Class of Models That Reason

Reasoning models are a new class of large language models (LLMs) that spend more time on “thinking” and “reflecting” to work through complex problems, while describing the steps required to solve a task.

The fundamental principle is that any problem can be solved with deep thought, reasoning and time, just like how humans tackle problems. By spending more time — and thus compute — on a problem, the LLM can yield better results. This phenomenon is known as test-time scaling, where a model dynamically allocates compute resources during inference to reason through problems.

Reasoning models can enhance user experiences on PCs by deeply understanding a user’s needs, taking actions on their behalf and allowing them to provide feedback on the model’s thought process — unlocking agentic workflows for solving complex, multi-step tasks such as analyzing market research, performing complicated math problems, debugging code and more.

The DeepSeek Difference

The DeepSeek-R1 family of distilled models is based on a large 671-billion-parameter mixture-of-experts (MoE) model. MoE models consist of multiple smaller expert models for solving complex problems. DeepSeek models further divide the work and assign subtasks to smaller sets of experts.

DeepSeek employed a technique called distillation to build a family of six smaller student models — ranging from 1.5-70 billion parameters — from the large DeepSeek 671-billion-parameter model. The reasoning capabilities of the larger DeepSeek-R1 671-billion-parameter model were taught to the smaller Llama and Qwen student models, resulting in powerful, smaller reasoning models that run locally on RTX AI PCs with fast performance.

Peak Performance on RTX

Inference speed is critical for this new class of reasoning models. GeForce RTX 50 Series GPUs, built with dedicated fifth-generation Tensor Cores, are based on the same NVIDIA Blackwell GPU architecture that fuels world-leading AI innovation in the data center. RTX fully accelerates DeepSeek, offering maximum inference performance on PCs.

Throughput performance of the Deepseek-R1 distilled family of models across GPUs on the PC.

Experience DeepSeek on RTX in Popular Tools

NVIDIA’s RTX AI platform offers the broadest selection of AI tools, software development kits and models, opening access to the capabilities of DeepSeek-R1 on over 100 million NVIDIA RTX AI PCs worldwide, including those powered by GeForce RTX 50 Series GPUs.

High-performance RTX GPUs make AI capabilities always available — even without an internet connection — and offer low latency and increased privacy because users don’t have to upload sensitive materials or expose their queries to an online service.

Experience the power of DeepSeek-R1 and RTX AI PCs through a vast ecosystem of software, including Llama.cpp, Ollama, LM Studio, AnythingLLM, Jan.AI, GPT4All and OpenWebUI, for inference. Plus, use Unsloth to fine-tune the models with custom data.

DeepSeek-R1 Now Live With NVIDIA NIM

January 30, 2025

by Erik Pounds nVidia AI

DeepSeek-R1 is an open model with state-of-the-art reasoning capabilities. Instead of offering direct responses, reasoning models like DeepSeek-R1 perform multiple inference passes over a query, conducting chain-of-thought, consensus and search methods to generate the best answer.

Performing this sequence of inference passes — using reason to arrive at the best answer — is known as test-time scaling. DeepSeek-R1 is a perfect example of this scaling law, demonstrating why accelerated computing is critical for the demands of agentic AI inference.

As models are allowed to iteratively “think” through the problem, they create more output tokens and longer generation cycles, so model quality continues to scale. Significant test-time compute is critical to enable both real-time inference and higher-quality responses from reasoning models like DeepSeek-R1, requiring larger inference deployments.

R1 delivers leading accuracy for tasks demanding logical inference, reasoning, math, coding and language understanding while also delivering high inference efficiency.

To help developers securely experiment with these capabilities and build their own specialized agents, the 671-billion-parameter DeepSeek-R1 model is now available as an NVIDIA NIM microservice preview on build.nvidia.com. The DeepSeek-R1 NIM microservice can deliver up to 3,872 tokens per second on a single NVIDIA HGX H200 system.

Developers can test and experiment with the application programming interface (API), which is expected to be available soon as a downloadable NIM microservice, part of the NVIDIA AI Enterprise software platform.

The DeepSeek-R1 NIM microservice simplifies deployments with support for industry-standard APIs. Enterprises can maximize security and data privacy by running the NIM microservice on their preferred accelerated computing infrastructure. Using NVIDIA AI Foundry with NVIDIA NeMo software, enterprises will also be able to create customized DeepSeek-R1 NIM microservices for specialized AI agents.

DeepSeek-R1 — a Perfect Example of Test-Time Scaling

DeepSeek-R1 is a large mixture-of-experts (MoE) model. It incorporates an impressive 671 billion parameters — 10x more than many other popular open-source LLMs — supporting a large input context length of 128,000 tokens. The model also uses an extreme number of experts per layer. Each layer of R1 has 256 experts, with each token routed to eight separate experts in parallel for evaluation.

Delivering real-time answers for R1 requires many GPUs with high compute performance, connected with high-bandwidth and low-latency communication to route prompt tokens to all the experts for inference. Combined with the software optimizations available in the NVIDIA NIM microservice, a single server with eight H200 GPUs connected using NVLink and NVLink Switch can run the full, 671-billion-parameter DeepSeek-R1 model at up to 3,872 tokens per second. This throughput is made possible by using the NVIDIA Hopper architecture’s FP8 Transformer Engine at every layer — and the 900 GB/s of NVLink bandwidth for MoE expert communication.

Getting every floating point operation per second (FLOPS) of performance out of a GPU is critical for real-time inference. The next-generation NVIDIA Blackwell architecture will give test-time scaling on reasoning models like DeepSeek-R1 a giant boost with fifth-generation Tensor Cores that can deliver up to 20 petaflops of peak FP4 compute performance and a 72-GPU NVLink domain specifically optimized for inference.

Get Started Now With the DeepSeek-R1 NIM Microservice

Developers can experience the DeepSeek-R1 NIM microservice, now available on build.nvidia.com. Watch how it works:

With NVIDIA NIM, enterprises can deploy DeepSeek-R1 with ease and ensure they get the high efficiency needed for agentic AI systems.

See notice regarding software product information.

Lights, Camera, Action: New NVIDIA Broadcast AI Features Now Streaming With GeForce RTX 50 Series GPUs

January 30, 2025

by Gerardo Delgado nVidia AI

New GeForce RTX 5090 and RTX 5080 GPUs — built on the NVIDIA Blackwell architecture — are now available to power generative AI content creation and accelerate creative performance.

GeForce RTX 5090 and RTX 5080 GPUs feature fifth-generation Tensor Cores with support for FP4, reducing the VRAM requirements to run generative AI models while doubling performance. For example, Black Forest Labs’ FLUX models — available on Hugging Face this week — at FP4 precision require less than 10GB of VRAM, compared with over 23GB at FP16. With a GeForce RTX 5090 GPU, the FLUX.1 [dev] model can generate images in just over five seconds, compared with 15 seconds on FP16 or 10 seconds on FP8 on a GeForce RTX 4090 GPU.

GeForce RTX 50 Series GPUs also come equipped with ninth-generation encoders and sixth-generation decoders that add support for 4:2:2 and increase encoding quality for HEVC and AV1. Fourth-generation RT Cores paired with DLSS 4 provide creators with super-smooth 3D rendering viewports.

“The GeForce RTX 5090 is a content creation powerhouse.” — PC World

The GeForce RTX 5090 GPU includes 32GB of ultra-fast GDDR7 memory and 1,792 GB/sec of total memory bandwidth — a 77% bandwidth increase over the GeForce RTX 4090 GPU. It also includes three encoders and two decoders, reducing export times by a third compared with the prior generation.

The GeForce RTX 5080 GPU features 16GB of GDDR7 memory, providing up to 960 GB/sec of total memory bandwidth — a 34% increase over the GeForce RTX 4080 GPU. And it includes two encoders and two decoders to boost video editing workloads.

“The NVIDIA GeForce RTX 5080 FE is notable on its own as a viable powerhouse option for any creative pro…” — Creative Bloq

The latest version of the NVIDIA Broadcast app is now available, adding two new beta AI effects — Studio Voice and Virtual Key Light — and improvements to existing ones, along with an updated user interface for better usability.

In addition, the January NVIDIA Studio Driver with support for the GeForce RTX 5090 and 5080 GPUs is ready for installation today. For automatic Studio Driver notifications, download the NVIDIA app, including an update for RTX Video Super Resolution — expanding the lineup of GeForce RTX GPUs that can run RTX Video Super Resolution for higher-quality video.

Use the GeForce RTX graphics card product finder to pick up GeForce RTX 5090 and RTX 5080 GPUs or a prebuilt system today.

Lights, Camera, Broadcast

The latest NVIDIA Broadcast app release features two new AI effects — Studio Voice and Virtual Key Light — both currently in beta.

Studio Voice enhances a user’s microphone to match that of a high-quality microphone. Virtual Key Light relights subjects to deliver even lighting, as if a physical key light was defining the form and dimension of an individual. The new effects require a GeForce RTX 4080 or 5080 GPU or higher, and are designed for chatting streams and podcasts — these are not recommended for gaming.

The app update also improves voice quality with the Background Noise Removal feature, adds gaze stability and subtle random eye movements for a more natural appearance with Eye Contact, and improves foreground and background separation with Virtual Background.

The updated NVIDIA Broadcast app interface.

There’s also an updated user interface that allows users to apply more effects simultaneously and includes a side-by-side camera preview option, a GPU utilization meter and more.

Developers can integrate these effects directly into applications with NVIDIA Maxine Windows software development kits (SDKs) or by accessing them as an NVIDIA NIM microservice.

The updated NVIDIA Broadcast app is available for download today.

Accelerating Creative Workflows

For video editors, all GeForce RTX 50 Series GPUs include 4:2:2 hardware support and can decode a single video source at up to 8K at 75 frames per second (fps) or nine video sources at 4K at 30 fps per decoder, enabling smooth multi-camera video editing.

“The GeForce RTX 5090 is currently unmatched in the consumer GPU market — nothing can touch it in terms of performance, with virtually any workload — AI, content creation, gaming, you name it.” — Hot Hardware

The GeForce RTX 5090 is equipped with three encoders and two decoders. These multi-encoder and -decoder setups enable the GeForce RTX 5090 GPU to export video 40% faster than the GeForce RTX 4090 GPU and at 4x speed compared with the GeForce RTX 3090 GPU.

GeForce RTX 50 Series GPUs also feature the ninth-generation NVIDIA Encoder (NVENC) with a 5% improvement in video quality on HEVC and AV1 encoding. The new AV1 Ultra Quality mode achieves 5% more compression at the same quality versus the previous generation, and the sixth-generation NVIDIA decoder achieves 2x decode speeds for H.264 over the prior version. The AV1 Ultra Quality mode will also be available to GeForce RTX 40 Series users.

Video editing applications Blackmagic Design’s DaVinci Resolve and Wondershare Filmora have integrated these technologies.

Livestreamers also benefit from the ninth-generation NVENC with a 5% video quality improvement for HEVC and AV1 — meaning that video quality looks like it used 5% more bitrate — in Twitch with the Twitch Enhanced Broadcasting beta, YouTube or Discord. This improvement is measured using BD-BR PSNR, the standard for measuring video quality by comparing what bitrate matches the same video quality between two encoders.

3D artists benefit from the 32GB of memory in GeForce RTX 5090 GPUs, allowing them to work on massive 3D projects and across multiple platforms simultaneously with smooth viewport movement. GeForce RTX 50 Series GPUs with fourth-generation RT Cores run 3D applications 40% faster.

DLSS 4 is now available in D5 Render and is coming in February to Chaos Vantage, two popular professional-grade 3D apps for architects, animators and designers. D5 Render will support DLSS 4’s new Multi Frame Generation feature to boost frame rates by using AI to generate up to three frames per rendered frame. This enables animators to smoothly navigate a scene with 4x as many frames, or render 3D content at 60 fps or more.

Developers can learn more about integrating these new tools into their apps via SDKs.

Stay tuned for more updates on the GeForce RTX 50 Series, app performance and compatibility, and emerging AI technologies.

Every month brings new creative app updates and optimizations powered by the NVIDIA Studio. Follow NVIDIA Studio on Instagram, X and Facebook. Access tutorials on the Studio YouTube channel and get updates directly in your inbox by subscribing to the Studio newsletter.

See notice regarding software product information.

GeForce NOW Celebrates Five Years of Cloud Gaming With AAA Blockbusters

January 30, 2025

by GeForce NOW Community nVidia AI

GeForce NOW turns five this February. Five incredible years of high-performance gaming have been made possible thanks to the members who’ve joined the cloud gaming platform on its remarkable journey.

Since exiting beta in 2020, GeForce NOW has changed how gamers access and enjoy their favorite titles. The cloud has come a long way, introducing groundbreaking new features and supporting over 2,000 games from celebrated publishers for members to play.

Five years of cloud gaming excellence deserves a celebration. As part of an epic February lineup of 17 games coming this month, every week, GeForce NOW will deliver a major game release in the cloud. This includes the highly anticipated Kingdom Come: Deliverance II from Warhorse Studios, Avowed from Obsidian Entertainment and Sid Meier’s Civilization VII from 2K Games. Make sure to stay tuned to GFN Thursdays to see what else is in store.

This GFN Thursday, check out the nine titles available to stream this week, including standout title Pax Dei, a medieval massively multiplayer online (MMO) game from Mainframe Industries. Whether seeking mythical exploration or heart-pounding sci-fi combat thrills, GeForce NOW provides unforgettable experiences for every kind of gamer.

Magical New Games

Pax Dei on GeForce NOW — *Build a kingdom one medieval dream at a time.*

Pax Dei is a vast, social sandbox MMO where myths are real, ghosts wander and magic shapes a breathtaking medieval world. Choose a path and forge a legacy as a master builder, fearless explorer, skilled warrior or dedicated craftsman. Build thriving villages in the Heartlands, craft resources alongside Clans and venture into the dangerous Wilderness to battle dark forces, uncover ancient secrets and vie for power. The further one goes, the greater the challenges and the rewards. In Pax Dei, every action shapes the story in a dynamic, living world. The Steam version arrives this week in the cloud, with the Epic Games Store version coming soon.

Look for the following games available to stream in the cloud this week:

Space Engineers 2 (New release on Steam, Jan. 27)
Eternal Strands (New release on Steam, Jan. 28)
Orcs Must Die! Deathtrap (New release on Steam, Jan. 28)
Sniper Elite: Resistance (New release on Steam and Xbox, available on PC Game Pass, Jan. 30)
Heart of the Machine (New release on Steam, Jan. 31)
Citizen Sleeper 2: Starward Vector (New release on Steam, Jan. 31)
Dead Island 2 (Xbox, available on PC Game Pass)
Pax Dei (Steam)
Sifu (Steam)

Here’s what to expect for the rest of February:

Kingdom Come Deliverance II (New release on Steam, Feb. 4)
Ambulance Life: A Paramedic Simulator (New Release on Steam, Feb. 6)
SWORN (New release on Steam, Feb. 6)
Sid Meier’s Civilization VII (New release on Steam and Epic Games Store, Feb. 11)
Legacy: Steel & Sorcery (New release on Steam, Feb. 12)
Tomb Raider IV-VI Remastered (New release on Steam, Feb. 14)
Avowed (New release on Steam, Battle.net and Xbox, available on PC Game Pass, Feb. 18)
Lost Records: Bloom & Rage (New release on Steam, Feb. 18)
Abiotic Factor (Steam)
Alan Wake (Xbox, available on the Microsoft Store)
Ashes of the Singularity: Escalation (Xbox, available on the Microsoft Store)
The Dark Crystal: Age of Resistance Tactics (Xbox, available on the Microsoft Store)
HUMANITY (Steam)
Murky Divers (Steam)
Somerville (Xbox, available on the Microsoft Store)
Songs of Silence (Steam)
UNDER NIGHT IN-BIRTH II Sys:Celes (Steam)

Joyful January

In addition to the 14 games announced last month, 15 more joined the GeForce NOW library:

Road 96 (New release on Xbox, available on PC Game Pass, Jan. 7)
Aloft (New release on Steam, Jan. 15)
Assetto Corsa EVO (New release on Steam, Jan. 16)
Among Us (Xbox, available on PC Game Pass)
Amnesia: Collection (Xbox, available on the Microsoft Store)
DREDGE (Epic Games Store)
Generation Zero (Xbox, available on PC Game Pass)
HOT WHEELS UNLEASHED 2 – Turbocharged (Xbox, available on PC Game Pass)
Kingdom Come: Deliverance (Xbox, available on the Microsoft Store)
Lawn Mowing Simulator (Xbox, available on the Microsoft Store)
Marvel Rivals (Steam)
Sins of a Solar Empire: Rebellion (Xbox, available on the Microsoft Store)
SMITE 2 (Steam)
STORY OF SEASONS: Friends of Mineral Town (Xbox, available on the Microsoft Store)
Townscaper (Xbox, available on the Microsoft Store)

What are you planning to play this weekend? Let us know on X or in the comments below.

morning!

predict your most played game of 2025 below

— NVIDIA GeForce NOW (@NVIDIAGFN) January 27, 2025

Leveling Up User Experiences With Agentic AI, From Bots to Autonomous Agents

January 29, 2025

by Noah Kravitz nVidia AI

AI agents with advanced perception and cognition capabilities are making digital experiences more dynamic and personalized across retail, finance, entertainment and other industries.

In this episode of the NVIDIA AI Podcast, Chris Covert, director of product experiences at Inworld AI, highlights how intelligent digital humans and characters are reshaping interactive experiences, from gaming to healthcare.

With expertise on the intersection of autonomous systems and human-centered design, Covert explains the different stages of AI agents — from basic conversational interfaces to fully autonomous systems. He emphasizes that the key to developing meaningful AI experiences is focusing on user value rather than technology alone.

The AI Podcast · AI Agents Take Digital Experiences to the Next Level in Gaming and Beyond, Featuring Chris Covert from Inworld AI – Episode 243

In addition, Covert discusses how livestreaming and recording software company Streamlabs announced a collaboration with Inworld and NVIDIA at this year’s CES trade show, unveiling an AI-powered streaming assistant that can provide real-time commentary, clip gameplay moments and interact dynamically with streamers thanks to NVIDIA ACE integrations.

Learn more about the latest advancements in agentic AI and other technologies by registering for NVIDIA GTC, the conference for the era of AI, taking place March 17-21 at the San Jose Convention Center.

Time Stamps

5:34 — The definition of digital humans and their current state in industries.

10:30 — The evolution of AI agents.

18:10 — The design philosophy behind building digital humans and why teams should start with a “moonshot” approach.

You Might Also Like…

How World Foundation Models Will Advance Physical AI

World foundation models are powerful neural networks that can simulate and predict outcomes in physical environments, enabling teams to enhance AI workflows and development. Ming-Yu Liu, vice president of research at NVIDIA and an IEEE Fellow, joined the NVIDIA AI Podcast to discuss how world foundation models will impact various industries.

How Roblox Uses Generative AI to Enhance User Experiences

Roblox is a colorful online platform that aims to reimagine the way that people come together. Now, generative AI is augmenting that vision. Anupam Singh, vice president of AI and growth engineering at Roblox, explains how the company uses the technology to enhance virtual experiences, power coding assistants to help creators, and increase inclusivity and user safety.

Exploring AI-Powered Filmmaking With Cuebric’s Pinar Seyhan Demirdag

Cuebric is on a mission to offer new solutions in filmmaking and content creation through immersive, two-and-a-half-dimensional cinematic environments. The company’s AI-powered application aims to help creators quickly bring their ideas to life, making high-quality production more accessible. Pinar Seyhan Demirdag, cofounder and CEO of Cuebric, talks about the current landscape of content creation and the role of AI in simplifying the creative process.

Amphitrite Rides AI Wave to Boost Maritime Shipping, Ocean Cleanup With Real-Time Weather Prediction and Simulation

January 27, 2025

by Bhoomi Gadhia nVidia AI

Named after Greek mythology’s goddess of the sea, France-based startup Amphitrite is fusing satellite data and AI to simulate and predict oceanic currents and weather.

It’s work that’s making waves in maritime-shipping and oceanic litter-collection operations.

Amphitrite’s AI models — powered by the NVIDIA AI and Earth-2 platforms — provide insights on positioning vessels to best harness the power of ocean currents, helping ships know when best to travel, as well as the optimal course. This helps users reduce travel times, fuel consumption and, ultimately, carbon emissions.

“We’re at a turning point on the modernization of oceanic atmospheric forecasting,” said Alexandre Stegner, cofounder and CEO of Amphitrite. “There’s a wide portfolio of applications that can use these domain-specific oceanographic AI models — first and foremost, we’re using them to help foster the energy transition and alleviate environmental issues.”

Optimizing Routes Based on Currents and Weather

Founded by expert oceanographers, Amphitrite — a member of the NVIDIA Inception program for cutting-edge startups — distinguishes itself from other weather modeling companies with its domain-specific expertise.

Amphitrite’s fine-tuned, three-kilometer-scale AI models focus on analyzing one parameter at a time, making them more accurate than global numerical modeling methods for the variable of interest. Read more in this paper showcasing the AI method, dubbed ORCAst, trained on NVIDIA GPUs.

Depending on the user’s needs, such variables include the current of the ocean within the first 10 meters of the surface — critical in helping ships optimize their travel and minimize fuel consumption — as well as the impacts of extreme waves and wind.

“It’s only with NVIDIA accelerated computing that we can achieve optimal performance and parallelization when analyzing data on the whole ocean,” said Evangelos Moschos, cofounder and chief technology officer of Amphitrite.

Using the latest NVIDIA AI technologies to predict ocean currents and weather in detail, ships can ride or avoid waves, optimize routes and enhance safety while saving energy and fuel.

“The amount of public satellite data that’s available is still much larger than the number of ways people are using this information,” Moschos said. “Fusing AI and satellite imagery, Amphitrite can improve the accuracy of global ocean current analyses by up to 2x compared with traditional methods.”

Fine-Tuned to Handle Oceans of Data

The startup’s AI models, tuned to handle seas of data on the ocean, are based on public data from NASA and the European Space Agency — including its Sentinel-3 satellite.

Plus, Amphitrite offers the world’s first forecast model incorporating data from the Surface Water and Ocean Topography (SWOT) mission — a satellite jointly developed and operated by NASA and French space agency CNES, in collaboration with the Canadian Space Agency and UK Space Agency.

“SWOT provides an unprecedented resolution of the ocean surface,” Moschos said.

While weather forecasting technologies have traditionally relied on numerical modeling and computational fluid dynamics, these approaches are harder to apply to the ocean, Moschos explained. This is because oceanic currents often deal with nonlinear physics. There’s also simply less observational data available on the ocean than on atmospheric weather.

Computer vision and AI, working with real-time satellite data, offer higher reliability for oceanic current and weather modeling than traditional methods.

Amphitrite trains and runs its AI models using NVIDIA H100 GPUs on premises and in the cloud — and is building on the FourCastNet model, part of Earth-2, to develop its computer vision models for wave prediction.

According to a case study along the Mediterranean Sea, the NVIDIA-powered Amphitrite fine-scale routing solution helped reduce one shipping line’s carbon emissions by 10%.

Through NVIDIA Inception, Amphitrite gained technical support when building its on-premises infrastructure, free cloud credits for NVIDIA GPU instances on Amazon Web Services, as well as opportunities to collaborate with NVIDIA experts on using the latest simulation technologies, like Earth-2 and FourCastNet.

Customers Set Sail With Amphitrite’s Models

Enterprises and organizations across the globe are using Amphitrite’s AI models to optimize their operations and make them more sustainable.

CMA-CGM, Genavir, Louis Dreyfus Armateurs and Orange Marine are among the shipping and oceanographic companies analyzing currents using the startup’s solutions.

In addition, Amphitrite is working with a nongovernmental organization to help track and remove pollution in the Pacific Ocean. The initiative uses Amphitrite’s models to analyze currents and follow plastics that drift from a garbage patch off the coast of California.

Moschos noted that another way the startup sets itself apart is by having an AI team — led by computer vision scientist Hannah Bull — that comprises majority women, some of whom are featured in the image above.

“This is still rare in the industry, but it’s something we’re really proud of on the technical front, especially since we founded the company in honor of Amphitrite, a powerful but often overlooked female figure in history,” Moschos said.

Learn more about NVIDIA Earth-2.

AI Maps Titan’s Methane Clouds in Record Time

January 24, 2025

by Brian Caulfield nVidia AI

Methane clouds on Titan, Saturn’s largest moon, are more than just a celestial oddity — they’re a window into one of the solar system’s most complex climates.

Until now, mapping them has been slow and grueling work. Enter AI: a team from NASA, UC Berkeley and France’s Observatoire des Sciences de l’Univers just changed the game.

Using NVIDIA GPUs, the researchers trained a deep learning model to analyze years of Cassini data in seconds. Their approach could reshape planetary science, turning what took days into moments.

“We were able to use AI to greatly speed up the work of scientists, increasing productivity and enabling questions to be answered that would otherwise be impractical,” said Zach Yahn, Georgia Tech PhD student and lead author of the study.

Read the full paper, “Rapid Automated Mapping of Clouds on Titan With Instance Segmentation.”

How It Works

At the project’s core is Mask R-CNN — a deep learning model that doesn’t just detect objects. It outlines them pixel by pixel. Trained on hand-labeled images of Titan, it mapped the moon’s elusive clouds: patchy, streaky and barely visible through a smoggy atmosphere.

The team used transfer learning, starting with a model trained on COCO (a dataset of everyday images), and fine-tuned it for Titan’s unique challenges. This saved time and demonstrated how “planetary scientists, who may not always have access to the vast computing resources necessary to train large models from scratch, can still use technologies like transfer learning to apply AI to their data and projects,” Yahn explained.

The model’s potential goes far beyond Titan. “Many other Solar System worlds have cloud formations of interest to planetary science researchers, including Mars and Venus. Similar technology might also be applied to volcanic flows on Io, plumes on Enceladus, linea on Europa and craters on solid planets and moons,” he added.

Fast Science, Powered by NVIDIA

NVIDIA GPUs made this speed possible, processing high-resolution images and generating cloud masks with minimal latency — work that traditional hardware would struggle to handle.

NVIDIA GPUs have become a mainstay for space scientists. They’ve helped analyze Webb Telescope data, model Mars landings and scan for extraterrestrial signals. Now, they’re helping researchers decode Titan.

What’s Next

This AI leap is just the start. Missions like NASA’s Europa Clipper and Dragonfly will flood researchers with data. AI can help handle it, processing it onboard, mid-mission, and even prioritizing findings in real time. Challenges remain, like creating hardware fit for space’s harsh conditions, but the potential is undeniable.

Methane clouds on Titan hold mysteries. Researchers are now unraveling them faster than ever with help from new AI tools accelerated by NVIDIA GPUs.

Read the full paper, “Rapid Automated Mapping of Clouds on Titan With Instance Segmentation.”

Image Credit: NASA Jet Propulsion Laboratory

Fast, Low-Cost Inference Offers Key to Profitable AI

January 23, 2025

by Dave Salvator nVidia AI

Businesses across every industry are rolling out AI services this year. For Microsoft, Oracle, Perplexity, Snap and hundreds of other leading companies, using the NVIDIA AI inference platform — a full stack comprising world-class silicon, systems and software — is the key to delivering high-throughput and low-latency inference and enabling great user experiences while lowering cost.

NVIDIA’s advancements in inference software optimization and the NVIDIA Hopper platform are helping industries serve the latest generative AI models, delivering excellent user experiences while optimizing total cost of ownership. The Hopper platform also helps deliver up to 15x more energy efficiency for inference workloads compared to previous generations.

AI inference is notoriously difficult, as it requires many steps to strike the right balance between throughput and user experience.

But the underlying goal is simple: generate more tokens at a lower cost. Tokens represent words in a large language model (LLM) system — and with AI inference services typically charging for every million tokens generated, this goal offers the most visible return on AI investments and energy used per task.

Full-stack software optimization offers the key to improving AI inference performance and achieving this goal.

Cost-Effective User Throughput

Businesses are often challenged with balancing the performance and costs of inference workloads. While some customers or use cases may work with an out-of-the-box or hosted model, others may require customization. NVIDIA technologies simplify model deployment while optimizing cost and performance for AI inference workloads. In addition, customers can experience flexibility and customizability with the models they choose to deploy.

NVIDIA NIM microservices, NVIDIA Triton Inference Server and the NVIDIA TensorRT library are among the inference solutions NVIDIA offers to suit users’ needs:

NVIDIA NIM inference microservices are prepackaged and performance-optimized for rapidly deploying AI foundation models on any infrastructure — cloud, data centers, edge or workstations.
NVIDIA Triton Inference Server, one of the company’s most popular open-source projects, allows users to package and serve any model regardless of the AI framework it was trained on.
NVIDIA TensorRT is a high-performance deep learning inference library that includes runtime and model optimizations to deliver low-latency and high-throughput inference for production applications.

Available in all major cloud marketplaces, the NVIDIA AI Enterprise software platform includes all these solutions and provides enterprise-grade support, stability, manageability and security.

With the framework-agnostic NVIDIA AI inference platform, companies save on productivity, development, and infrastructure and setup costs. Using NVIDIA technologies can also boost business revenue by helping companies avoid downtime and fraudulent transactions, increase e-commerce shopping conversion rates and generate new, AI-powered revenue streams.

Cloud-Based LLM Inference

To ease LLM deployment, NVIDIA has collaborated closely with every major cloud service provider to ensure that the NVIDIA inference platform can be seamlessly deployed in the cloud with minimal or no code required. NVIDIA NIM is integrated with cloud-native services such as:

Amazon SageMaker AI, Amazon Bedrock Marketplace, Amazon Elastic Kubernetes Service
Google Cloud’s Vertex AI, Google Kubernetes Engine
Microsoft Azure AI Foundry coming soon, Azure Kubernetes Service
Oracle Cloud Infrastructure’s data science tools, Oracle Cloud Infrastructure Kubernetes Engine

Plus, for customized inference deployments, NVIDIA Triton Inference Server is deeply integrated into all major cloud service providers.

For example, using the OCI Data Science platform, deploying NVIDIA Triton is as simple as turning on a switch in the command line arguments during model deployment, which instantly launches an NVIDIA Triton inference endpoint.

Similarly, with Azure Machine Learning, users can deploy NVIDIA Triton either with no-code deployment through the Azure Machine Learning Studio or full-code deployment with Azure Machine Learning CLI. AWS provides one-click deployment for NVIDIA NIM from SageMaker Marketplace and Google Cloud provides a one-click deployment option on Google Kubernetes Engine (GKE). Google Cloud provides a one-click deployment option on Google Kubernetes Engine, while AWS offers NVIDIA Triton on its AWS Deep Learning containers.

The NVIDIA AI inference platform also uses popular communication methods for delivering AI predictions, automatically adjusting to accommodate the growing and changing needs of users within a cloud-based infrastructure.

From accelerating LLMs to enhancing creative workflows and transforming agreement management, NVIDIA’s AI inference platform is driving real-world impact across industries. Learn how collaboration and innovation are enabling the organizations below to achieve new levels of efficiency and scalability.

Serving 400 Million Search Queries Monthly With Perplexity AI

Perplexity AI, an AI-powered search engine, handles over 435 million monthly queries. Each query represents multiple AI inference requests. To meet this demand, the Perplexity AI team turned to NVIDIA H100 GPUs, Triton Inference Server and TensorRT-LLM.

Supporting over 20 AI models, including Llama 3 variations like 8B and 70B, Perplexity processes diverse tasks such as search, summarization and question-answering. By using smaller classifier models to route tasks to GPU pods, managed by NVIDIA Triton, the company delivers cost-efficient, responsive service under strict service level agreements.

Through model parallelism, which splits LLMs across GPUs, Perplexity achieved a threefold cost reduction while maintaining low latency and high accuracy. This best-practice framework demonstrates how IT teams can meet growing AI demands, optimize total cost of ownership and scale seamlessly with NVIDIA accelerated computing.

Reducing Response Times With Recurrent Drafter (ReDrafter)

Open-source research advancements are helping to democratize AI inference. Recently, NVIDIA incorporated Redrafter, an open-source approach to speculative decoding published by Apple , into NVIDIA TensorRT-LLM.

ReDrafter uses smaller “draft” modules to predict tokens in parallel, which are then validated by the main model. This technique significantly reduces response times for LLMs, particularly during periods of low traffic.

Transforming Agreement Management With Docusign

Docusign, a leader in digital agreement management, turned to NVIDIA to supercharge its Intelligent Agreement Management platform. With over 1.5 million customers globally, Docusign needed to optimize throughput and manage infrastructure expenses while delivering AI-driven insights.

NVIDIA Triton provided a unified inference platform for all frameworks, accelerating time to market and boosting productivity by transforming agreement data into actionable insights. Docusign’s adoption of the NVIDIA inference platform underscores the positive impact of scalable AI infrastructure on customer experiences and operational efficiency.

“NVIDIA Triton makes our lives easier,” said Alex Zakhvatov, senior product manager at Docusign. “We no longer need to deploy bespoke, framework-specific inference servers for our AI models. We leverage Triton as a unified inference server for all AI frameworks and also use it to identify the right production scenario to optimize cost- and performance-saving engineering efforts.”

Enhancing Customer Care in Telco With Amdocs

Amdocs, a leading provider of software and services for communications and media providers, built amAIz, a domain-specific generative AI platform for telcos as an open, secure, cost-effective and LLM-agnostic framework. Amdocs is using NVIDIA DGX Cloud and NVIDIA AI Enterprise software to provide solutions based on commercially available LLMs as well as domain-adapted models, enabling service providers to build and deploy enterprise-grade generative AI applications.

Using NVIDIA NIM, Amdocs reduced the number of tokens consumed for deployed use cases by up to 60% in data preprocessing and 40% in inferencing, offering the same level of accuracy with a significantly lower cost per token, depending on various factors and volumes used. The collaboration also reduced query latency by approximately 80%, ensuring that end users experience near real-time responses. This acceleration enhances user experiences across commerce, customer service, operations and beyond.

Revolutionizing Retail With AI on Snap

Shopping for the perfect outfit has never been easier, thanks to Snap’s Screenshop feature. Integrated into Snapchat, this AI-powered tool helps users find fashion items seen in photos. NVIDIA Triton played a pivotal role in enabling Screenshop’s pipeline, which processes images using multiple frameworks, including TensorFlow and PyTorch.

By consolidating its pipeline onto a single inference serving platform, Snap significantly reduced development time and costs while ensuring seamless deployment of updated models. The result is a frictionless user experience powered by AI.

“We didn’t want to deploy bespoke inference serving platforms for our Screenshop pipeline, a TF-serving platform for TensorFlow and a TorchServe platform for PyTorch,” explained Ke Ma, a machine learning engineer at Snap. “Triton’s framework-agnostic design and support for multiple backends like TensorFlow, PyTorch and ONNX was very compelling. It allowed us to serve our end-to-end pipeline using a single inference serving platform, which reduces our inference serving costs and the number of developer days needed to update our models in production.”

Following the successful launch of the Screenshop service on NVIDIA Triton, Ma and his team turned to NVIDIA TensorRT to further enhance their system’s performance. By applying the default NVIDIA TensorRT settings during the compilation process, the Screenshop team immediately saw a 3x surge in throughput, estimated to deliver a staggering 66% cost reduction.

Financial Freedom Powered by AI With Wealthsimple

Wealthsimple, a Canadian investment platform managing over C$30 billion in assets, redefined its approach to machine learning with NVIDIA’s AI inference platform. By standardizing its infrastructure, Wealthsimple slashed model delivery time from months to under 15 minutes, eliminating downtime and empowering teams to deliver machine learning as a service.

By adopting NVIDIA Triton and running its models through AWS, Wealthsimple achieved 99.999% uptime, ensuring seamless predictions for over 145 million transactions annually. This transformation highlights how robust AI infrastructure can revolutionize financial services.

“NVIDIA’s AI inference platform has been the linchpin in our organization’s ML success story, revolutionizing our model deployment, reducing downtime and enabling us to deliver unparalleled service to our clients,” said Mandy Gu, senior software development manager at Wealthsimple.

Elevating Creative Workflows With Let’s Enhance

AI-powered image generation has transformed creative workflows and can be applied to enterprise use cases such as creating personalized content and imaginative backgrounds for marketing visuals. While diffusion models are powerful tools for enhancing creative workflows, the models can be computationally expensive.

To optimize its workflows using the Stable Diffusion XL model in production, Let’s Enhance, a pioneering AI startup, chose the NVIDIA AI inference platform.

Product images with backgrounds created using Let’s Enhance platform powered by SDXL.

Let’s Enhance’s latest product, AI Photoshoot, uses the SDXL model to transform plain product photos into beautiful visual assets for e-commerce websites and marketing campaigns.

With NVIDIA Triton’s robust support for various frameworks and backends, coupled with its dynamic batching feature set, Let’s Enhance was able to seamlessly integrate the SDXL model into existing AI pipelines with minimal involvement from engineering teams, freeing up their time for research and development efforts.

Accelerating Cloud-Based Vision AI With OCI

Oracle Cloud Infrastructure (OCI) integrated NVIDIA Triton to power its Vision AI service, enhancing prediction throughput by up to 76% and reducing latency by 51%. These optimizations improved customer experiences with applications including automating toll billing for transit agencies and streamlining invoice recognition for global businesses.

With Triton’s hardware-agnostic capabilities, OCI has expanded its AI services portfolio, offering robust and efficient solutions across its global data centers.

“Our AI platform is Triton-aware for the benefit of our customers,” said Tzvi Keisar, a director of product management for OCI’s data science service, which handles machine learning for Oracle’s internal and external users.

Real-Time Contextualized Intelligence and Search Efficiency With Microsoft

Azure offers one of the widest and broadest selections of virtual machines powered and optimized by NVIDIA AI. These virtual machines encompass multiple generations of NVIDIA GPUs, including NVIDIA Blackwell and NVIDIA Hopper systems.

Building on this rich history of engineering collaboration, NVIDIA GPUs and NVIDIA Triton now help accelerate AI inference in Copilot for Microsoft 365. Available as a dedicated physical keyboard key on Windows PCs, Microsoft 365 Copilot combines the power of LLMs with proprietary enterprise data to deliver real-time contextualized intelligence, enabling users to enhance their creativity, productivity and skills.

Microsoft Bing also used NVIDIA inference solutions to address challenges including latency, cost and speed. By integrating NVIDIA TensorRT-LLM techniques, Microsoft significantly improved inference performance for its Deep Search feature, which powers optimized web results.

Deep search walkthrough courtesy of Microsoft

Microsoft Bing Visual Search enables people around the world to find content using photographs as queries. The heart of this capability is Microsoft’s TuringMM visual embedding model that maps images and text into a shared high-dimensional space. Because it operates on billions of images across the web, performance is critical.

Microsoft Bing optimized the TuringMM pipeline using NVIDIA TensorRT and NVIDIA acceleration libraries including CV-CUDA and nvImageCodec. These efforts resulted in a 5.13x speedup and significant TCO reduction.

Unlocking the Full Potential of AI Inference With Hardware Innovation

Improving the efficiency of AI inference workloads is a multifaceted challenge that demands innovative technologies across hardware and software.

NVIDIA GPUs are at the forefront of AI enablement, offering high efficiency and performance for AI models. They’re also the most energy efficient: NVIDIA accelerated computing on the NVIDIA Blackwell architecture has cut the energy used per token generation by 100,000x in the past decade for inference of trillion-parameter AI models.

The NVIDIA Grace Hopper Superchip, which combines NVIDIA Grace CPU and Hopper GPU architectures using NVIDIA NVLink-C2C, delivers substantial inference performance improvements across industries.

Meta Andromeda is using the superchip for efficient and high-performing personalized ads retrieval. By creating deep neural networks with increased compute complexity and parallelism, on Facebook and Instagram it has achieved an 8% ad quality improvement on select segments and a 6% recall improvement.

With optimized retrieval models and low-latency, high-throughput and memory-IO aware GPU operators, Andromeda offers a 100x improvement in feature extraction speed compared to previous CPU-based components. This integration of AI at the retrieval stage has allowed Meta to lead the industry in ads retrieval, addressing challenges like scalability and latency for a better user experience and higher return on ad spend.

As cutting-edge AI models continue to grow in size, the amount of compute required to generate each token also grows. To run state-of-the-art LLMs in real time, enterprises need multiple GPUs working in concert. Tools like the NVIDIA Collective Communication Library, or NCCL, enable multi-GPU systems to quickly exchange large amounts of data between GPUs with minimal communication time.

Future AI Inference Innovations

The future of AI inference promises significant advances in both performance and cost.

The combination of NVIDIA software, novel techniques and advanced hardware will enable data centers to handle increasingly complex and diverse workloads. AI inference will continue to drive advancements in industries such as healthcare and finance by enabling more accurate predictions, faster decision-making and better user experiences.

Learn more about how NVIDIA is delivering breakthrough inference performance results and stay up to date with the latest AI inference performance updates.

‘Baldur’s Gate 3’ Mod Support Launches in the Cloud

January 23, 2025

by GeForce NOW Community nVidia AI

GeForce NOW is expanding mod support for hit game Baldur’s Gate 3 in collaboration with Larian Studios and mod.io for Ultimate and Performance members.

This expanded mod support arrives alongside seven new games joining the cloud this week.

Level Up Gaming

Time to roll for initiative — adventurers in the Forgotten Realms can now enjoy a range of curated mods uploaded to mod.io for Baldur’s Gate 3. Ultimate and Performance members can enhance their Baldur’s Gate 3 journeys across realms and devices with a wide array of customization options. Stay tuned to GFN Thursday for more information on expanding mod support for more of the game’s PC mods at a later time.

Downloading mods is easy — choose the desired mods from the Baldur’s Gate 3 in-game mod menu, and they’ll stay enabled across sessions. Or subscribe to the mods via mod.io to load them automatically when launching the game from GeForce NOW. Read more details in the knowledge article.

Learn more about how curated mods are made available for Baldur’s Gate 3 players and read the curation guidelines.

GeForce NOW members can bring their unique adventures across devices, including NVIDIA SHIELD TVs, underpowered laptops, Macs, Chromebooks and handheld devices like the Steam Deck. Whether battling mind flayers in the living room or crafting spells on the go, GeForce NOW delivers experiences that are seamless, immersive and portable as a Bag of Holding.

NOW Playing

Hordes of Hel on GeForce NOW — *Ring around the rosie.*

Jötunnslayer: Hordes of Hel is a gripping roguelike horde-survivor game set in the dark realms of Norse Mythology. Fight waves of enemies to earn divine blessings of ancient Viking Deities, explore hostile worlds and face powerful bosses. Become a god-like warrior in this ultimate showdown.

Jötunnslayer: Hordes of Hel (New release on Jan 21, Steam)
Among Us (Xbox, available on PC Game Pass)
Amnesia: Collection (Xbox, available on the Microsoft Store)
Lawn Mowing Simulator (Xbox, available on the Microsoft Store)
Sins of a Solar Empire: Rebellion (Xbox, available on the Microsoft Store)
STORY OF SEASONS: Friends of Mineral Town (Xbox, available on the Microsoft Store)
Townscaper (Xbox, available on the Microsoft Store)

What are you planning to play this weekend? Let us know on X or in the comments below.

what’s the last game you ever beat?

— NVIDIA GeForce NOW (@NVIDIAGFN) January 22, 2025

How AI Helps Fight Fraud in Financial Services, Healthcare, Government and More

January 22, 2025

by Dan Rowinski nVidia AI

Companies and organizations are increasingly using AI to protect their customers and thwart the efforts of fraudsters around the world.

Voice security company Hiya found that 550 million scam calls were placed per week in 2023, with INTERPOL estimating that scammers stole $1 trillion from victims that same year. In the U.S., one of four noncontact-list calls were flagged as suspected spam, with fraudsters often luring people into Venmo-related or extended warranty scams.

Traditional methods of fraud detection include rules-based systems, statistical modeling and manual reviews. These methods have struggled to scale to the growing volume of fraud in the digital era without sacrificing speed and accuracy. For instance, rules-based systems often have high false-positive rates, statistical modeling can be time-consuming and resource-intensive, and manual reviews can’t scale rapidly enough.

In addition, traditional data science workflows lack the infrastructure required to analyze the volumes of data involved in fraud detection, leading to slower processing times and limiting real-time analysis and detection.

Plus, fraudsters themselves can use large language models (LLMs) and other AI tools to trick victims into investing in scams, giving up their bank credentials or buying cryptocurrency.

But AI — coupled with accelerated computing systems— can be used to check AI and help mitigate all of these issues.

Businesses that integrate robust AI fraud detection tools have seen up to a 40% improvement in fraud detection accuracy — helping reduce financial and reputational damage to institutions.

These technologies offer robust infrastructure and solutions for analyzing vast amounts of transactional data and can quickly and efficiently recognize fraud patterns and identify abnormal behaviors.

AI-powered fraud detection solutions provide higher detection accuracy by looking at the whole picture instead of individual transactions, catching fraud patterns that traditional methods might overlook. AI can also help reduce false positives, tapping into quality data to provide context about what constitutes a legitimate transaction. And, importantly, AI and accelerated computing provide better scalability, capable of handling massive data networks to detect fraud in real time.

How Financial Institutions Use AI to Detect Fraud

Financial services and banking are the front lines of the battle against fraud such as identity theft, account takeover, false or illegal transactions, and check scams. Financial losses worldwide from credit card transaction fraud are expected to reach $43 billion by 2026.

AI is helping enhance security and address the challenge of escalating fraud incidents.

Banks and other financial service institutions can tap into NVIDIA technologies to combat fraud. For example, the NVIDIA RAPIDS Accelerator for Apache Spark enables faster data processing to handle massive volumes of transaction data. Banks and financial service institutions can also use the new NVIDIA AI workflow for fraud detection — harnessing AI tools like XGBoost and graph neural networks (GNNs) with NVIDIA RAPIDS, NVIDIA Triton and NVIDIA Morpheus — to detect fraud and reduce false positives.

BNY Mellon improved fraud detection accuracy by 20% using NVIDIA DGX systems. PayPal improved real-time fraud detection by 10% running on NVIDIA GPU-powered inference, while lowering server capacity by nearly 8x. And Swedbank trained generative adversarial networks on NVIDIA GPUs to detect suspicious activities.

US Federal Agencies Fight Fraud With AI

The United States Government Accountability Office estimates that the government loses up to $521 billion annually due to fraud, based on an analysis of fiscal years 2018 to 2022. Tax fraud, check fraud and improper payments to contractors, in addition to improper payments under the Social Security and Medicare programs have become a massive drag on the government’s finances.

While some of this fraud was inflated by the recent pandemic, finding new ways to combat fraud has become a strategic imperative. As such, federal agencies have turned to AI and accelerated computing to improve fraud detection and prevent improper payments.

For example, the U.S. Treasury Department began using machine learning in late 2022 to analyze its trove of data and mitigate check fraud. The department estimated that AI helped officials prevent or recover more than $4 billion in fraud in fiscal year 2024.

Along with the Treasury Department, agencies such as the Internal Revenue Service have looked to AI and machine learning to close the tax gap — including tax fraud — which was estimated at $606 billion in tax year 2022. The IRS has explored the use of NVIDIA’s accelerated data science frameworks such as RAPIDS and Morpheus to identify anomalous patterns in taxpayer records, data access and common vulnerability and exposures. LLMs combined with retrieval-augmented generation and RAPIDS have also been used to highlight records that may not be in alignment with policies.

How AI Can Help Healthcare Stem Potential Fraud

According to the U.S. Department of Justice, healthcare fraud, waste and abuse may account for as much as 10% of all healthcare expenditures. Other estimates have deemed that percentage closer to 3%. Medicare and Medicaid fraud could be near $100 billion. Regardless, healthcare fraud is a problem worth hundreds of billions of dollars.

The additional challenge with healthcare fraud is that it can come from all directions. Unlike the IRS or the financial services industry, the healthcare industry is a fragmented ecosystem of hospital systems, insurance companies, pharmaceutical companies, independent medical or dental practices, and more. Fraud can occur at both provider and patient levels, putting pressure on the entire system.

Common types of potential healthcare fraud include:

Billing for services not rendered
Upcoding: billing for a more expensive service than the one rendered
Unbundling: multiple bills for the same service
Falsifying records
Using someone else’s insurance
Forged prescriptions

The same AI technologies that help combat fraud in financial services and the public sector can also be applied to healthcare. Insurance companies can use pattern and anomaly detection to look for claims that seem atypical, either from the provider or the patient, and scrutinize billing data for potentially fraudulent activity. Real-time monitoring can detect suspicious activity at the source, as it’s happening. And automated claims processing can help reduce human error and detect inconsistencies while improving operational efficiency.

Data processing through NVIDIA RAPIDS can be combined with machine learning and GNNs or other types of AI to help better detect fraud at every layer of the healthcare system, assisting patients and practitioners everywhere dealing with high costs of care.

AI for Fraud Detection Could Save Billions of Dollars

Financial services, the public sector and the healthcare industry are all using AI for fraud detection to provide a continuous defense against one of the world’s biggest drains on economic activity.

The NVIDIA AI platform supports the entire fraud detection and identity verification pipeline — from data preparation to model training to deployment — with tools like NVIDIA RAPIDS, NVIDIA Triton Inference Server and NVIDIA Morpheus on the NVIDIA AI Enterprise software platform.

Learn more about NVIDIA solutions for fraud detection with AI and accelerated computing.

A New Class of Models That Reason

The DeepSeek Difference

Peak Performance on RTX

Experience DeepSeek on RTX in Popular Tools

DeepSeek-R1 — a Perfect Example of Test-Time Scaling

Get Started Now With the DeepSeek-R1 NIM Microservice

Lights, Camera, Broadcast

Accelerating Creative Workflows

Magical New Games

Joyful January

Time Stamps

You Might Also Like…

Optimizing Routes Based on Currents and Weather

Fine-Tuned to Handle Oceans of Data

Customers Set Sail With Amphitrite’s Models

How It Works

Fast Science, Powered by NVIDIA

What’s Next

Cost-Effective User Throughput

Cloud-Based LLM Inference

Serving 400 Million Search Queries Monthly With Perplexity AI

Reducing Response Times With Recurrent Drafter (ReDrafter)

Transforming Agreement Management With Docusign

Enhancing Customer Care in Telco With Amdocs

Revolutionizing Retail With AI on Snap

Financial Freedom Powered by AI With Wealthsimple

Elevating Creative Workflows With Let’s Enhance

Accelerating Cloud-Based Vision AI With OCI

Real-Time Contextualized Intelligence and Search Efficiency With Microsoft

Unlocking the Full Potential of AI Inference With Hardware Innovation

Future AI Inference Innovations

Level Up Gaming

NOW Playing

How Financial Institutions Use AI to Detect Fraud

US Federal Agencies Fight Fraud With AI

How AI Can Help Healthcare Stem Potential Fraud

AI for Fraud Detection Could Save Billions of Dollars

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.