Nerding About NeRFs: How Neural Radiance Fields Transform 2D Images Into Hyperrealistic 3D Models

Nerding About NeRFs: How Neural Radiance Fields Transform 2D Images Into Hyperrealistic 3D Models

Let’s talk about NeRFs — no, not the neon-colored foam dart blasters, but neural radiance fields, a technology that might just change the nature of images forever. In this episode of NVIDIA’s AI Podcast recorded live at GTC, host Noah Kravitz speaks with Michael Rubloff, founder and managing editor of radiancefields.com, about radiance field-based technologies. NeRFs allow users to take a series of 2D images or video to create a hyperrealistic 3D model — something like a photograph of a scene, but that can be looked at from multiple angles. Tune in to learn more about the technology’s creative and commercial applications and how it might transform the way people capture and experience the world.

Watch the replay of Rubloff’s GTC session on the intersection of generative AI and extended reality.

Time Stamps

1:18: What are NeRFs?
3:01: How do NeRFs work?
3:44: What’s the difference between NeRFs and Gaussian splatting?
4:36: How are NeRFs being used?
7:22: What is a radiance field?
14:18: How might radiance fields affect creative applications?
17:50: Examples of NeRFs in action in the media right now
21:00: Rubloff’s insight on where NeRFs will go in the future

You Might Also Like…

Media.Monks’ Lewis Smithingham on Enhancing Media and Marketing With AI – Ep. 222

Meet Media.Monks’ Wormhole, an alien-like, conversational robot with a quirky personality and the ability to offer keen marketing expertise. Lewis Smithingham, senior vice president of innovation and special ops at Media.Monks, a global marketing and advertising company, discusses the creation of Wormhole and AI’s potential to enhance media and entertainment.

Living Optics CEO Robin Wang on Democratizing Hyperspectral Imaging – Ep. 219

Step into the realm of the unseen with Robin Wang, CEO of Living Optics. The startup cofounder discusses the power of Living Optics’ hyperspectral imaging camera, which can capture visual data across 96 colors, reveals details invisible to the human eye.

Exploring Filmmaking With Cuebric’s AI: Insights from Pinar Seyhan Demirdag – Ep. 214

Pinar Seyhan Demirdag, co-founder and CEO of Cuebric, which is on a mission to offer new filmmaking and content creation solutions through immersive, two-and-a-half-dimensional cinematic environments.

Deepdub’s Ofir Krakowski on Redefining Dubbing From Hollywood to Bollywood – Ep. 202

Deepdub acts as a digital bridge, providing access to content by using generative AI to break down language and cultural barriers. The Israel-based startup’s co-founder and CEO, Ofir Krakowski, describes how Deepdub uses AI-driven dubbing to help entertainment companies boost efficiency and cut costs while increasing accessibility.

Subscribe to the AI Podcast

Get the AI Podcast through iTunes, Google Play, Amazon Music, Castbox, DoggCatcher, Overcast, PlayerFM, Pocket Casts, Podbay, PodBean, PodCruncher, PodKicker, Soundcloud, Spotify, Stitcher and TuneIn.

Make the AI Podcast better: Have a few minutes to spare? Fill out this listener survey.

Read More

Why Accelerated Data Processing Is Crucial for AI Innovation in Every Industry

Why Accelerated Data Processing Is Crucial for AI Innovation in Every Industry

Across industries, AI is supercharging innovation with machine-powered computation. In finance, bankers are using AI to detect fraud more quickly and keep accounts safe, telecommunications providers are improving networks to deliver superior service, scientists are developing novel treatments for rare diseases, utility companies are building cleaner, more reliable energy grids and automotive companies are making self-driving cars safer and more accessible.

The backbone of top AI use cases is data. Effective and precise AI models require training on extensive datasets. Enterprises seeking to harness the power of AI must establish a data pipeline that involves extracting data from diverse sources, transforming it into a consistent format and storing it efficiently.

Data scientists work to refine datasets through multiple experiments to fine-tune AI models for optimal performance in real-world applications. These applications, from voice assistants to personalized recommendation systems, require rapid processing of large data volumes to deliver real-time performance.

As AI models become more complex and begin to handle diverse data types such as text, audio, images, and video, the need for rapid data processing becomes more critical. Organizations that continue to rely on legacy CPU-based computing are struggling with hampered innovation and performance due to data bottlenecks, escalating data center costs, and insufficient computing capabilities.

Many businesses are turning to accelerated computing to integrate AI into their operations. This method leverages GPUs, specialized hardware, software, and parallel computing techniques to boost computing performance by as much as 150x and increase energy efficiency by up to 42x.

Leading companies across different sectors are using accelerated data processing to spearhead groundbreaking AI initiatives.

Finance Organizations Detect Fraud in a Fraction of a Second

Financial organizations face a significant challenge in detecting patterns of fraud due to the vast amount of transactional data that requires rapid analysis. Additionally, the scarcity of labeled data for actual instances of fraud poses a difficulty in training AI models. Conventional data science pipelines lack the required acceleration to handle the large data volumes associated with fraud detection. This leads to slower processing times that hinder real-time data analysis and fraud detection capabilities.

To overcome these challenges, American Express, which handles more than 8 billion transactions per year, uses accelerated computing to train and deploy long short-term memory (LSTM) models. These models excel in sequential analysis and detection of anomalies, and can adapt and learn from new data, making them ideal for combating fraud.

Leveraging parallel computing techniques on GPUs, American Express significantly speeds up the training of its LSTM models. GPUs also enable live models to process huge volumes of transactional data to make high-performance computations to detect fraud in real time.

The system operates within two milliseconds of latency to better protect customers and merchants, delivering a 50x improvement over a CPU-based configuration. By combining the accelerated LSTM deep neural network with its existing methods, American Express has improved fraud detection accuracy by up to 6% in specific segments.

Financial companies can also use accelerated computing to reduce data processing costs. Running data-heavy Spark3 workloads on NVIDIA GPUs, PayPal confirmed the potential to reduce cloud costs by up to 70% for big data processing and AI applications.

By processing data more efficiently, financial institutions can detect fraud in real time, enabling faster decision-making without disrupting transaction flow and minimizing the risk of financial loss.

Telcos Simplify Complex Routing Operations

Telecommunications providers generate immense amounts of data from various sources, including network devices, customer interactions, billing systems, and network performance and maintenance.

Managing national networks that handle hundreds of petabytes of data every day requires complex technician routing to ensure service delivery. To optimize technician dispatch, advanced routing engines perform trillions of computations, taking into account factors like weather, technician skills, customer requests and fleet distribution. Success in these operations depends on meticulous data preparation and sufficient computing power.

AT&T, which operates one of the nation’s largest field dispatch teams to service its customers, is enhancing data-heavy routing operations with NVIDIA cuOpt, which relies on heuristics, metaheuristics and optimizations to calculate complex vehicle routing problems.

In early trials, cuOpt delivered routing solutions in 10 seconds, achieving a 90% reduction in cloud costs and enabling technicians to complete more service calls daily. NVIDIA RAPIDS, a suite of software libraries that enables acceleration of data science and analytics pipelines, further accelerates cuOpt, allowing companies to integrate local search heuristics and metaheuristics like Tabu search for continuous route optimization.

AT&T is adopting NVIDIA RAPIDS Accelerator for Apache Spark to enhance the performance of Spark-based AI and data pipelines. This has helped the company boost operational efficiency on everything from training AI models to maintaining network quality to reducing customer churn and improving fraud detection. With RAPIDS Accelerator, AT&T is reducing its cloud computing spend for target workloads while enabling faster performance and reducing its carbon footprint.

Accelerated data pipelines and processing will be critical as telcos seek to improve operational efficiency while delivering the highest possible service quality.

Biomedical Researchers Condense Drug Discovery Timelines

As researchers utilize technology to study the roughly 25,000 genes in the human genome to understand their relationship with diseases, there has been an explosion of medical data and peer-reviewed research papers. Biomedical researchers rely on these papers to narrow down the field of study for novel treatments. However, conducting literature reviews of such a massive and expanding body of relevant research has become an impossible task.

AstraZeneca, a leading pharmaceutical company, developed a Biological Insights Knowledge Graph (BIKG) to aid scientists across the drug discovery process, from literature reviews to screen hit rating, target identification and more. This graph integrates public and internal databases with information from scientific literature, modeling between 10 million and 1 billion complex biological relationships.

BIKG has been effectively used for gene ranking, aiding scientists in hypothesizing high-potential targets for novel disease treatments. At NVIDIA GTC, the AstraZeneca team presented a project that successfully identified genes linked to resistance in lung cancer treatments.

To narrow down potential genes, data scientists and biological researchers collaborated to define the criteria and gene features ideal for targeting in treatment development. They trained a machine learning algorithm to search the BIKG databases for genes with the designated features mentioned in literature as treatable. Utilizing NVIDIA RAPIDS for faster computations, the team reduced the initial gene pool from 3,000 to just 40 target genes, a task that previously took months but now takes mere seconds.

By supplementing drug development with accelerated computing and AI, pharmaceutical companies and researchers can finally use the enormous troves of data building up in the medical field to develop novel drugs faster and more safely, ultimately having a life-saving impact.

Utility Companies Build the Future of Clean Energy 

There’s been a significant push to shift to carbon-neutral energy sources in the energy sector. With the cost of harnessing renewable resources such as solar energy falling drastically over the last 10 years, the opportunity to make real progress toward a clean energy future has never been greater.

However, this shift toward integrating clean energy from wind farms, solar farms and home batteries has introduced new complexities in grid management. As energy infrastructure diversifies and two-way power flows must be accommodated, managing the grid has become more data-intensive. New smart grids are now required to handle high-voltage areas for vehicle charging. They must also manage the availability of distributed stored energy sources and adapt to variations in usage across the network.

Utilidata, a prominent grid-edge software company, has collaborated with NVIDIA to develop a distributed AI platform, Karman, for the grid edge using a custom NVIDIA Jetson Orin edge AI module. This custom chip and platform, embedded in electricity meters, transforms each meter into a data collection and control point, capable of handling thousands of data points per second.

Karman processes real-time, high-resolution data from meters at the network’s edge. This enables utility companies to gain detailed insights into grid conditions, predict usage and seamlessly integrate distributed energy resources in seconds, rather than minutes or hours. Additionally, with inference models on edge devices, network operators can anticipate and quickly identify line faults to predict potential outages and conduct preventative maintenance to increase grid reliability.

Through the integration of AI and accelerated data analytics, Karman helps utility providers transform existing infrastructure into efficient smart grids. This allows for tailored, localized electricity distribution to meet fluctuating demand patterns without extensive physical infrastructure upgrades, facilitating a more cost-effective modernization of the grid.

Automakers Enable Safer, More Accessible, Self-Driving Vehicles

As auto companies strive for full self-driving capabilities, vehicles must be able to detect objects and navigate in real time. This requires high-speed data processing tasks, including feeding live data from cameras, lidar, radar and GPS into AI models that make navigation decisions to keep roads safe.

The autonomous driving inference workflow is complex and includes multiple AI models along with necessary preprocessing and postprocessing steps. Traditionally, these steps were handled on the client side using CPUs. However, this can lead to significant bottlenecks in processing speeds, which is an unacceptable drawback for an application where fast processing equates to safety.

To enhance the efficiency of autonomous driving workflows, electric vehicle manufacturer NIO integrated NVIDIA Triton Inference Server into its inference pipeline. NVIDIA Triton is open-source, multi-framework, inference-serving software. By centralizing data processing tasks, NIO reduced latency by 6x in some core areas and increased overall data throughput by up to 5x.

NIO’s GPU-centric approach made it easier to update and deploy new AI models without the need to change anything on the vehicles themselves. Additionally, the company could use multiple AI models at the same time on the same set of images without having to send data back and forth over a network, which saved on data transfer costs and improved performance.

By using accelerated data processing, autonomous vehicle software developers ensure they can reach a high-performance standard to avoid traffic accidents, lower transportation costs and improve mobility for users.

Retailers Improve Demand Forecasting

In the fast-paced retail environment, the ability to process and analyze data quickly is critical to adjusting inventory levels, personalizing customer interactions and optimizing pricing strategies on the fly. The larger a retailer is and the more products it carries, the more complex and compute-intensive its data operations will be.

Walmart, the largest retailer in the world, turned to accelerated computing to significantly improve forecasting accuracy for 500 million item-by-store combinations across 4,500 stores.

As Walmart’s data science team built more robust machine learning algorithms to take on this mammoth forecasting challenge, the existing computing environment began to falter, with jobs failing to complete or generating inaccurate results. The company found that data scientists were having to remove features from algorithms just so they would run to completion.

To improve its forecasting operations, Walmart started using NVIDIA GPUs and RAPIDs. The company now uses a forecasting model with 350 data features to predict sales across all product categories. These features encompass sales data, promotional events, and external factors like weather conditions and major events like the Super Bowl, which influence demand.

Advanced models helped Walmart improve forecast accuracy from 94% to 97% while eliminating an estimated $100 million in fresh produce waste and reducing stockout and markdown scenarios. GPUs also ran models 100x faster with jobs complete in just four hours, an operation that would’ve taken several weeks in a CPU environment.

By shifting data-intensive operations to GPUs and accelerated computing, retailers can lower both their cost and their carbon footprint while delivering best-fit choices and lower prices to shoppers.

Public Sector Improves Disaster Preparedness 

Drones and satellites capture huge amounts of aerial image data that public and private organizations use to predict weather patterns, track animal migrations and observe environmental changes. This data is invaluable for research and planning, enabling more informed decision-making in fields like agriculture, disaster management and efforts to combat climate change. However, the value of this imagery can be limited if it lacks specific location metadata.

A federal agency working with NVIDIA needed a way to automatically pinpoint the location of images missing geospatial metadata, which is essential for missions such as search and rescue, responding to natural disasters and monitoring the environment. However, identifying a small area within a larger region using an aerial image without metadata is extremely challenging, akin to locating a needle in a haystack. Algorithms designed to help with geolocation must address variations in image lighting and differences due to images being taken at various times, dates and angles.

To identify non-geotagged aerial images, NVIDIA, Booz Allen and the government agency collaborated on a solution that uses computer vision algorithms to extract information from image pixel data to scale the image similarity search problem.

When attempting to solve this problem, an NVIDIA solutions architect first used a Python-based application. Initially running on CPUs, processing took more than 24 hours. GPUs supercharged this to just minutes, performing thousands of data operations in parallel versus only a handful of operations on a CPU. By shifting the application code to CuPy, an open-sourced GPU-accelerated library, the application experienced a remarkable 1.8-million-x speedup, returning results in 67 microseconds.

With a solution that can process images and the data of large land masses in just minutes, organizations can gain access to the critical information needed to respond more quickly and effectively to emergencies and plan proactively, potentially saving lives and safeguarding the environment.

Accelerate AI Initiatives and Deliver Business Results

Companies using accelerated computing for data processing are advancing AI initiatives and positioning themselves to innovate and perform at higher levels than their peers.

Accelerated computing handles larger datasets more efficiently, enables faster model training and selection of optimal algorithms, and facilitates more precise results for live AI solutions.

Enterprises that use it can achieve superior price-performance ratios compared to traditional CPU-based systems and enhance their ability to deliver outstanding results and experiences to customers, employees and partners.

Learn how accelerated computing helps organizations achieve AI objectives and drive innovation. 

Read More

Here Comes a New Challenger: ‘Street Fighter 6’ Joins GeForce NOW

Here Comes a New Challenger: ‘Street Fighter 6’ Joins GeForce NOW

Capcom’s latest entry in the iconic Street Fighter series, Street Fighter 6, punches its way into the cloud this GFN Thursday. The game, along with Ubisoft’s XDefiant, leads six new games joining the GeForce NOW library.

A new reward makes its way to the cloud gaming service’s Ultimate and Priority members. For a limited time, GeForce NOW members who are new to Xbox PC Game Pass can get three months of Microsoft’s subscription service free, just by opting into the GeForce NOW Rewards program.

Plus, make sure to follow @NVIDIAGFN on X to see picturesque in-game locations from where members are sending their #GreetingsfromGFN.

Get Ready to Rumble

Street Fighter 6 on GeForce NOW
Are Ryu ready?

Unleash the ultimate Hadoken with Street Fighter 6 on GeForce NOW. The renowned 2D fighting game returns with intense battles, special moves, combos and Super Art attacks to defeat opponents. With a roster of 22 iconic fighters, including classic World Warriors like Ryu, Chun-Li, Guile and Akuma, plus all-new characters like Kimberly, Jamie, Marisa and Manon, there’s no better time to hit the streets.

The newest installment introduces innovative features and enhanced visuals across three distinct game modes — Fighting Ground, World Tour and Battle Hub — for gamers to level up and put their skills to the test. The game’s blend of classic mechanics and fresh enhancements is captivating longtime fans and newcomers alike.

Become a World Warrior in the cloud with a GeForce NOW Ultimate membership and stream all the fighting glory at up to stunning 4K resolution. Witness every punch, kick and Hadoken with others by hopping online for some head-to-head competition.

Defying Gravity

XDefiant on GeForce NOW
Discover which faction will reign supreme in “XDefiant.”

XDefiant, a free-to-play first-person shooter, combines intense gunplay with strategic team dynamics. Set in a world where factions inspired by iconic Ubisoft franchises clash, the game enables players to customize their loadouts and engage in fast-paced battles. Choose stealthy tactics or all-out aggression for a diverse and thrilling multiplayer experience.

Prepare for adrenaline-fueled firefights and tactical showdowns at up to 240 frames per second with an Ultimate membership. Every frame counts in the fight against other factions.

Get in the Pass Lane

PC Game Pass member reward on GeForce NOW
It’s rewarding to be a GeForce NOW member.

Get ready for a summer of gaming. GeForce NOW Ultimate and Priority members new to PC Game Pass and part of the GeForce NOW Rewards program can now receive three free months of Microsoft’s service.

With PC Game Pass and GeForce NOW, members can play high-quality Xbox PC titles with the power of an NVIDIA GeForce RTX server in the cloud. Jump into the action in iconic franchises like Starfield, Forza Motorsport and Remnant II with support for more titles added every GFN Thursday.

This special offer is available for a limited time, and only for GeForce NOW members new to PC Game Pass.

Mischief Managed

Sneak out on GeForce NOW
Hide and seek on an epic scale.

Get into all kinds of mischief and fun in Sneak Out from Kinguin Studios. Enter the Haunted Castle and prepare to hunt, hide or prank, causing all kinds of hilarious mayhem while trying to win a deadly game of hide and seek.

Check out the list of new games this week:

  • Killer Klowns from Outer Space: The Game (New release on Steam, June 4)
  • Autopsy Simulator (New release on Steam, June 6)
  • Chornobyl Liquidators (New release on Steam, June 6)
  • Sneak Out (New release on Steam, June 6)
  • Farm Together 2 (Steam)
  • Street Fighter 6 (Steam)
  • XDefiant (Ubisoft)

What are you planning to play this weekend? Let us know on X or in the comments below.

Read More

Creativity Accelerated: New RTX-Powered AI Hardware and Software Announced at COMPUTEX

Creativity Accelerated: New RTX-Powered AI Hardware and Software Announced at COMPUTEX

NVIDIA launched NVIDIA Studio at COMPUTEX in 2019. Five years and more than 500 NVIDIA RTX-accelerated apps and games later, it’s bringing AI to even more creators with an array of new RTX technology integrations announced this week at COMPUTEX 2024.

Newly announced NVIDIA GeForce RTX AI laptops — including the ASUS ProArt PX13 and P16 and MSI Stealth 16 AI+ laptops — will feature dedicated RTX Tensor Cores to accelerate AI performance and power-efficient systems-on-a-chip with Windows 11 AI PC features. They join over 200 laptops already accelerated with RTX AI technology.

NVIDIA RTX Video, a collection of technologies including RTX Video Super Resolution and RTX Video HDR that enhance video content streamed in browsers like Google Chrome, Microsoft Edge and Mozilla Firefox, is coming to the free VLC Media Player. And for the first time in June, creators can enjoy these AI-enhanced video effects in popular creative apps like DaVinci Resolve and Wondershare Filmora.

DaVinci Resolve and Cyberlink PowerDirector are adding NVIDIA’s new H.265 Ultra-High-Quality (UHQ) mode, which uses the NVIDIA NVENC to increase high-efficiency video coding (HEVC) and encoding efficiency by 10%.

NVIDIA RTX Remix, a modding platform for remastering classic games with RTX, will soon be made open source, allowing more modders to streamline how assets are replaced and scenes are relit. RTX Remix will also be made accessible via a new REST application programming interface (API) to connect the platform to other modding tools like Blender and Hammer.

Creative apps are continuing to adopt AI-powered NVIDIA DLSS for higher-quality ray-traced visuals in the viewport, with 3D modeling platform Womp being the latest to integrate DLSS 3.5 with Ray Reconstruction.

NVIDIA unveiled Project G-Assist, an RTX-powered AI-assistant technology demo that provides context-aware help for PC games and apps.

The new NVIDIA app beta update adds 120 frames per second AV1 video capture and one-click performance-tuning.

And the latest Game Ready Driver and NVIDIA Studio Driver are available for installation today.

Video Gets the AI Treatment

RTX Video is a collection of real-time, AI-based video enhancements — powered by RTX GPUs equipped with AI Tensor Cores — to dramatically improve video quality.

It includes RTX Video Super Resolution — an upscaling technology that removes compression artifacts and generates additional pixels to improve video sharpness and clarity up to 4K — and RTX Video HDR, which transforms standard dynamic range videos into stunning high-dynamic range on HDR10 displays.

NVIDIA has released the RTX Video software development kit, which allows app developers to add RTX Video effects to creator workflows.

Blackmagic Design’s DaVinci Resolve, a powerful video editing app with color correction, visual effects, graphics and audio post-production capabilities, will be one of the first to integrate RTX Video. The integration is being demoed on the COMPUTEX show floor.

Wondershare Filmora, a video editing app with AI tools and pro-level social media video editing features, will support RTX Video HDR, coming soon.

Wondershare Filmora will soon support RTX Video HDR.

VLC Media Player — an open-source, cross-platform media player, has added RTX Video HDR in its latest beta release, following its recently added support for Mozilla Firefox.

NVIDIA hardware encoders deliver a generational boost in encoding efficiency to HEVC. Performance tested on dual Xeon Gold-6140@2.3GHz running NVIDIA L4 Tensor Core GPUs with driver 520.65.

NVIDIA also released a new UHQ mode in NVENC, a dedicated hardware encoder on RTX GPUs, for the HEVC video compression standard (also known as H.265). The new mode increases compression by 10% without diminishing quality, making NVENC HEVC 34% more efficient than the typically used x264 Medium compression standard.

DaVinci Resolve and Cyberlink PowerDirector video editing software will be adding support for the new UHQ mode in their next updates. Stay tuned for official launch dates.

RTX Remix Open Sources Creator Toolkit

NVIDIA RTX Remix allows modders to easily capture game assets, automatically enhance materials with generative AI tools and create stunning RTX remasters with full ray tracing.

RTX Remix open beta recently added DLSS 3.5 support featuring Ray Reconstruction, an AI model that creates higher-quality images for intensive ray-traced games and apps.

Later this month, NVIDIA will make the RTX Remix Toolkit open source, allowing more modders to streamline how assets are replaced and scenes are relit. The company is also increasing the supported file formats for RTX Remix’s asset ingestor and bolstering RTX Remix’s AI Texture Tools with new models.

The RTX Remix toolkit is now completely open source.

NVIDIA is also making the capabilities of RTX Remix accessible via a new powerful REST API, allowing modders to livelink RTX Remix to other DCC tools such as Blender and modding tools such as Hammer. NVIDIA is also providing an SDK for the RTX Remix runtime to allow modders to deploy RTX Remix’s renderer into other applications and games beyond DirectX 8 and 9 classics.

Catch Some Rays

NVIDIA DLSS 3.5 with Ray Reconstruction enhances ray-traced image quality on NVIDIA RTX and GeForce RTX GPUs by replacing hand-tuned denoisers with an NVIDIA supercomputer-trained AI network that generates higher-quality pixels in between sampled rays.

Previewing content in the viewport, even with high-end hardware, can sometimes offer less-than-ideal image quality, as traditional denoisers require hand-tuning for every scene. With DLSS 3.5, the AI neural network recognizes a wide variety of scenes, producing high-quality preview images and drastically reducing time spent rendering.

The free browser-based 3D modeling platform Womp has added DLSS 3.5 to enhance interactive, photorealistic modeling in the viewport.

DLSS 3.5 with Ray Reconstruction unlocks sharper visuals in the viewport.

Chaos Vantage and D5 Render, two popular professional-grade 3D apps that feature real-time preview modes with ray tracing, have also seen drastic performance increases with DLSS 3.5 — up to a 60% boost from Ray Reconstruction and 4x from all DLSS technologies.

Tools That Accelerate AI Apps

The vast ecosystem of open-source AI models currently available are usually pretrained for general purposes and run in data centers.

To create more effective app-specific AI tools that run on local PCs, NVIDIA has introduced the RTX AI Toolkit — an end-to-end workflow for the customization, optimization and deployment of AI models on RTX AI PCs.

Partners such as Adobe, Topaz and Blackmagic Design are integrating RTX AI Toolkit within their popular creative apps to accelerate AI performance on RTX PCs.

Developers can learn more on the NVIDIA Technical Blog.

Read More

Yotta CEO Sunil Gupta on Supercharging India’s Fast-Growing AI Market

Yotta CEO Sunil Gupta on Supercharging India’s Fast-Growing AI Market

India’s AI market is expected to be massive. Yotta Data Services is setting its sights on supercharging it. In this episode of NVIDIA’s AI Podcast, Sunil Gupta, cofounder, managing director and CEO of Yotta Data Services, speaks with host Noah Kravitz about the company’s Shakti Cloud offering, which provides scalable GPU services for enterprises of all sizes. Yotta is the first Indian cloud services provider in the NVIDIA Partner Network, and its Shakti Cloud is India’s fastest AI supercomputing infrastructure, with 16 exaflops of compute capacity supported by over 16,000 NVIDIA H100 Tensor Core GPUs. Tune in to hear Gupta’s insights on India’s potential as a major AI market and how to balance data center growth with sustainability and energy efficiency.

Stay tuned for more AI Podcast episodes recorded live from GTC.

Time Stamps

1:18: Background on Yotta
2:58: What is Shakti Cloud?
6:44: What does Shakti Cloud mean for India’s tech sector?
10:36: The self-service, scalable capabilities of Shakti Cloud
19:48: Balancing data center growth with sustainability
24:35: Yotta’s work with NVIDIA
27:48: What’s next for Yotta?

You Might Also Like…

Performance AI: Insights From Arthur’s Adam Wenchel – Ep. 221

In this episode of the NVIDIA AI Podcast, recorded live at the GTC 2024, host Noah Kravitz sits down with Adam Wenchel, co-founder and CEO of Arthur, about the company’s technology, which enhances the performance of AI systems across various metrics like accuracy, explainability and fairness.

How the Ohio Supercomputing Center Drives the Future of Computing – Ep. 213

NASCAR races are all about speed, but even the fastest cars need to factor in safety, especially as rules and tracks change. The Ohio Supercomputer Center is ready to help. In this episode of NVIDIA’s AI Podcast, host Noah Kravitz speaks with Alan Chalker, the director of strategic programs at the OSC, about all things supercomputing.

Replit CEO Amjad Masad on Empowering the Next Billion Software Creators – Ep. 201

Replit aims to empower the next billion software creators. In this week’s episode of NVIDIA’s AI Podcast, host Noah Kraviz dives into a conversation with Replit CEO Amjad Masad about bridging the gap between ideas and software, a task simplified by advances in generative AI.

Rendered.ai CEO Nathan Kundtz on Using AI to Build Better AI – Ep. 177

Data is the fuel that makes artificial intelligence run. Training machine learning and AI systems requires data, but compiling quality real-world data for AI and ML can be difficult and expensive. That’s where synthetic data comes in. In this episode of NVIDIA’s AI Podcast, Nathan Kundtz, founder and CEO of Rendered.ai, speaks with host Noah Kravtiz about a platform as a service for creating synthetic data to train AI models.

Subscribe to the AI Podcast

Get the AI Podcast through iTunes, Google Play, Amazon Music, Castbox, DoggCatcher, Overcast, PlayerFM, Pocket Casts, Podbay, PodBean, PodCruncher, PodKicker, Soundcloud, Spotify, Stitcher and TuneIn.

Make the AI Podcast better: Have a few minutes to spare? Fill out this listener survey.

Read More

SAP and NVIDIA Create AI for ‘The Most Valuable Language,’ CEOs Unveil at Sapphire Orlando

SAP and NVIDIA Create AI for ‘The Most Valuable Language,’ CEOs Unveil at Sapphire Orlando

German enterprise cloud leader SAP is harnessing generative AI and industrial digital twins in the development of next-generation enterprise applications for its customers.

At SAP’s Sapphire event today, in Orlando, Florida, NVIDIA and the enterprise software company unveiled generative AI efforts featuring NVIDIA AI Enterprise software to offer two new capabilities for Joule, SAP’s generative AI copilot — SAP Consulting capabilities and ABAP Developer capabilities.

To help streamline the selling and buying process of complex products, SAP is also integrating NVIDIA Omniverse Cloud application programming interfaces, or APIs, into its Intelligent Product Recommendation solution. This will enable salespeople to visualize 3D product digital twins directly in SAP Intelligent Product Recommendation. NVIDIA Omniverse, based on OpenUSD, lets users simulate how configurable equipment might physically fit into and operate in the real world, saving time and costs while improving efficiency and safety.

These new advances come after the companies announced AI-focused collaborations at GTC in March.

“One of the most amazing breakthroughs in large language models and generative AI is that we’ve managed to learn the representation of almost any language,” said NVIDIA founder and CEO Jensen Huang, speaking by video link at the SAP Sapphire conference. “In our company, the most valuable language is CUDA. At SAP, the most valuable language is ABAP.”

“We’ve created some amazing generative APIs that understand the language of SAP, that understand the language of supply chains, that understand the language of ERP systems, so that we have a specialist, an agent and AI to help us to do all the things that we want to do to manage our business,” said SAP CEO Christian Klein.

NVIDIA AI Helps to Accelerate Joule, SAP’s Generative AI Copilot  

Joule, which is embedded across SAP’s portfolio of cloud solutions and applications, allows users to work faster while gaining smarter insights grounded in business data. At SAP Sapphire, the company announced a new consulting capability for Joule leveraging NVIDIA NeMo Retriever microservices.

NVIDIA NeMo Retriever connects AI applications with proprietary data to drive retrieval-augmented generation, or RAG. This brings domain expertise and knowledge of the business to LLMs so that AI copilots and coding assistants can give more accurate and relevant responses.

With the new AI-powered SAP Consulting capabilities — connected to 200,000 pages of SAP learning content, help, product documentation, and community content — tens of thousands of SAP consultants can be more productive, while accessing the latest product and process details when working with end customers. Joule’s SAP Consulting capabilities are already being previewed by internal SAP consultants as well as leading system integrators.

NVIDIA-powered AI is also coming to assist the more than 5 million developers using SAP’s ABAP programming language. SAP’s custom model was trained on over 250 million lines of proprietary ABAP code using NVIDIA HGX H100 systems and will be deployed using NVIDIA NIM inference microservices for optimal runtime performance. Joule’s ABAP Developer capabilities can help developers generate, complete and explain code as well as create software unit tests, which allows them to spend more time building and deploying features rather than manually writing lines of code.

As announced at GTC 2024, this NVIDIA AI Enterprise software, including NeMo and NIM, is expected to be accessible in the generative AI hub in SAP AI Core for developers and customers to leverage.

Omniverse to Help Optimize Sales Processes for Complex Products

Using NVIDIA AI and Omniverse, SAP is building the future of connectivity and is reimagining key business processes, including the sales process for manufacturers.

SAP Intelligent Product Recommendation software utilizes generative AI to analyze requirements described in natural language and generate recommendations for the most suitable configurable products. With Omniverse Cloud APIs integrated with the application, users can then interact with physically accurate models of complex products in a digital twin of the environment where they will be installed. This lets sales teams generate quotes more quickly and helps sales representatives recommend the best solutions for their customers’ needs.

In addition, SAP Intelligent Product Recommendation can use NVIDIA AI to analyze text from multiple sources, including emails and notes captured in customer relationship management software or from product specifications documents. This lets the software generate  recommendations based on key, custom parameters, such as power requirements, lead times and expected carbon footprint.

Watch the replay of the Sapphire Orlando keynote here.

Read More

NVIDIA and Cisco Weave Fabric for Generative AI

NVIDIA and Cisco Weave Fabric for Generative AI

Building and deploying AI applications at scale requires a new class of computing infrastructure — one that can handle the massive amounts of data, compute power and networking bandwidth needed by generative AI models.

To better ensure these models perform optimally and efficiently, NVIDIA is teaming with Cisco to enable enterprise generative AI infrastructure.

Cisco’s new Nexus HyperFabric AI cluster solution, developed in collaboration with NVIDIA, provides a path for enterprises to operationalize generative AI. Cisco HyperFabric is an enterprise-ready, end-to-end infrastructure solution to scale generative AI workloads. It combines NVIDIA accelerated computing and AI software with Cisco AI-native networking and a robust VAST Data Platform.

“Enterprise applications are transforming into generative AI applications, significantly increasing data processing requirements and overall infrastructure complexity,” said Kevin Wollenweber, senior vice president and general manager of data center and provider connectivity at Cisco. “Together, Cisco and NVIDIA are advancing HyperFabric to advance generative AI for the world’s enterprises so they can use their data and domain expertise to transform productivity and insight.”

Powering a Full-Stack AI Fabric

Foundational to the solution are NVIDIA Tensor Core GPUs, which provide the accelerated computing needed to process massive datasets. The solution utilizes NVIDIA AI Enterprise, a cloud-native software platform that acts as the operating system for enterprise AI. NVIDIA AI Enterprise streamlines the development and deployment of production-grade AI copilots and other generative AI applications, ensuring optimized performance, security and application programming interface stability.

Included with NVIDIA AI Enterprise, NVIDIA NIM inference microservices accelerate the deployment of foundation models while ensuring data security. NIM microservices are designed to bridge the gap between complex AI development and enterprise operational needs. As organizations across various industries embark on their AI journeys, the combination of NVIDIA NIM and the Cisco Nexus HyperFabric AI cluster supports the entire process, from ideation to development and deployment of production-scale AI applications.

The Cisco Nexus HyperFabric AI cluster solution integrates NVIDIA Tensor Core GPUs and NVIDIA BlueField-3 SuperNICs and DPUs to enhance system performance and security. The SuperNICs offer advanced network capabilities, ensuring seamless, high-speed connectivity across the infrastructure. BlueField-3 DPUs offload, accelerate and isolate the infrastructure services, creating a more efficient AI solution.

BlueField-3 DPUs can also run security services like the Cisco Hypershield solution. It enables an AI-native, hyperdistributed security architecture, where security shifts closer to the workloads needing protection. Cisco Hypershield is another notable area of collaboration between the companies, focusing on creating AI-powered security solutions.

Join NVIDIA at Cisco Live

Learn more about how Cisco and NVIDIA power generative AI at Cisco Live — running through June 6 in Las Vegas — where the companies will showcase NVIDIA AI technologies at the Cisco AI Hub and share best practices for enterprises to get started with AI.

Attend these sessions to discover how to accelerate generative AI with NVIDIA, Cisco and other ecosystem partners:

  • Keynote Deep Dive: “Harness a Bold New Era: Transform Data Center and Service Provider Connectivity” with NVIDIA’s Kevin Deierling and Cisco’s Jonathan Davidson, Kevin Wollenweber, Jeremy Foster and Bill Gartner — Wednesday, June 5, from 1-2 p.m. PT
  • AI Hub Theater Presentation: “Accelerate, Deploy Generative AI Anywhere With NVIDIA Inference Microservices” with Marty Jain, vice president of sales and business development at NVIDIA — Tuesday, June 4, from 2:15-2:45 p.m. PT
  • WWT AI Hub Booth: Thought leadership interview with NVIDIA’s Jain and WWT Vice President of Cloud, Infrastructure and AI Solutions Neil Anderson — Wednesday, June 5, from 10-11 a.m. PT
  • NetApp Theater: “Accelerating Gen AI With NVIDIA Inference Microservices on FlexPod” with Sicong Ji, strategic platforms and solutions lead at NVIDIA — Wednesday, June 5, from 1:30-1:40 p.m. PT
  • Pure Storage Theater: “Accelerating Gen AI With NVIDIA Inference Microservices on FlashStack” with Joslyn Shakur, sales alliance manager at NVIDIA — Wednesday, June 5, from 2-2:10 p.m. PT ​

Sign up for generative AI news to stay up to date on the latest breakthroughs, developments and technologies.

Read More

Digital Bank Debunks Financial Fraud With Generative AI

Digital Bank Debunks Financial Fraud With Generative AI

European neobank bunq is debunking financial fraudsters with the help of NVIDIA accelerated computing and AI.

Dubbed “the bank of the free,” bunq offers online banking anytime, anywhere. Through the bunq app, users can handle all their financial needs exclusively online, without needing to visit a physical bank.

With more than 12 million customers and 8 billion euros’ worth of deposits made to date, bunq has become one of the largest neobanks in the European Union. Founded in 2012, it was the first bank to obtain a European banking license in over three decades.

To meet growing customer needs, bunq turned to generative AI to help detect fraud and money laundering. Its automated transaction-monitoring system, powered by NVIDIA accelerated computing, greatly improved its training speed.

“AI has enormous potential to help humanity in so many ways, and this is a great example of how human intelligence can be coupled with AI,” said Ali el Hassouni, head of data and AI at bunq.

Faster Fraud Detection

Financial fraud is more prevalent than ever, el Hassouni said in a recent talk at NVIDIA GTC.

Traditional transaction-monitoring systems are rules based, meaning algorithms flag suspicious transactions according to a set of criteria that determine if an activity presents risk of fraud or money laundering. These criteria must be manually set, resulting in high false-positive rates and making such systems labor intensive and difficult to scale.

Instead, using supervised and unsupervised learning, bunq’s AI-powered transaction-monitoring system is completely automated and easily scalable.

Bunq achieved this using NVIDIA GPUs, which accelerated its data processing pipeline more than 5x.

In addition, compared with previous methods, bunq trained its fraud-detection model nearly 100x faster using the open-source NVIDIA RAPIDS suite of GPU-accelerated data science libraries.

RAPIDS is part of the NVIDIA AI Enterprise software platform, which accelerates data science pipelines and streamlines the development and deployment of production-grade generative AI applications.

“We chose NVIDIA’s advanced, GPU-optimized software, as it enables us to use larger datasets and speed the training of new models — sometimes by an order of magnitude — resulting in improved model accuracy and reduced false positives,” said el Hassouni.

AI Across the Bank

Bunq is seeking to tap AI’s potential across its operations.

“We’re constantly looking for new ways to apply AI for the benefit of our users,” el Hassouni said. “More than half of our user tickets are handled automatically. We also use AI to spot fake IDs when onboarding new users, automate our marketing efforts and much more.”

Finn, a personal AI assistant available to bunq customers, is powered by the company’s proprietary large language model and generative AI. It can answer user questions like, “How much did I spend on groceries last month?” and “What’s the name of the Indian restaurant I ate at last week?”

The company is exploring NVIDIA NeMo Retriever, a collection of generative AI microservices available in early access, to further improve Finn’s accuracy. NeMo Retriever is a part of NVIDIA NIM inference microservices, which provide models as optimized containers, available with NVIDIA AI Enterprise.

“Our initial testing of NeMo Retriever embedding NIM has been extremely positive, and our collaboration with NVIDIA on LLMs is poised to help us to take Finn to the next level and enhance customer experience,” el Hassouni said. 

Plus, for the digital bank’s marketing efforts, AI helps analyze consumer engagement metrics to inform future campaigns.

“We’re creating a borderless banking experience for our users, always keeping them at the heart of everything we do,” el Hassouni said.

Watch bunq’s NVIDIA GTC session on demand and subscribe to NVIDIA financial services news

Learn more about AI and financial services at Money20/20 Europe, a fintech conference running June 4-6 in Amsterdam, where NVIDIA will host an AI Summit in collaboration with AWS, and where bunq will present on a panel about AI for fraud detection.

Read More

‘Accelerate Everything,’ NVIDIA CEO Says Ahead of COMPUTEX

‘Accelerate Everything,’ NVIDIA CEO Says Ahead of COMPUTEX

“Generative AI is reshaping industries and opening new opportunities for innovation and growth,” NVIDIA founder and CEO Jensen Huang said in an address ahead of this week’s COMPUTEX technology conference in Taipei.

“Today, we’re at the cusp of a major shift in computing,” Huang told the audience, clad in his trademark black leather jacket. “The intersection of AI and accelerated computing is set to redefine the future.”

Huang spoke ahead of one of the world’s premier technology conferences to an audience of more than 6,500 industry leaders, press, entrepreneurs, gamers, creators and AI enthusiasts gathered at the glass-domed National Taiwan University Sports Center set in the verdant heart of Taipei.

The theme: NVIDIA accelerated platforms are in full production, whether through AI PCs and consumer devices featuring a host of NVIDIA RTX-powered capabilities or enterprises building and deploying AI factories with NVIDIA’s full-stack computing platform.

“The future of computing is accelerated,” Huang said. “With our innovations in AI and accelerated computing, we’re pushing the boundaries of what’s possible and driving the next wave of technological advancement.”
 

‘One-Year Rhythm’

More’s coming, with Huang revealing a roadmap for new semiconductors that will arrive on a one-year rhythm. Revealed for the first time, the Rubin platform will succeed the upcoming Blackwell platform, featuring new GPUs, a new Arm-based CPU — Vera — and advanced networking with NVLink 6, CX9 SuperNIC and the X1600 converged InfiniBand/Ethernet switch.

“Our company has a one-year rhythm. Our basic philosophy is very simple: build the entire data center scale, disaggregate and sell to you parts on a one-year rhythm, and push everything to technology limits,” Huang explained.

NVIDIA’s creative team used AI tools from members of the NVIDIA Inception startup program, built on NVIDIA NIM and NVIDIA’s accelerated computing, to create the COMPUTEX keynote. Packed with demos, this showcase highlighted these innovative tools and the transformative impact of NVIDIA’s technology.

‘Accelerated Computing Is Sustainable Computing’

NVIDIA is driving down the cost of turning data into intelligence, Huang explained as he began his talk.

“Accelerated computing is sustainable computing,” he emphasized, outlining how the combination of GPUs and CPUs can deliver up to a 100x speedup while only increasing power consumption by a factor of three, achieving 25x more performance per Watt over CPUs alone.

“The more you buy, the more you save,” Huang noted, highlighting this approach’s significant cost and energy savings.

Industry Joins NVIDIA to Build AI Factories to Power New Industrial Revolution

Leading computer manufacturers, particularly from Taiwan, the global IT hub, have embraced NVIDIA GPUs and networking solutions. Top companies include ASRock Rack, ASUS, GIGABYTE, Ingrasys, Inventec, Pegatron, QCT, Supermicro, Wistron and Wiwynn, which are creating cloud, on-premises and edge AI systems.

The NVIDIA MGX modular reference design platform now supports Blackwell, including the GB200 NVL2 platform, designed for optimal performance in large language model inference, retrieval-augmented generation and data processing.

AMD and Intel are supporting the MGX architecture with plans to deliver, for the first time, their own CPU host processor module designs. Any server system builder can use these reference designs to save development time while ensuring consistency in design and performance.

Next-Generation Networking with Spectrum-X

In networking, Huang unveiled plans for the annual release of Spectrum-X products to cater to the growing demand for high-performance Ethernet networking for AI.

NVIDIA Spectrum-X, the first Ethernet fabric built for AI, enhances network performance by 1.6x more than traditional Ethernet fabrics. It accelerates the processing, analysis and execution of AI workloads and, in turn, the development and deployment of AI solutions.

CoreWeave, GMO Internet Group, Lambda, Scaleway, STPX Global and Yotta are among the first AI cloud service providers embracing Spectrum-X to bring extreme networking performance to their AI infrastructures.

NVIDIA NIM to Transform Millions Into Gen AI Developers

With NVIDIA NIM, the world’s 28 million developers can now easily create generative AI applications. NIM — inference microservices that provide models as optimized containers — can be deployed on clouds, data centers or workstations.

NIM also enables enterprises to maximize their infrastructure investments. For example, running Meta Llama 3-8B in a NIM produces up to 3x more generative AI tokens on accelerated infrastructure than without NIM.


Nearly 200 technology partners — including Cadence, Cloudera, Cohesity, DataStax, NetApp, Scale AI, and Synopsys — are integrating NIM into their platforms to speed generative AI deployments for domain-specific applications, such as copilots, code assistants, digital human avatars and more. Hugging Face is now offering NIM — starting with Meta Llama 3.

“Today we just posted up in Hugging Face the Llama 3 fully optimized, it’s available there for you to try. You can even take it with you,” Huang said. “So you could run it in the cloud, run it in any cloud, download this container, put it into your own data center, and you can host it to make it available for your customers.”

NVIDIA Brings AI Assistants to Life With GeForce RTX AI PCs

NVIDIA’s RTX AI PCs, powered by RTX technologies, are set to revolutionize consumer experiences with over 200 RTX AI laptops and more than 500 AI-powered apps and games.

The RTX AI Toolkit and newly available PC-based NIM inference microservices for the NVIDIA ACE digital human platform underscore NVIDIA’s commitment to AI accessibility.

Project G-Assist, an RTX-powered AI assistant technology demo, was also announced, showcasing context-aware assistance for PC games and apps.

And Microsoft and NVIDIA are collaborating to help developers bring new generative AI capabilities to their Windows native and web apps with easy API access to RTX-accelerated SLMs that enable RAG capabilities that run on-device as part of Windows Copilot Runtime.

NVIDIA Robotics Adopted by Industry Leaders

NVIDIA is spearheading the $50 trillion industrial digitization shift, with sectors embracing autonomous operations and digital twins — virtual models that enhance efficiency and cut costs. Through its Developer Program, NVIDIA offers access to NIM, fostering AI innovation.

Taiwanese manufacturers are transforming their factories using NVIDIA’s technology, with Huang showcasing Foxconn’s use of NVIDIA Omniverse, Isaac and Metropolis to create digital twins, combining vision AI and robot development tools for enhanced robotic facilities.

“The next wave of AI is physical AI. AI that understands the laws of physics, AI that can work among us,” Huang said, emphasizing the importance of robotics and AI in future developments.

The NVIDIA Isaac platform provides a robust toolkit for developers to build AI robots, including AMRs, industrial arms and humanoids, powered by AI models and supercomputers like Jetson Orin and Thor.

“Robotics is here. Physical AI is here. This is not science fiction, and it’s being used all over Taiwan. It’s just really, really exciting,” Huang added.

Global electronics giants are integrating NVIDIA’s autonomous robotics into their factories, leveraging simulation in Omniverse to test and validate this new wave of AI for the physical world. This includes over 5 million preprogrammed robots worldwide.

“All the factories will be robotic. The factories will orchestrate robots, and those robots will be building products that are robotic,” Huang explained.

Huang emphasized NVIDIA Isaac’s role in boosting factory and warehouse efficiency, with global leaders like BYD Electronics, Siemens, Teradyne Robotics and Intrinsic adopting its advanced libraries and AI models.

NVIDIA AI Enterprise on the IGX platform, with partners like ADLINK, Advantech and ONYX, delivers edge AI solutions meeting strict regulatory standards, essential for medical technology and other industries.

Huang ended his keynote on the same note he began it on, paying tribute to Taiwan and NVIDIA’s many partners there. “Thank you,” Huang said. “I love you guys.”

Read More

KServe Providers Dish Up NIMble Inference in Clouds and Data Centers

KServe Providers Dish Up NIMble Inference in Clouds and Data Centers

Deploying generative AI in the enterprise is about to get easier than ever.

NVIDIA NIM, a set of generative AI inference microservices, will work with KServe, open-source software that automates putting AI models to work at the scale of a cloud computing application.

The combination ensures generative AI can be deployed like any other large enterprise application. It also makes NIM widely available through platforms from dozens of companies, such as Canonical, Nutanix and Red Hat.

The integration of NIM on KServe extends NVIDIA’s technologies to the open-source community, ecosystem partners and customers. Through NIM, they can all access the performance, support and security of the NVIDIA AI Enterprise software platform with an API call — the push-button of modern programming.

Serving AI on Kubernetes

KServe got its start as part of Kubeflow, a machine learning toolkit based on Kubernetes, the open-source system for deploying and managing software containers that hold all the components of large distributed applications.

As Kubeflow expanded its work on AI inference, what became KServe was born and ultimately evolved into its own open-source project.

Many companies have contributed to and adopted the KServe software that runs today at companies including AWS, Bloomberg, Canonical, Cisco, Hewlett Packard Enterprise, IBM, Red Hat, Zillow and NVIDIA.

Under the Hood With KServe

KServe is essentially an extension of Kubernetes that runs AI inference like a powerful cloud application. It uses a standard protocol, runs with optimized performance and supports PyTorch, Scikit-learn, TensorFlow and XGBoost without users needing to know the details of those AI frameworks.

The software is especially useful these days, when new large language models (LLMs) are emerging rapidly.

KServe lets users easily go back and forth from one model to another, testing which one best suits their needs. And when an updated version of a model gets released, a KServe feature called “canary rollouts” automates the job of carefully validating and gradually deploying it into production.

Another feature, GPU autoscaling, efficiently manages how models are deployed as demand for a service ebbs and flows, so customers and service providers have the best possible experience.

An API Call to Generative AI

The goodness of KServe will now be available with the ease of NVIDIA NIM.

With NIM, a simple API call takes care of all the complexities. Enterprise IT admins get the metrics they need to ensure their application is running with optimal performance and efficiency, whether it’s in their data center or on a remote cloud service — even if they change the AI models they’re using.

NIM lets IT professionals become generative AI pros, transforming their company’s operations. That’s why a host of enterprises such as Foxconn and ServiceNow are deploying NIM microservices.

NIM Rides Dozens of Kubernetes Platforms

Thanks to its integration with KServe, users will be able access NIM on dozens of enterprise platforms such as Canonical’s Charmed KubeFlow and Charmed Kubernetes, Nutanix GPT-in-a-Box 2.0, Red Hat’s OpenShift AI and many others.

“Red Hat has been working with NVIDIA to make it easier than ever for enterprises to deploy AI using open source technologies,” said KServe contributor Yuan Tang, a principal software engineer at Red Hat. “By enhancing KServe and adding support for NIM in Red Hat OpenShift AI, we’re able to provide streamlined access to NVIDIA’s generative AI platform for Red Hat customers.”

“Through the integration of NVIDIA NIM inference microservices with Nutanix GPT-in-a-Box 2.0, customers will be able to build scalable, secure, high-performance generative AI applications in a consistent way, from the cloud to the edge,” said the vice president of engineering at Nutanix, Debojyoti Dutta, whose team contributes to KServe and Kubeflow.

“As a company that also contributes significantly to KServe, we’re pleased to offer NIM through Charmed Kubernetes and Charmed Kubeflow,” said Andreea Munteanu, MLOps product manager at Canonical. “Users will be able to access the full power of generative AI, with the highest performance, efficiency and ease thanks to the combination of our efforts.”

Dozens of other software providers can feel the benefits of NIM simply because they include KServe in their offerings.

Serving the Open-Source Community

NVIDIA has a long track record on the KServe project. As noted in a recent technical blog, KServe’s Open Inference Protocol is used in NVIDIA Triton Inference Server, which helps users run many AI models simultaneously across many GPUs, frameworks and operating modes.

With KServe, NVIDIA focuses on use cases that involve running one AI model at a time across many GPUs.

As part of the NIM integration, NVIDIA plans to be an active contributor to KServe, building on its portfolio of contributions to open-source software that includes Triton and TensorRT-LLM. NVIDIA is also an active member of the Cloud Native Computing Foundation, which supports open-source code for generative AI and other projects.

Try the NIM API on the NVIDIA API Catalog using the Llama 3 8B or Llama 3 70B LLM models today. Hundreds of NVIDIA partners worldwide are using NIM to deploy generative AI.

Watch NVIDIA founder and CEO Jensen Huang’s COMPUTEX keynote to get the latest on AI and more.

Read More