Lightweight Champ: NVIDIA Releases Small Language Model With State-of-the-Art Accuracy

Lightweight Champ: NVIDIA Releases Small Language Model With State-of-the-Art Accuracy

Developers of generative AI typically face a tradeoff between model size and accuracy. But a new language model released by NVIDIA delivers the best of both, providing state-of-the-art accuracy in a compact form factor.

Mistral-NeMo-Minitron 8B — a miniaturized version of the open Mistral NeMo 12B model released by Mistral AI and NVIDIA last month — is small enough to run on an NVIDIA RTX-powered workstation while still excelling across multiple benchmarks for AI-powered chatbots, virtual assistants, content generators and educational tools. Minitron models are distilled by NVIDIA using NVIDIA NeMo, an end-to-end platform for developing custom generative AI.

“We combined two different AI optimization methods — pruning to shrink Mistral NeMo’s 12 billion parameters into 8 billion, and distillation to improve accuracy,” said Bryan Catanzaro, vice president of applied deep learning research at NVIDIA. “By doing so, Mistral-NeMo-Minitron 8B delivers comparable accuracy to the original model at lower computational cost.”

Unlike their larger counterparts, small language models can run in real time on workstations and laptops. This makes it easier for organizations with limited resources to deploy generative AI capabilities across their infrastructure while optimizing for cost, operational efficiency and energy use. Running language models locally on edge devices also delivers security benefits, since data doesn’t need to be passed to a server from an edge device.

Developers can get started with Mistral-NeMo-Minitron 8B packaged as an NVIDIA NIM microservice with a standard application programming interface (API) — or they can download the model from Hugging Face. A downloadable NVIDIA NIM, which can be deployed on any GPU-accelerated system in minutes, will be available soon.

State-of-the-Art for 8 Billion Parameters

For a model of its size, Mistral-NeMo-Minitron 8B leads on nine popular benchmarks for language models. These benchmarks cover a variety of tasks including language understanding, common sense reasoning, mathematical reasoning, summarization, coding and ability to generate truthful answers.

Packaged as an NVIDIA NIM microservice, the model is optimized for low latency, which means faster responses for users, and high throughput, which corresponds to higher computational efficiency in production.

In some cases, developers may want an even smaller version of the model to run on a smartphone or an embedded device like a robot. To do so, they can download the 8-billion-parameter model and, using NVIDIA AI Foundry, prune and distill it into a smaller, optimized neural network customized for enterprise-specific applications.

The AI Foundry platform and service offers developers a full-stack solution for creating a customized foundation model packaged as a NIM microservice. It includes popular foundation models, the NVIDIA NeMo platform and dedicated capacity on NVIDIA DGX Cloud. Developers using NVIDIA AI Foundry can also access NVIDIA AI Enterprise, a software platform that provides security, stability and support for production deployments.

Since the original Mistral-NeMo-Minitron 8B model starts with a baseline of state-of-the-art accuracy, versions downsized using AI Foundry would still offer users high accuracy with a fraction of the training data and compute infrastructure.

Harnessing the Perks of Pruning and Distillation 

To achieve high accuracy with a smaller model, the team used a process that combines pruning and distillation. Pruning downsizes a neural network by removing model weights that contribute the least to accuracy. During distillation, the team retrained this pruned model on a small dataset to significantly boost accuracy, which had decreased through the pruning process.

The end result is a smaller, more efficient model with the predictive accuracy of its larger counterpart.

This technique means that a fraction of the original dataset is required to train each additional model within a family of related models, saving up to 40x the compute cost when pruning and distilling a larger model compared to training a smaller model from scratch.

Read the NVIDIA Technical Blog and a technical report for details.

NVIDIA also announced this week Nemotron-Mini-4B-Instruct, another small language model optimized for low memory usage and faster response times on NVIDIA GeForce RTX AI PCs and laptops. The model is available as an NVIDIA NIM microservice for cloud and on-device deployment and is part of NVIDIA ACE, a suite of digital human technologies that provide speech, intelligence and animation powered by generative AI.

Experience both models as NIM microservices from a browser or an API at ai.nvidia.com.

See notice regarding software product information.

Read More

SLMming Down Latency: How NVIDIA’s First On-Device Small Language Model Makes Digital Humans More Lifelike

SLMming Down Latency: How NVIDIA’s First On-Device Small Language Model Makes Digital Humans More Lifelike

Editor’s note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, software, tools and accelerations for RTX PC and workstation users.

At Gamescom this week, NVIDIA announced that NVIDIA ACE — a suite of technologies for bringing digital humans to life with generative AI — now includes the company’s first on-device small language model (SLM), powered locally by RTX AI.

The model, called Nemotron-4 4B Instruct, provides better role-play, retrieval-augmented generation and function-calling capabilities, so game characters can more intuitively comprehend player instructions, respond to gamers, and perform more accurate and relevant actions.

Available as an NVIDIA NIM microservice for cloud and on-device deployment by game developers, the model is optimized for low memory usage, offering faster response times and providing developers a way to take advantage of over 100 million GeForce RTX-powered PCs and laptops and NVIDIA RTX-powered workstations.

The SLM Advantage

An AI model’s accuracy and performance depends on the size and quality of the dataset used for training. Large language models are trained on vast amounts of data, but are typically general-purpose and contain excess information for most uses.

SLMs, on the other hand, focus on specific use cases. So even with less data, they’re capable of delivering more accurate responses, more quickly — critical elements for conversing naturally with digital humans.

Nemotron-4 4B was first distilled from the larger Nemotron-4 15B LLM. This process requires the smaller model, called a “student,” to mimic the outputs of the larger model, appropriately called a “teacher.” During this process, noncritical outputs of the student model are pruned or removed to reduce the parameter size of the model. Then, the SLM is quantized, which reduces the precision of the model’s weights.

With fewer parameters and less precision, Nemotron-4 4B has a lower memory footprint and faster time to first token — how quickly a response begins — than the larger Nemotron-4 LLM while still maintaining a high level of accuracy due to distillation. Its smaller memory footprint also means games and apps that integrate the NIM microservice can run locally on more of the GeForce RTX AI PCs and laptops and NVIDIA RTX AI workstations that consumers own today.

This new, optimized SLM is also purpose-built with instruction tuning, a technique for fine-tuning models on instructional prompts to better perform specific tasks. This can be seen in Mecha BREAK, a video game in which players can converse with a mechanic game character and instruct it to switch and customize mechs.

ACEs Up

ACE NIM microservices allow developers to deploy state-of-the-art generative AI models through the cloud or on RTX AI PCs and workstations to bring AI to their games and applications. With ACE NIM microservices, non-playable characters (NPCs) can dynamically interact and converse with players in the game in real time.

ACE consists of key AI models for speech-to-text, language, text-to-speech and facial animation. It’s also modular, allowing developers to choose the NIM microservice needed for each element in their particular process.

NVIDIA Riva automatic speech recognition (ASR) processes a user’s spoken language and uses AI to deliver a highly accurate transcription in real time. The technology builds fully customizable conversational AI pipelines using GPU-accelerated multilingual speech and translation microservices. Other supported ASRs include OpenAI’s Whisper, a open-source neural net that approaches human-level robustness and accuracy on English speech recognition.

Once translated to digital text, the transcription goes into an LLM — such as Google’s Gemma, Meta’s Llama 3 or now NVIDIA Nemotron-4 4B — to start generating a response to the user’s original voice input.

Next, another piece of Riva technology — text-to-speech — generates an audio response. ElevenLabs’ proprietary AI speech and voice technology is also supported and has been demoed as part of ACE, as seen in the above demo.

Finally, NVIDIA Audio2Face (A2F) generates facial expressions that can be synced to dialogue in many languages. With the microservice, digital avatars can display dynamic, realistic emotions streamed live or baked in during post-processing.

The AI network automatically animates face, eyes, mouth, tongue and head motions to match the selected emotional range and level of intensity. And A2F can automatically infer emotion directly from an audio clip.

Finally, the full character or digital human is animated in a renderer, like Unreal Engine or the NVIDIA Omniverse platform.

AI That’s NIMble

In addition to its modular support for various NVIDIA-powered and third-party AI models, ACE allows developers to run inference for each model in the cloud or locally on RTX AI PCs and workstations.

The NVIDIA AI Inference Manager software development kit allows for hybrid inference based on various needs such as experience, workload and costs. It streamlines AI model deployment and integration for PC application developers by preconfiguring the PC with the necessary AI models, engines and dependencies. Apps and games can then orchestrate inference seamlessly across a PC or workstation to the cloud.

ACE NIM microservices run locally on RTX AI PCs and workstations, as well as in the cloud. Current microservices running locally include Audio2Face, in the Covert Protocol tech demo, and the new Nemotron-4 4B Instruct and Whisper ASR in Mecha BREAK.

To Infinity and Beyond

Digital humans go far beyond NPCs in games. At last month’s SIGGRAPH conference, NVIDIA previewed “James,” an interactive digital human that can connect with people using emotions, humor and more. James is based on a customer-service workflow using ACE.

Interact with James at ai.nvidia.com.

Changes in communication methods between humans and technology over the decades eventually led to the creation of digital humans. The future of the human-computer interface will have a friendly face and require no physical inputs.

Digital humans drive more engaging and natural interactions. According to Gartner, 80% of conversational offerings will embed generative AI by 2025, and 75% of customer-facing applications will have conversational AI with emotion. Digital humans will transform multiple industries and use cases beyond gaming, including customer service, healthcare, retail, telepresence and robotics.

Users can get a glimpse of this future now by interacting with James in real time at ai.nvidia.com.

Generative AI is transforming gaming, videoconferencing and interactive experiences of all kinds. Make sense of what’s new and what’s next by subscribing to the AI Decoded newsletter.

Read More

How Snowflake Is Unlocking the Value of Data With Large Language Models

How Snowflake Is Unlocking the Value of Data With Large Language Models

Snowflake is using AI to help enterprises transform data into insights and applications. In this episode of NVIDIA’s AI Podcast, host Noah Kravitz and Baris Gultekin, head of AI at Snowflake, discuss how the company’s AI Data Cloud platform enables customers to access and manage data at scale. By separating the storage of data from compute, Snowflake has allowed organizations across the world to connect via cloud technology and work on a unified platform — eliminating data silos and streamlining collaborative workflows.

Time Stamps

1:45: What does Snowflake do?
3:18: Snowflake’s AI and data strategies — building a platform with natural language analysis
5:30: How to efficiently access large language models with Snowflake Cortex
11:49: Snowflake’s open-source LLM: Arctic
16:18: Gultekin’s journey in AI and data science
23:05: The AI industry in three to five years — real-world applications of Snowflake technology
27:54: Gutlekin’s advice for professionals interested in AI

You Might Also Like:

How Roblox Uses Generative AI to Enhance User Experiences – Ep. 227

Roblox is a colorful online platform reimagining the way people come together. Anupam Singh, vice president of AI and growth engineering at Roblox, discusses how the company uses generative AI to enhance virtual experiences and bolster inclusivity and user safety.

NVIDIA’s Jim Fan Delves Into Large Language Models and Their Industry Impact – Ep. 204

Most know Minecraft as the popular blocky sandbox game, but for Jim Fan, senior AI scientist at NVIDIA, Minecraft was the perfect place to test the decision-making agency of AI models. Fan discusses how he used large language models to research open-ended AI agents and create Voyager, an AI bot built with Chat GPT-4 that can autonomously play Minecraft.

Media.Monk’s Lewis Smithingham on Enhancing Media and Marketing With AI – Ep. 222

Media.Monks’ platform Wormhole streamlines marketing and content creation workflows with AI-powered insights. Lewis Smithingham, senior vice president of innovation and special operations at Media.Monks, addresses AI’s potential in the entertainment and advertisement industries.

NVIDIA’s Annamali Chockalingam on the Rise of LLMs – Ep. 206

LLMs are in the spotlight, capable of tasks like generation, summarization, translation, instruction and chatting. Annamalai Chockalingam, senior product manager of developer marketing at NVIDIA, discusses how a combination of these modalities and actions can build applications to solve any problem.

Subscribe to the AI Podcast

Get the AI Podcast through iTunes, Google Play, Amazon Music, Castbox, DoggCatcher, Overcast, PlayerFM, Pocket Casts, Podbay, PodBean, PodCruncher, PodKicker, Soundcloud, Spotify, Stitcher and TuneIn.

Make the AI Podcast better: Have a few minutes to spare? Fill out this listener survey.

Read More

High-Tech Highways: India Uses NVIDIA Accelerated Computing to Ease Tollbooth Traffic

High-Tech Highways: India Uses NVIDIA Accelerated Computing to Ease Tollbooth Traffic

India is home to the globe’s second-largest road network, spanning nearly 4 million miles, and has over a thousand tollbooths, most of them run manually.

Traditional booths like these, wherever in the world they’re deployed, can contribute to massive traffic delays, long commute times and serious road congestion.

To help automate tollbooths across India, Calsoft, an Indian-American technology company, helped implement a broad range of NVIDIA technologies integrated with the country’s dominant payment system, known as the unified payments interface, or UPI, for a client.

Manual tollbooths demand more time and labor compared to automated ones. However, automating India’s toll systems faces an extra complication: the diverse range of license plates.

India’s non-standardized plates pose a significant challenge to the accuracy of automatic number plate recognition (ANPR) systems. Any implementation would need to address these plate variations, which include divergent color, sizing, font styles and placement upon vehicles, as well as many different languages.

The solution Calsoft helped build automatically reads passing vehicle plates and charges the associated driver’s UPI account. This approach reduces the need for manual toll collection and is a massive step toward addressing traffic in the region.

Automation in Action

As part of a pilot program, this solution has been deployed in several leading metropolitan cities. The solution provides about 95% accuracy in its ability to read plates through the use of an ANPR pipeline that detects and classifies the plates as they roll through tollbooths.

NVIDIA’s technology has been crucial in this effort, according to Vipin Shankar, senior vice president of technology at Calsoft. “Particularly challenging was night-time detection,” he said. “Another challenge was model accuracy improvement on pixel distortions due to environmental impacts like fog, heavy rains, reflections due to bright sunshine, dusty winds and more.”

The solution uses NVIDIA Metropolis to track and detect vehicles throughout the process. Metropolis is an application framework, a set of developer tools and a partner ecosystem that brings visual data and AI together to improve operational efficiency and safety across a range of industries.

Calsoft engineers used NVIDIA Triton Inference Server software to deploy and manage their AI models. The team also used the NVIDIA DeepStream software development kit to build a real-time streaming platform. This was key for processing and analyzing data streams efficiently, incorporating advanced capabilities such as real-time object detection and classification.

Calsoft uses NVIDIA hardware, including NVIDIA Jetson edge AI modules and NVIDIA A100 Tensor Core GPUs in its AI solutions. Calsoft’s tollbooth solution is also scalable, meaning it’s designed to accommodate future growth and expansion needs, and can better ensure sustained performance and adaptability as traffic conditions evolve.

Learn how NVIDIA Metropolis has helped other municipalities, like Raleigh, North Carolina, better manage traffic flow and enhance pedestrian safety. 

Read More

NVIDIA Showcases New AI Capabilities With ACE, RTX Games and More at Gamescom 2024

NVIDIA Showcases New AI Capabilities With ACE, RTX Games and More at Gamescom 2024

At Gamescom, the world’s biggest gaming expo, NVIDIA has once again pushed the boundaries of gaming technology to ensure that gamers have incredibly immersive experiences and can enjoy enhanced performance and visual fidelity.

The company’s announcements today include its first digital human technologies on-device small language model showcased in the first game tech demo, Mecha BREAK, a milestone celebration of 600 RTX games and applications with 20 new RTX games announced, and new games on GeForce NOW.

Alongside these, NVIDIA announced a collaboration with MediaTek that brings G-SYNC display technologies to more gamers.

Gamescom, held every year in Cologne, Germany, is where innovators from across the gaming community showcase their latest creations. In 2018, NVIDIA founder and CEO Jensen Huang introduced NVIDIA RTX at the event, bringing real-time ray tracing and AI to gaming and setting a new standard for graphics performance.

NVIDIA ACE: Advancing AI-Powered Game Characters

Leading NVIDIA’s announcements at Gamescom was NVIDIA ACE, a revolutionary suite of technologies for bringing digital humans to life with generative AI.

The first game to showcase ACE and digital human technologies is Amazing Seasun Games’ Mecha BREAK, a fast-based mech combat game that demonstrates the potential of AI-powered game characters.

https://www.youtube.com/watch?v=d5z7oIXhVqg

The ACE suite also expanded with NVIDIA’s first digital human technologies on-device small language model (SLM), NVIDIA Nemotron-4 4B Instruct, improving conversation for game characters. This new on-device model provides better role-play, retrieval-augmented generation and function-calling capabilities, allowing game characters to more intuitively comprehend player instructions, respond to gamers and perform more accurate and relevant actions.

Perfect World Games is advancing its ACE and digital human tech demo, Legends, with new AI-powered vision capabilities, unlocking a new level of immersion and accessibility for PC games.

Celebrating 600 RTX Games and Apps With 20 New RTX Titles

NVIDIA RTX continues to revolutionize the ways people play and create with ray tracing, DLSS and AI-powered technologies. Today marks another RTX milestone: 600 RTX-enhanced games and applications are now available.

This week, NVIDIA announced 20 new RTX and DLSS titles to join this impressive roster, including high-profile games such as Indiana Jones and the Great Circle, Dune: Awakening and Dragon Age: The Veilguard.

Game Science’s much-anticipated Black Myth: Wukong also launches today, featuring full ray tracing and DLSS 3, delivering the ultimate RTX experience for GeForce RTX 40 Series gamers.

https://www.youtube.com/watch?v=97egUiMlLZM

Half-Life 2: An RTX Remix Project Unveils Remastered Nova Prospekt

Half-Life 2 RTX: An RTX Remix Project from Orbifold Studios is a community remaster of Valve’s classic game. Now boasting over 100 contributing artists, Orbifold Studios unveiled a remaster of one of Half-Life 2’s most iconic levels, Nova Prospekt.

Using NVIDIA RTX Remix, Orbifold Studios has remastered Nova Prospekt with full ray tracing, DLSS 3.5 with Ray Reconstruction and Reflex. The Nova Prospekt trailer also reveals remasters of Gordon’s revolver, shotgun and Overwatch Standard Issue Pulse Rifle, remasters of the Combine soldiers and Antlions, and the addition of new geometry and detail that uses the capabilities of modern PCs to increase realism.

https://www.youtube.com/watch?v=R0-F8sPprmA

NVIDIA and MediaTek Bring G-SYNC Display Technologies to More Gamers 

NVIDIA and MediaTek are collaborating to make the industry’s best gaming display technologies more accessible.

The companies’ collaboration integrates the full suite of NVIDIA G-SYNC technologies to the world’s most popular scalers, allowing for the creation of feature-rich G-SYNC monitors at a more affordable price.

A highlight of this collaboration is the introduction of G-SYNC Pulsar, a new technology that offers 4x the effective motion clarity alongside a smooth and tear-free variable refresh rate (VRR) experience. G-SYNC Pulsar will debut on newly announced monitors, including the ASUS ROG Swift 360Hz PG27AQNR, Acer Predator XB273U F5 and AOC AGON PRO AG276QSG2.

GeForce NOW Raises the Bar for Cloud Gaming

Each week geforcenow.com adds top-tier PC games on GFN Thursday to stream at peak performance from GeForce RTX SuperPODs in the cloud, along with new features and updates for members.

For Gamescom, GeForce NOW is adding the highly anticipated action role-playing game Black Myth: Wukong from Game Science, as well as a demo for the upcoming PC release of FINAL FANTASY XVI from Square Enix. A new update brings Xbox automatic sign-in, making it easy for members to quickly jump into their PC games across devices by linking their account just once.

These latest GeForce NOW updates — available today — raise the bar for cloud gaming and build on recent milestones, including added support for mods, new data centers in Japan and Poland, and 2,000 games available in the cloud. Check out GeForce NOW’s Gamescom blog for more details.

Star Wars Outlaws: GeForce RTX 40 Series Bundle 

In collaboration with Ubisoft, Massive Entertainment and Lucasfilm Games, NVIDIA is launching a new Star Wars Outlaws GeForce RTX 40 Series Bundle. Gamers will experience the first-ever open-world Star Wars game, set between the events of Star Wars: The Empire Strikes Back and Star Wars: Return of the Jedi, enhanced with NVIDIA DLSS 3.5, ray tracing and Reflex technologies. It’ll also be available in the cloud on GeForce NOW.

For all the news and details on NVIDIA’s latest Gamescom announcements, visit GeForce News

Read More

At Gamescom 2024, GeForce NOW Brings ‘Black Myth: Wukong’ and ‘FINAL FANTASY XVI Demo’ to the Cloud

At Gamescom 2024, GeForce NOW Brings ‘Black Myth: Wukong’ and ‘FINAL FANTASY XVI Demo’ to the Cloud

Each week, GeForce NOW elevates cloud gaming by bringing top PC games and new updates to the cloud.

Starting today, members can stream the highly anticipated action role-playing game (RPG) Black Myth: Wukong from Game Science, as well as a demo for the upcoming PC release of FINAL FANTASY XVI from Square Enix.

Experience these triple-A releases now — at peak performance, even on low-powered devices — along with Xbox automatic sign-in coming Aug. 22 on GFN Thursday. The feature will make it easy for members to dive into their favorite PC games.

These latest GeForce NOW updates — announced today at the annual Gamescom conference — build on recent milestones, including added support for mods, new data centers in Japan and Poland, and 2,000 games available in the cloud.

Get Your Game Face On

GeForce NOW recently celebrated 2,000 games in the cloud, thanks to strong collaborations with publishers that have brought their titles to the platform.

New games are added weekly from popular digital gaming stores Steam, Epic Games Store, Ubisoft Connect, Battle.net, and Xbox, including over 140 PC Game Pass titles. Powered by up to GeForce RTX GPU-class graphics, GeForce NOW is bringing even more top titles from celebrated publishers to the cloud.

Black Myth Wukong on GeForce NOW
No monkey business in the cloud — just high-performance gameplay.

Members can be among the first to play Black Myth: Wukong without waiting for downloads or worrying about system requirements.

Take on the role of Sun Wukong the Monkey King, wielding the powers of a magical staff throughout a richly woven narrative filled with mythical creatures and formidable foes. Learn to harness Wukong’s unique abilities and combat styles, reminiscent of the Soulslike game genre. As the “Destined One,” take on various challenges to uncover the truth hidden beneath the veil of a glorious historical legend.

Witness the beauty of ancient China — from snow-draped mountains to intricate cave networks — in exquisite detail elevated by NVIDIA RTX technologies, including full ray tracing and DLSS 3. The GeForce NOW Ultimate membership brings this visual splendor to life at 4K resolution and 120 frames per second, even on underpowered devices, streaming from GeForce RTX 4080 SuperPODs in the cloud.

FINAL FANTASY XVI demo on GeForce NOW
Questing looks so good in the cloud.

The latest mainline numbered entry in the renowned RPG series from Square Enix, FINAL FANTASY XVI is coming soon to the cloud. Get a taste of the title’s high-octane action, breathtaking world and epic storytelling in its PC demo on GeForce NOW. Members can learn to master their Eikonic abilities while battling colossal foes and exploring the stunning realm of Valisthea.

Try the demo on GeForce NOW to jump right into an epic adventure without worrying about system specs.

GeForce NOW members can play these titles and more at high performance. Ultimate members can stream at up to 4K resolution and 120 fps with support for NVIDIA DLSS and NVIDIA Reflex technology, and experience the action even on low-powered devices. Keep an eye out on GFN Thursdays for the latest on game release dates in the cloud.

Set It and Forget It

Xbox SSO on GeForce NOW
Link-credible.

GeForce NOW makes gaming more convenient by letting members link their gaming accounts directly to the cloud service. Starting Aug. 22, such support extends to Xbox accounts — alongside existing support for Epic Games and Ubisoft automatic sign-in — enabling seamless access across devices.

After linking their Xbox profiles just once, members will be signed in automatically across their devices for all future GeForce NOW sessions — speeding access to their favorite PC games, from Fallout 76 to Starfield to Forza Horizon 5.

This new feature joins Xbox game library sync on GeForce NOW, which allows members to sync their supported Xbox Game Pass and Microsoft Store games to their cloud streaming library. Members will enjoy a more unified gaming experience with the ability to instantly play over 140 Xbox Game Pass titles across devices.

With a steady drumbeat of quality games from top publishers and new features to continuously improve the service, GeForce NOW is a gamer’s gateway to high-performance gaming from the cloud, enabling play on any device with real-time ray tracing and high resolutions. Check back every GFN Thursday for more news on upcoming game launches, new game releases, service updates and more.

Read More

NVIDIA Announces First Digital Human Technologies On-Device Small Language Model, Improving Conversation for Game Characters

NVIDIA Announces First Digital Human Technologies On-Device Small Language Model, Improving Conversation for Game Characters

NVIDIA’s first digital human technology small language model is being demonstrated in Mecha BREAK, a new multiplayer mech game developed by Amazing Seasun Games, to bring its characters to life and provide a more dynamic and immersive gameplay experience on GeForce RTX AI PCs.

The new on-device model, called Nemotron-4 4B Instruct, improves the conversation abilities of game characters, allowing them to more intuitively comprehend players and respond naturally.

NVIDIA has optimized ACE technology to run directly on GeForce RTX AI PCs and laptops. This greatly improves developers’ ability to deploy state-of-the-art digital human technology in next-generation games such as Mecha BREAK.

A Small Language Model Purpose-Built for Role-Playing

NVIDIA Nemotron-4 4B Instruct provides better role-play, retrieval-augmented generation and function-calling capabilities, allowing game characters to more intuitively comprehend player instructions, respond to gamers and perform more accurate and relevant actions.

The model is available as an NVIDIA NIM microservice, which provides a streamlined path for developing and deploying generative AI-powered applications. The NIM is optimized for low memory usage, offering faster response times and providing developers a way to take advantage of over 100 million GeForce RTX-powered PCs and laptops.

Nemotron-4 4B Instruct is part of NVIDIA ACE, a suite of digital human technologies that provide speech, intelligence and animation powered by generative AI. It’s available as a NIM for cloud and on-device deployment by game developers.

First Game Showcases ACE NIM Microservices

Mecha BREAK, developed by Amazing Seasun Games, a Kingsoft Corporation game subsidiary, is showcasing the NVIDIA Nemotron-4 4B Instruct NIM running on device in the first display of ACE-powered game interactions. The NVIDIA Audio2Face-3D NIM and Whisper, OpenAI’s automatic speech recognition model, provide facial animation and speech recognition running on-device. ElevenLabs powers the character’s voice through the cloud.

In this demo, shown first at Gamescom, one of the world’s biggest gaming expos, NVIDIA ACE and digital human technologies allow players to interact with a mechanic non-playable character (NPC) that can help them choose from a diverse range of mechanized robots, or mechs, to complement their playstyle or team needs, assist in appearance customization and give advice on how to best prepare their colossal war machine for battle.

“We’re excited to showcase the power and potential of ACE NIM microservices in Mecha BREAK, using Audio2Face and Nemotron-4 4B Instruct to dramatically enhance in-game immersion,” said Kris Kwok, CEO of Amazing Seasun Games.

Perfect World Games Explores Latest Digital Human Technologies

NVIDIA ACE and digital human technologies continue to expand their footprint in the gaming industry.

Global game publisher and developer Perfect World Games is advancing its NVIDIA ACE and digital human technology demo, Legends, with new AI-powered vision capabilities. Within the demo, the character Yun Ni can see gamers and identify people and objects in the real world using the computer’s camera powered by ChatGPT-4o, adding an augmented reality layer to the gameplay experience. These capabilities unlock a new level of immersion and accessibility for PC games.

Learn more about NVIDIA ACE and download the NIM to begin building game characters powered by generative AI. 

Read More

Level Up: NVIDIA, MediaTek to Bring G-SYNC Display Technologies to More Gamers

Level Up: NVIDIA, MediaTek to Bring G-SYNC Display Technologies to More Gamers

Picture this: NVIDIA and MediaTek are working together to make the industry’s best gaming display technologies more accessible to gamers globally.

The companies’ collaboration, announced today at the Gamescom gaming gathering in Cologne, Germany, integrates the full suite of NVIDIA G-SYNC technologies into the world’s most popular scalers.

Gamers can expect superior image quality, unmatched motion clarity, ultra-low latency, highly accurate colors, and more cutting-edge benefits on their displays.

G-SYNC Pulsar: The Star of New Display Technologies

A highlight of this collaboration is the introduction of G-SYNC Pulsar, a new technology that offers 4x the effective motion clarity alongside a smooth and tear-free variable refresh rate (VRR) gaming experience.

G-SYNC Pulsar will debut on newly announced monitors, including the ASUS ROG Swift 360Hz PG27AQNR, Acer Predator XB273U F5 and AOC AGON PRO AG276QSG2.

These monitors, expected later this year, feature 2560×1440 resolution, a 360Hz refresh rate and HDR support.

Integrating G-SYNC into MediaTek scalers eliminates the need for a separate G-SYNC module, streamlining the production process and reducing costs.

This allows for the creation of feature-rich G-SYNC monitors at a more affordable price. And by expanding the availability of these premium gaming products to a broader audience, more gamers will be able to enjoy the best in motion clarity, image quality and performance.

Read More

AI Chases the Storm: New NVIDIA Research Boosts Weather Prediction, Climate Simulation

AI Chases the Storm: New NVIDIA Research Boosts Weather Prediction, Climate Simulation

As hurricanes, tornadoes and other extreme weather events occur with increased frequency and severity, it’s more important than ever to improve and accelerate climate research and prediction using the latest technologies.

Amid peaks in the current Atlantic hurricane season, NVIDIA Research today announced a new generative AI model, dubbed StormCast, for emulating high-fidelity atmospheric dynamics. This means the model can enable reliable weather prediction at mesoscale — a scale larger than storms but smaller than cyclones — which is critical for disaster planning and mitigation.

Detailed in a paper written in collaboration with the Lawrence Berkeley National Laboratory and the University of Washington, StormCast arrives as extreme weather phenomena are taking lives, destroying homes and causing more than $150 billion in damage annually in the U.S. alone.

It’s just one example of how generative AI is supercharging thundering breakthroughs in climate research and actionable extreme weather prediction, helping scientists tackle challenges of the highest stakes: saving lives and the world.

NVIDIA Earth-2 — a digital twin cloud platform that combines the power of AI, physical simulations and computer graphics — enables simulation and visualization of weather and climate predictions at a global scale with unprecedented accuracy and speed.

At COMPUTEX in June, NVIDIA founder and CEO Jensen Huang announced CorrDiff, available through Earth-2.

In Taiwan, for example, the National Science and Technology Center for Disaster Reduction predicts fine-scale details of typhoons using CorrDiff, an NVIDIA generative AI model offered as part of Earth-2.

CorrDiff can super-resolve 25-kilometer-scale atmospheric data by 12.5x down to 2 kilometers — 1,000x faster and using 3,000x less energy for a single inference than traditional methods.

That means the center’s potentially lifesaving work, which previously cost nearly $3 million on CPUs, can be accomplished using about $60,000 on a single system with an NVIDIA H100 Tensor Core GPU. It’s a massive reduction that shows how generative AI and accelerated computing increase energy efficiency and lower costs.

The center also plans to use CorrDiff to predict downwash — when strong winds funnel down to street level, damaging buildings and affecting pedestrians — in urban areas.

Now, StormCast adds hourly autoregressive prediction capabilities to CorrDiff, meaning it can predict future outcomes based on past ones.

A Global Impact From a Regional Focus

Global climate research begins at a regional level.

Physical hazards of weather and climate change can vary dramatically on regional scales. But reliable numerical weather prediction at this level comes with substantial computational costs. This is due to the high spatial resolution needed to represent the underlying fluid-dynamic motions at mesoscale.

Regional weather prediction models — often referred to as convection-allowing models, or CAMs — have traditionally forced researchers to face varying tradeoffs in resolution, ensemble size and affordability.

CAMs are useful to meteorologists for tracking the evolution and structure of storms, as well as for monitoring its convective mode, or how a storm is organized when it forms. For example, the likelihood of a tornado is based on a storm’s structure and convective mode.

A mesoscale convective system visualized using NOAA’s Geostationary Operational Environmental Satellite. Image courtesy of NOAA.

CAMs also help researchers understand the implications for weather-related physical hazards at the infrastructure level.

For example, global climate model simulations can be used to inform CAMs, helping them translate slow changes in the moisture content of large atmospheric rivers into flash-flooding projections in vulnerable coastal areas.

At lower resolutions, machine learning models trained on global data have emerged as useful emulators of numerical weather prediction models that can be used to improve early-warning systems for severe events. These machine learning models typically have a spatial resolution of about 30 kilometers and a temporal resolution of six hours.

Now, with the help of generative diffusion, StormCast enables this at a 3-kilometer, hourly scale.

Despite being in its infancy, the model — when applied with precipitation radars — already offers forecasts with lead times of up to six hours that are up to 10% more accurate than the U.S. National Oceanic and Atmospheric Administration (NOAA)’s state-of-the-art 3-kilometer operational CAM.

Plus, outputs from StormCast exhibit physically realistic heat and moisture dynamics, and can predict over 100 variables, such as temperature, moisture concentration, wind and rainfall radar reflectivity values at multiple, finely spaced altitudes. This enables scientists to confirm the realistic 3D evolution of a storm’s buoyancy — a first-of-its-kind accomplishment in AI weather simulation.

NVIDIA researchers trained StormCast on approximately three-and-a-half years of NOAA climate data from the central U.S., using NVIDIA accelerated computing to speed calculations.

More Innovations Brewing

Scientists are already looking to harness the model’s benefits.

“Given both the outsized impacts of organized thunderstorms and winter precipitation, and the major challenges in forecasting them with confidence, the production of computationally tractable storm-scale ensemble weather forecasts represents one of the grand challenges of numerical weather prediction,” said Tom Hamill, head of innovation at The Weather Company. “StormCast is a notable model that addresses these challenges, and The Weather Company is excited to collaborate with NVIDIA on developing, evaluating and potentially using these deep learning forecast models.”

“Developing high-resolution weather models requires AI algorithms to resolve convection, which is a huge challenge,” said Imme Ebert-Uphoff, machine learning lead at Colorado State University’s Cooperative Institute for Research in the Atmosphere. “The new NVIDIA research explores the potential of accomplishing this with diffusion models like StormCast, which presents a significant step toward the development of future AI models for high-resolution weather prediction.”

Alongside the acceleration and visualization of physically accurate climate simulations, as well as a digital twin of our planet, such research breakthroughs signify how NVIDIA Earth-2 is enabling a new, vital era of climate research.

Learn more about sustainable computing and NVIDIA Research, a global team of hundreds of scientists and engineers focused on topics including climate AI, computer graphics, computer vision, self-driving cars and robotics.

Featured image courtesy of NASA.

See notice regarding software product information.

Read More

GeForce NOW and CurseForge Bring Mod Support to ‘World of Warcraft: The War Within’ in the Cloud

GeForce NOW and CurseForge Bring Mod Support to ‘World of Warcraft: The War Within’ in the Cloud

Time to be wowed: GeForce NOW members can now stream World of Warcraft on supported devices with in-game mods powered by the CurseForge platform for WoW customization. With support for top mods, even the most hardcore raid leaders can play like a hero, thanks to the cloud.

Embark on a new adventure in Azeroth when the upcoming World of Warcraft expansion, The War Within, launches on Monday, Aug. 26, at 3 p.m. PT. GeForce NOW members who purchase the Epic Edition of The War Within will get early streaming access on Thursday, Aug. 22, at 3 p.m. PT.

And check out the five new games available in the ever-expanding GeForce NOW library this week, including the Psychonauts series from Double Fine Productions.

For those looking to upgrade their play or try out GeForce NOW for the first time, the Summer Sale is offering new one- and six-month Ultimate and Priority memberships at 50% off through this Sunday, Aug. 18.

Play Your Way

Return to World of Warcraft in style across supported devices with GeForce NOW and support for top CurseForge Addons. One of the most popular platforms for WoW Addons, CurseForge includes user interface customization, combat Addons, action bars, quest helpers and many more categories.

GeForce NOW has collaborated with CurseForge to include over 25 of the top Addons from its platform, available to Ultimate and Priority members — including Day Pass users — to seamlessly customize their WoW experience. The Addons are just as easy to implement as they would be while playing locally on a PC gaming rig, and a CurseForge account isn’t required — just launch WoW and enable Addons from the game’s menu.

CurseForge WoW Addson on GeForce NOW
Customize and conquer.

After clicking the new “Addons” button from the in-game menu, paying GeForce NOW members can choose the Addons they’d like to enable or opt to enable all. After that, CurseForge will ensure Addons update automatically, and GeForce NOW will remember the Addons selected on each game launch. Check out this article for more details.

Stream World of Warcraft and any mods that normally only work on PC across devices that the game is supported on, including SHIELD TVs, underpowered laptops, Chromebooks and handheld devices like the Steam Deck. Addons will work across all supported WoW experiences on GeForce NOW, including Classic and Cataclysm Classic. From ancient wonders to perilous dungeons, GeForce NOW is the best way for game veterans and newcomers alike to experience unparalleled adventure in the heart of Azeroth.

Members can show off their World of Warcraft Addons in the cloud by sharing a screenshot on social media using #ModsonGFN for a chance to be featured on GeForce NOW’s channels.

Dive to New Depths

Get ready to embark on a journey to the heart of Azeroth in The War Within and unveil the mysteries lying beneath the world’s surface, with early streaming access for those who purchase the Epic Edition. GeForce NOW members will be able to jump right in without waiting around for game updates.

The War Within coming soon to GeForce NOW
Dig deep.

Experience the thrill of Warbands and an account-wide progression system, and soar through the skies with the new Skyriding feature. Dive into the Radiant Echoes event, collect Residual Memories and gear up with new items for Warband collections. With class and system updates, this expansion sets the stage for the epic tales and battles that await in the depths of Azeroth. Answer the call to arms when the saga of The War Within begins.

The full update for The War Within promises to delve deeper into the unexplored depths of Azeroth, continuing the story and expanding the gameplay experience. Look forward to new zones, dungeons and raids, as well as class and system updates — all at high performance, streaming on GeForce NOW.

Mind Over Matter

Psychonauts 2 on GeForce NOW
The mind is a dangerous place.

The Psychonauts franchise is the beloved action-adventure platformer from Double Fine Productions. Follow the story of Razputin “Raz” Aquato, a young psychic who runs away from the circus to join a summer camp for psychic spies-in-training. Experience unique level design and explore the minds of various characters, each filled with imaginative and often bizarre landscapes that reflect their psychological states.

Psychonauts 2 on GeForce NOW
Brains over brawn.

In Psychonauts 2, Raz is a full-fledged Psychonaut embarking on his first official mission. The sequel delves deeper into Raz’s family background and the workings of the Psychonauts organization. Explore intricately designed mental worlds, all while engaging in platforming and puzzle-solving gameplay.

With a GeForce NOW Ultimate or Priority account, the mind-bending world of Psychonauts is just a click away. Dive in, explore the depths of the human psyche and uncover the secrets that lie within.

Get With the Newness

The Hunt: Showdown 1896 on GeForce NOW
The hunt is on.

Hunt: Showdown 1896 from Crytek marks a new era for the high-stakes, tactical first-person extraction game. The new PC update, available for members today, moves from the original game’s swamps of Louisiana to the sprawling mountains of Colorado. The brand-new Mammon’s Gulch map brings mountains, stunning vistas and grueling mines to the gaming experience, providing all-new elevation points and strategic angles for players to hide out and take out enemy Hunters.

In addition, members can look for the following:

  • Level Zero: Extraction (New release on Steam, Aug. 13)
  • shapez 2 (New release on Steam, Aug. 15)
  • Car Manufacture (Steam)
  • Psychonauts (Steam)
  • Psychonauts 2 (Steam and Xbox, available on PC Game Pass)

What are you planning to play this weekend? Let us know on X or in the comments below.

Read More