“Everybody Will Have An AI Assistant,“ NVIDIA CEO Tells SIGGRAPH Audience

“Everybody Will Have An AI Assistant,“ NVIDIA CEO Tells SIGGRAPH Audience

The generative AI revolution — with deep roots in visual computing — is amplifying human creativity even as accelerated computing promises significant gains in energy efficiency, NVIDIA founder and CEO Jensen Huang said Monday.

That makes this week’s SIGGRAPH professional graphics conference, in Denver, the logical venue to discuss what’s next.

“Everybody will have an AI assistant,” Huang said. “Every single company, every single job within the company, will have AI assistance.”

But even as generative AI promises to amplify human productivity, Huang said the accelerated computing technology that underpins it promises to make computing more energy efficient.

“Accelerated computing helps you save so much energy, 20 times, 50 times, and doing the same processing,” Huang said. “The first thing we have to do, as a society, is accelerate every application we can: this reduces the amount of energy being used all over the world.”

The conversation follows a spate of announcements from NVIDIA today.

NVIDIA introduced a new suite of NIM microservices tailored for diverse workflows, including OpenUSD, 3D modeling, physics, materials, robotics, industrial digital twins and physical AI.
These advancements aim to enhance developer capabilities, particularly with the integration of Hugging Face Inference-as-a-Service on DGX Cloud.

In addition, Shutterstock has launched a Generative 3D Service, while Getty Images has upgraded its offerings using NVIDIA Edify technology.

In the realm of AI and graphics, NVIDIA has revealed new OpenUSD NIM microservices and reference workflows designed for generative physical AI applications.

This includes a program for accelerating humanoid robotics development through new NIM microservices for robotics simulation and more.

Finally, WPP, the world’s largest advertising agency, is using Omniverse-driven generative AI for The Coca-Cola Company, helping drive brand authenticity, showcasing the practical applications of NVIDIA’s advancements in AI technology across various industries.

Huang and Goode started their conversation by exploring how visual computing gave rise to everything from computer games to digital animation to GPU-accelerated computing and, most recently, generative AI powered by industrial-scale AI factories.

All these advancements build on one another. Robotics, for example, requires advanced AI and photorealistic virtual worlds where AI can be trained before being deployed into next-generation humanoid robots.

Huang explained that robotics requires three computers: one to train the AI, one to test the AI in a physically accurate simulation, and one within the robot itself.

“Just about every industry is going to be affected by this, whether it’s scientific computing trying to do a better job predicting the weather with a lot less energy, to augmenting and collaborating with creators to generate images, or generating virtual scenes for industrial visualization,” Huang said. “Robotic self-driving cars are all going to be transformed by generative AI.”

Likewise, NVIDIA Omniverse systems — built around the OpenUSD standard — will also be key to harnessing generative AI to create assets that the world’s largest brands can use.

By pulling from brand assets that live in Omniverse, which can capture brand assets, these systems can capture and replicate carefully curated brand magic.

Finally, all these systems — visual computing, simulation and large-language models — will come together to create digital humans who can help people interact with digital systems of all kinds.

“One of the things that we’re announcing here this week is the concept of digital agents, digital AIs that will augment every single job in the company,” Huang said.

“And so one of the most important use cases that people are discovering is customer service,” Huang said. “In the future, my guess is that it’s going to be human still, but AI in the loop.”

All of this, like any new tool, promises to amplify human productivity and creativity. “Imagine the stories that you’re going to be able to tell with these tools,” Huang said.

Read More

Recipe for Magic: WPP and NVIDIA Omniverse Help The Coca-Cola Company Scale Generative AI Content That Pops With Brand Authenticity

Recipe for Magic: WPP and NVIDIA Omniverse Help The Coca-Cola Company Scale Generative AI Content That Pops With Brand Authenticity

When The Coca-Cola Company produces thirst-quenching marketing, the creative elements of campaigns aren’t just left to chance — there’s a recipe for the magic. Now, the beverage company, through its partnership with WPP Open X, is beginning to scale its global campaigns with generative AI from NVIDIA Omniverse and NVIDIA NIM microservices.

“With NVIDIA, we can personalize and customize Coke and meals imagery across 100-plus markets, delivering on hyperlocal relevance with speed and at global scale,” said Samir Bhutada, global vice president of StudioX Digital Transformation at The Coca-Cola Company.

Coca-Cola has been working with WPP to develop digital twin tools and roll out Prod X — a custom production studio experience created specifically for the beverage maker to use globally.

WPP announced today at SIGGRAPH that The Coca-Cola Company will be an early adopter for integrating the new NVIDIA NIM microservices for Universal Scene Description (aka OpenUSD) into its Prod X roadmap. OpenUSD is a 3D framework that enables interoperability between software tools and data types for building virtual worlds. NIM inference microservices provide models as optimized containers.

The USD Search NIM allows WPP to tap into a large archive of models to create on-brand assets, and the USD Code NIM can be used to assemble them into scenes.

These NIM microservices will enable Prod X users to create 3D advertising assets that contain culturally relevant elements on a global scale, using prompt engineering to quickly make adjustments to AI-generated images so that brands can better target their products at local markets.

Tapping Into NVIDIA NIM Microservices to Deploy Generative AI 

WPP said that the NVIDIA NIM microservices will have a lasting impact on the 3D engineering and art world.

The USD Search NIM can make WPP’s massive visual asset libraries quickly available via written prompts. The USD Code NIM allows developers to enter prompts and get Python code to create novel 3D worlds.

“The beauty of the solution is that it compresses multiple phases of the production process into a single interface and process,” said Perry Nightingale, senior vice president of creative AI at WPP, of the new NIM microservices. “It empowers artists to get more out of the technology and create better work.”

Redefining Content Production With Production Studio

WPP recently announced the release of Production Studio on WPP Open, the company’s intelligent marketing operating system powered by AI. Co-developed with its production company, Hogarth, Production Studio taps into the Omniverse development platform and OpenUSD for its generative AI-enabled product configurator workflows.

Production Studio can streamline and automate multilingual text, image and video creation, simplifying content creation for advertisers and marketers, and directly addresses the challenges advertisers continue to face in producing brand-compliant and product-accurate content at scale.

“Our groundbreaking research with NVIDIA Omniverse for the past few years, and the research and development associated with having built our own core USD pipeline and decades of experience in 3D workflows, is what made it possible for us to stand up a tailored experience like this for The Coca-Cola Company,” said Priti Mhatre, managing director for strategic consulting and AI at Hogarth.

SIGGRAPH attendees can hear more about WPP’s efforts by joining the company’s session on “Robotics, Generative AI, and OpenUSD: How WPP Is Building the Future of Creativity.”

NVIDIA founder and CEO Jensen Huang will also be featured at the event in fireside chats with Meta founder and CEO Mark Zuckerberg and WIRED Senior Writer Lauren Goode. Watch the talks and other sessions from NVIDIA at SIGGRAPH 2024 on demand.

Photo credit: WPP, The Coca-Cola Company

See notice regarding software product information.

Read More

Reality Reimagined: NVIDIA Introduces fVDB to Build Bigger Digital Models of the World

Reality Reimagined: NVIDIA Introduces fVDB to Build Bigger Digital Models of the World

NVIDIA announced at SIGGRAPH fVDB, a new deep-learning framework for generating AI-ready virtual representations of the real world.

fVDB is built on top of OpenVDB, the industry-standard library for simulating and rendering sparse volumetric data such as water, fire, smoke and clouds.

Generative physical AI, such as autonomous vehicles and robots that inhabit the real world, need to have “spatial intelligence” — the ability to understand and operate in 3D space.

Capturing the large scale and super-fine details of the world around us is essential. But converting reality into a virtual representation to train AI is hard.

Raw data for real-world environments can be collected through many different techniques, like neural radiance fields (NeRFs) and lidar. fVDB translates this data into massive, AI-ready environments rendered in real time.

Building on a decade of innovation in the OpenVDB standard, the introduction of fVDB at SIGGRAPH represents a significant leap forward in how industries can benefit from digital twins of the real world.

Reality-scale virtual environments are used for training autonomous agents. City-scale 3D models are captured by drones for climate science and disaster planning. Today, 3D generative AI is even used to plan urban spaces and smart cities.

fVDB enables industries to tap into spatial intelligence on a larger scale and with higher resolution than ever before, making physical AI even smarter.

The framework builds NVIDIA-accelerated AI operators on top of NanoVDB, a GPU-accelerated data structure for efficient 3D simulations. These operators include convolution, pooling, attention and meshing, all of which are designed for high-performance 3D deep learning applications.

AI operators allow businesses to build complex neural networks for spatial intelligence, like large-scale point cloud reconstruction and 3D generative modeling.

fVDB is the result of a long-running effort by NVIDIA’s research team and is already used to support NVIDIA Research, NVIDIA DRIVE and NVIDIA Omniverse projects that require high-fidelity models of large, complex real-world spaces.

Key Advantages of fVDB

  • Larger: 4x larger spatial scale than prior frameworks
  • Faster: 3.5x faster than prior frameworks
  • Interoperable: Businesses can fully tap into massive real-world datasets. fVDB reads VDB datasets into full-sized 3D environments. AI-ready and real-time rendered for building physical AI with spatial intelligence.
  • More powerful: 10x more operators than prior frameworks. fVDB simplifies processes by combining functionalities that previously required multiple deep-learning libraries.

fVDB will soon be available as NVIDIA NIM inference microservices. A trio of the microservices will enable businesses to incorporate fVDB into OpenUSD workflows, generating AI-ready OpenUSD geometry in NVIDIA Omniverse, a development platform for industrial digitalization and generative physical AI applications. They are:

  • fVDB Mesh Generation NIM — Generates digital 3D environments of the real world
  • fVDB NeRF-XL NIM — Generates large-scale NeRFs in OpenUSD using Omniverse Cloud APIs
  • fVDB Physics Super-Res NIM — Performs super-resolution to generate an OpenUSD-based, high-resolution physics simulation

Over the past decade, OpenVDB, housed at the Academy Software Foundation, has earned multiple Academy Awards as a core technology used throughout the visual-effects industry. It has since grown beyond entertainment to industrial and scientific uses, like industrial design and robotics.

NVIDIA continues to enhance the open-source OpenVDB library. Four years ago, the company introduced NanoVDB, which added GPU support to OpenVDB. This delivered an order-of-magnitude speed-up, enabling faster performance and easier development, and opening the door to real-time simulation and rendering.

Two years ago, NVIDIA introduced NeuralVDB, which builds machine learning on top of NanoVDB to compress the memory footprint of VDB volumes up to 100x, allowing creators, developers and researchers to interact with extremely large and complex datasets.

fVDB builds AI operators on top of NanoVDB to unlock spatial intelligence at the scale of reality. Apply to the early-access program for the fVDB PyTorch extension. fVDB will also be available as part of the OpenVDB GitHub repository.

Dive deeper into fVDB in this technical blog and watch how accelerated computing and generative AI are transforming industries and creating new opportunities for innovation and growth in NVIDIA founder and CEO Jensen Huang’s two fireside chats at SIGGRAPH.

See notice regarding software product information.

Read More

NVIDIA Supercharges Digital Marketing With Greater Control Over Generative AI

NVIDIA Supercharges Digital Marketing With Greater Control Over Generative AI

The world’s brands and agencies are using generative AI to create advertising and marketing content, but it doesn’t always provide the desired outputs.

NVIDIA offers a comprehensive set of technologies — bringing together generative AI, NVIDIA NIM microservices, NVIDIA Omniverse and Universal Scene Description (OpenUSD) — to allow developers to build applications and workflows that enable brand-accurate, targeted and efficient advertising at scale.

Developers can use the USD Search NIM microservice to provide artists access to a vast archive of OpenUSD-based, brand-approved assets — such as products, props and environments — and when integrated with the USD Code NIM microservice, assembly of these scenes can be accelerated. Teams can also use the NVIDIA Edify-powered Shutterstock Generative 3D service to rapidly generate 3D new assets using AI.

The scenes, once constructed, can be rendered to a 2D image and used as input to direct an AI-powered image generator to create precise, brand-accurate visuals.

Global agencies, developers and production studios are tapping these technologies to revolutionize every aspect of the advertising process, from creative production and content supply chain to dynamic creative optimization.

WPP announced at SIGGRAPH its adoption of the technologies, naming The Coca-Cola Company the first brand to embrace generative AI with Omniverse and NVIDIA NIM microservices.

Agencies and Service Providers Increase Adoption of Omniverse

The NVIDIA Omniverse development platform has seen widespread adoption for its ability to build accurate digital twins of products. These virtual replicas allow brands and agencies to create ultra-photorealistic and physically accurate 3D product configurators, helping to increase personalization, customer engagement and loyalty, and average selling prices, and reducing return rates.

Digital twins can also serve many purposes and be updated to meet shifting consumer preferences with minimal time, cost and effort, helping flexibly scale content production.

Agencies and Service Providers Increase Adoption of Omniverse

The NVIDIA Omniverse development platform has seen widespread adoption for its ability to build accurate digital twins of products. These virtual replicas allow brands and agencies to create ultra-photorealistic and physically accurate 3D product configurators, helping to increase personalization, customer engagement and loyalty, and average selling prices, and reducing return rates.

Digital twins can also serve many purposes and be updated to meet shifting consumer preferences with minimal time, cost and effort, helping flexibly scale content production.

 

Image courtesy of Monks, Hatch.

Global marketing and technology services company Monks developed Monks.Flow, an AI-centric professional managed service that uses the Omniverse platform to help brands virtually explore different customizable product designs and unlock scale and hyper-personalization across any customer journey.

“NVIDIA Omniverse and OpenUSD’s interoperability accelerates connectivity between marketing, technology and product development,” said Lewis Smithingham, executive vice president of strategic industries at Monks. “Combining Omniverse with Monks’ streamlined marketing and technology services, we infuse AI throughout the product development pipeline and help accelerate technological and creative possibilities for clients.”

Collective World, a creative and technology company, is an early adopter of real-time 3D, OpenUSD and NVIDIA Omniverse, using them to create high-quality digital campaigns for customers like Unilever and EE. The technologies allow Collective to develop digital twins, delivering consistent, high-quality product content at scale to streamline advertising and marketing campaigns.

Building on its use of NVIDIA technologies, Collective World announced at SIGGRAPH that it has joined the NVIDIA Partner Network.

Product digital twin configurator and content generation tool built by Collective on NVIDIA Omniverse.

INDG is using Omniverse to introduce new capabilities into Grip, its popular software tool. Grip uses OpenUSD and generative AI to streamline and enhance the creation process, delivering stunning, high-fidelity marketing content faster than ever.

“This integration helps bring significant efficiencies to every brand by delivering seamless interoperability and enabling real-time visualization,” said Frans Vriendsendorp, CEO of INDG. “Harnessing the potential of USD to eliminate the lock-in to proprietary formats, the combination of Grip and Omniverse are helping set new standards in the realm of digital content creation.”

Image generated with Grip, copyright Beiersdorf

To get started building applications and services using OpenUSD, Omniverse and NVIDIA AI, check out the product configurator developer resources and the generative AI workflow for content creation reference architecture, or submit a contact form to learn more or connect with NVIDIA’s ecosystem of service providers.

Watch NVIDIA founder and CEO Jensen Huang’s fireside chats, as well as other on-demand sessions from NVIDIA at SIGGRAPH.

Stay up to date by subscribing to our newsletter, and following NVIDIA Omniverse on Instagram, LinkedIn, Medium and X.

Read More

Hugging Face Offers Developers Inference-as-a-Service Powered by NVIDIA NIM

Hugging Face Offers Developers Inference-as-a-Service Powered by NVIDIA NIM

One of the world’s largest AI communities — comprising 4 million developers on the Hugging Face platform — is gaining easy access to NVIDIA-accelerated inference on some of the most popular AI models.

New inference-as-a-service capabilities will enable developers to rapidly deploy leading large language models such as the Llama 3 family and Mistral AI models with optimization from NVIDIA NIM microservices running on NVIDIA DGX Cloud.

Announced today at the SIGGRAPH conference, the service will help developers quickly prototype with open-source AI models hosted on the Hugging Face Hub and deploy them in production. Enterprise Hub users can tap serverless inference for increased flexibility, minimal infrastructure overhead and optimized performance with NVIDIA NIM.

The inference service complements Train on DGX Cloud, an AI training service already available on Hugging Face.

Developers facing a growing number of open-source models can benefit from a hub where they can easily compare options. These training and inference tools give Hugging Face developers new ways to experiment with, test and deploy cutting-edge models on NVIDIA-accelerated infrastructure. They’re made easily accessible using the “Train” and “Deploy” drop-down menus on Hugging Face model cards, letting users get started with just a few clicks.

Get started with inference-as-a-service powered by NVIDIA NIM.

Beyond a Token Gesture — NVIDIA NIM Brings Big Benefits

NVIDIA NIM is a collection of AI microservices — including NVIDIA AI foundation models and open-source community models — optimized for inference using industry-standard application programming interfaces, or APIs.

NIM offers users higher efficiency in processing tokens — the units of data used and generated by a language model. The optimized microservices also improve the efficiency of the underlying NVIDIA DGX Cloud infrastructure, which can increase the speed of critical AI applications.

This means developers see faster, more robust results from an AI model accessed as a NIM compared with other versions of the model. The 70-billion-parameter version of Llama 3, for example, delivers up to 5x higher throughput when accessed as a NIM compared with off-the-shelf deployment on NVIDIA H100 Tensor Core GPU-powered systems.

Near-Instant Access to DGX Cloud Provides Accessible AI Acceleration

The NVIDIA DGX Cloud platform is purpose-built for generative AI, offering developers easy access to reliable accelerated computing infrastructure that can help them bring production-ready applications to market faster.

The platform provides scalable GPU resources that support every step of AI development, from prototype to production, without requiring developers to make long-term AI infrastructure commitments.

Hugging Face inference-as-a-service on NVIDIA DGX Cloud powered by NIM microservices offers easy access to compute resources that are optimized for AI deployment, enabling users to experiment with the latest AI models in an enterprise-grade environment.

More on NVIDIA NIM at SIGGRAPH 

At SIGGRAPH, NVIDIA also introduced generative AI models and NIM microservices for the OpenUSD framework to accelerate developers’ abilities to build highly accurate virtual worlds for the next evolution of AI.

To experience more than 100 NVIDIA NIM microservices with applications across industries, visit ai.nvidia.com.

Read More

New NVIDIA Digital Human Technologies Enhance Customer Interactions Across Industries

New NVIDIA Digital Human Technologies Enhance Customer Interactions Across Industries

Generative AI is unlocking new ways for enterprises to engage customers through digital human avatars.

At SIGGRAPH, NVIDIA previewed “James,” an interactive digital human that can connect with people using emotions, humor and more. James is based on a customer-service workflow using NVIDIA ACE, a reference design for creating custom, hyperrealistic, interactive avatars. Users will soon be able to talk with James in real time at ai.nvidia.com.

NVIDIA also showcased at the computer graphics conference the latest advancements to the NVIDIA Maxine AI platform, including Maxine 3D and Audio2Face-2D for an immersive telepresence experience.

Developers can use Maxine and NVIDIA ACE digital human technologies to make customer interactions with digital interfaces more engaging and natural. ACE technologies enable digital human development with AI models for speech and translation, vision, intelligence, lifelike animation and behavior, and realistic appearance.

Companies across industries are using Maxine and ACE to deliver immersive virtual customer experiences.

Meet James, a Digital Brand Ambassador

Built on top of NVIDIA NIM microservices, James is a virtual assistant that can provide contextually accurate responses.

Using retrieval-augmented generation (RAG), James can accurately tell users about the latest NVIDIA technologies. ACE allows developers to use their own data to create domain-specific avatars that can communicate relevant information to customers.

James is powered by the latest NVIDIA RTX rendering technologies for advanced, lifelike animations. His natural-sounding voice is powered by ElevenLabs. NVIDIA ACE lets developers customize animation, voice and language when building avatars tailored for different use cases.

NVIDIA Maxine Enhances Digital Humans in Telepresence

Maxine, a platform for deploying cutting-edge AI features that enhance the audio and video quality of digital humans, enables the use of real-time, photorealistic 2D and 3D avatars with video-conferencing devices.

Maxine 3D converts 2D video portrait inputs into 3D avatars, allowing the integration of highly realistic digital humans in video conferencing and other two-way communication applications. The technology will soon be available in early access.

Audio2Face-2D, currently in early access, animates static portraits based on audio input, creating dynamic, speaking digital humans from a single image. Try the technology at ai.nvidia.com.

Companies Embracing Digital Human Applications

HTC, Looking Glass, Reply and UneeQ are among the latest companies using NVIDIA ACE and Maxine across a broad range of use cases, including customer service agents, and telepresence experiences in entertainment, retail and hospitality.

At SIGGRAPH, digital human technology developer UneeQ is showcasing two new demos.

The first spotlights cloud-rendered digital humans powered by NVIDIA GPUs with local, in-browser computer vision for enhanced scalability and privacy, and animated using the Audio2Face-3D NVIDIA NIM microservice. UneeQ’s Synapse technology processes anonymized user data and feeds it to a large language model (LLM) for more accurate, responsive interactions.

The second demo runs on a single NVIDIA RTX GPU-powered laptop, featuring an advanced digital human powered by Gemma 7B LLM, RAG and the NVIDIA Audio2Face-3D NIM microservice.

Both demos showcase UneeQ’s NVIDIA-powered efforts to develop digital humans that can react to users’ facial expressions and actions, pushing the boundaries of realism in virtual customer service experiences.

HTC Viverse has integrated the Audio2Face-3D NVIDIA NIM microservice into its VIVERSE AI agent for dynamic facial animation and lip sync, allowing for more natural and immersive user interactions.

Hologram technology company Looking Glass’ Magic Mirror demo at SIGGRAPH uses a simple camera setup and Maxine’s advanced 3D AI capabilities to generate a real-time holographic feed of users’ faces on its newly launched, group-viewable Looking Glass 16-inch and 32-inch Spatial Displays.

Reply is unveiling an enhanced version of Futura, its cutting-edge digital human developed for Costa Crociere’s Costa Smeralda cruise ship. Powered by Audio2Face-3D NVIDIA NIM and Riva ASR NIM microservices, Futura’s speech-synthesis capabilities tap advanced technologies including GPT-4o, LlamaIndex for RAG and Microsoft Azure text-to-speech services.

Futura also incorporates Reply’s proprietary affective computing technology, alongside Hume AI and MorphCast, for comprehensive emotion recognition. Built using Unreal Engine 5.4.3 and MetaHuman Creator with NVIDIA ACE-powered facial animation, Futura supports six languages. The intelligent assistant can help plan personalized port visits, suggest tailored itineraries and facilitate tour bookings.

In addition, Futura refines recommendations based on guest feedback and uses a specially created knowledge base to provide informative city presentations, enhancing tourist itineraries. Futura aims to enhance customer service and offer immersive interactions in real-world scenarios, leading to streamlined operations and driving business growth.

Learn more about NVIDIA ACE and NVIDIA Maxine

Discover how accelerated computing and generative AI are transforming industries and creating new opportunities for innovation by watching NVIDIA founder and CEO Jensen Huang’s fireside chats at SIGGRAPH.

See notice regarding software product information.

Read More

AI Gets Physical: New NVIDIA NIM Microservices Bring Generative AI to Digital Environments

AI Gets Physical: New NVIDIA NIM Microservices Bring Generative AI to Digital Environments

Millions of people already use generative AI to assist in writing and learning. Now, the technology can also help them more effectively navigate the physical world.

NVIDIA announced at SIGGRAPH generative physical AI advancements including the NVIDIA Metropolis reference workflow for building interactive visual AI agents and new NVIDIA NIM microservices that will help developers train physical machines and improve how they handle complex tasks.

These include three fVDB NIM microservices that support NVIDIA’s new deep learning framework for 3D worlds, as well as the USD Code, USD Search and USD Validate NIM microservices for working with Universal Scene Description (aka OpenUSD).

The NVIDIA OpenUSD NIM microservices work together with the world’s first generative AI models for OpenUSD development — also developed by NVIDIA — to enable developers to incorporate generative AI copilots and agents into USD workflows and broaden the possibilities of 3D worlds.

NVIDIA NIM Microservices Transform Physical AI Landscapes

Physical AI uses advanced simulations and learning methods to help robots and other industrial automation more effectively perceive, reason and navigate their surroundings. The technology is transforming industries like manufacturing and healthcare, and advancing smart spaces with robots, factory and warehouse technologies, surgical AI agents and cars that can operate more autonomously and precisely.

NVIDIA offers a broad range of NIM microservices customized for specific models and industry domains. NVIDIA’s suite of NIM microservices tailored for physical AI supports capabilities for speech and translation, vision and intelligence, and realistic animation and behavior.

Turning Visual AI Agents Into Visionaries With NVIDIA NIM

Visual AI agents use computer vision capabilities to perceive and interact with the physical world and perform reasoning tasks.

Highly perceptive and interactive visual AI agents are powered by a new class of generative AI models called vision language models (VLMs), which bridge digital perception and real-world interaction in physical AI workloads to enable enhanced decision-making, accuracy, interactivity and performance. With VLMs, developers can build vision AI agents that can more effectively handle challenging tasks, even in complex environments.

Generative AI-powered visual AI agents are rapidly being deployed across hospitals, factories, warehouses, retail stores, airports, traffic intersections and more.

To help physical AI developers more easily build high-performing, custom visual AI agents, NVIDIA offers NIM microservices and reference workflows for physical AI. The NVIDIA Metropolis reference workflow provides a simple, structured approach for customizing, building and deploying visual AI agents, as detailed in the blog.

NVIDIA NIM Helps K2K Make Palermo More Efficient, Safe and Secure

City traffic managers in Palermo, Italy, deployed visual AI agents using NVIDIA NIM to uncover physical insights that help them better manage roadways.

K2K, an NVIDIA Metropolis partner, is leading the effort, integrating NVIDIA NIM microservices and VLMs into AI agents that analyze the city’s live traffic cameras in real time. City officials can ask the agents questions in natural language and receive fast, accurate insights on street activity and suggestions on how to improve the city’s operations, like adjusting traffic light timing.

Leading global electronics giants Foxconn and Pegatron have adopted physical AI, NIM microservices and Metropolis reference workflows to more efficiently design and run their massive manufacturing operations.

The companies are building virtual factories in simulation to save significant time and costs. They’re also running more thorough tests and refinements for their physical AI — including AI multi-camera and visual AI agents — in digital twins before real-world deployment, improving worker safety and leading to operational efficiencies.

Bridging the Simulation-to-Reality Gap With Synthetic Data Generation

Many AI-driven businesses are now adopting a “simulation-first” approach for generative physical AI projects involving real-world industrial automation.

Manufacturing, factory logistics and robotics companies need to manage intricate human-worker interactions, advanced facilities and expensive equipment. NVIDIA physical AI software, tools and platforms — including physical AI and VLM NIM microservices, reference workflows and fVDB — can help them streamline the highly complex engineering required to create digital representations or virtual environments that accurately mimic real-world conditions.

VLMs are seeing widespread adoption across industries because of their ability to generate highly realistic imagery. However, these models can be challenging to train because of the immense volume of data required to create an accurate physical AI model.

Synthetic data generated from digital twins using computer simulations offers a powerful alternative to real-world datasets, which can be expensive — and sometimes impossible — to acquire for model training, depending on the use case.

Tools like NVIDIA NIM microservices and Omniverse Replicator let developers build generative AI-enabled synthetic data pipelines to accelerate the creation of robust, diverse datasets for training physical AI. This enhances the adaptability and performance of models such as VLMs, enabling them to generalize more effectively across industries and use cases.

Availability

Developers can access state-of-the-art, open and NVIDIA-built foundation AI models and NIM microservices at ai.nvidia.com. The Metropolis NIM reference workflow is available in the GitHub repository, and Metropolis VIA microservices are available for download in developer preview.

OpenUSD NIM microservices are available in preview through the NVIDIA API catalog.

Watch how accelerated computing and generative AI are transforming industries and creating new opportunities for innovation and growth in NVIDIA founder and CEO Jensen Huang’s fireside chats at SIGGRAPH.

See notice regarding software product information.

Read More

For Your Edification: Shutterstock Releases Generative 3D, Getty Images Upgrades Service Powered by NVIDIA

For Your Edification: Shutterstock Releases Generative 3D, Getty Images Upgrades Service Powered by NVIDIA

Designers and artists have new and improved ways to boost their productivity with generative AI trained on licensed data.

Shutterstock, a leading platform for creative content, launched its Generative 3D service in commercial beta. It lets creators quickly prototype 3D assets and generate 360 HDRi backgrounds that light scenes, using just text or image prompts.

Getty Images, a premier visual content creator and marketplace, turbocharged its Generative AI by Getty Images service so it creates images twice as fast, improves output quality, brings advanced controls and enables fine-tuning.

The services are built with NVIDIA’s visual AI foundry using NVIDIA Edify, a multimodal generative AI architecture. The AI models are then optimized and packaged for maximum performance with NVIDIA NIM, a set of accelerated microservices for AI inference.

Edify enables service providers to train responsible generative models on their licensed data and scale them quickly with NVIDIA DGX Cloud, the cloud-first way to get the best of NVIDIA AI.

Generative AI Speeds 3D Modeling

Available now for enterprises in commercial beta, Shutterstock’s service lets designers and artists quickly create 3D objects that help them prototype or populate virtual environments. For example, tapping generative AI, they can quickly create the silverware and plates on a dining room table so they can focus on designing the characters around it.

The 3D assets the service generates are ready to edit using digital content creation tools, and available in a variety of popular file formats. Their clean geometry and layout gives artists an advanced starting point for adding their own flair.

An example of a 3D mesh from Shutterstock Generative 3D.

The AI model first delivers a preview of a single asset in as little as 10 seconds. If users like it, the preview can be turned into a higher-quality 3D asset, complete with physically based rendering materials like concrete, wood or leather.

At this year’s SIGGRAPH computer graphics conference, designers will see just how fast they can make their ideas come to life.

Shutterstock will demo a workflow in Blender that lets artists generate objects directly within their 3D environment. In the Shutterstock booth at SIGGRAPH, HP will show 3D prints and physical prototypes of the kinds of assets attendees can design on the show floor using Generative 3D.

Shutterstock is also working with global marketing and communications services company WPP to bring ideas to life with Edify 3D generation for virtual production (see video below).

Explore Generative 3D by Shutterstock on the company’s website, or test-drive the application programming interface (API) at build.nvidia.com/.

Virtual Lighting Gets Real

Lighting a virtual scene with accurate reflections can be a complicated task. Creatives need to operate expensive 360-degree camera rigs and go on set to create backgrounds from scratch, or search vast libraries for something that approximates what they want.

With Shutterstock’s Generative 3D service, users can now simply describe the exact environment they need in text or with an image, and out comes a high-dynamic-range panoramic image, aka 360 HDRi, in brilliant 16K resolution. (See video below.)

Want that beautiful new sports car shown in a desert, a tropical beach or maybe on a winding mountain road? With generative AI, designers can shift gears fast.

Three companies plan to integrate Shutterstock’s 360 HDRi APIs directly into their workflows — WPP, CGI studio Katana and Dassault Systèmes, developer of the 3DEXCITE applications for creating high-end visualizations and 3D content for virtual worlds.

Examples from Generative AI by Getty Images.

Great Images Get a Custom Fit

Generative AI by Getty Images has upgraded to a more powerful Edify AI model with a portfolio of new features that let artists control image composition and style.

Want a red beach ball floating above that perfect shot of a coral reef in Fiji? Getty Images’ service can get it done in a snap.

The new model is twice as fast, boosts image quality and prompt accuracy, and lets users control camera settings like the depth of field or focal length of a shot. Users can generate four images in about six seconds and scale them up to 4K resolution.

An example of the camera controls in Generative AI by Getty Images.
An example of the camera controls in Generative AI by Getty Images.

In addition, the commercially safe foundational model now serves as the basis for a fine-tuning capability that lets companies customize the AI with their own data. That lets them generate images tailored to the creative style of their specific brands.

New controls in the service support the use of a sketch or depth map to guide the composition or structure of an image.

Creatives at Omnicom, a global leader in marketing and sales solutions, are using Getty Images’ service to streamline advertising workflows and safely create on-brand content. The collaboration with Getty Images is part of Omnicom’s strategy to infuse generative AI into every facet of its business, helping teams move from ideas to outcomes faster.

Generative AI by Getty Images is available through the Getty Images and iStock websites, and via an API.

For more about NVIDIA’s offerings, read about the AI foundry for visual generative AI built on NVIDIA DGX Cloud, and try it on ai.nvidia.com.

To get the big picture, listen to NVIDIA founder and CEO Jensen Huang in two fireside chats at SIGGRAPH.

See notice regarding software product information.

Read More

Unleash the Dragonborn: ‘Elder Scrolls V: Skyrim Special Edition’ Joins GeForce NOW

Unleash the Dragonborn: ‘Elder Scrolls V: Skyrim Special Edition’ Joins GeForce NOW

“Hey, you. You’re finally awake.”

It’s the summer of Elder Scrolls — whether a seasoned Dragonborn or a new adventurer, dive into the legendary world of Tamriel this GFN Thursday as The Elder Scrolls V: Skyrim Special Edition joins the cloud.

Epic adventures await, along with nine new games joining the GeForce NOW library this week.

Plus make sure to catch the GeForce NOW Summer Sale for 50% off new Ultimate and Priority memberships.

Unleash the Dragonborn

Skyrim on GeForce NOW
Taking an arrow to the knee won’t stop gamers from questing in the cloud.

Experience the legendary adventures, breathtaking landscapes and immersive storytelling of the iconic role-playing game The Elder Scrolls V: Skyrim Special Edition from Bethesda Game Studios — now accessible on any device from the cloud. Become the Dragonborn and defeat Alduin the World-Eater, a dragon prophesied to destroy the world. 

Explore a vast landscape, complete quests and improve skills to develop characters in the open world of Skyrim. The Special Edition includes add-ons with all-new features, including remastered art and effects. It also brings the adventure of Bethesda Game Studios creations, including new quests, environments, characters, dialogue, armor and weapons.

Get ready to embark on unforgettable quests, battle fearsome foes and uncover the rich lore of the Elder Scrolls universe, all with the power and convenience of GeForce NOW. “Fus Ro Dah” with an Ultimate membership to stream at up to 4K resolution and 120 frames per second with up to eight-hour gaming sessions for the ultimate immersive experience throughout the realms of Tamriel.

All Hands on Deck

World of Warships members rewards on GeForce NOW
Get those sea legs ready for a reward.

Wargaming is bringing back an in-game event exclusively for GeForce NOW members this week.

Through Tuesday, July 30, members who complete the quest while streaming World of Warships can earn up to five GeForce NOW one-day Priority codes — one for each day of the challenge. Aspiring admirals can learn more on the World of Warships blog and social channels.

Shiny and New

Conscript on GeForce NOW
Rendezvous with death.

Take on classic survival horror in CONSCRIPT from Jordan Mochi and Team17. Inspired by legendary games in the genre, the game is set in 1916 during the Great War. CONSCRIPT blends all the punishing mechanics of older horror games into a cohesive, tense and unique experience. Play as a French soldier searching for his missing-in-action brother during the Battle of Verdun. Search through twisted trenches, navigate overrun forts and cross no-man’s-land to find him.

Here’s the full list of new games this week:

  • Cataclismo (New release on Steam, July 22
  • CONSCRIPT (New release on Steam, July 23)
  • F1 Manager 2024 (New release on Steam, July 23)
  • EARTH DEFENSE FORCE 6 (New release on Steam, July 25)
  • The Elder Scrolls V: Skyrim (Steam)
  • The Elder Scrolls V: Skyrim Special Edition (Steam, Epic Games Store and Xbox, available on PC Game Pass)
  • Gang Beasts (Steam and Xbox, available on PC Game Pass)
  • Kingdoms and Castles (Steam)
  • The Settlers: New Allies (Steam)

What are you planning to play this weekend? Let us know on X or in the comments below.

 

Read More

Demystifying AI-Assisted Artistry With Adobe Apps Using NVIDIA RTX

Demystifying AI-Assisted Artistry With Adobe Apps Using NVIDIA RTX

Editor’s note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, software, tools and accelerations for RTX PC users.

Adobe Creative Cloud applications, which tap NVIDIA RTX GPUs, are designed to enhance the creativity of users, empowering them to work faster and focus on their craft.

These tools seamlessly integrate into existing creator workflows, enabling greater productivity and delivering power and precision.

Look to the Light

Generative AI creates new data in forms such as images or text by learning from existing data. It effectively visualizes and generates content to match what a user describes and helps open up fresh avenues for creativity.

Adobe Firefly is Adobe’s family of creative generative AI models that offer new ways to ideate and create while assisting creative workflows using generative AI. They’re designed to be safe for commercial use and were trained, using NVIDIA GPUs, on licensed content, like Adobe Stock Images, and public domain content where copyright has expired.

Firefly features are integrated in Adobe’s most popular creative apps.

Adobe Photoshop features the Generative Fill tool, which uses simple description prompts to easily add content from images. With the latest Reference Image feature currently in beta, users can also upload a sample image to get image results closer to their desired output.

Use Generative Fill to add content and Reference Image to refine it.

Generative Expand allows artists to extend the border of their image with the Crop tool, filling in bigger canvases with new content that automatically blends in with the existing image.

Bigger canvas? Not a problem.

RTX-accelerated Neural Filters, such as Photo Restoration, enable complex adjustments such as colorizing black-and-white photos and performing style transfers using AI. The Smart Portrait filter, which allows non-destructive editing with filters, is based on work from NVIDIA Research.

The brand-new Generative Shape Fill (beta) in Adobe Illustrator, powered by the latest Adobe Firefly Vector Model, allows users to accelerate design workflows by quickly filling shapes with detail and color in their own styles. With Generative Shape Fill, designers can easily match the style and color of their own artwork to create a wide variety of editable and scalable vector graphic options.

Generative AI.

Adobe Illustrator’s Generative Recolor feature lets creators type in a text prompt to explore custom color palettes and themes for their vector artwork in seconds.

Color us impressed.

NVIDIA will continue working with Adobe to support advanced generative AI models, with a focus on deep integration into the apps the world’s leading creators use.

Making Moves on Video

Adobe Premiere Pro is one of the most popular and powerful video editing solutions.

Its Enhance Speech tool, accelerated by RTX, uses AI to remove unwanted noise and improve the quality of dialogue clips so they sound professionally recorded. It’s up to 4.5x faster on RTX PCs.

Adobe Premiere Pro’s AI-powered Enhance Speech tool removes unwanted noise and improves dialogue quality.

Auto Reframe, another Adobe Premiere feature, uses GPU acceleration to identify and track the most relevant elements in a video, and intelligently reframes video content for different aspect ratios. Scene Edit Detection automatically finds the original edit points in a video, a necessary step before the video editing stage begins.

Visual Effects

Separating a foreground object from a background is a crucial step in many visual effects and compositing workflows.

Adobe After Effects has a new feature that uses a matte to isolate an object, enabling capabilities including background replacement and the selective application of effects to the foreground.

Using the Roto Brush tool, artists can draw strokes on representative areas of the foreground and background elements. After Effects uses that information to create a segmentation boundary between the foreground and background elements, delivering cleaner cutouts with fewer clicks.

Creating 3D Product Shots

The Substance 3D Collection is Adobe’s solution for 3D material authoring, texturing and rendering, enabling users to rapidly create stunningly photorealistic 3D content, including models, materials and lighting.

Visualizing products and designs in the context of a space is compelling, but it can be time-consuming to find the right environment for the objects to live in. Substance 3D Stager’s Generative Background feature, powered by Adobe Firefly, solves this issue by letting artists quickly explore generated backgrounds to composite 3D models.

Once an environment is selected, Stager can automatically match the perspective and lighting to the generated background.

Material Authoring With AI

Adobe Substance 3D Sampler, also part of the Substance 3D Collection, is designed to transform images of surfaces and objects into photorealistic physically based rendering (PBR) materials, 3D models and high-dynamic range environment lights. With the recent introduction of new generative workflows powered by Adobe Firefly, Sampler is making it easier than ever for artists to explore variations when creating materials for everything from product visualization projects to the latest AAA games.

Sampler’s Text-to-Texture feature allows users to generate tiled images from detailed text prompts. These generated images can then be edited and transformed into photorealistic PBR materials using the machine learning-powered Image-to-Material feature or any Sampler filter.

Image-to-Texture similarly enables the creation of tiled textures from reference images, providing an alternate way to prompt and generate variations from existing visual content.

Adobe 3D Sampler’s Image-to-Texture feature.

Sampler’s Text-to-Pattern feature uses text prompts to generate tiling patterns, which can be used as base colors or inputs for various filters, such as the Cloth Weave filter for creating original fabric materials.

All of these generative AI features in the Substance 3D Collection, supercharged with RTX GPUs, are designed to help 3D creators ideate and create faster.

Photo-tastic Features

Adobe Lightroom’s AI-powered Raw Details feature produces crisp detail and more accurate renditions of edges, improves color rendering and reduces artifacts, enhancing the image without changing its original resolution. This feature is handy for large displays and prints, where fine details are visible.

Enhance, enhance, enhance.

Super Resolution helps create an enhanced image with similar results as Raw Details but with 2x the linear resolution. This means that the enhanced image will have 2x the width and height of the original image — or 4x the total pixel count. This is especially useful for increasing the resolution of cropped imagery.

For faster editing, AI-powered, RTX-accelerated masking tools like Select Subject, which isolates people from an image, and Select Sky, which captures skies, enable users to create complex masks with the click of a button.

Visit Adobe’s AI features page for a complete list of AI features using RTX.

Looking for more AI-powered content creation apps? Consider NVIDIA Broadcast, which transforms any room into a home studio, free for RTX GPU owners. 

Generative AI is transforming gaming, videoconferencing and interactive experiences of all kinds. Make sense of what’s new and what’s next by subscribing to the AI Decoded newsletter.

Read More