In celebration of Zoox’s 10th anniversary, NVIDIA founder and CEO Jensen Huang recently joined the robotaxi company’s CEO, Aicha Evans, and its cofounder and CTO, Jesse Levinson, to discuss the latest in autonomous vehicle (AV) innovation and experience a ride in the Zoox robotaxi.
In a fireside chat at Zoox’s headquarters in Foster City, Calif., the trio reflected on the two companies’ decade of collaboration. Evans and Levinson highlighted how Zoox pioneered the concept of a robotaxi purpose-built for ride-hailing and created groundbreaking innovations along the way, using NVIDIA technology.
“The world has never seen a robotics company like this before,” said Huang. “Zoox started out solely as a sustainable robotics company that delivers robots into the world as a fleet.”
Since 2014, Zoox has been on a mission to create fully autonomous, bidirectional vehicles purpose-built for ride-hailing services. This sets it apart in an industry largely focused on retrofitting existing cars with self-driving technology.
A decade later, the company is operating its robotaxi, powered by NVIDIA GPUs, on public roads.
Computing at the Core
Zoox robotaxis are, at their core, supercomputers on wheels. They’re built on multiple NVIDIA GPUs dedicated to processing the enormous amounts of data generated in real time by their sensors.
The sensor array includes cameras, lidar, radar, long-wave infrared sensors and microphones. The onboard computing system rapidly processes the raw sensor data collected and fuses it to provide a coherent understanding of the vehicle’s surroundings.
The processed data then flows through a perception engine and prediction module to planning and control systems, enabling the vehicle to navigate complex urban environments safely.
NVIDIA GPUs deliver the immense computing power required for the Zoox robotaxis’ autonomous capabilities and continuous learning from new experiences.
Using Simulation as a Virtual Proving Ground
Key to Zoox’s AV development process is its extensive use of simulation. The company uses NVIDIA GPUs and software tools to run a wide array of simulations, testing its autonomous systems in virtual environments before real-world deployment.
These simulations range from synthetic scenarios to replays of real-world scenarios created using data collected from test vehicles. Zoox uses retrofitted Toyota Highlanders equipped with the same sensor and compute packages as its robotaxis to gather driving data and validate its autonomous technology.
This data is then fed back into simulation environments, where it can be used to create countless variations and replays of scenarios and agent interactions.
Zoox also uses what it calls “adversarial simulations,” carefully crafted scenarios designed to test the limits of the autonomous systems and uncover potential edge cases.
The company’s comprehensive approach to simulation allows it to rapidly iterate and improve its autonomous driving software, bolstering AV safety and performance.
“We’ve been using NVIDIA hardware since the very start,” said Levinson. “It’s a huge part of our simulator, and we rely on NVIDIA GPUs in the vehicle to process everything around us in real time.”
A Neat Way to Seat
Zoox’s robotaxi, with its unique bidirectional design and carriage-style seating, is optimized for autonomous operation and passenger comfort, eliminating traditional concepts of a car’s “front” and “back” and providing equal comfort and safety for all occupants.
“I came to visit you when you were zero years old, and the vision was compelling,” Huang said, reflecting on Zoox’s evolution over the years. “The challenge was incredible. The technology, the talent — it is all world-class.”
Using NVIDIA GPUs and tools, Zoox is poised to redefine urban mobility, pioneering a future of safe, efficient and sustainable autonomous transportation for all.
From Testing Miles to Market Projections
As the AV industry gains momentum, recent projections highlight the potential for explosive growth in the robotaxi market. Guidehouse Insights forecasts over 5 million robotaxi deployments by 2030, with numbers expected to surge to almost 34 million by 2035.
The regulatory landscape reflects this progress, with 38 companies currently holding valid permits to test AVs with safety drivers in California. Zoox is currently one of only six companies permitted to test AVs without safety drivers in the state.
As the industry advances, Zoox has created a next-generation robotaxi by combining cutting-edge onboard computing with extensive simulation and development.
In the image at top, NVIDIA founder and CEO Jensen Huang stands with Zoox CEO Aicha Evans and Zoox cofounder and CTO Jesse Levinson in front of a Zoox robotaxi.
NVIDIA researchers used NVIDIA Edify, a multimodal architecture for visual generative AI, to build a detailed 3D desert landscape within a few minutes in a live demo at SIGGRAPH’s Real-Time Live event on Tuesday.
During the event — one of the prestigious graphics conference’s top sessions — NVIDIA researchers showed how, with the support of an AI agent, they could build and edit a desert landscape from scratch within five minutes. The live demo highlighted how generative AI can act as an assistant to artists by accelerating ideation and generating custom secondary assets that would otherwise have been sourced from a repository.
By drastically decreasing ideation time, these AI technologies will empower 3D artists to be more productive and creative — giving them the tools to explore concepts faster and expedite parts of their workflows. They could, for example, generate the background assets or 360 HDRi environments that the scene needs in minutes, instead of spending hours finding or creating them.
From Idea to 3D Scene in Three Minutes
Creating a full 3D scene is a complex, time-consuming task. Artists must support their hero asset with plenty of background objects to create a rich scene, then find an appropriate background and an environment map to light it. Due to time constraints, they’ve often had to make a trade-off between rapid results and creative exploration.
With the support of AI agents, creative teams can achieve both goals: quickly bring concepts to life and continue iterating to achieve the right look.
In the Real-Time Live demo, the researchers used an AI agent to instruct an NVIDIA Edify-powered model to generate dozens of 3D assets, including cacti, rocks and the skull of a bull — with previews produced in just seconds.
They next directed the agent to harness other models to create potential backgrounds and a layout of how the objects would be placed in the scene — and showcased how the agent could adapt to last-minute changes in creative direction by quickly swapping the rocks for gold nuggets.
With a design plan in place, they prompted the agent to create full-quality assets and render the scene as a photorealistic image in NVIDIA Omniverse USD Composer, an app for virtual world-building.
NVIDIA Edify Accelerates Environment Generation
NVIDIA Edify models can help creators focus on hero assets while accelerating the creation of background environments and objects using AI-powered scene generation tools. The Real-Time Live demo showcased two Edify models:
Edify 3D generates ready-to-edit 3D meshes from text or image prompts. Within seconds, the model can generate previews, including rotating animations of each object, to help creators rapidly prototype before committing to a specific design.
Edify 360 HDRi uses text or image prompts to generate up to 16K high-dynamic range images (HDRi) of nature landscapes, which can be used as backgrounds and to light scenes.
During the demo, the researchers also showcased an AI agent powered by a large language model, and USD Layout, an AI model that generates scene layouts using OpenUSD, a platform for 3D workflows.
At SIGGRAPH, NVIDIA also announced that two leading creative content companies are giving designers and artists new ways to boost productivity with generative AI using tools powered by NVIDIA Edify.
Shutterstock has launched in commercial beta its Generative 3D service, which lets creators quickly prototype and generate 3D assets using text or image prompts. Its 360 HDRi generator based on Edify also entered early access.
Getty Images updated its Generative AI by Getty Images service with the latest version of NVIDIA Edify. Users can now create images twice as fast, with improved output quality and prompt adherence, and advanced controls and fine-tuning.
Harnessing Universal Scene Description in NVIDIA Omniverse
The 3D objects, environment maps and layouts generated using Edify models are structured with USD, a standard format for describing and composing 3D worlds. This compatibility allows artists to immediately import Edify-powered creations into Omniverse USD Composer.
Within Composer, they can use popular digital content creation tools to further modify the scene by, for example, changing the position of objects, modifying their appearance or adjusting lighting.
Real-Time Live is one of the most anticipated events at SIGGRAPH, featuring about a dozen real-time applications including generative AI, virtual reality and live performance capture technology. Watch the replay below.
Enterprises are rapidly adopting generative AI, large language models (LLMs), advanced graphics and digital twins to increase operational efficiencies, reduce costs and drive innovation.
However, to adopt these technologies effectively, enterprises need access to state-of-the-art, full-stack accelerated computing platforms. To meet this demand, Oracle Cloud Infrastructure (OCI) today announced NVIDIA L40S GPU bare-metal instances available to order and the upcoming availability of a new virtual machine accelerated by a single NVIDIA H100 Tensor Core GPU. This new VM expands OCI’s existing H100 portfolio, which includes an NVIDIA HGX H100 8-GPU bare-metal instance.
Paired with NVIDIA networking and running the NVIDIA software stack, these platforms deliver powerful performance and efficiency, enabling enterprises to advance generative AI.
NVIDIA L40S Now Available to Order on OCI
The NVIDIA L40S is a universal data center GPU designed to deliver breakthrough multi-workload acceleration for generative AI, graphics and video applications. Equipped with fourth-generation Tensor Cores and support for the FP8 data format, the L40S GPU excels in training and fine-tuning small- to mid-size LLMs and in inference across a wide range of generative AI use cases.
The L40S GPU also has best-in-class graphics and media acceleration. Its third-generation NVIDIA Ray Tracing Cores (RT Cores) and multiple encode/decode engines make it ideal for advanced visualization and digital twin applications.
The L40S GPU delivers up to 3.8x the real-time ray-tracing performance of its predecessor, and supports NVIDIA DLSS 3 for faster rendering and smoother frame rates. This makes the GPU ideal for developing applications on the NVIDIA Omniverse platform, enabling real-time, photorealistic 3D simulations and AI-enabled digital twins. With Omniverse on the L40S GPU, enterprises can develop advanced 3D applications and workflows for industrial digitalization that will allow them to design, simulate and optimize products, processes and facilities in real time before going into production.
OCI will offer the L40S GPU in its BM.GPU.L40S.4 bare-metal compute shape, featuring four NVIDIA L40S GPUs, each with 48GB of GDDR6 memory. This shape includes local NVMe drives with 7.38TB capacity, 4th Generation Intel Xeon CPUs with 112 cores and 1TB of system memory.
These shapes eliminate the overhead of any virtualization for high-throughput and latency-sensitive AI or machine learning workloads with OCI’s bare-metal compute architecture. The accelerated compute shape features the NVIDIA BlueField-3 DPU for improved server efficiency, offloading data center tasks from CPUs to accelerate networking, storage and security workloads. The use of BlueField-3 DPUs furthers OCI’s strategy of off-box virtualization across its entire fleet.
OCI Supercluster with NVIDIA L40S enables ultra-high performance with 800Gbps of internode bandwidth and low latency for up to 3,840 GPUs. OCI’s cluster network uses NVIDIA ConnectX-7 NICs over RoCE v2 to support high-throughput and latency-sensitive workloads, including AI training.
“We chose OCI AI infrastructure with bare-metal instances and NVIDIA L40S GPUs for 30% more efficient video encoding,” said Sharon Carmel, CEO of Beamr Cloud. “Videos processed with Beamr Cloud on OCI will have up to 50% reduced storage and network bandwidth consumption, speeding up file transfers by 2x and increasing productivity for end users. Beamr will provide OCI customers video AI workflows, preparing them for the future of video.”
Single-GPU H100 VMs Coming Soon on OCI
The VM.GPU.H100.1 compute virtual machine shape, accelerated by a single NVIDIA H100 Tensor Core GPU, is coming soon to OCI. This will provide cost-effective, on-demand access for enterprises looking to use the power of NVIDIA H100 GPUs for their generative AI and HPC workloads.
A single H100 provides a good platform for smaller workloads and LLM inference. For example, one H100 GPU can generate more than 27,000 tokens per second for Llama 3 8B (up to 4x more throughput than a single A100 GPU at FP16 precision) with NVIDIA TensorRT-LLM at an input and output sequence length of 128 and FP8 precision.
The VM.GPU.H100.1 shape includes 2×3.4TB of NVMe drive capacity, 13 cores of 4th Gen Intel Xeon processors and 246GB of system memory, making it well-suited for a range of AI tasks.
“Oracle Cloud’s bare-metal compute with NVIDIA H100 and A100 GPUs, low-latency Supercluster and high-performance storage delivers up to 20% better price-performance for Altair’s computational fluid dynamics and structural mechanics solvers,” said Yeshwant Mummaneni, chief engineer of data management analytics at Altair. “We look forward to leveraging these GPUs with virtual machines for the Altair Unlimited virtual appliance.”
GH200 Bare-Metal Instances Available for Validation
OCI has also made available the BM.GPU.GH200 compute shape for customer testing. It features the NVIDIA Grace Hopper Superchip and NVLink-C2C, a high-bandwidth, cache-coherent 900GB/s connection between the NVIDIA Grace CPU and NVIDIA Hopper GPU. This provides over 600GB of accessible memory, enabling up to 10x higher performance for applications running terabytes of data compared to the NVIDIA A100 GPU.
Optimized Software for Enterprise AI
Enterprises have a wide variety of NVIDIA GPUs to accelerate their AI, HPC and data analytics workloads on OCI. However, maximizing the full potential of these GPU-accelerated compute instances requires an optimized software layer.
Optimized for NVIDIA GPUs, NIM pre-built containers offer developers improved cost of ownership, faster time to market and security. NIM microservices for popular community models, found on the NVIDIA API Catalog, can be deployed easily on OCI.
Performance will continue to improve over time with upcoming GPU-accelerated instances, including NVIDIA H200 Tensor Core GPUs and NVIDIA Blackwell GPUs.
Editor’s note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, software, tools and accelerations for RTX PC users.
NVIDIA is spotlighting the latest NVIDIA RTX-powered tools and apps at SIGGRAPH, an annual trade show at the intersection of graphics and AI.
These AI technologies provide advanced ray-tracing and rendering techniques, enabling highly realistic graphics and immersive experiences in gaming, virtual reality, animation and cinematic special effects. RTX AI PCs and workstations are helping drive the future of interactive digital media, content creation, productivity and development.
ACE’s AI Magic
During a SIGGRAPH fireside chat, NVIDIA founder and CEO Jensen Huang introduced “James” — an interactive digital human built on NVIDIA NIM microservices — that showcases the potential of AI-driven customer interactions.
Using NVIDIA ACE technology and based on a customer-service workflow, James is a virtual assistant that can connect with people using emotions, humor and contextually accurate responses. Soon, users will be able to interact with James in real time at ai.nvidia.com.
James is a virtual assistant in NVIDIA ACE.
NVIDIA also introduced the latest advancements in the NVIDIA Maxine AI platform for telepresence, as well as companies adopting NVIDIA ACE, a suite of technologies for bringing digital humans to life with generative AI. These technologies enable digital human development with AI models for speech and translation, vision, intelligence, realistic animation and behavior, and lifelike appearance.
Maxine features two AI technologies that enhance the digital human experience in telepresence scenarios: Maxine 3D and Audio2Face-2D.
Developers can harness Maxine and ACE technologies to drive more engaging and natural interactions for people using digital interfaces across customer service, gaming and other interactive experiences.
Tapping advanced AI, NVIDIA ACE technologies allow developers to design avatars that can respond to users in real time with lifelike animations, speech and emotions. RTX GPUs provide the necessary computational power and graphical fidelity to render ACE avatars with stunning detail and fluidity.
With ongoing advancements and increasing adoption, ACE is setting new benchmarks for building virtual worlds and sparking innovation across industries. Developers tapping into the power of ACE with RTX GPUs can build more immersive applications and advanced, AI-based, interactive digital media experiences.
RTX Updates Unleash AI-rtistry for Creators
NVIDIA GeForce RTX PCs and NVIDIA RTX workstations are getting an upgrade with GPU accelerations that provide users with enhanced AI content-creation experiences.
For video editors, RTX Video HDR is now available through Wondershare Filmora and DaVinci Resolve. With this technology, users can transform any content into high dynamic range video with richer colors and greater detail in light and dark scenes — making it ideal for gaming videos, travel vlogs or event filmmaking. Combining RTX Video HDR with RTX Video Super Resolution further improves visual quality by removing encoding artifacts and enhancing details.
RTX Video HDR requires an RTX GPU connected to an HDR10-compatible monitor or TV. Users with an RTX GPU-powered PC can send files to the Filmora desktop app and continue to edit with local RTX acceleration, doubling the speed of the export process with dual encoders on GeForce RTX 4070 Ti or above GPUs. Popular media player VLC in June added support for RTX Video Super Resolution and RTX Video HDR, adding AI-enhanced video playback.
In addition, 3D artists are gaining more AI applications and tools that simplify and enhance workflows, including Replikant, Adobe, Topaz and Getty Images.
Replikant, an AI-assisted 3D animation platform, is integrating NVIDIA Audio2Face, an ACE technology, to enable improved lip sync and facial animation. By taking advantage of NVIDIA-accelerated generative models, users can enjoy real-time visuals enhanced by RTX and NVIDIA DLSS technology. Replikant is now available on Steam.
Adobe Substance 3D Modeler has added Search Asset Library by Shape, an AI-powered feature designed to streamline the replacement and enhancement of complex shapes using existing 3D models. This new capability significantly accelerates prototyping and enhances design workflows.
New AI features in Adobe Substance 3D integrate advanced generative AI capabilities, enhancing its texturing and material-creation tools. Adobe has launched the first integration of its Firefly generative AI capabilities into Substance 3D Sampler and Stager, making 3D workflows more seamless and productive for industrial designers, game developers and visual effects professionals.
For tasks like text-to-texture generation and prompt descriptions, Substance 3D users can generate photorealistic or stylized textures. These textures can then be applied directly to 3D models. The new Text to Texture and Generative Background features significantly accelerate traditionally time-consuming and intricate 3D texturing and staging tasks.
Powered by NVIDIA RTX Tensor Cores, Substance 3D can significantly accelerate computations and allows for more intuitive and creative design processes. This development builds on Adobe’s innovation with Firefly-powered Creative Cloud upgrades in Substance 3D workflows.
Topaz AI has added NVIDIA TensorRT acceleration for multi-GPU workflows, enabling parallelization across multiple GPUs for supercharged rendering speeds — up to 2x faster with two GPUs over a single GPU system, and scaling further with additional GPUs.
Getty Images has updated its Generative AI by iStock service with new features to enhance image generation and quality. Powered by NVIDIA Edify models, the latest enhancement delivers generation speeds set to reach around six seconds for four images, doubling the performance of the previous model, with speeds at the forefront of the industry. The improved Text-2-Image and Image-2-Image functionalities provide higher-quality results and greater adherence to user prompts.
Generative AI by iStock users can now also designate camera settings such as focal length (narrow, standard or wide) and depth of field (near or far). Improvements to generative AI super-resolution enhances image quality by using AI to create new pixels, significantly improving resolution without over-sharpening the image.
LLM-azing AI
ChatRTX — a tech demo that connects a large language model (LLM), like Meta’s Llama, to a user’s data for quickly querying notes, documents or images — is getting a user interface (UI) makeover, offering a cleaner, more polished experience.
ChatRTX also serves as an open-source reference project that shows developers how to build powerful, local, retrieval-augmented applications (RAG) applications accelerated by RTX.
ChatRTX is getting a interface (UI) makeover.
The latest version of ChatRTX, released today, uses the Electron + Material UI framework, which lets developers more easily add their own UI elements or extend the technology’s functionality. The update also includes a new architecture that simplifies the integration of different UIs and streamlines the building of new chat and RAG applications on top of the ChatRTX backend application programming interface.
End users can download the latest version of ChatRTX from the ChatRTX web page. Developers can find the source code for the new release on the ChatRTX GitHub repository.
Meta Llama 3.1-8B models are now optimized for inference on NVIDIA GeForce RTX PCs and NVIDIA RTX workstations. These models are natively supported with NVIDIA TensorRT-LLM, open-source software that accelerates LLM inference performance.
Dell’s AI Chatbots: Harnessing RTX Rocket Fuel
Dell is presenting how enterprises can boost AI development with an optimized RAG chatbot using NVIDIA AI Workbench and an NVIDIA NIM microservice for Llama 3. Using the NVIDIA AI Workbench Hybrid RAG Project, Dell is demonstrating how the chatbot can be used to converse with enterprise data that’s embedded in a local vector database, with inference running in one of three ways:
Locally on a Hugging Face TGI server
In the cloud using NVIDIA inference endpoints
On self-hosted NVIDIA NIM microservices
Learn more about the AI Workbench Hybrid RAG Project. SIGGRAPH attendees can experience this technology firsthand at Dell Technologies’ booth 301.
HP AI Studio: Innovate Faster With CUDA-X and Galileo
At SIGGRAPH, HP is presenting the Z by HP AI Studio, a centralized data science platform. Announced in October 2023, AI Studio has now been enhanced with the latest NVIDIA CUDA-X libraries as well as HP’s recent partnership with Galileo, a generative AI trust-layer company. Key benefits include:
Deploy projects faster: Configure, connect and share local and remote projects quickly.
Collaborate with ease: Access and share data, templates and experiments effortlessly.
Work your way: Choose where to work on your data, easily switching between online and offline modes.
Designed to enhance productivity and streamline AI development, AI Studio allows data science teams to focus on innovation. Visit HP’s booth 501 to see how AI Studio with RAPIDS cuDF can boost data preprocessing to accelerate AI pipelines. Apply for early access to AI Studio.
An RTX Speed Surge for Stable Diffusion
Stable Diffusion 3.0, the latest model from Stability AI, has been optimized with TensorRT to provide a 60% speedup.
A NIM microservice for Stable Diffusion 3 with optimized performance is available for preview on ai.nvidia.com.
There’s still time to join NVIDIA at SIGGRAPH to see how RTX AI is transforming the future of content creation and visual media experiences. The conference runs through Aug. 1.
Generative AI is transforming graphics and interactive experiences of all kinds. Make sense of what’s new and what’s next by subscribing to the AI Decoded newsletter.
In a highly anticipated fireside chat at SIGGRAPH 2024, NVIDIA founder and CEO Jensen Huang and Meta founder and CEO Mark Zuckerberg discussed the transformative potential of open source AI and AI assistants.
Zuckerberg kicked off the discussion by announcing the launch of AI Studio, a new platform that allows users to create, share and discover AI characters, making AI more accessible to millions of creators and small businesses.
“Every single restaurant, every single website will probably, in the future, have these AIs …” Huang said.
“…just like every business has an email address and a website and a social media account, I think, in the future, every business is going to have an AI,” Zuckerberg responded.
Zuckerberg has gotten it right before. Huang credited Zuckerberg and Meta with being leaders in AI, even if only some have noticed until recently.
“You guys have done amazing AI work,” Huang said, citing advancements from Meta in computer vision, language models, real-time translation. “We all use Pytorch, that comes out of Meta.”
The Importance of Open Source in Advancing AI
Zuckerberg highlighted the importance of open source in advancing AI — with the two business leaders emphasizing the importance of open platforms for innovation.
Meta has rapidly emerged as a leader in AI, putting it to work throughout its businesses — most notably with Meta AI, which is used across Facebook, Instagram and WhatsApp — and advancing open-source AI throughout the industry, most recently with the release of Llama 3.1.
The open-source model represents a significant investment of time and training resources. The largest version of Llama boasts 405 billion parameters and was trained on over 16,000 NVIDIA H100 GPUs.
“One of the things that drives quality improvements is it used to be that you have a different model for each type of content,” Zuckerberg explained.
“A the models get bigger and more general, that gets better and better. So, I kind of dream of one day like you can almost imagine all of Facebook or Instagram being like a single AI model that has unified all these different content types and systems together,” he added.
Zuckerberg sees collaboration as key to more advancements. In a blog post released last week, Zuckerberg wrote that the release of Llama 3.1 promises to be an “inflection point” in adopting open source in AI.
These advancements promise more tools to foster engagement, create compelling and personalized content — such as digital avatars — and build virtual worlds.
More broadly, the advancement of AI across a broad ecosystem promises to supercharge human productivity, for example, by giving every human on earth a digital assistant — or assistants — allowing people to live richer lives that they can interact with quickly and fluidly.
“I feel like I’m collaborating with WhatsApp,” Huang said. “Imagine I’m sitting here typing, and it’s generating the images as I’m going. I go back and change my words, and it’s generating other images.”
Vision for the Future
Looking ahead, both CEOs shared their visions for the future.
Zuckerberg expressed optimism about bringing AI together with the real world through eyeglasses — nothing his company’s collaboration with eyewear maker Luxotic — that can be used to help transform education, entertainment and work.
Huang emphasized how interacting with AIs is becoming more fluid, moving beyond just text-based interactions.
“Today’s AI is kind of turn-based. You say something, it says something back to you,” Huang said. In the future, AI could contemplate multiple options, or come up with a tree of options and simulate outcomes, making it much more powerful.”
Throughout the conversation, the two leaders playfully bantered about everything from fashion to steak sandwiches, ending the discussion by exchanging leather jackets.
Zuckerberg give Huang with a black leather shearling jacket with an enormous hood.
Huang gave Zuckerberg his own leather jacket, which he got from his wife, Lori, just for SIGGRAPH, quipping that it was just “two hours old.”
“Well this one’s yours,” Zuckerberg said with a smile. “This is worth more because it’s used.”
The generative AI revolution — with deep roots in visual computing — is amplifying human creativity even as accelerated computing promises significant gains in energy efficiency, NVIDIA founder and CEO Jensen Huang said Monday.
That makes this week’s SIGGRAPH professional graphics conference, in Denver, the logical venue to discuss what’s next.
“Everybody will have an AI assistant,” Huang said. “Every single company, every single job within the company, will have AI assistance.”
But even as generative AI promises to amplify human productivity, Huang said the accelerated computing technology that underpins it promises to make computing more energy efficient.
“Accelerated computing helps you save so much energy, 20 times, 50 times, and doing the same processing,” Huang said. “The first thing we have to do, as a society, is accelerate every application we can: this reduces the amount of energy being used all over the world.”
The conversation follows a spate of announcements from NVIDIA today.
Finally, WPP, the world’s largest advertising agency, is using Omniverse-driven generative AI for The Coca-Cola Company, helping drive brand authenticity, showcasing the practical applications of NVIDIA’s advancements in AI technology across various industries.
Huang and Goode started their conversation by exploring how visual computing gave rise to everything from computer games to digital animation to GPU-accelerated computing and, most recently, generative AI powered by industrial-scale AI factories.
All these advancements build on one another. Robotics, for example, requires advanced AI and photorealistic virtual worlds where AI can be trained before being deployed into next-generation humanoid robots.
Huang explained that robotics requires three computers: one to train the AI, one to test the AI in a physically accurate simulation, and one within the robot itself.
“Just about every industry is going to be affected by this, whether it’s scientific computing trying to do a better job predicting the weather with a lot less energy, to augmenting and collaborating with creators to generate images, or generating virtual scenes for industrial visualization,” Huang said. “Robotic self-driving cars are all going to be transformed by generative AI.”
Likewise, NVIDIA Omniverse systems — built around the OpenUSD standard — will also be key to harnessing generative AI to create assets that the world’s largest brands can use.
By pulling from brand assets that live in Omniverse, which can capture brand assets, these systems can capture and replicate carefully curated brand magic.
Finally, all these systems — visual computing, simulation and large-language models — will come together to create digital humans who can help people interact with digital systems of all kinds.
“One of the things that we’re announcing here this week is the concept of digital agents, digital AIs that will augment every single job in the company,” Huang said.
“And so one of the most important use cases that people are discovering is customer service,” Huang said. “In the future, my guess is that it’s going to be human still, but AI in the loop.”
All of this, like any new tool, promises to amplify human productivity and creativity. “Imagine the stories that you’re going to be able to tell with these tools,” Huang said.
When The Coca-Cola Company produces thirst-quenching marketing, the creative elements of campaigns aren’t just left to chance — there’s a recipe for the magic. Now, the beverage company, through its partnership with WPP Open X, is beginning to scale its global campaigns with generative AI from NVIDIA Omniverse and NVIDIA NIM microservices.
“With NVIDIA, we can personalize and customize Coke and meals imagery across 100-plus markets, delivering on hyperlocal relevance with speed and at global scale,” said Samir Bhutada, global vice president of StudioX Digital Transformation at The Coca-Cola Company.
Coca-Cola has been working with WPP to develop digital twin tools and roll out Prod X — a custom production studio experience created specifically for the beverage maker to use globally.
WPP announced today at SIGGRAPH that The Coca-Cola Company will be an early adopter for integrating the new NVIDIA NIM microservices for Universal Scene Description (aka OpenUSD) into its Prod X roadmap. OpenUSD is a 3D framework that enables interoperability between software tools and data types for building virtual worlds. NIM inference microservices provide models as optimized containers.
The USD Search NIM allows WPP to tap into a large archive of models to create on-brand assets, and the USD Code NIM can be used to assemble them into scenes.
These NIM microservices will enable Prod X users to create 3D advertising assets that contain culturally relevant elements on a global scale, using prompt engineering to quickly make adjustments to AI-generated images so that brands can better target their products at local markets.
Tapping Into NVIDIA NIM Microservices to Deploy Generative AI
WPP said that the NVIDIA NIM microservices will have a lasting impact on the 3D engineering and art world.
The USD Search NIM can make WPP’s massive visual asset libraries quickly available via written prompts. The USD Code NIM allows developers to enter prompts and get Python code to create novel 3D worlds.
“The beauty of the solution is that it compresses multiple phases of the production process into a single interface and process,” said Perry Nightingale, senior vice president of creative AI at WPP, of the new NIM microservices. “It empowers artists to get more out of the technology and create better work.”
Redefining Content Production With Production Studio
WPP recently announced the release of Production Studio on WPP Open, the company’s intelligent marketing operating system powered by AI. Co-developed with its production company, Hogarth, Production Studio taps into the Omniverse development platform and OpenUSD for its generative AI-enabled product configurator workflows.
Production Studio can streamline and automate multilingual text, image and video creation, simplifying content creation for advertisers and marketers, and directly addresses the challenges advertisers continue to face in producing brand-compliant and product-accurate content at scale.
“Our groundbreaking research with NVIDIA Omniverse for the past few years, and the research and development associated with having built our own core USD pipeline and decades of experience in 3D workflows, is what made it possible for us to stand up a tailored experience like this for The Coca-Cola Company,” said Priti Mhatre, managing director for strategic consulting and AI at Hogarth.
NVIDIA founder and CEO Jensen Huang will also be featured at the event in fireside chats with Meta founder and CEO Mark Zuckerberg and WIRED Senior Writer Lauren Goode. Watch the talks and other sessions from NVIDIA at SIGGRAPH 2024 on demand.
Photo credit: WPP, The Coca-Cola Company
See notice regarding software product information.
NVIDIA announced at SIGGRAPH fVDB, a new deep-learning framework for generating AI-ready virtual representations of the real world.
fVDB is built on top of OpenVDB, the industry-standard library for simulating and rendering sparse volumetric data such as water, fire, smoke and clouds.
Generative physical AI, such as autonomous vehicles and robots that inhabit the real world, need to have “spatial intelligence” — the ability to understand and operate in 3D space.
Capturing the large scale and super-fine details of the world around us is essential. But converting reality into a virtual representation to train AI is hard.
Raw data for real-world environments can be collected through many different techniques, like neural radiance fields (NeRFs) and lidar. fVDB translates this data into massive, AI-ready environments rendered in real time.
Building on a decade of innovation in the OpenVDB standard, the introduction of fVDB at SIGGRAPH represents a significant leap forward in how industries can benefit from digital twins of the real world.
Reality-scale virtual environments are used for training autonomous agents. City-scale 3D models are captured by drones for climate science and disaster planning. Today, 3D generative AI is even used to plan urban spaces and smart cities.
fVDB enables industries to tap into spatial intelligence on a larger scale and with higher resolution than ever before, making physical AI even smarter.
The framework builds NVIDIA-accelerated AI operators on top of NanoVDB, a GPU-accelerated data structure for efficient 3D simulations. These operators include convolution, pooling, attention and meshing, all of which are designed for high-performance 3D deep learning applications.
AI operators allow businesses to build complex neural networks for spatial intelligence, like large-scale point cloud reconstruction and 3D generative modeling.
fVDB is the result of a long-running effort by NVIDIA’s research team and is already used to support NVIDIA Research, NVIDIA DRIVE and NVIDIA Omniverse projects that require high-fidelity models of large, complex real-world spaces.
Key Advantages of fVDB
Larger: 4x larger spatial scale than prior frameworks
Faster: 3.5x faster than prior frameworks
Interoperable: Businesses can fully tap into massive real-world datasets. fVDB reads VDB datasets into full-sized 3D environments. AI-ready and real-time rendered for building physical AI with spatial intelligence.
More powerful: 10x more operators than prior frameworks. fVDB simplifies processes by combining functionalities that previously required multiple deep-learning libraries.
fVDB will soon be available as NVIDIA NIM inference microservices. A trio of the microservices will enable businesses to incorporate fVDB into OpenUSD workflows, generating AI-ready OpenUSD geometry in NVIDIA Omniverse, a development platform for industrial digitalization and generative physical AI applications. They are:
fVDB Mesh Generation NIM — Generates digital 3D environments of the real world
fVDB NeRF-XL NIM — Generates large-scale NeRFs in OpenUSD using Omniverse Cloud APIs
fVDB Physics Super-Res NIM — Performs super-resolution to generate an OpenUSD-based, high-resolution physics simulation
Over the past decade, OpenVDB, housed at the Academy Software Foundation, has earned multiple Academy Awards as a core technology used throughout the visual-effects industry. It has since grown beyond entertainment to industrial and scientific uses, like industrial design and robotics.
NVIDIA continues to enhance the open-source OpenVDB library. Four years ago, the company introduced NanoVDB, which added GPU support to OpenVDB. This delivered an order-of-magnitude speed-up, enabling faster performance and easier development, and opening the door to real-time simulation and rendering.
Two years ago, NVIDIA introduced NeuralVDB, which builds machine learning on top of NanoVDB to compress the memory footprint of VDB volumes up to 100x, allowing creators, developers and researchers to interact with extremely large and complex datasets.
fVDB builds AI operators on top of NanoVDB to unlock spatial intelligence at the scale of reality. Apply to the early-access program for the fVDB PyTorch extension. fVDB will also be available as part of the OpenVDB GitHub repository.
Dive deeper into fVDB in this technical blog and watch how accelerated computing and generative AI are transforming industries and creating new opportunities for innovation and growth in NVIDIA founder and CEO Jensen Huang’s two fireside chats at SIGGRAPH.
See notice regarding software product information.
The world’s brands and agencies are using generative AI to create advertising and marketing content, but it doesn’t always provide the desired outputs.
NVIDIA offers a comprehensive set of technologies — bringing together generative AI, NVIDIA NIM microservices, NVIDIA Omniverse and Universal Scene Description (OpenUSD) — to allow developers to build applications and workflows that enable brand-accurate, targeted and efficient advertising at scale.
The scenes, once constructed, can be rendered to a 2D image and used as input to direct an AI-powered image generator to create precise, brand-accurate visuals.
Global agencies, developers and production studios are tapping these technologies to revolutionize every aspect of the advertising process, from creative production and content supply chain to dynamic creative optimization.
Agencies and Service Providers Increase Adoption of Omniverse
The NVIDIA Omniverse development platform has seen widespread adoption for its ability to build accurate digital twins of products. These virtual replicas allow brands and agencies to create ultra-photorealistic and physically accurate 3D product configurators, helping to increase personalization, customer engagement and loyalty, and average selling prices, and reducing return rates.
Digital twins can also serve many purposes and be updated to meet shifting consumer preferences with minimal time, cost and effort, helping flexibly scale content production.
Agencies and Service Providers Increase Adoption of Omniverse
The NVIDIA Omniverse development platform has seen widespread adoption for its ability to build accurate digital twins of products. These virtual replicas allow brands and agencies to create ultra-photorealistic and physically accurate 3D product configurators, helping to increase personalization, customer engagement and loyalty, and average selling prices, and reducing return rates.
Digital twins can also serve many purposes and be updated to meet shifting consumer preferences with minimal time, cost and effort, helping flexibly scale content production.
Image courtesy of Monks, Hatch.
Global marketing and technology services company Monks developed Monks.Flow, an AI-centric professional managed service that uses the Omniverse platform to help brands virtually explore different customizable product designs and unlock scale and hyper-personalization across any customer journey.
“NVIDIA Omniverse and OpenUSD’s interoperability accelerates connectivity between marketing, technology and product development,” said Lewis Smithingham, executive vice president of strategic industries at Monks. “Combining Omniverse with Monks’ streamlined marketing and technology services, we infuse AI throughout the product development pipeline and help accelerate technological and creative possibilities for clients.”
Collective World, a creative and technology company, is an early adopter of real-time 3D, OpenUSD and NVIDIA Omniverse, using them to create high-quality digital campaigns for customers like Unilever and EE. The technologies allow Collective to develop digital twins, delivering consistent, high-quality product content at scale to streamline advertising and marketing campaigns.
Building on its use of NVIDIA technologies, Collective World announced at SIGGRAPH that it has joined the NVIDIA Partner Network.
Product digital twin configurator and content generation tool built by Collective on NVIDIA Omniverse.
INDG is using Omniverse to introduce new capabilities into Grip, its popular software tool. Grip uses OpenUSD and generative AI to streamline and enhance the creation process, delivering stunning, high-fidelity marketing content faster than ever.
“This integration helps bring significant efficiencies to every brand by delivering seamless interoperability and enabling real-time visualization,” said Frans Vriendsendorp, CEO of INDG. “Harnessing the potential of USD to eliminate the lock-in to proprietary formats, the combination of Grip and Omniverse are helping set new standards in the realm of digital content creation.”
One of the world’s largest AI communities — comprising 4 million developers on the Hugging Face platform — is gaining easy access to NVIDIA-accelerated inference on some of the most popular AI models.
New inference-as-a-service capabilities will enable developers to rapidly deploy leading large language models such as the Llama 3 family and Mistral AI models with optimization from NVIDIA NIM microservices running on NVIDIA DGX Cloud.
Announced today at the SIGGRAPH conference, the service will help developers quickly prototype with open-source AI models hosted on the Hugging Face Hub and deploy them in production. Enterprise Hub users can tap serverless inference for increased flexibility, minimal infrastructure overhead and optimized performance with NVIDIA NIM.
The inference service complements Train on DGX Cloud, an AI training service already available on Hugging Face.
Developers facing a growing number of open-source models can benefit from a hub where they can easily compare options. These training and inference tools give Hugging Face developers new ways to experiment with, test and deploy cutting-edge models on NVIDIA-accelerated infrastructure. They’re made easily accessible using the “Train” and “Deploy” drop-down menus on Hugging Face model cards, letting users get started with just a few clicks.
Beyond a Token Gesture — NVIDIA NIM Brings Big Benefits
NVIDIA NIM is a collection of AI microservices — including NVIDIA AI foundation models and open-source community models — optimized for inference using industry-standard application programming interfaces, or APIs.
NIM offers users higher efficiency in processing tokens — the units of data used and generated by a language model. The optimized microservices also improve the efficiency of the underlying NVIDIA DGX Cloud infrastructure, which can increase the speed of critical AI applications.
This means developers see faster, more robust results from an AI model accessed as a NIM compared with other versions of the model. The 70-billion-parameter version of Llama 3, for example, delivers up to 5x higher throughput when accessed as a NIM compared with off-the-shelf deployment on NVIDIA H100 Tensor Core GPU-powered systems.
Near-Instant Access to DGX Cloud Provides Accessible AI Acceleration
The NVIDIA DGX Cloud platform is purpose-built for generative AI, offering developers easy access to reliable accelerated computing infrastructure that can help them bring production-ready applications to market faster.
The platform provides scalable GPU resources that support every step of AI development, from prototype to production, without requiring developers to make long-term AI infrastructure commitments.
Hugging Face inference-as-a-service on NVIDIA DGX Cloud powered by NIM microservices offers easy access to compute resources that are optimized for AI deployment, enabling users to experiment with the latest AI models in an enterprise-grade environment.