October 2024 – Page 15

Zoom’s AI-First Transformation to Boost Business Productivity, Collaboration

Zoom, a company that helped change the way people work during the COVID-19 pandemic, is continuing to reimagine the future of work by transforming itself into an AI-first communications and productivity platform. In this episode of NVIDIA’s AI Podcast, Zoom CTO Xuedong (XD) Huang shares how the company is reshaping productivity with AI, including through its Zoom AI Companion 2.0, unveiled recently at the Zoomtopia conference. Designed to be a productivity partner, the AI companion is central to Zoom’s “federated AI” strategy, which focuses on integrating multiple large language models.

Huang also introduces the concept of “AUI,” combining conversational AI and graphical user interfaces (GUIs) to streamline collaboration and supercharge business performance.

The AI Podcast · Zoom’s AI-First Transformation to Boost Business Productivity, Collaboration – Ep. 235

Time Stamps

6:49: The fundamental capabilities of generative AI

8:20: Zoom’s approach to AI, including the use of small language models

11:20: Zoom’s “federated AI” strategy, integrating multiple AI models

13:10: Introducing the concept of “AUI.”

20:00: Huang on how AI will impact productivity and everyday tasks

29:00: How Zoom helps business leaders understand AI and return on investment of AI projects

32:50: Huang’s near-term outlook on the development of AI

You Might Also Like…

How SonicJobs Uses AI Agents to Connect the Internet, Starting With Jobs – Ep. 233

Mikhil Raja, cofounder and CEO of SonicJobs, shares how the company has have built AI agents to enable candidates to complete applications directly on job platforms, without redirection, boosting completion rates. Raja delves deep into SonicJobs’ cutting-edge technology, which merges traditional AI with large language models, to understand and interact with job application web flows.

Yotta CEO Sunil Gupta on Supercharging India’s Fast-Growing AI Market – Ep. 225

Sunil Gupta, cofounder, managing director and CEO of Yotta Data Services, talks about the company’s Shakti Cloud offering, which provides scalable GPU services for enterprises. Gupta also shares insights on India’s potential as a major AI market and the importance of balancing data center growth with sustainability and energy efficiency.

Replit CEO Amjad Masad on Empowering the Next Billion Software Creators – Ep. 201

Amjad Masad, CEO of Replit, aims to bridge the gap between ideas and software using the latest advancements in generative AI. Masad talks about the future of AI and how it can function as a collaborator that can conduct high-level tasks and even manage resources.

Subscribe to the AI Podcast

Get the AI Podcast through iTunes, Google Play, Amazon Music, Castbox, DoggCatcher, Overcast, PlayerFM, Pocket Casts, Podbay, PodBean, PodCruncher, PodKicker, Soundcloud, Spotify, Stitcher and TuneIn.

Make the AI Podcast better: Have a few minutes to spare? Fill out this listener survey.

Zoom’s AI-First Transformation to Boost Business Productivity, Collaboration

Huang also introduces the concept of “AUI,” combining conversational AI and graphical user interfaces (GUIs) to streamline collaboration and supercharge business performance.

The AI Podcast · Zoom’s AI-First Transformation to Boost Business Productivity, Collaboration – Ep. 235

Time Stamps

6:49: The fundamental capabilities of generative AI

8:20: Zoom’s approach to AI, including the use of small language models

11:20: Zoom’s “federated AI” strategy, integrating multiple AI models

13:10: Introducing the concept of “AUI.”

20:00: Huang on how AI will impact productivity and everyday tasks

29:00: How Zoom helps business leaders understand AI and return on investment of AI projects

32:50: Huang’s near-term outlook on the development of AI

You Might Also Like…

How SonicJobs Uses AI Agents to Connect the Internet, Starting With Jobs – Ep. 233

Yotta CEO Sunil Gupta on Supercharging India’s Fast-Growing AI Market – Ep. 225

Replit CEO Amjad Masad on Empowering the Next Billion Software Creators – Ep. 201

Subscribe to the AI Podcast

Make the AI Podcast better: Have a few minutes to spare? Fill out this listener survey.

Zoom’s AI-First Transformation to Boost Business Productivity, Collaboration

Huang also introduces the concept of “AUI,” combining conversational AI and graphical user interfaces (GUIs) to streamline collaboration and supercharge business performance.

The AI Podcast · Zoom’s AI-First Transformation to Boost Business Productivity, Collaboration – Ep. 235

Time Stamps

6:49: The fundamental capabilities of generative AI

8:20: Zoom’s approach to AI, including the use of small language models

11:20: Zoom’s “federated AI” strategy, integrating multiple AI models

13:10: Introducing the concept of “AUI.”

20:00: Huang on how AI will impact productivity and everyday tasks

29:00: How Zoom helps business leaders understand AI and return on investment of AI projects

32:50: Huang’s near-term outlook on the development of AI

You Might Also Like…

How SonicJobs Uses AI Agents to Connect the Internet, Starting With Jobs – Ep. 233

Yotta CEO Sunil Gupta on Supercharging India’s Fast-Growing AI Market – Ep. 225

Replit CEO Amjad Masad on Empowering the Next Billion Software Creators – Ep. 201

Subscribe to the AI Podcast

Make the AI Podcast better: Have a few minutes to spare? Fill out this listener survey.

Zoom’s AI-First Transformation to Boost Business Productivity, Collaboration

Huang also introduces the concept of “AUI,” combining conversational AI and graphical user interfaces (GUIs) to streamline collaboration and supercharge business performance.

The AI Podcast · Zoom’s AI-First Transformation to Boost Business Productivity, Collaboration – Ep. 235

Time Stamps

6:49: The fundamental capabilities of generative AI

8:20: Zoom’s approach to AI, including the use of small language models

11:20: Zoom’s “federated AI” strategy, integrating multiple AI models

13:10: Introducing the concept of “AUI.”

20:00: Huang on how AI will impact productivity and everyday tasks

29:00: How Zoom helps business leaders understand AI and return on investment of AI projects

32:50: Huang’s near-term outlook on the development of AI

You Might Also Like…

How SonicJobs Uses AI Agents to Connect the Internet, Starting With Jobs – Ep. 233

Yotta CEO Sunil Gupta on Supercharging India’s Fast-Growing AI Market – Ep. 225

Replit CEO Amjad Masad on Empowering the Next Billion Software Creators – Ep. 201

Subscribe to the AI Podcast

Make the AI Podcast better: Have a few minutes to spare? Fill out this listener survey.

Zoom’s AI-First Transformation to Boost Business Productivity, Collaboration

Huang also introduces the concept of “AUI,” combining conversational AI and graphical user interfaces (GUIs) to streamline collaboration and supercharge business performance.

The AI Podcast · Zoom’s AI-First Transformation to Boost Business Productivity, Collaboration – Ep. 235

Time Stamps

6:49: The fundamental capabilities of generative AI

8:20: Zoom’s approach to AI, including the use of small language models

11:20: Zoom’s “federated AI” strategy, integrating multiple AI models

13:10: Introducing the concept of “AUI.”

20:00: Huang on how AI will impact productivity and everyday tasks

29:00: How Zoom helps business leaders understand AI and return on investment of AI projects

32:50: Huang’s near-term outlook on the development of AI

You Might Also Like…

How SonicJobs Uses AI Agents to Connect the Internet, Starting With Jobs – Ep. 233

Yotta CEO Sunil Gupta on Supercharging India’s Fast-Growing AI Market – Ep. 225

Replit CEO Amjad Masad on Empowering the Next Billion Software Creators – Ep. 201

Subscribe to the AI Podcast

Make the AI Podcast better: Have a few minutes to spare? Fill out this listener survey.

Zoom’s AI-First Transformation to Boost Business Productivity, Collaboration

Huang also introduces the concept of “AUI,” combining conversational AI and graphical user interfaces (GUIs) to streamline collaboration and supercharge business performance.

The AI Podcast · Zoom’s AI-First Transformation to Boost Business Productivity, Collaboration – Ep. 235

Time Stamps

6:49: The fundamental capabilities of generative AI

8:20: Zoom’s approach to AI, including the use of small language models

11:20: Zoom’s “federated AI” strategy, integrating multiple AI models

13:10: Introducing the concept of “AUI.”

20:00: Huang on how AI will impact productivity and everyday tasks

29:00: How Zoom helps business leaders understand AI and return on investment of AI projects

32:50: Huang’s near-term outlook on the development of AI

You Might Also Like…

How SonicJobs Uses AI Agents to Connect the Internet, Starting With Jobs – Ep. 233

Yotta CEO Sunil Gupta on Supercharging India’s Fast-Growing AI Market – Ep. 225

Replit CEO Amjad Masad on Empowering the Next Billion Software Creators – Ep. 201

Subscribe to the AI Podcast

Make the AI Podcast better: Have a few minutes to spare? Fill out this listener survey.

ExecuTorch Beta: On-Device AI and LLMs, Stability, and Acceleration with Partners

TLDR

ExecuTorch has achieved Beta status with the release of v0.4, providing stable APIs and runtime, as well as extensive kernel coverage.
ExecuTorch is the recommended on-device inference engine for Llama 3.2 1B/3B models, offering enhanced performance and memory efficiency for both original and quantized models.
There has been a significant increase in adoption and ecosystem growth for ExecuTorch, and the focus is now on improving reliability, performance, and coverage for non-CPU backends as the next steps.

Current On-Device AI Market

The on-device AI market has been rapidly expanding, and is revolutionizing the way we interact with technology. It is unlocking new experiences, enabling personalization, and reducing latency. Traditionally, computer vision and speech recognition have been the primary use-cases for on-device AI, particularly in IoT, industrial applications, and mobile devices. However, the emergence of Large Language Models (LLMs) has made Generative AI the fastest growing sector in AI, subsequently highlighting the importance of on-device Generative AI. IDC forecasts by 2028, close to 1 billion GenAI capable smartphones being shipped worldwide.

LLMs are not only getting smaller but more powerful. This has led to the creation of a new class of applications that leverage multiple models for intelligent agents and streamlined workflows. The community is rapidly adopting and contributing to these new models, with quantized versions being created within hours of model release. Several leading technology companies are investing heavily in small LLMs, even deploying Low-Rank Adaptation (LoRA) at scale on-device to transform user experiences.

However, this rapid progress comes at a cost. The fragmentation of our on-device AI landscape creates complexity and inefficiency when going from model authoring to edge deployment. This is where PyTorch’s ExecuTorch comes in – our Beta announcement marks an important milestone in addressing these challenges and empowering developers to create innovative, AI-powered applications.

What’s New Today

It’s been exactly one year since we first open sourced ExecuTorch, six months since Alpha release, and today, we’re excited to announce three main developments:

1. Beta. ExecuTorch has reached Beta status starting from v0.4! It is now widely adopted and used in production environments across Meta. Through this adoption process we’ve identified and addressed feature gaps, improved stability, and expanded kernel and accelerator coverage. These improvements make us confident to promote ExecuTorch from Alpha to Beta status, and we are happy to welcome the community to adopt it in their own production settings. Here are three concrete enhancements:

Developers can write application code and include the latest ExecuTorch as a dependency, updating when needed with a clean API contract. This is possible due to our API stabilization efforts, as well as our explicit API lifecycle and backwards compatibility policy.
Running ExecuTorch on CPUs reached the necessary performance, portability and coverage. In particular, we have implemented more than 85% of all core ATen operators as part of our portable CPU kernels library to ensure running a model on ExecuTorch just works in most cases and making missing ops an exception rather than the norm. Moreover, we integrated and extensively tested our XNNPACK delegate for high performance on a wide range of CPU architectures. It is used in a number of production cases today.
In addition to the low-level ExecuTorch components for greater portability, we built extensions and higher-level abstractions to support more common use-cases such as developer tooling to support on-device debugging and profiling, and Module.h extension to simplify deployment for mobile devices.

2. On-Device Large-Language Models (LLMs). There has been a growing interest in the community to deploy Large Language Models (LLMs) on edge devices, as it offers improved privacy and offline capabilities. However, these models are quite large, pushing the limits of what is possible. Fortunately, ExecuTorch can support these models, and we’ve enhanced the overall framework with numerous optimizations.

ExecuTorch is the recommended framework to run latest Llama models on-device with excellent performance today. The Llama 3.2 1B/3B models are well-suited for mobile deployment, and it is especially true with the official quantized 1B/3B model releases from Meta, as it provides a great balance between performance, accuracy, and size. When deploying Llama 3.2 1B/3B quantized models, decode latency improved by 2.5x and prefill latency improved by 4.2x on average, while model size decreased by 56% and memory usage reduced by 41% on average when benchmarked on Android OnePlus 12 device (we’ve also verified similar relative performance on Samsung S24+ for 1B and 3B, and Samsung S22 for 1B). For Llama 3.2 1B quantized model, for example, ExecuTorch is able to achieve 50.2 tokens/s for decoding and 260 tokens/s for prefill on the OnePlus 12, using the latest CPU kernels from XNNPACK and Kleidi libraries. These quantized models allow developers to integrate LLMs into memory and power-constrained devices while still maintaining quality and safety.
One of the value propositions of ExecuTorch is being able to use accelerators on mobile devices seamlessly. In fact, ExecuTorch also showcased accelerators to achieve even greater performance running Llama across Apple MPS backend, Qualcomm AI Accelerator, and MediaTek AI Accelerator.
There has been growing community and industry interest in multimodal and beyond text-only LLMs, evidenced by Meta’s Llama 3.2 11B/90B vision models and open-source models like Llava. We have so far enabled Llava 1.5 7B model on phones via ExecuTorch, making many optimizations, notably reducing runtime memory from 11GB all the way down to 5GB.

3. Ecosystem and Community Adoption
Now that ExecuTorch is in Beta, it is mature enough to be used in production. It is being increasingly used at Meta across various product surfaces. For instance, ExecuTorch already powers various ML inference use cases across Meta’s Ray-Ban Meta Smart Glasses and Quest 3 VR headsets as well as Instagram and WhatsApp.

We also partnered with Hugging Face to provide native ExecuTorch support for models being exported using torch.export. This collaboration ensures exported artifacts can directly be lowered and run efficiently on various mobile and edge devices. Models like gemma-2b and phi3-mini are already supported and more foundational models support is in progress.

With stable APIs and Gen AI support, we’re excited to build and grow ExecuTorch with the community. The on-device AI community is growing rapidly and finding ways to adopt ExecuTorch across various fields. For instance, ExecuTorch is being used in a mobile app built by Digica to streamline inventory management in hospitals. As another example, Software Mansion developed an app, EraserAI, to remove unwanted objects from a photo with EfficientSAM running on-device with ExecuTorch via Core ML delegate.

Towards General Availability (GA):
Since the original release of ExecuTorch alpha, we’ve seen a growing interest within the community in using ExecuTorch in various production environments. To that end, we have made great progress towards more stabilized and matured APIs and have made a significant investment in community support, adoption and contribution to ExecuTorch. As are are getting close to GA, we are investing our efforts in the following areas:

Non-CPU backends: Bringing non-CPU backends to even greater robustness, coverage and performance is our next goal. From day one of our original launch, we have partnered with Apple (for Core ML and MPS), Arm (for EthosU NPU) and Qualcomm (for Hexagon NPU) on accelerator integration with ExecuTorch, and we’ve since then expanded our partnership to MediaTek (NPU) and Cadence (XTensa DSP). We’re also building Vulkan GPU integration in-house. In terms of feature coverage, we’ve successfully implemented the core functionalities with our partners, ensured seamless integration with our developer tooling, and showcased successful LLM integration with many of the accelerators. Our next big step is to thoroughly validate the performance and reliability of the system in real-world, production use-cases. This stage will help us fine-tune the experience and ensure the stability needed for smooth operations.
Benchmarking infra: As part of our ongoing testing efforts, we’ve developed a benchmarking infrastructure along with a public dashboard to showcase our progress toward on-device model inference benchmarking. This allows us to transparently track and display model coverage across various backends, giving our community real-time insights into how we’re advancing towards our goals.

We’re excited to share these developments with you and look forward to continued improvements in collaboration with our partners and the community! We welcome community contribution to help us make ExecuTorch the clear choice for deploying AI and LLM models on-device. We invite you to start using ExecuTorch in your on-device projects, or even better consider contributing to it. You can also report any issues on our GitHub page.

The Three Computer Solution: Powering the Next Wave of AI Robotics

ChatGPT marked the big bang moment of generative AI. Answers can be generated in response to nearly any query, helping transform digital work such as content creation, customer service, software development and business operations for knowledge workers.

Physical AI, the embodiment of artificial intelligence in humanoids, factories and other devices within industrial systems, has yet to experience its breakthrough moment.

This has held back industries such as transportation and mobility, manufacturing, logistics and robotics. But that’s about to change thanks to three computers bringing together advanced training, simulation and inference.

The Rise of Multimodal, Physical AI

For 60 years, “Software 1.0” — serial code written by human programmers — ran on general-purpose computers powered by CPUs.

Then, in 2012, Alex Krizhevsky, mentored by Ilya Sutskever and Geoffrey Hinton, won the ImageNet computer image recognition competition with AlexNet, a revolutionary deep learning model for image classification.

This marked the industry’s first contact with AI. The breakthrough of machine learning — neural networks running on GPUs — jump-started the era of Software 2.0.

Today, software writes software. The world’s computing workloads are shifting from general-purpose computing on CPUs to accelerated computing on GPUs, leaving Moore’s law far behind.

With generative AI, multimodal transformer and diffusion models have been trained to generate responses.

Large language models are one-dimensional, able to predict the next token, in modes like letters or words. Image- and video-generation models are two-dimensional, able to predict the next pixel.

None of these models can understand or interpret the three-dimensional world. And that’s where physical AI comes in.

Physical AI models can perceive, understand, interact with and navigate the physical world with generative AI. With accelerated computing, multimodal physical AI breakthroughs and large-scale physically based simulations are allowing the world to realize the value of physical AI through robots.

A robot is a system that can perceive, reason, plan, act and learn. Robots are often thought of as autonomous mobile robots (AMRs), manipulator arms or humanoids. But there are many more types of robotic embodiments.

In the near future, everything that moves, or that monitors things that move, will be autonomous robotic systems. These systems will be capable of sensing and responding to their environments.

Everything from surgical rooms to data centers, warehouses to factories, even traffic control systems or entire smart cities will transform from static, manually operated systems to autonomous, interactive systems embodied by physical AI.

The Next Frontier: Humanoids Robots

Humanoid robots are an ideal general-purpose robotic manifestation because they can operate efficiently in environments built for humans, while requiring minimal adjustments for deployment and operation.

The global market for humanoid robots is expected to reach $38 billion by 2035, a more than sixfold increase from the roughly $6 billion for the period forecast nearly two years ago, according to Goldman Sachs.

Researchers and developers around the world are racing to build this next wave of robots.

Three Computers to Develop Physical AI

To develop humanoid robots, three accelerated computer systems are required to handle physical AI and robot training, simulation and runtime. Two computing advancements are accelerating humanoid robot development: multimodal foundation models and scalable, physically based simulations of robots and their worlds.

Breakthroughs in generative AI are bringing 3D perception, control, skill planning and intelligence to robots. Robot simulation at scale lets developers refine, test and optimize robot skills in a virtual world that mimics the laws of physics — helping reduce real-world data acquisition costs and ensuring they can perform in safe, controlled settings.

NVIDIA has built three computers and accelerated development platforms to enable developers to create physical AI.

First, models are trained on a supercomputer. Developers can use NVIDIA NeMo on the NVIDIA DGX platform to train and fine-tune powerful foundation and generative AI models. They can also tap into NVIDIA Project GR00T, an initiative to develop general-purpose foundation models for humanoid robots to enable them to understand natural language and emulate movements by observing human actions.

Second, NVIDIA Omniverse, running on NVIDIA OVX servers, provides the development platform and simulation environment for testing and optimizing physical AI with application programming interfaces and frameworks like NVIDIA Isaac Sim.

Developers can use Isaac Sim to simulate and validate robot models, or generate massive amounts of physically-based synthetic data to bootstrap robot model training. Researchers and developers can also use NVIDIA Isaac Lab, an open-source robot learning framework that powers robot reinforcement learning and imitation learning, to help accelerate robot policy training and refinement.

Lastly, trained AI models are deployed to a runtime computer. NVIDIA Jetson Thor robotics computers are specifically designed for compact, on-board computing needs. An ensemble of models consisting of control policy, vision and language models composes the robot brain and is deployed on a power-efficient, on-board edge computing system.

Depending on their workflows and challenge areas, robot makers and foundation model developers can use as many of the accelerated computing platforms and systems as needed.

Building the Next Wave of Autonomous Facilities

Robotic facilities result from a culmination of all of these technologies.

Manufacturers like Foxconn or logistics companies like Amazon Robotics can orchestrate teams of autonomous robots to work alongside human workers and monitor factory operations through hundreds or thousands of sensors.

These autonomous warehouses, plants and factories will have digital twins. The digital twins are used for layout planning and optimization, operations simulation and, most importantly, robot fleet software-in-the-loop testing.

Built on Omniverse, “Mega” is a blueprint for factory digital twins that enables industrial enterprises to test and optimize their robot fleets in simulation before deploying them to physical factories. This helps ensure seamless integration, optimal performance and minimal disruption.

Mega lets developers populate their factory digital twins with virtual robots and their AI models, or the brains of the robots. Robots in the digital twin execute tasks by perceiving their environment, reasoning, planning their next motion and, finally, completing planned actions.

These actions are simulated in the digital environment by the world simulator in Omniverse, and the results are perceived by the robot brains through Omniverse sensor simulation.

With sensor simulations, the robot brains decide the next action, and the loop continues, all while Mega meticulously tracks the state and position of every element within the factory digital twin.

This advanced software-in-the-loop testing methodology enables industrial enterprises to simulate and validate changes within the safe confines of the Omniverse digital twin, helping them anticipate and mitigate potential issues to reduce risk and costs during real-world deployment.

Empowering the Developer Ecosystem With NVIDIA Technology

NVIDIA accelerates the work of the global ecosystem of robotics developers and robot foundation model builders with three computers.

Universal Robots, a Teradyne Robotics company, used NVIDIA Isaac Manipulator, Isaac accelerated libraries and AI models, and NVIDIA Jetson Orin to build UR AI Accelerator, a ready-to-use hardware and software toolkit that enables cobot developers to build applications, accelerate development and reduce the time to market of AI products.

RGo Robotics used NVIDIA Isaac Perceptor to help its wheel.me AMRs work everywhere, all the time, and make intelligent decisions by giving them human-like perception and visual-spatial information.

Humanoid robot makers including 1X Technologies, Agility Robotics, Apptronik, Boston Dynamics, Fourier, Galbot, Mentee, Sanctuary AI, Unitree Robotics and XPENG Robotics are adopting NVIDIA’s robotics development platform.

Boston Dynamics is using Isaac Sim and Isaac Lab to build quadrupeds and humanoid robots to augment human productivity, tackle labor shortages and prioritize safety in warehouses.

Fourier is tapping into Isaac Sim to train humanoid robots to operate in fields that demand high levels of interaction and adaptability, such as scientific research, healthcare and manufacturing.

Using Isaac Lab and Isaac Sim, Galbot advanced the development of a large-scale robotic dexterous grasp dataset called DexGraspNet that can be applied to different dexterous robotic hands, as well as a simulation environment for evaluating dexterous grasping models.

Field AI developed risk-bounded multitask and multipurpose foundation models for robots to safely operate in outdoor field environments, using the Isaac platform and Isaac Lab.

The era of physical AI is here — and it’s transforming the world’s heavy industries and robotics.

Get started with NVIDIA Robotics.

The Three Computer Solution: Powering the Next Wave of AI Robotics

Physical AI, the embodiment of artificial intelligence in humanoids, factories and other devices within industrial systems, has yet to experience its breakthrough moment.

The Rise of Multimodal, Physical AI

For 60 years, “Software 1.0” — serial code written by human programmers — ran on general-purpose computers powered by CPUs.

This marked the industry’s first contact with AI. The breakthrough of machine learning — neural networks running on GPUs — jump-started the era of Software 2.0.

Today, software writes software. The world’s computing workloads are shifting from general-purpose computing on CPUs to accelerated computing on GPUs, leaving Moore’s law far behind.

With generative AI, multimodal transformer and diffusion models have been trained to generate responses.

Large language models are one-dimensional, able to predict the next token, in modes like letters or words. Image- and video-generation models are two-dimensional, able to predict the next pixel.

None of these models can understand or interpret the three-dimensional world. And that’s where physical AI comes in.

In the near future, everything that moves, or that monitors things that move, will be autonomous robotic systems. These systems will be capable of sensing and responding to their environments.

The Next Frontier: Humanoids Robots

Researchers and developers around the world are racing to build this next wave of robots.

Three Computers to Develop Physical AI

NVIDIA has built three computers and accelerated development platforms to enable developers to create physical AI.

Depending on their workflows and challenge areas, robot makers and foundation model developers can use as many of the accelerated computing platforms and systems as needed.

Building the Next Wave of Autonomous Facilities

Robotic facilities result from a culmination of all of these technologies.

These actions are simulated in the digital environment by the world simulator in Omniverse, and the results are perceived by the robot brains through Omniverse sensor simulation.

With sensor simulations, the robot brains decide the next action, and the loop continues, all while Mega meticulously tracks the state and position of every element within the factory digital twin.

Empowering the Developer Ecosystem With NVIDIA Technology

NVIDIA accelerates the work of the global ecosystem of robotics developers and robot foundation model builders with three computers.

RGo Robotics used NVIDIA Isaac Perceptor to help its wheel.me AMRs work everywhere, all the time, and make intelligent decisions by giving them human-like perception and visual-spatial information.

Boston Dynamics is using Isaac Sim and Isaac Lab to build quadrupeds and humanoid robots to augment human productivity, tackle labor shortages and prioritize safety in warehouses.

Fourier is tapping into Isaac Sim to train humanoid robots to operate in fields that demand high levels of interaction and adaptability, such as scientific research, healthcare and manufacturing.

Field AI developed risk-bounded multitask and multipurpose foundation models for robots to safely operate in outdoor field environments, using the Isaac platform and Isaac Lab.

The era of physical AI is here — and it’s transforming the world’s heavy industries and robotics.

Get started with NVIDIA Robotics.

The Three Computer Solution: Powering the Next Wave of AI Robotics

Physical AI, the embodiment of artificial intelligence in humanoids, factories and other devices within industrial systems, has yet to experience its breakthrough moment.

The Rise of Multimodal, Physical AI

For 60 years, “Software 1.0” — serial code written by human programmers — ran on general-purpose computers powered by CPUs.

This marked the industry’s first contact with AI. The breakthrough of machine learning — neural networks running on GPUs — jump-started the era of Software 2.0.

Today, software writes software. The world’s computing workloads are shifting from general-purpose computing on CPUs to accelerated computing on GPUs, leaving Moore’s law far behind.

With generative AI, multimodal transformer and diffusion models have been trained to generate responses.

Large language models are one-dimensional, able to predict the next token, in modes like letters or words. Image- and video-generation models are two-dimensional, able to predict the next pixel.

None of these models can understand or interpret the three-dimensional world. And that’s where physical AI comes in.

In the near future, everything that moves, or that monitors things that move, will be autonomous robotic systems. These systems will be capable of sensing and responding to their environments.

The Next Frontier: Humanoids Robots

Researchers and developers around the world are racing to build this next wave of robots.

Three Computers to Develop Physical AI

NVIDIA has built three computers and accelerated development platforms to enable developers to create physical AI.

Depending on their workflows and challenge areas, robot makers and foundation model developers can use as many of the accelerated computing platforms and systems as needed.

Building the Next Wave of Autonomous Facilities

Robotic facilities result from a culmination of all of these technologies.

These actions are simulated in the digital environment by the world simulator in Omniverse, and the results are perceived by the robot brains through Omniverse sensor simulation.

With sensor simulations, the robot brains decide the next action, and the loop continues, all while Mega meticulously tracks the state and position of every element within the factory digital twin.

Empowering the Developer Ecosystem With NVIDIA Technology

NVIDIA accelerates the work of the global ecosystem of robotics developers and robot foundation model builders with three computers.

RGo Robotics used NVIDIA Isaac Perceptor to help its wheel.me AMRs work everywhere, all the time, and make intelligent decisions by giving them human-like perception and visual-spatial information.

Boston Dynamics is using Isaac Sim and Isaac Lab to build quadrupeds and humanoid robots to augment human productivity, tackle labor shortages and prioritize safety in warehouses.

Fourier is tapping into Isaac Sim to train humanoid robots to operate in fields that demand high levels of interaction and adaptability, such as scientific research, healthcare and manufacturing.

Field AI developed risk-bounded multitask and multipurpose foundation models for robots to safely operate in outdoor field environments, using the Isaac platform and Isaac Lab.

The era of physical AI is here — and it’s transforming the world’s heavy industries and robotics.

Get started with NVIDIA Robotics.

Time Stamps

You Might Also Like…

Subscribe to the AI Podcast

Time Stamps

You Might Also Like…

Subscribe to the AI Podcast

Time Stamps

You Might Also Like…

Subscribe to the AI Podcast

Time Stamps

You Might Also Like…

Subscribe to the AI Podcast

Time Stamps

You Might Also Like…

Subscribe to the AI Podcast

Time Stamps

You Might Also Like…

Subscribe to the AI Podcast

The Rise of Multimodal, Physical AI

The Next Frontier: Humanoids Robots

Three Computers to Develop Physical AI

Building the Next Wave of Autonomous Facilities

Empowering the Developer Ecosystem With NVIDIA Technology

The Rise of Multimodal, Physical AI

The Next Frontier: Humanoids Robots

Three Computers to Develop Physical AI

Building the Next Wave of Autonomous Facilities

Empowering the Developer Ecosystem With NVIDIA Technology

The Rise of Multimodal, Physical AI

The Next Frontier: Humanoids Robots

Three Computers to Develop Physical AI

Building the Next Wave of Autonomous Facilities

Empowering the Developer Ecosystem With NVIDIA Technology

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.