NVIDIA Isaac Nova Orin Opens New Era of Innovation for Autonomous Mobile Robots

Next-day packages. New vehicle deliveries. Fresh organic produce. Each of these modern conveniences is accelerated by fleets of mobile robots.

NVIDIA today is announcing updates to Nova Orin — an autonomous mobile robot (AMR) reference platform — that advance its roadmap. We’re releasing details of three reference platform configurations. Two use a single Jetson AGX Orin — which runs the NVIDIA Isaac robotics stack and the Robot Operating System (ROS) with the GPU-accelerated framework — and one relies on two Orin modules.

The Nova Orin platform is designed to improve reliability and reduce development costs worldwide for building and deploying AMRs.

AMRs are like self-driving cars but for unstructured environments. They don’t need fixed, preprogrammed tracks and are capable of avoiding obstacles. This makes them ideal in logistics for moving items in warehouses, distribution centers and factories, or for applications in hospitality, cleaning, roaming security and last-mile delivery.

For years, AMR manufacturers have been designing robots by sourcing and integrating compute hardware, software and sensors in house. This time-consuming effort demands years of engineering resources, lengthens go-to-market pipelines and distracts from developing domain-specific applications.

Nova Orin offers a better way forward with tested, industrial-grade configurations of sensors, software and GPU-computing capabilities. Tapping into the NVIDIA AI platform frees developers to focus on building their unique software stack of robot applications.

Much is at stake for intralogistics enabled by AMRs across industries, a market expected to increase nearly 6x to $46 billion by 2030, up from $8 billion in 2021, according to estimates from ABI Research.

Designing a Highly Capable, Flexible Reference Architecture 

The Nova Orin reference architecture designs are provided for specific use cases. There is one Orin-based design without safety-certified sensors, and one that includes them, along with a safety programmable logic controller. The third architecture has a dual Orin-based design that depends on vision AI for enabling functional safety.

Sensor support is included for stereo cameras, lidars, ultrasonic sensors and inertial measurement units. The chosen sensors have been selected to balance performance, price and reliability for industrial applications. The suite of sensors provides a multimodal diversity of coverage that is required for developing and deploying safe and collaborative AMRs.

The stereo cameras and fisheye cameras are custom designed by NVIDIA in coordination with camera partners. All sensors are calibrated and time synchronized, and come with drivers for reliable data capture. These sensors allow AMRs to detect objects and obstacles across a wide range of situations while also enabling simultaneous localization and mapping (SLAM).

NVIDIA provides two lidar options, one for applications that don’t need sensors certified for functional safety, and the other for those that do. In addition to these 2D lidars, Nova Orin supports 3D lidar for mapping and ground-truth data collection.

Building a Comprehensive AI Platform for OEMs, ISVs

NVIDIA is driving the Nova Orin platform forward with extensive software support in addition to the hardware and integration tools.

The base OS includes drivers and firmware for all the hardware and adaptation tools, as well as design guides for integrating it with robots. Nova can be integrated easily with a ROS-based robot application.

The sensors will have validated models in Isaac Sim for application development and testing without the need for an actual robot.

The cloud-native data acquisition tools eliminate the arduous task of setting up data pipelines for the vast amount of sensor data needed for training models, debugging and analytics. State-of-the-art GEMs developed for Nova sensors are GPU accelerated with the Jetson Orin platform, providing key building blocks such as visual SLAM, stereo depth estimation, obstacle detection, 3D reconstruction, semantic segmentation and pose estimation.

Nova Orin also addresses the need to quickly create high-fidelity, city-scale 3D maps for indoor environments in the cloud. These generated maps allow robot navigation, fleet planning and simulation. Plus, the maps can be continuously updated using data from the robots.

AMRs That Are Ready for Industries

As robotics systems evolve, the need for secure deployment and management of the critical AI software on board is paramount for future AMRs.

Nova Orin supports secure over-the-air updates, as well as device management and monitoring, to enable easy deployment and reduce the cost of maintenance. Its open, modular design enables developers to use some or all capabilities of the platform and extend it to quickly develop robotics applications.

NVIDIA is working closely with regulatory bodies to develop vision-enabled safety technology to further reduce the cost and improve reliability of AMRs. And we’re providing a software development kit for navigation, so developers can quickly develop applications.

Improving productivity for factories and warehouses will depend on AMRs working safely and efficiently side by side at scale. High levels of autonomy driven by 3D perception from Nova Orin will help drive that revolution.

Learn more about Nova Orin and sign up to be notified of its availability.

The post NVIDIA Isaac Nova Orin Opens New Era of Innovation for Autonomous Mobile Robots appeared first on NVIDIA Blog.

Read More

On Track: Digitale Schiene Deutschland Building Digital Twin of Rail Network in NVIDIA Omniverse

Deutsche Bahn’s rail network consists of 5,700 stations and 33,000 kilometers of track, making it the largest in Western Europe.

Digitale Schiene Deutschland (Digital Rail for Germany, or DSD), part of Germany’s national railway operator Deutsche Bahn, is working to increase the network’s capacity without building new tracks. It’s striving to create a powerful railway system in which trains are automated, safely run with less headway between each other and are optimally steered through the network.

In collaboration with NVIDIA, DSD is beginning to build the first country-scale digital twin to fully simulate automatic train operation across an entire network. That means creating a photorealistic and physically accurate emulation of the entire rail system. It will include tracks running through cities and countrysides, and many details from sources such as station platform measurements and vehicle sensors.

Using the AI-enabled digital twin created with NVIDIA Omniverse, DSD can develop highly capable perception and incident prevention and management systems to optimally detect and react to irregular situations during day-to-day railway operation.

“With NVIDIA technologies, we’re able to begin realizing the vision of a fully automated train network,” said Ruben Schilling, who leads the perception group at DB Netz, part of Deutsche Bahn. The envisioned future railway system improves the capacity, quality and efficiency of the network.

This is the basis for satisfied passengers and cargo customers, leading to more traffic on the tracks and thereby reducing the carbon footprint of the mobility sector.

Data, Data and More Data

Creating a digital twin at such a large scale is a massive undertaking. It needs a custom-built 3D pipeline that connects computer-aided design datasets that are built, for example, within the Siemens JT ecosystem with DSD’s high-definition 3D maps and various simulation tools. Using the Universal Scene Description 3D framework, DSD can connect and combine data sources into a single shared virtual model.

With its network perfectly synchronized with the real world, DSD can run optimization tests and “what if” scenarios to test and validate changes in the railway system, such as reactions to unforeseen situations.

Running on NVIDIA OVX, the computing system for running Omniverse simulations, DSD will be able to operate the persistent simulation, which is regularly improved by data stream updates from the physical world.

Watch the demo to see the digital twin in action:

Future computer vision-powered systems could continually perform route observation and incident recognition, automatically warning of and reacting to potential hazards.

The AI sensor models will be trained and optimized with a combination of real-world and synthetic data, some of which will be generated by the Omniverse Replicator software development kit framework. This will ensure models can perceive, plan and act when faced with everyday and unexpected scenarios.

The Future of Rail

With its pioneering approach to rail network optimization, DSD is contributing to the future of Europe’s rail system and industry development. Sharing its data pool across countries allows for continuous improvement and deployment across future vehicles, resulting in the highest possible quality while reducing costs.

Watch the GTC keynote on demand to see all of NVIDIA’s latest announcements, and register free for the conference, running through Thursday, Sept. 22, to explore how digital twins are transforming industries.

The post On Track: Digitale Schiene Deutschland Building Digital Twin of Rail Network in NVIDIA Omniverse appeared first on NVIDIA Blog.

Read More

Reinventing Retail: Lowe’s Teams With NVIDIA and Magic Leap to Create Interactive Store Digital Twins

With tens of millions of weekly transactions across its more than 2,000 stores, Lowe’s helps customers achieve their home-improvement goals. Now, the Fortune 50 retailer is experimenting with high-tech methods to elevate both the associate and customer experience.

Using NVIDIA Omniverse Enterprise to visualize and interact with a store’s digital data, Lowe’s is testing digital twins in Mill Creek, Wash,. and Charlotte, N.C. Its ultimate goal is to empower its retail associates to better serve customers, collaborate with one another in new ways and optimize store operations.

“At Lowe’s, we are always looking for ways to reimagine store operations and remove friction for our customers,” said Seemantini Godbole, executive vice president and chief digital and information officer at Lowe’s. “With NVIDIA Omniverse, we’re pulling data together in ways that have never been possible, giving our associates superpowers.”

Augmented Reality Restocking and ‘X-Ray Vision’

With its interactive digital twin, Lowe’s is exploring a variety of novel augmented reality use cases, including reconfiguring layouts, restocking support, real-time collaboration and what it calls “X-ray vision.”

Wearing a Magic Leap 2 AR headset, store associates can interact with the digital twin. This AR experience helps an associate compare what a store shelf should look like with what it actually looks like, and ensure it’s stocked with the right products in the right configurations.

And this isn’t just a single-player activity. Store associates on the ground can communicate and collaborate with centralized store planners via AR. For example, if a store associate notices an improvement that could be made to a proposed planogram for their store, they can flag it on the digital twin with an AR “sticky note.”

Lastly, a benefit of the digital twin and Magic Leap 2 headset is the ability to explore “X-ray vision.” Traditionally, a store associate might need to climb a ladder to scan or read small labels on cardboard boxes held in a store’s top stock. With an AR headset and the digital twin, the associate could look up at a partially obscured cardboard box from ground level, and, thanks to computer vision and Lowe’s inventory application programming interfaces, “see” what’s inside via an AR overlay.

Store Data Visualization and Simulation

Home-improvement retail is a tactile business. And when making decisions about how to create a new store display, a common way for retailers to see what works is to build a physical prototype, put it out into a brick-and-mortar store and examine how customers react.

With NVIDIA Omniverse and AI, Lowe’s is exploring more efficient ways to approach this process.

Just as e-commerce sites gather analytics to optimize the customer shopping experience online, the digital twin enables new ways of viewing sales performance and customer traffic data to optimize the in-store experience. 3D heatmaps and visual indicators that show the physical distance of items frequently bought together can help associates put these objects near each other. Within a 100,000 square-foot location, for example, minimizing the number of steps needed to pick up an item is critical.

Using historical order and product location data, Lowe’s can also use NVIDIA Omniverse to simulate what might happen when a store is set up differently. Using AI avatars created in Lowe’s Innovation Labs, the retailer can simulate how far customers and associates might need to walk to pick up items that are often bought together.

NVIDIA Omniverse allows for hundreds of simulations to be run in a fraction of the time that it takes to build a physical store display, Godbole said.

Expanding Into the Metaverse

Lowe’s also announced today at NVIDIA GTC that it will soon make the over 600 photorealistic 3D product assets from its home-improvement library free for other Omniverse creators to use in their virtual worlds. All of these products will be available in the Universal Scene Description format on which Omniverse is built, and can be used in any metaverse created by developers using NVIDIA Omniverse Enterprise.

For Lowe’s, the future of home improvement is one in which AI, digital twins and mixed reality play a part in the daily lives of its associates, Godbole said. With NVIDIA Omniverse, the retailer is taking steps to build this future – and there’s a lot more to come as it tests new strategies.

Join a GTC panel discussion on Wednesday, Sept. 21, with Lowe’s Innovation Labs VP Cheryl Friedman and Senior Director of Creative Technology Mason Sheffield, who will discuss how Lowe’s is using AI and NVIDIA Omniverse to make the home-improvement retail experience even better.

Watch the GTC keynote on demand to see all of NVIDIA’s latest announcements, and register free for the conference — running through Thursday, Sept. 22 — to explore how digital twins are transforming industries.

The post Reinventing Retail: Lowe’s Teams With NVIDIA and Magic Leap to Create Interactive Store Digital Twins appeared first on NVIDIA Blog.

Read More

Experience the Future of Vehicle Infotainment: NVIDIA DRIVE Concierge Brings Customized AI to Every Seat

With NVIDIA DRIVE, in-vehicle infotainment, or IVI, is so much more than just giving directions and playing music.

NVIDIA founder and CEO Jensen Huang demonstrated the capabilities of a truly IVI experience during today’s GTC keynote. Using centralized, high-performance compute, the NVIDIA DRIVE Concierge platform spans traditional cockpit and cluster capabilities, as well as personalized, AI-powered safety, convenience and entertainment features for every occupant.

Drivers in the U.S. spend an average of nearly 450 hours in their car every year. With just a traditional cockpit and infotainment display, those hours can seem even longer.

DRIVE Concierge makes time in vehicles more enjoyable, convenient and safe, extending intelligent features to every passenger using the DRIVE AGX compute platform, DRIVE IX software stack and Omniverse Avatar Cloud Engine (ACE).

These capabilities include crystal-clear graphics and visualizations in the cockpit and cluster, intelligent digital assistants, driver and occupant monitoring, and streaming content such as games and movies.

With DRIVE Concierge, every passenger can enjoy their own intelligent experience.

Cockpit Capabilities

By running on the cross-domain DRIVE platform, DRIVE Concierge can virtualize, as well as host, multiple virtual machines on a single chip — rather than distributed computers — for streamlined development.

With this centralized architecture, DRIVE Concierge seamlessly orchestrates driver information, cockpit and infotainment functions. It supports the Android Automotive operating system, so automakers can easily customize and scale their IVI offerings.

And digital cockpit and cluster features are just the beginning. DRIVE Concierge extends this premium functionality to the entire vehicle, with world-class confidence view, video-conferencing capabilities, digital assistants, gaming and more.

Visualizing Intelligence

Speed, fuel range and distance traveled are key data for human drivers to be aware of. When AI is at the wheel, however, a detailed view of the vehicle’s perception and planning layers is also crucial.

DRIVE Concierge is tightly integrated with the DRIVE Chauffeur platform to provide high-quality, 360-degree, 4D visualization with low latency. Drivers and passengers can always see what’s in the mind of the vehicle’s AI, with beautiful 3D graphics.

This visualization is critical to building trust between the autonomous vehicle and its passengers, so occupants can be confident in the AV system’s perception and planned path.

How May AI Help You?

In addition to revolutionizing driving, AI is creating a more intelligent vehicle interior with personalized digital assistants.

Omniverse ACE is a collection of cloud-based AI models and services for developers to easily build, customize and deploy interactive avatars.

With ACE, AV developers can create in-vehicle assistants that are easily customizable with speech AI, computer vision, natural language understanding, recommendation engines and simulation technologies.

These avatars can help make recommendations, book reservations, access vehicle controls and provide alerts for situations like if a valuable item is left behind.

Game On

With software-defined capabilities, cars are becoming living spaces, complete with the same entertainment available at home.

NVIDIA DRIVE Concierge lets passengers watch videos and experience high-performance gaming wherever they go. Users can choose from their favorite apps and stream videos and games on any vehicle screen.

By using the NVIDIA GeForce NOW cloud gaming service, passengers can access more than 1,400 titles without the need for downloads, benefitting from automatic updates and unlimited cloud storage.

Safety and Security

Intelligent interiors provide an added layer of safety to vehicles, in addition to convenience and entertainment.

DRIVE Concierge uses interior sensors and dedicated deep neural networks for driver monitoring, which ensures attention is on the road in situations where the human is in control.

It can also perform passenger monitoring to make sure that occupants are safe and no precious cargo is left behind.

Using NVIDIA DRIVE Sim on Omniverse, developers can collaborate to design passenger interactions with such cutting-edge features in the vehicle.

By tapping into NVIDIA’s past heritage of infotainment technology, DRIVE Concierge is revolutionizing the future of in-vehicle experiences.

The post Experience the Future of Vehicle Infotainment: NVIDIA DRIVE Concierge Brings Customized AI to Every Seat appeared first on NVIDIA Blog.

Read More

NVIDIA DRIVE Thor Strikes AI Performance Balance, Uniting AV and Cockpit on a Single Computer

The next generation of autonomous vehicle computing is improving performance and efficiency at the speed of light.

During today’s GTC keynote, NVIDIA founder and CEO Jensen Huang unveiled DRIVE Thor, a superchip of epic proportions. The automotive-grade system-on-a-chip (SoC) is built on the latest CPU and GPU advances to deliver 2,000 teraflops of performance while reducing overall system costs.

DRIVE Thor succeeds NVIDIA DRIVE Orin in the company’s product lineup, incorporating the newest compute technology to accelerate industry deployment of intelligent-vehicle technology, targeting automakers’ 2025 models.

DRIVE Thor is the next generation in the NVIDIA AI compute roadmap.

Geely-owned premium EV maker ZEEKR will be the first customer for the next-generation platform, with production starting in 2025.

DRIVE Thor unifies traditionally distributed functions in vehicles — including digital cluster, infotainment, parking and assisted driving — for greater efficiency in development and faster software iteration.

Manufacturers can configure the DRIVE Thor superchip in multiple ways. They can dedicate all of the platform’s 2,000 teraflops to the autonomous driving pipeline, or use a portion for in-cabin AI and infotainment and another portion for driver assistance.

Like the current-generation NVIDIA DRIVE Orin, DRIVE Thor uses the productivity of the NVIDIA DRIVE software development kit, is designed to be ASIL-D functionally safe, and is built on a scalable architecture, so developers can seamlessly port their past software development to the latest platform.

Lightning Fast

In addition to raw performance, DRIVE Thor delivers an incredible leap in deep neural network accuracy.

DRIVE Thor marks the first inclusion of a transformer engine in the AV platform family. The transformer engine is a new component of the NVIDIA GPU Tensor Core. Transformer networks process video data as a single perception frame, enabling the compute platform to process more data over time.

With 8-bit floating point (FP8) precision, the SoC introduces a new data type for automotive. Traditionally, AV developers see a loss in accuracy when moving from 32-bit floating point to 8-bit integer data formats. FP8 precision eases this transition, making it possible for developers to transfer data types without sacrificing accuracy.

Additionally, DRIVE Thor uses updated ARM Poseidon AE cores, making it one of the highest performance processors in the industry.

Multi-Domain Computing

DRIVE Thor is as efficient as it is powerful.

The SoC is capable of multi-domain computing, meaning it can partition tasks for autonomous driving and in-vehicle infotainment. This multi-compute domain isolation lets concurrent time-critical processes run without interruption. On one computer, the vehicle can simultaneously run Linux, QNX and Android.

Typically, these types of functions are controlled by tens of electronic control units distributed throughout a vehicle. Rather than relying on these distributed ECUs, manufacturers can now consolidate vehicle functions using DRIVE Thor’s ability to isolate specific tasks.

With DRIVE Thor, automakers can consolidate intelligent vehicle functions on a single SoC.

All vehicle displays, sensors and more can connect to this single SoC, simplifying what has been an incredibly complex supply chain for automakers.

Two Is Always Better Than One

If one DRIVE Thor seems incredible, try two.

Customers can use one DRIVE Thor SoC, or they can connect two via the latest NVLink-C2C chip interconnect technology to serve as a monolithic platform that runs a single operating system.

This capability provides automakers with the compute headroom and flexibility to build software-defined vehicles that are continuously upgradeable through secure, over-the-air updates.

Designed with the best of NVIDIA GPU technology, DRIVE Thor is truly an AV SoC of heroic proportions.

The post NVIDIA DRIVE Thor Strikes AI Performance Balance, Uniting AV and Cockpit on a Single Computer appeared first on NVIDIA Blog.

Read More

HEAVY.AI Delivers Digital Twin for Telco Network Planning and Operations Based on NVIDIA Omniverse

Telecoms began touting the benefits of 5G networks six years ago. Yet the race to deliver ultrafast wireless internet today resembles a contest between the tortoise and the hare, as some mobile network operators struggle with costly and complex network requirements.

Advanced data analytics company HEAVY.AI today unveiled solutions to put carriers on more even footing. Its initial product, HeavyRF, delivers a next-generation network planning and operations tool based on the NVIDIA Omniverse platform for creating digital twins.

“Building out 5G networks globally will cost trillions of dollars over the next decade, and our telco network customers are rightly worried about how much of that is money not well spent,” said Jon Kondo, CEO of HEAVY.AI. “Using HEAVY advanced analytics and NVIDIA Omniverse-based real-time simulations, they’ll see big savings in time and money.”

HEAVY.AI also announced that Charter Communications is collaborating on incorporating the tool into its modeling and planning operations for its Spectrum telco network, which has 32 million customers in 41 U.S. states. The collaboration extends HEAVY’s relationship with Charter, building on the existing analytics operations to 5G network planning.

“HEAVY.AI’s new digital twin capabilities give us a way to explore and fine-tune our expanding 5G networks in ways that weren’t possible before,” said Jared Ritter, senior director of analytics and automation at Charter Communications.

Without the digital twin approach, telco operators must either: physically place microcell towers in densely populated areas to understand the interaction between radio transmitters, the environment, and humans and devices that are on the move — or use tools that offer less detail about key factors  such as tree density or high-rise interference.

Early deployments of 5G needed 300% more base stations for the same level of coverage offered by the previous generation, called Long Term Evolution (LTE), because of higher spectrum bands. A 5G site will consume 300% more power and cost 4x more than an LTE site if they’re deployed in the same way, according to researcher Analysys Mason.

Those sobering figures are prompting the industry to look for efficiencies. Harnessing GPU-accelerated analytics and real-time geophysical mapping, HEAVY.AI’s digital twin solution allows telcos to test radio frequency (RF) propagation scenarios in seconds, powered by the HeavyRF module. This results in significant time and cost savings, because the base stations and microcells can be more accurately placed and tuned at first installation.

The HeavyRF module supports telcos’ goals to plan, build and operate new networks more efficiently by tightly integrating key business information such as mobility and parcels data, as well as customer experience data, within RF planning workflows.

Using an RF-synchronized digital twin would enable planners at Charter Communications to optimize capacity and coverage, plus interactively see how changes in deployment patterns translate into customer acquisition and retention at the household level.

The goal is to use machine learning and big data pipelines to continuously mirror existing real-world conditions.

The digital twin will use the parallel computing capabilities of modern GPUs for visual simulation, as well as to generate physical simulations of RF signals using real-time RTX ray tracing, powered by NVIDIA Omniverse’s RTX Renderer.

For telcos, it’s not just about investing in traditional networks. With the rise of AI applications and services, these companies seek to lay the foundation for 5G-enabled devices, autonomous vehicles, appliances, robots and city infrastructure.

Watch the GTC keynote on demand to see all of NVIDIA’s latest announcements, and register for the conference — running through Thursday, Sept. 22 — to explore how digital twins are transforming industries.

The post HEAVY.AI Delivers Digital Twin for Telco Network Planning and Operations Based on NVIDIA Omniverse appeared first on NVIDIA Blog.

Read More

Reconstructing the Real World in DRIVE Sim With AI

Autonomous vehicle simulation poses two challenges: generating a world with enough detail and realism that the AI driver perceives the simulation as real, as well as creating simulations at a large enough scale to cover all the cases on which the AI driver needs to be fully trained and tested.

To address these challenges, NVIDIA researchers have created new AI-based tools to build simulations directly from real-world data. NVIDIA founder and CEO Jensen Huang previewed the breakthrough during the GTC keynote.

This research includes award-winning work first published at SIGGRAPH, a computer graphics conference held last month.

Neural Reconstruction Engine

The Neural Reconstruction Engine is a new AI toolset for the NVIDIA DRIVE Sim simulation platform that uses multiple AI networks to turn recorded video data into simulation.

The new pipeline uses AI to automatically extract the key components needed for simulation, including the environment, 3D assets and scenarios. These pieces are then reconstructed into simulation scenes that have the realism of data recordings, but are fully reactive and can be manipulated as needed. Achieving this level of detail and diversity by hand is costly, time consuming and not scalable.

Environments and Assets

A simulation needs an environment in which to operate. The AI pipeline converts 2D video data from a real-world drive to a dynamic, 3D digital twin environment that can be loaded into DRIVE Sim.

A 3D simulation environment generated from recorded driving data using AI.

The DRIVE Sim AI pipeline follows a similar process to reconstruct other 3D assets. Engineers can use the assets to reconstruct the current scene or place them in a larger library of assets to be used in any simulation.

Using the asset-harvesting pipeline is key to growing the DRIVE Sim library and ensuring it matches the diversity and distribution of the real world.

Assets can be harvested from real-world data, turned into 3D objects and reused in other scenes. Here, the tow truck is reconstructed from the scene on the left and used in a different simulation shown on the right.

Scenarios

Scenarios are the events that take place during a simulation in an environment combined with assets.

The Neural Reconstruction Engine assigns AI-based behaviors to the actors in the scene, so that when presented with the original events, they behave precisely as they did in the real drive. However, since they have an AI behavior model, the figures in the simulation can respond and react to changes by the AV or other scene elements.

Because these scenarios are all occurring in simulation, they can also be manipulated to add new situations. Timing and location of events can be altered. Developers can even incorporate entirely new elements, synthetic or real, to make a scenario more challenging, such as the addition of a child chasing a ball to the scene below.

Synthetic objects can be mixed with real-world scenarios.

Integration Into DRIVE Sim

Once the environment, assets and scenario have been extracted, they’re reassembled in DRIVE Sim to create a 3D simulation of the recorded scene or mixed with other assets to create a completely new scene.

DRIVE Sim provides the tools for developers to adjust dynamic and static objects, the vehicle’s path, and the location, orientation and parameters of the vehicle sensors.

The same scenes in DRIVE Sim are also used to generate pre-labeled synthetic data to train perception systems. Randomizations are applied on top of recreated scenes to add diversity to the training data. Building scenes out of real-world data greatly reduces the sim-to-real gap.

Reconstructed scenes can be augmented with synthetic assets and used to produce new data with ground truth for training AV perception systems.

The ability to mix and match simulation formats is a significant advantage in comprehensively testing and validating AVs at scale. Engineers can manipulate events in a world that is responsive and matches their needs precisely.

The Neural Reconstruction Engine is the result of work by the research team at NVIDIA, and will be integrated into future releases of DRIVE Sim. This breakthrough will enable developers to take advantage of both physics-based and neural-driven simulation on the same cloud-based platform.

The post Reconstructing the Real World in DRIVE Sim With AI appeared first on NVIDIA Blog.

Read More

Parallel data processing with RStudio on Amazon SageMaker

Last year, we announced the general availability of RStudio on Amazon SageMaker, the industry’s first fully managed RStudio Workbench integrated development environment (IDE) in the cloud. You can quickly launch the familiar RStudio IDE, and dial up and down the underlying compute resources without interrupting your work, making it easy to build machine learning (ML) and analytics solutions in R at scale.

With ever-increasing data volume being generated, datasets used for ML and statistical analysis are growing in tandem. With this brings the challenges of increased development time and compute infrastructure management. To solve these challenges, data scientists have looked to implement parallel data processing techniques. Parallel data processing, or data parallelization, takes large existing datasets and distributes them across multiple processers or nodes to operate on the data simultaneously. This can allow for faster processing time of larger datasets, along with optimized usage on compute. This can help ML practitioners create reusable patterns for dataset generation, and also help reduce compute infrastructure load and cost.

Solution overview

Within Amazon SageMaker, many customers use SageMaker Processing to help implement parallel data processing. With SageMaker Processing, you can use a simplified, managed experience on SageMaker to run your data processing workloads, such as feature engineering, data validation, model evaluation, and model interpretation. This brings many benefits because there’s no long-running infrastructure to manage—processing instances spin down when jobs are complete, environments can be standardized via containers, data within Amazon Simple Storage Service (Amazon S3) is natively distributed across instances, and infrastructure settings are flexible in terms of memory, compute, and storage.

SageMaker Processing offers options for how to distribute data. For parallel data processing, you must use the ShardedByS3Key option for the S3DataDistributionType. When this parameter is selected, SageMaker Processing takes the provided n instances and distribute objects 1/n objects from the input data source across the instances. For example, if two instances are provided with four data objects, each instance receives two objects.

SageMaker Processing requires three components to run processing jobs:

  • A container image that has your code and dependencies to run your data processing workloads
  • A path to an input data source within Amazon S3
  • A path to an output data source within Amazon S3

The process is depicted in the following diagram.

In this post, we show you how to use RStudio on SageMaker to interface with a series of SageMaker Processing jobs to create a parallel data processing pipeline using the R programming language.

The solution consists of the following steps:

  1. Set up the RStudio project.
  2. Build and register the processing container image.
  3. Run the two-step processing pipeline:
    1. The first step takes multiple data files and processes them across a series of processing jobs.
    2. The second step concatenates the output files and splits them into train, test, and validation datasets.

Prerequisites

Complete the following prerequisites:

  1. Set up the RStudio on SageMaker Workbench. For more information, refer to Announcing Fully Managed RStudio on Amazon SageMaker for Data Scientists.
  2. Create a user with RStudio on SageMaker with appropriate access permissions.

Set up the RStudio project

To set up the RStudio project, complete the following steps:

  1. Navigate to your Amazon SageMaker Studio control panel on the SageMaker console.
  2. Launch your app in the RStudio environment.
  3. Start a new RStudio session.
  4. For Session Name, enter a name.
  5. For Instance Type and Image, use the default settings.
  6. Choose Start Session.
  7. Navigate into the session.
  8. Choose New Project, Version control, and then Select Git.
  9. For Repository URL, enter https://github.com/aws-samples/aws-parallel-data-processing-r.git
  10. Leave the remaining options as default and choose Create Project.

You can navigate to the aws-parallel-data-processing-R directory on the Files tab to view the repository. The repository contains the following files:

  • Container_Build.rmd
  • /dataset

    • bank-additional-full-data1.csv
    • bank-additional-full-data2.csv
    • bank-additional-full-data3.csv
    • bank-additional-full-data4.csv
  • /docker
  • Dockerfile-Processing
  • Parallel_Data_Processing.rmd
  • /preprocessing

    • filter.R
    • process.R

Build the container

In this step, we build our processing container image and push it to Amazon Elastic Container Registry (Amazon ECR). Complete the following steps:

  1. Navigate to the Container_Build.rmd file.
  2. Install the SageMaker Studio Image Build CLI by running the following cell. Make sure you have the required permissions prior to completing this step, this is a CLI designed to push and register container images within Studio.
    pip install sagemaker-studio-image-build

  3. Run the next cell to build and register our processing container:
    /home/sagemaker-user/.local/bin/sm-docker build . --file ./docker/Dockerfile-Processing --repository sagemaker-rstudio-parallel-processing:1.0

After the job has successfully run, you receive an output that looks like the following:

Image URI: <Account_Number>.dkr.ecr.<Region>.amazonaws.com/sagemaker-rstudio- parallel-processing:1.0

Run the processing pipeline

After you build the container, navigate to the Parallel_Data_Processing.rmd file. This file contains a series of steps that helps us create our parallel data processing pipeline using SageMaker Processing. The following diagram depicts the steps of the pipeline that we complete.

Start by running the package import step. Import the required RStudio packages along with the SageMaker SDK:

suppressWarnings(library(dplyr))
suppressWarnings(library(reticulate))
suppressWarnings(library(readr))
path_to_python <- system(‘which python’, intern = TRUE)

use_python(path_to_python)
sagemaker <- import('sagemaker')

Now set up your SageMaker execution role and environment details:

role = sagemaker$get_execution_role()
session = sagemaker$Session()
bucket = session$default_bucket()
account_id <- session$account_id()
region <- session$boto_region_name
local_path <- dirname(rstudioapi::getSourceEditorContext()$path)

Initialize the container that we built and registered in the earlier step:

container_uri <- paste(account_id, "dkr.ecr", region, "amazonaws.com/sagemaker-rstudio-parallel-processing:1.0", sep=".")
print(container_uri)

From here we dive into each of the processing steps in more detail.

Upload the dataset

For our example, we use the Bank Marketing dataset from UCI. We have already split the dataset into multiple smaller files. Run the following code to upload the files to Amazon S3:

local_dataset_path <- paste0(local_path,"/dataset/")

dataset_files <- list.files(path=local_dataset_path, pattern="\.csv$", full.names=TRUE)
for (file in dataset_files){
  session$upload_data(file, bucket=bucket, key_prefix="sagemaker-rstudio-example/split")
}

input_s3_split_location <- paste0("s3://", bucket, "/sagemaker-rstudio-example/split")

After the files are uploaded, move to the next step.

Perform parallel data processing

In this step, we take the data files and perform feature engineering to filter out certain columns. This job is distributed across a series of processing instances (for our example, we use two).

We use the filter.R file to process the data, and configure the job as follows:

filter_processor <- sagemaker$processing$ScriptProcessor(command=list("Rscript"),
                                                        image_uri=container_uri,
                                                        role=role,
                                                        instance_count=2L,
                                                        instance_type="ml.m5.large")

output_s3_filter_location <- paste0("s3://", bucket, "/sagemaker-rstudio-example/filtered")
s3_filter_input <- sagemaker$processing$ProcessingInput(source=input_s3_split_location,
                                                        destination="/opt/ml/processing/input",
                                                        s3_data_distribution_type="ShardedByS3Key",
                                                        s3_data_type="S3Prefix")
s3_filter_output <- sagemaker$processing$ProcessingOutput(output_name="bank-additional-full-filtered",
                                                         destination=output_s3_filter_location,
                                                         source="/opt/ml/processing/output")

filtering_step <- sagemaker$workflow$steps$ProcessingStep(name="FilterProcessingStep",
                                                      code=paste0(local_path, "/preprocessing/filter.R"),
                                                      processor=filter_processor,
                                                      inputs=list(s3_filter_input),
                                                      outputs=list(s3_filter_output))

As mentioned earlier, when running a parallel data processing job, you must adjust the input parameter with how the data will be sharded, and the type of data. Therefore, we provide the sharding method by S3Prefix:

s3_data_distribution_type="ShardedByS3Key",
                                                      s3_data_type="S3Prefix")

After you insert these parameters, SageMaker Processing will equally distribute the data across the number of instances selected.

Adjust the parameters as necessary, and then run the cell to instantiate the job.

Generate training, test, and validation datasets

In this step, we take the processed data files, combine them, and split them into test, train, and validation datasets. This allows us to use the data for building our model.

We use the process.R file to process the data, and configure the job as follows:

script_processor <- sagemaker$processing$ScriptProcessor(command=list("Rscript"),
                                                         image_uri=container_uri,
                                                         role=role,
                                                         instance_count=1L,
                                                         instance_type="ml.m5.large")

output_s3_processed_location <- paste0("s3://", bucket, "/sagemaker-rstudio-example/processed")
s3_processed_input <- sagemaker$processing$ProcessingInput(source=output_s3_filter_location,
                                                         destination="/opt/ml/processing/input",
                                                         s3_data_type="S3Prefix")
s3_processed_output <- sagemaker$processing$ProcessingOutput(output_name="bank-additional-full-processed",
                                                         destination=output_s3_processed_location,
                                                         source="/opt/ml/processing/output")

processing_step <- sagemaker$workflow$steps$ProcessingStep(name="ProcessingStep",
                                                      code=paste0(local_path, "/preprocessing/process.R"),
                                                      processor=script_processor,
                                                      inputs=list(s3_processed_input),
                                                      outputs=list(s3_processed_output),
                                                      depends_on=list(filtering_step))

Adjust the parameters are necessary, and then run the cell to instantiate the job.

Run the pipeline

After all the steps are instantiated, start the processing pipeline to run each step by running the following cell:

pipeline = sagemaker$workflow$pipeline$Pipeline(
  name="BankAdditionalPipelineUsingR",
  steps=list(filtering_step, processing_step)
)

upserted <- pipeline$upsert(role_arn=role)
execution <- pipeline$start()

execution$describe()
execution$wait()

The time each of these jobs takes will vary based on the instance size and count selected.

Navigate to the SageMaker console to see all your processing jobs.

We start with the filtering job, as shown in the following screenshot.

When that’s complete, the pipeline moves to the data processing job.

When both jobs are complete, navigate to your S3 bucket. Look within the sagemaker-rstudio-example folder, under processed. You can see the files for the train, test and validation datasets.

Conclusion

With an increased amount of data that will be required to build more and more sophisticated models, we need to change our approach to how we process data. Parallel data processing is an efficient method in accelerating dataset generation, and if coupled with modern cloud environments and tooling such as RStudio on SageMaker and SageMaker Processing, can remove much of the undifferentiated heavy lifting of infrastructure management, boilerplate code generation, and environment management. In this post, we walked through how you can implement parallel data processing within RStudio on SageMaker. We encourage you to try it out by cloning the GitHub repository, and if you have suggestions on how to make the experience better, please submit an issue or a pull request.

To learn more about the features and services used in this solution, refer to RStudio on Amazon SageMaker and Amazon SageMaker Processing.


About the authors

Raj Pathak is a Solutions Architect and Technical advisor to Fortune 50 and Mid-Sized FSI (Banking, Insurance, Capital Markets) customers across Canada and the United States. Raj specializes in Machine Learning with applications in Document Extraction, Contact Center Transformation and Computer Vision.

Jake Wen is a Solutions Architect at AWS with passion for ML training and Natural Language Processing. Jake helps Small Medium Business customers with design and thought leadership to build and deploy applications at scale. Outside of work, he enjoys hiking.

Aditi Rajnish is a first-year software engineering student at University of Waterloo. Her interests include computer vision, natural language processing, and edge computing. She is also passionate about community-based STEM outreach and advocacy. In her spare time, she can be found rock climbing, playing the piano, or learning how to bake the perfect scone.

Sean MorganSean Morgan is an AI/ML Solutions Architect at AWS. He has experience in the semiconductor and academic research fields, and uses his experience to help customers reach their goals on AWS. In his free time, Sean is an active open-source contributor and maintainer, and is the special interest group lead for TensorFlow Add-ons.

Paul Wu is a Solutions Architect working in AWS’ Greenfield Business in Texas. His areas of expertise include containers and migrations.

Read More

AI Models vs. AI Systems: Understanding Units of Performance Assessment

As AI becomes more deeply integrated into every aspect of our lives, it is essential that AI systems perform appropriately for their intended use. We know AI models can never be perfect, so how do we decide when AI performance is ‘good enough’ for use in a real life application? Is level of accuracy a sufficient gauge? What else matters? These are questions Microsoft Research tackles every day as part of our mission to follow a responsible, human-centered approach to building and deploying future-looking AI systems.

To answer the question, “what is good enough?”, it becomes necessary to distinguish between an AI model and an AI system as the unit of performance assessment. An AI model typically involves some input data, a pattern-matching algorithm, and an output classification. For example, a radiology scan of the patient’s chest might be shown to an AI model to predict whether a patient has COVID-19. An AI system, by contrast, would evaluate a broader range of information about the patient, beyond the COVID-19 prediction, to inform a clinical decision and treatment plan.

Research has shown that human-AI collaboration can increase the accuracy of AI models alone (reference). In this blog, we share key learnings from the recently retired Project Talia, the prior collaboration between Microsoft Research and SilverCloud Health to understand how thinking about the AI system as a whole—beyond the AI model—can help to more precisely define and enumerate ‘good enough’ for real-life application.

In Project Talia, we developed two AI models to predict treatment outcomes for patients receiving human-supported, internet-delivered cognitive behavioral treatment (iCBT) for symptoms of depression and anxiety. These AI models have the potential to assist the work practices of iCBT coaches. These iCBT coaches are practicing behavioral health professionals specifically trained to guide patients on the use of the treatment platform, recommend specific treatments, and help the patient work through identified difficulties.

Project Talia offers an illustration of the distinction between the AI model produced during research and a resulting AI system that could potentially get implemented to support real-life patient treatment. In this scenario, we demonstrate every system element that must be considered to ensure effective system outcomes, not just AI model outcomes.

Project Talia: Improving Mental Health Outcomes

SilverCloud Health (acquired by Amwell in 2021) is an evidence-based, digital, on-demand mental health platform that delivers iCBT-based programs to patients in combination with limited but regular contact from the iCBT coach. The platform offers more than thirty iCBT programs, predominantly for treating mild-to-moderate symptoms of depression, anxiety, and stress.

Patients work through the program(s) independently and with regular assistance from the iCBT coach, who provides guidance and encouragement through weekly reviews and feedback on the treatment journey.

Previous research (reference) has shown that involving a human coach within iCBT leads to more effective treatment outcomes for patients than unsupported interventions. Aiming to maximize the effects and outcomes of human support in this format, AI models were developed to dynamically predict the likelihood of a patient achieving a reliable improvement[1] in their depression and anxiety symptoms by the end of the treatment program (typically 8 to 14 weeks in length).

Existing literature on feedback-informed therapy (reference) and Project Talia research (reference) suggest that having access to these predictions could provide reassurance for those patients ‘on track’ toward achieving a positive outcome from treatment, or prompt iCBT coaches to make appropriate adjustments therein to better meet those patients’ needs.

AI Model vs. AI System

A two-part graphic entitled
Figure 1: Differentiating AI performance assessment on the level of AI model versus AI system.

The figure above illustrates the distinction between the AI model and AI system in this example (figure 1). The AI model takes in a clinical score calculated by a patient’s responses to standardized clinical questionnaires that assess symptoms of depression and anxiety at each treatment session. After three treatment sessions, the AI model predicts whether or not the patient will achieve a clinically significant reduction in mental health symptoms at completion of the treatment program. The AI model itself is trained on fully anonymized clinical scores of nearly 50,000 previous SilverCloud Health patients and achieved an acceptable accuracy of 87% (reference).

The outcome prediction could then be embedded into the clinical management interface that guides iCBT coaches in their efforts to make more informed decisions about that patient’s treatment journey (i.e., increase level and frequency of support from the coach).

When AI models are introduced into human contexts such as this, they rarely work in isolation. In this case, the clinical score is entered into a model with parameters tuned to a particular healthcare context. This example illustrates how AI model performance metrics are not sufficient to determine whether an AI system is ‘good enough’ for real-life application. We must examine challenges that arise throughout every element of the AI system.

Following are two specific examples of how AI system performance can be altered while retaining the same model: contextual model parameters and user interface and workflow integration.

Contextual Model Parameters: Which Error Type is Most Costly

Examining overall performance metrics exclusively can limit visibility into the different types of errors an AI model can make, which can have (potentially negative) implications on the AI system as a whole. For example, an AI model’s false positive and false negative errors can impact the AI system differently. A false positive error could mean a patient who needed extra help might not receive it; a false negative would mean a patient may receive unnecessary care. In this case, false positive errors would have a much bigger impact on a patient than false negative errors. But false negative errors can also be problematic when they cause unnecessary resource allocation.

Contextual model parameters can be tuned to change the balance between error types while maintaining the overall accuracy of the model. The clinical team could define these contextual model parameters to minimize false positive errors that could be more detrimental to patients, by specifying the model to produce only 5% false positives errors. Choosing this parameter could come, however, at the expense of a higher false negative rate, which would require monitoring how AI model performance might then impact service costs or staff burn-out.

This example illustrates the challenging decisions domain experts, who may know little about the details of AI, must make and the implications these decisions can have on AI system performance. In this example, we provided a prototype visualization tool to help the clinical team in their understanding of the implications of their choices across different patient groups.

We are moving into a world in which domain experts and business decision makers, who embed AI into their work practices, will bear increasing responsibility in assessing and debugging AI systems to improve quality, functionality, and relevance to their work.

User Interface and Workflow Integration

AI model predictions need to be contextualized for a given workflow. Research on iCBT coaches has shown that listing the predictions for all patients of a coach in a single screen outside the normal patient-centered workflow can be demotivating (reference). If a coach saw that the majority of their patients were predicted to not improve, or if their patients’ outcomes were predicted to be worse for than those of their colleagues, this could lead coaches to question their own competence or invite competitive thoughts about their colleagues’ performances—both unhelpful in this context.

Displaying the AI model prediction inside the individual patient’s profile, as in the illustration below (figure 2), provides a useful indicator of how well the person is doing and therefore can guide clinical practice. It also deliberately encourages the use of the AI model prediction within the context of other relevant clinical information.

Situating the AI output with other patient information can nurture a more balanced relationship between AI-generated insight and coaches’ own patient assessments, which can counterbalance effects of over-reliance and over-trust in potentially fallible AI predictions (also referred to as automation bias).

Screenshot of the supporter intervention management interface that shows at the top the patient status, including details on their assigned iCBT program and their level of active engagement with it. Embedded within the patient status are the reliable improvement prediction outputs for PHQ and GAD alongside means to extend the visual to access more details. Below the patient status are more detailed information about patient activity alongside a chart of the patients' PHQ and GAD score trajectories over time.
Figure 2: User interface for clinical coaches that shows the integration of treatment outcome prediction (here defined as reliable improvement) within the patient status. It shows: A) a general explanation of the prediction; B) visual charts with a text label to convey the prediction results for depression and anxiety symptoms (via PHQ-9 and GAD-7 clinical scores); C) drop-down menus for numerical percentages of the prediction; and D) other contextual information about the patient that are considered in their review, including their clinical score trajectories over time. 

This example illustrates the importance of user interface design and workflow integration in how well AI model predictions are understood and can contribute to the success or failure of an AI system as a whole. Domain experts, user research, and service designers start to play a far more important role in the development of AI systems than the typical focus on data scientists.

Final Thoughts

Aggregate performance metrics, such as accuracy, area-under-the-curve (AUC) scores, or mean square error, are easy to calculate on an AI model, but they indicate little about the utility or function of the entire AI system in practice. So, how do we decide when AI system performance is ‘good enough’ for use in real-life application? It is clear that high levels of AI model performance alone are not sufficient—we must consider every element of the AI system.

Contextual model parameters and interface and workflow design present just two examples of how preparing domain experts with expectations, skills, and tools are necessary for optimal benefit from the incorporation of AI systems into human contexts.


[1] Defined as an improvement of 6 or more points on the PHQ-9 depression scale, or 4 or more points on the Gad-7 anxiety scale.

The post AI Models vs. AI Systems: Understanding Units of Performance Assessment appeared first on Microsoft Research.

Read More