Enterprises Ignite Big Savings With NVIDIA-Accelerated Apache Spark

Enterprises Ignite Big Savings With NVIDIA-Accelerated Apache Spark

Tens of thousands of companies worldwide rely on Apache Spark to crunch massive datasets to support critical operations, as well as predict trends, customer behavior, business performance and more. The faster a company can process and understand its data, the more it stands to make and save.

That’s why companies with massive datasets — including the world’s largest retailers and banks — have adopted NVIDIA RAPIDS Accelerator for Apache Spark. The open-source software runs on top of the NVIDIA accelerated computing platform to significantly accelerate the processing of end-to-end data science and analytics pipelines — without any code changes.

To make it even easier for companies to get value out of NVIDIA-accelerated Spark, NVIDIA today unveiled Project Aether — a collection of tools and processes that automatically qualify, test, configure and optimize Spark workloads for GPU acceleration at scale.

Project Aether Completes a Year’s Worth of Work in Less Than a Week 

Customers using Spark in production often manage tens of thousands of complex jobs, or more. Migrating from CPU-only to GPU-powered computing offers numerous and significant benefits, but can be a manual and time-consuming process.

Project Aether automates the myriad steps that companies previously have done manually, including analyzing all of their Spark jobs to identify the best candidates for GPU acceleration, as well as staging and performing test runs of each job. It uses AI to fine-tune the configuration of each job to obtain the maximum performance.

To understand the impact of Project Aether, consider an enterprise that has 100 Spark jobs to complete. With Project Aether, each of these jobs can be configured and optimized for NVIDIA GPU acceleration in as little as four days. The same process done manually by a single data engineer could take up to an entire year.

CBA Drives AI Transformation With NVIDIA-Accelerated Apache Spark

Running Apache Spark on NVIDIA accelerated computing helps enterprises around the world complete jobs faster and with less hardware compared with using CPUs only — saving time, space, power and cooling, as well as on-premises capital and operational costs in the cloud.

Australia’s largest financial institution, the Commonwealth Bank of Australia, is responsible for processing 60% of the continent’s financial transactions. CBA was experiencing challenges from the latency and costs associated with running its Spark workloads. Using CPU-only computing clusters, the bank estimates it faced nearly nine years of processing time for its training backlog — on top of handling already taxing daily data demands.

“With 40 million inferencing transactions a day, it was critical we were able to process these in a timely, reliable manner,” said Andrew McMullan, chief data and analytics officer at CBA.

Running RAPIDS Accelerator for Apache Spark on GPU-powered infrastructure provided CBA with a 640x performance boost, allowing the bank to process a training of 6.3 billion transactions in just five days. Additionally, on its daily volume of 40 million transactions, CBA is now able to conduct inference in 46 minutes and reduce costs by more than 80% compared with using a CPU-based solution.

McMullan says another value of NVIDIA-accelerated Apache Spark is how it offers his team the compute time efficiency needed to cost-effectively build models that can help CBA deliver better customer service, anticipate when customers may need assistance with home loans and more quickly detect fraudulent transactions.

CBA also plans to use NVIDIA-accelerated Apache Spark to better pinpoint where customers commonly end their digital journeys, enabling the bank to remediate when needed to reduce the rate of abandoned applications.

Global Ecosystem

RAPIDS Accelerator for Apache Spark is available through a global network of partners. It runs on Amazon Web Services, Cloudera, Databricks, Dataiku, Google Cloud, Microsoft Azure and Oracle Cloud Infrastructure.

Dell Technologies today also announced the integration of RAPIDS Accelerator for Apache Spark with Dell Data Lakehouse.

To get assistance through NVIDIA Project Aether with a large-scale migration of Apache Spark workloads, apply for access.

To learn more, register for NVIDIA GTC and attend these key sessions featuring Walmart, Capital One, CBA and other industry leaders:

See notice regarding software product information.

Read More

Driving Impact: NVIDIA Expands Automotive Ecosystem to Bring Physical AI to the Streets

Driving Impact: NVIDIA Expands Automotive Ecosystem to Bring Physical AI to the Streets

The autonomous vehicle (AV) revolution is here — and NVIDIA is at its forefront, bringing more than two decades of automotive computing, software and safety expertise to power innovation from the cloud to the car.

At NVIDIA GTC, a global AI conference taking place this week in San Jose, California, dozens of transportation leaders are showcasing their latest advancements with NVIDIA technologies that span passenger cars, trucks, commercial vehicles and more.

Mobility leaders are increasingly turning to NVIDIA’s three core accelerated compute platforms: NVIDIA DGX systems for training the AI-based stack in the data center, NVIDIA Omniverse and NVIDIA Cosmos running on NVIDIA OVX systems for simulation and synthetic data generation, and the NVIDIA DRIVE AGX in-vehicle computer to process real-time sensor data for safe, highly automated and autonomous driving capabilities.

For manufacturers and developers in the multitrillion-dollar auto industry, this unlocks new possibilities for designing, manufacturing and deploying functionally safe, intelligent mobility solutions — offering consumers safer, smarter and more enjoyable experiences.

Transforming Passenger Vehicles 

The U.S.’s largest automaker, General Motors (GM), is collaborating with NVIDIA to develop and build its next-generation vehicles, factories and robots using NVIDIA’s accelerated compute platforms. GM has been investing in NVIDIA GPU platforms for training AI models.

The companies’ collaboration now expands to include optimizing factory planning using Omnivese with Cosmos and deploying next-generation vehicles at scale accelerated by the NVIDIA DRIVE AGX. This will help GM build physical AI systems tailored to its company vision, craft and know-how, and ultimately enable mobility that’s safer, smarter and more accessible than ever.

Volvo Cars, which is using the NVIDIA DRIVE AGX in-vehicle computer in its next-generation electric vehicles, and its subsidiary Zenseact use the NVIDIA DGX platform to analyze and contextualize sensor data, unlock new insights and train future safety models that will enhance overall vehicle performance and safety.

Lenovo has teamed with robotics company Nuro to create a robust end-to-end system for level 4 autonomous vehicles that prioritize safety, reliability and convenience. The system is built on NVIDIA DRIVE AGX in-vehicle compute.

Advancements in Trucking

NVIDIA’s AI-driven technologies are also supercharging trucking, helping address pressing challenges like driver shortages, rising e-commerce demands and high operational costs. NVIDIA DRIVE AGX delivers the computational muscle needed for safe, reliable and efficient autonomous operations — improving road safety and logistics on a massive scale.

Gatik is integrating DRIVE AGX for the onboard AI processing necessary for its freight-only class 6 and 7 trucks, manufactured by Isuzu Motors, which offer driverless middle-mile delivery of a wide range of goods to Fortune 500 customers including Tyson Foods, Kroger and Loblaw.

Uber Freight is also adopting DRIVE AGX as the AI computing backbone of its current and future carrier fleets, sustainably enhancing efficiency and saving costs for shippers.

Torc is developing a scalable, physical AI compute system for autonomous trucks. The system uses NVIDIA DRIVE AGX in-vehicle compute and the NVIDIA DriveOS operating system with Flex’s Jupiter platform and manufacturing capabilities to support Torc’s productization and scaled market entry in 2027.

Growing Demand for DRIVE AGX

NVIDIA DRIVE AGX Orin platform is the AI brain behind today’s intelligent fleets — and the next wave of mobility is already arriving, as production vehicles built on the NVIDIA DRIVE AGX Thor centralized car computer start to hit the roads.

Magna is a key global automotive supplier helping to meet the surging demand for the NVIDIA Blackwell architecture-based DRIVE Thor platform — designed for the most demanding processing workloads, including those involving generative AI, vision language models and large language models (LLMs). Magna will develop driving systems built with DRIVE AGX Thor for integration in automakers’ vehicle roadmaps, delivering active safety and comfort functions along with interior cabin AI experiences.

Simulation and Data: The Backbone of AV Development

Earlier this year, NVIDIA announced the Omniverse Blueprint for AV simulation, a reference workflow for creating rich 3D worlds for autonomous vehicle training, testing and validation. The blueprint is expanding to include NVIDIA Cosmos world foundation models (WFMs) to amplify photoreal data variation.

Unveiled at the CES trade show in January, Cosmos is already being adopted in automotive, including by Plus, which is embedding Cosmos physical AI models into its SuperDrive technology, accelerating the development of level 4 self-driving trucks.

Foretellix is extending its integration of the blueprint, using the Cosmos Transfer WFM to add conditions like weather and lighting to its sensor simulation scenarios to achieve greater situation diversity. Mcity is integrating the blueprint into the digital twin of its AV testing facility to enable physics-based modeling of camera, lidar, radar and ultrasonic sensor data.

CARLA, which offers an open-source AV simulator, has integrated the blueprint to deliver high-fidelity sensor simulation. Global systems integrator Capgemini will be the first to use CARLA’s Omniverse integration for enhanced sensor simulation in its AV development platform.

NVIDIA is using Nexar’s extensive, high-quality, edge-case data to train and fine-tune NVIDIA Cosmos’ simulation capabilities. Nexar is tapping into Cosmos, neural infrastructure models and the NVIDIA DGX Cloud platform to supercharge its AI development, refining AV training, high-definition mapping and predictive modeling.

Enhancing In-Vehicle Experiences With NVIDIA AI Enterprise

Mobility leaders are integrating the NVIDIA AI Enterprise software platform, running on DRIVE AGX, to enhance in-vehicle experiences with generative and agentic AI.

At GTC, Cerence AI is showcasing Cerence xUI, its new LLM-based AI assistant platform that will advance the next generation of agentic in-vehicle user experiences. The Cerence xUI hybrid platform runs in the cloud as well as onboard the vehicle, optimized first on NVIDIA DRIVE AGX Orin.

As the foundation for Cerence xUI, the CaLLM family of language models is based on open-source foundation models and fine-tuned on Cerence AI’s automotive dataset. Tapping into NVIDIA AI Enterprise and bolstering inference performance including through the NVIDIA TensorRT-LLM library and NVIDIA NeMo, Cerence AI has optimized CaLLM to serve as the central agentic orchestrator facilitating enriched driver experiences at the edge and in the cloud.

SoundHound will also be demonstrating its next-generation in-vehicle voice assistant, which uses generative AI at the edge with NVIDIA DRIVE AGX, enhancing the in-car experience by bringing cloud-based LLM intelligence directly to vehicles.

The Complexity of Autonomy and NVIDIA’s Safety-First Solution

Safety is the cornerstone in deploying highly automated and autonomous vehicles to the roads at scale. But building AVs is one of today’s most complex computing challenges. It demands immense computational power, precision and an unwavering commitment to safety.

AVs and highly automated cars promise to extend mobility to those who need it most, reducing accidents and saving lives. To help deliver on this promise, NVIDIA has developed NVIDIA Halos, a full-stack comprehensive safety system that unifies vehicle architecture, AI models, chips, software, tools and services for the safe development of AVs from the cloud to the car.

NVIDIA will host its inaugural AV Safety Day at GTC today, featuring in-depth discussions on automotive safety frameworks and implementation.

In addition, NVIDIA will host Automotive Developer Day on Thursday, March 20, offering sessions on the latest advancements in end-to-end AV development and beyond.

New Tools for AV Developers

NVIDIA also released new NVIDIA NIM microservices for automotive — designed to accelerate development and deployment of end-to-end stacks from cloud to car. The new NIM microservices for in-vehicle applications, which utilize the nuScenes dataset by Motional, include:

  • BEVFormer, a state-of-the-art transformer-based model that fuses multi-frame camera data into a unified bird’s-eye-view representation for 3D perception.
  • SparseDrive, an end-to-end autonomous driving model that performs motion prediction and planning simultaneously, outputting a safe planning trajectory.

For automotive enterprise applications, NVIDIA offers a variety of models, including NV-CLIP, a multimodal transformer model that generates embeddings from images and text; Cosmos Nemotron, a vision language model that queries and summarizes images and videos for multimodal understanding and AI-powered perception; and many more.

Learn more about NVIDIA’s latest automotive news by watching the NVIDIA GTC keynote and register for sessions from NVIDIA and industry leaders at the show, which runs through March 21.

Read More

NVIDIA Unveils AI-Q Blueprint to Connect AI Agents for the Future of Work

NVIDIA Unveils AI-Q Blueprint to Connect AI Agents for the Future of Work

AI agents are the new digital workforce, transforming business operations, automating complex tasks and unlocking new efficiencies. Now, with the ability to collaborate, these agents can work together to solve complex problems and drive even greater impact.

Businesses across industries, including sports and finance, can more quickly harness these benefits with AI-Q — a new NVIDIA Blueprint for developing agentic systems that can use reasoning to unlock knowledge in enterprise data.

Smarter Agentic AI Systems With NVIDIA AI-Q and AgentIQ Toolkit

AI-Q provides an easy-to-follow reference for integrating NVIDIA accelerated computing, partner storage platforms, and software and tools — including the new NVIDIA Llama Nemotron reasoning models. AI-Q offers a powerful foundation for enterprises to build digital workforces that break down agentic silos and are capable of handling complex tasks with high accuracy and speed.

AI-Q integrates fast multimodal extraction and world-class retrieval, using NVIDIA NeMo Retriever, NVIDIA NIM microservices and AI agents.

The blueprint is powered by the new NVIDIA AgentIQ toolkit for seamless, heterogeneous connectivity between agents, tools and data. Released today on GitHub, AgentIQ is an open-source software library for connecting, profiling and optimizing teams of AI agents fueled by enterprise data to create multi-agent, end-to-end systems. It can be easily integrated with existing multi-agent systems — either in parts or as a complete solution — with a simple onboarding process that’s 100% opt-in.

The AgentIQ toolkit also enhances transparency with full system traceability and profiling — enabling organizations to monitor performance, identify inefficiencies and gain fine-grained understanding of how business intelligence is generated. This profiling data can be used with NVIDIA NIM and the NVIDIA Dynamo open-source library to optimize the performance of agentic systems.

The New Enterprise AI Agent Workforce

As AI agents become digital employees, IT teams will support onboarding and training. The AI-Q blueprint and AgentIQ toolkit support digital employees by enabling collaboration between agents and optimizing performance across different agentic frameworks.

Enterprises using these tools will be able to more easily connect AI agent teams across solutions — like Salesforce’s Agentforce, Atlassian Rovo in Confluence and Jira, and the ServiceNow AI platform for business transformation — to break down silos, streamline tasks and cut response times from days to hours.

AgentIQ also integrates with frameworks and tools like CrewAI, LangGraph, Llama Stack, Microsoft Azure AI Agent Service and Letta, letting developers work in their preferred environment.

Azure AI Agent Service is integrated with AgentIQ to enable more efficient AI agents and orchestration of multi-agent frameworks using Semantic Kernel, which is fully supported in AgentIQ.

A wide range of industries are integrating visual perception and interactive capabilities into their agents and copilots.

Financial services leader Visa is using AI agents to streamline cybersecurity, automating phishing email analysis at scale. Using the profiler feature of AI-Q, Visa can optimize agent performance and costs, maximizing AI’s role in efficient threat response.

Get Started With AI-Q and AgentIQ

AI-Q integration into the NVIDIA Metropolis VSS blueprint is enabling multimodal agents, combining visual perception with speech, translation and data analytics for enhanced intelligence.

Developers can use the AgentIQ toolkit open-source library today and sign up for this hackathon to build hands-on skills for advancing agentic systems.

Plus, learn how an NVIDIA solutions architect used the AgentIQ toolkit to improve AI code generation.

Agentic systems built with AI-Q require a powerful AI data platform. NVIDIA partners are delivering these customized platforms that continuously process data to let AI agents quickly access knowledge to reason and respond to complex queries.

See notice regarding software product information.

Read More

NVIDIA Unveils Open Physical AI Dataset to Advance Robotics and Autonomous Vehicle Development

NVIDIA Unveils Open Physical AI Dataset to Advance Robotics and Autonomous Vehicle Development

Teaching autonomous robots and vehicles how to interact with the physical world requires vast amounts of high-quality data. To give researchers and developers a head start, NVIDIA is releasing a massive, open-source dataset for building the next generation of physical AI.

Announced at NVIDIA GTC, a global AI conference taking place this week in San Jose, California, this commercial-grade, pre-validated dataset can help researchers and developers kickstart physical AI projects that can be prohibitively difficult to start from scratch. Developers can either directly use the dataset for model pretraining, testing and validation — or use it during post-training to fine-tune world foundation models, accelerating the path to deployment.

The initial dataset is now available on Hugging Face, offering developers 15 terabytes of data representing more than 320,000 trajectories for robotics training, plus up to 1,000 Universal Scene Description (OpenUSD) assets, including a SimReady collection. Dedicated data to support end-to-end autonomous vehicle (AV) development — which will include 20-second clips of diverse traffic scenarios spanning over 1,000 cities across the U.S. and two dozen European countries — is coming soon.

flythrough of synthetically generated objects
The NVIDIA Physical AI Dataset includes hundreds of SimReady assets for rich scenario building.

This dataset will grow over time to become the world’s largest unified and open dataset for physical AI development. It could be applied to develop AI models to power robots that safely maneuver warehouse environments, humanoid robots that support surgeons during procedures and AVs that can navigate complex traffic scenarios like construction zones.

The NVIDIA Physical AI Dataset is slated to contain a subset of the real-world and synthetic data NVIDIA uses to train, test and validate physical AI for the NVIDIA Cosmos world model development platform, the NVIDIA DRIVE AV software stack, the NVIDIA Isaac AI robot development platform and the NVIDIA Metropolis application framework for smart cities.

Early adopters include the Berkeley DeepDrive Center at the University of California, Berkeley, the Carnegie Mellon Safe AI Lab and the Contextual Robotics Institute at University of California, San Diego.

“We can do a lot of things with this dataset, such as training predictive AI models that help autonomous vehicles better track the movements of vulnerable road users like pedestrians to improve safety,” said Henrik Christensen, director of multiple robotics and autonomous vehicle labs at UCSD. “A dataset that provides a diverse set of environments and longer clips than existing open-source resources will be tremendously helpful to advance robotics and AV research.”

Addressing the Need for Physical AI Data

The NVIDIA Physical AI Dataset can help developers scale AI performance during pretraining, where more data helps build a more robust model — and during post-training, where an AI model is trained on additional data to improve its performance for a specific use case.

Collecting, curating and annotating a dataset that covers diverse scenarios and accurately represents the physics and variation of the real world is time-consuming, presenting a bottleneck for most developers. For academic researchers and small enterprises, running a fleet of vehicles over months to gather data for autonomous vehicle AI is impractical and costly — and, since much of the footage collected is uneventful, typically just 10% of data is used for training.

But this scale of data collection is essential to building safe, accurate, commercial-grade models. NVIDIA Isaac GR00T robotics models take thousands of hours of video clips for post-training — the GR00T N1 model, for example, was trained on an expansive humanoid dataset of real and synthetic data. The NVIDIA DRIVE AV end-to-end AI model for autonomous vehicles requires tens of thousands of hours of driving data to develop.

 

This open dataset, comprising thousands of hours of multicamera video at unprecedented diversity, scale and geography — will particularly benefit the field of safety research by enabling new work on identifying outliers and assessing model generalization performance. The effort contributes to NVIDIA Halos’ full-stack AV safety system.

In addition to harnessing the NVIDIA Physical AI Dataset to help meet their data needs, developers can further boost AI development with tools like NVIDIA NeMo Curator, which process vast datasets efficiently for model training and customization. Using NeMo Curator, 20 million hours of video can be processed in just two weeks on NVIDIA Blackwell GPUs, compared with 3.4 years on unoptimized CPU pipelines.

Robotics developers can also tap the new NVIDIA Isaac GR00T blueprint for synthetic manipulation motion generation, a reference workflow built on NVIDIA Omniverse and NVIDIA Cosmos that uses a small number of human demonstrations to create massive amounts of synthetic motion trajectories for robot manipulation.

University Labs Set to Adopt Dataset for AI Development

The robotics labs at UCSD include teams focused on medical applications, humanoids and in-home assistive technology. Christensen anticipates that the Physical AI Dataset’s robotics data could help develop semantic AI models that understand the context of spaces like homes, hotel rooms and hospitals.

“One of our goals is to achieve a level of understanding where, if a robot was asked to put your groceries away, it would know exactly which items should go in the fridge and what goes in the pantry,” he said.

In the field of autonomous vehicles, Christensen’s lab could apply the dataset to train AI models to understand the intention of various road users and predict the best action to take. His research teams could also use the dataset to support the development of digital twins that simulate edge cases and challenging weather conditions. These simulations could be used to train and test autonomous driving models in situations that are rare in real-world environments.

At Berkeley DeepDrive, a leading research center on AI for autonomous systems, the dataset could support the development of policy models and world foundation models for autonomous vehicles.

“Data diversity is incredibly important to train foundation models,” said Wei Zhan, codirector of Berkeley DeepDrive. “This dataset could support state-of-the-art research for public and private sector teams developing AI models for autonomous vehicles and robotics.”

Researchers at Carnegie Mellon University’s Safe AI Lab plan to use the dataset to advance their work evaluating and certifying the safety of self-driving cars. The team plans to test how a physical AI foundation model trained on this dataset performs in a simulation environment with rare conditions — and compare its performance to an AV model trained on existing datasets.

“This dataset covers different types of roads and geographies, different infrastructure, different weather environments,” said Ding Zhao, associate professor at CMU and head of the Safe AI Lab. “Its diversity could be quite valuable in helping us train a model with causal reasoning capabilities in the physical world that understands edge cases and long-tail problems.”

Access the NVIDIA Physical AI dataset on Hugging Face. Build foundational knowledge with courses such as the Learn OpenUSD learning path and Robotics Fundamentals learning path. And to learn more about the latest advancements in physical AI, watch the GTC keynote by NVIDIA founder and CEO Jensen Huang.

See notice regarding software product information.

Read More

New NVIDIA Software for Blackwell Infrastructure Runs AI Factories at Light Speed

New NVIDIA Software for Blackwell Infrastructure Runs AI Factories at Light Speed

The industrial age was fueled by steam. The digital age brought a shift through software. Now, the AI age is marked by the development of generative AI, agentic AI and AI reasoning, which enables models to process more data to learn and reason to solve complex problems.

Just as industrial factories transform raw materials into goods, modern businesses require AI factories to quickly transform data into insights that are scalable, accurate and reliable.

Orchestrating this new infrastructure is far more complex than it was to build steam-powered factories. State-of-the-art models demand supercomputing-scale resources. Any downtime risks derailing weeks of progress and reducing GPU utilization.

To enable enterprises and developers to manage and run AI factories at light speed, NVIDIA today announced at the NVIDIA GTC global AI conference NVIDIA Mission Control — the only unified operations and orchestration software platform that automates the complex management of AI data centers and workloads.

NVIDIA Mission Control enhances every aspect of AI factory operations. From configuring deployments to validating infrastructure to operating developer workloads, its capabilities help enterprises get frontier models up and running faster.

It is designed to easily transition NVIDIA Blackwell-based systems from pretraining to post-training — and now test-time scaling — with speed and efficiency. The software enables enterprises to easily pivot between training and inference workloads on their Blackwell-based NVIDIA DGX systems and NVIDIA Grace Blackwell systems, dynamically reallocating cluster resources to match shifting priorities.

In addition, Mission Control includes NVIDIA Run:ai technology to streamline operations and job orchestration for development, training and inference, boosting infrastructure utilization by up to 5x.

Mission Control’s autonomous recovery capabilities, supported by rapid checkpointing and automated tiered restart features, can deliver up to 10x faster job recovery compared with traditional methods that rely on manual intervention, boosting AI training and inference efficiency to keep AI applications in operation.

Built on decades of NVIDIA supercomputing expertise, Mission Control lets enterprises simply run models by minimizing time spent managing AI infrastructure. It automates the lifecycle of AI factory infrastructure for all NVIDIA Blackwell-based NVIDIA DGX systems and NVIDIA Grace Blackwell systems from Dell Technologies, Hewlett Packard Enterprise (HPE), Lenovo and Supermicro to make advanced AI infrastructure more accessible to the world’s industries.

Enterprises can further simplify and speed deployments of NVIDIA DGX GB300 and DGX B300 systems by using Mission Control with the NVIDIA Instant AI Factory service preconfigured in Equinix AI-ready data centers across 45 markets globally.

Advanced Software Provides Enterprises Uninterrupted Infrastructure Oversight  

Mission Control automates end-to-end infrastructure management — including provisioning, monitoring and error diagnosis — to deliver uninterrupted operations. Plus, it continuously monitors every layer of the application and infrastructure stack to predict and identify sources of downtime and inefficiency — saving time, energy and costs.

Additional NVIDIA Mission Control software benefits include:

  • Simplified cluster setup and provisioning with new automation and standardized application programming interfaces to speed time to deployment with integrated inventory management and visualizations.
  • Seamless workload orchestration for simplified Slurm and Kubernetes workflows.
  • Energy-optimized power profiles to balance power requirements and tune GPU performance for various workload types with developer-selectable controls.
  • Autonomous job recovery to identify, isolate and recover from inefficiencies without manual intervention to maximize developer productivity and infrastructure resiliency.
  • Customizable dashboards that track key performance indicators with access to critical telemetry data about clusters.
  • On-demand health checks to validate hardware and cluster performance throughout the infrastructure lifecycle.
  • Building management integration for enhanced coordination with building management systems to provide more control for power and cooling events, including rapid leakage detection.

Leading System Makers Bring NVIDIA Mission Control to Grace Blackwell Servers  

Leading system makers plan to offer NVIDIA GB200 NVL72 and GB300 NVL72 systems with NVIDIA Mission Control.

Dell plans to offer NVIDIA Mission Control software as part of the Dell AI Factory with NVIDIA.

“The AI industrial revolution demands efficient infrastructure that adapts as fast as business evolves, and the Dell AI Factory with NVIDIA delivers with comprehensive compute, networking, storage and support,” said Ihab Tarazi, chief technology officer and senior vice president at Dell Technologies. “Pairing NVIDIA Mission Control software and Dell PowerEdge XE9712 and XE9680 servers helps enterprises scale models effortlessly to meet the demands of both training and inference, turning data into actionable insights faster than ever before.”

HPE will offer the NVIDIA GB200 NVL72 by HPE and GB300 NVL72 by HPE systems with NVIDIA Mission Control software.

“We are helping service providers and cutting-edge enterprises to rapidly deploy, scale, and optimize complex AI clusters capable of training trillion parameter models,” said Trish Damkroger, senior vice president and general manager, HPC & AI Infrastructure Solutions at HPE. “As part of our collaboration with NVIDIA, we will deliver NVIDIA Grace Blackwell rack-scale systems and Mission Control software with HPE’s global services and direct liquid cooling expertise to power the new AI era.”

Lenovo plans to update its Lenovo Hybrid AI Advantage with NVIDIA systems to include NVIDIA Mission Control software.

“Bringing NVIDIA Mission Control software to Lenovo Hybrid AI Advantage with NVIDIA systems empowers enterprises to navigate the demands of generative and agentic AI workloads with unmatched agility,” said Brian Connors, worldwide vice president and general manager of enterprise and SMB segment and AI, infrastructure solutions group, at Lenovo. “By automating infrastructure orchestration and enabling seamless transitions between training and inference workloads, Lenovo and NVIDIA are helping customers scale AI innovation at the speed of business.”

Supermicro plans to incorporate NVIDIA Mission Control software into its Supercluster systems.

“Supermicro is proud to team with NVIDIA on a Grace Blackwell NVL72 system that is fully supported by NVIDIA Mission Control software,” Cenly Chen, chief growth officer at Supermicro. “Running on Supermicro’s AI SuperCluster systems with NVIDIA Grace Blackwell, NVIDIA Mission Control software provides customers with a seamless management software suite to maximize performance on both current NVIDIA GB200 NVL72 systems and future platforms such as NVIDIA GB300 NVL72.”

Base Command Manager Offers Free Kickstart for AI Cluster Management

To help enterprises with infrastructure management, NVIDIA Base Command Manager software is expected to soon be available for free for up to eight accelerators per system, for any cluster size, with the option to purchase NVIDIA Enterprise Support separately.

Availability

NVIDIA Mission Control for NVIDIA DGX GB200 and DGX B200 systems is available now. NVIDIA GB200 NVL72 systems with Mission Control are expected to soon be available from Dell, HPE, LeNewfonovo and Supermicro.

NVIDIA Mission Control is expected to become available for the latest NVIDIA DGX GB300 and DGX B300 systems, as well as GB300 NVL72 systems from leading global providers, later this year.

See notice regarding software product information.

Read More

Where AI and Graphics Converge: NVIDIA Blackwell Universal Data Center GPU Accelerates Demanding Enterprise Workloads

Where AI and Graphics Converge: NVIDIA Blackwell Universal Data Center GPU Accelerates Demanding Enterprise Workloads

The first NVIDIA Blackwell-powered data center GPU built for both enterprise AI and visual computing — the NVIDIA RTX PRO 6000 Blackwell Server Edition — is designed to accelerate the most demanding AI and graphics applications for every industry.

Compared to the previous-generation NVIDIA Ada Lovelace architecture L40S GPU, the RTX PRO 6000 Blackwell Server Edition GPU will deliver a multifold increase in performance across a wide array of enterprise workloads — up to 5x higher large language model (LLM) inference throughput for agentic AI applications, nearly 7x faster genomics sequencing, 3.3x speedups for text-to-video generation, nearly 2x faster inference for recommender systems and over 2x speedups for rendering.

It’s part of the NVIDIA RTX PRO Blackwell series of workstation and server GPUs announced today at NVIDIA GTC, the global AI conference taking place through Friday, March 21, in San Jose, California. The RTX PRO lineup includes desktop, laptop and data center GPUs that support AI and creative workloads across industries.

With the RTX PRO 6000 Blackwell Server Edition, enterprises across various sectors — including architecture, automotive, cloud services, financial services, game development, healthcare, manufacturing, media and entertainment and retail — can enable breakthrough performance for workloads such as multimodal generative AI, data analytics, engineering simulation, and visual computing.

Content creation, semiconductor manufacturing and genomics analysis companies are already set to harness its capabilities to accelerate compute-intensive, AI-enabled workflows.

Universal GPU Delivers Powerful Capabilities for AI and Graphics 

The RTX PRO 6000 Blackwell Server Edition packages powerful RTX AI and graphics capabilities in a passively cooled form factor designed to run 24/7 in data center environments. With 96GB of ultrafast GDDR7 memory and support for Multi-Instance GPU, or MIG, each RTX PRO 6000 can be partitioned into as many as four fully isolated instances with 24GB each to run simultaneous AI and graphics workloads.

RTX PRO 6000 is the first universal GPU to enable secure AI with NVIDIA Confidential Computing, which protects AI models and sensitive data from unauthorized access with strong, hardware-based security — providing a physically isolated trusted execution environment to secure the entire workload while data is in use.

To support enterprise-scale deployments, the RTX PRO 6000 can be configured in high-density accelerated computing platforms for distributed inference workloads — or used to deliver virtual workstations with NVIDIA vGPU software to power AI development and graphics-intensive applications.

The RTX PRO 6000 GPU delivers supercharged inferencing performance across a broad range of AI models and accelerates real-time, photorealistic ray tracing of complex virtual environments. It includes the latest Blackwell hardware and software innovations like fifth-generation Tensor Cores, fourth-generation RT Cores, DLSS 4, a fully integrated media pipeline and second-generation Transformer Engine with support for FP4 precision.

Enterprises can run the NVIDIA Omniverse and NVIDIA AI Enterprise platforms at scale on RTX PRO 6000 Blackwell Server Edition GPUs to accelerate the development and deployment of agentic and physical AI applications, such as image and video generation, LLM inference, recommender systems, computer vision, digital twins and robotics simulation.

Accelerated AI Inference and Visual Computing for Any Industry

Black Forest Labs, creator of the popular FLUX image generation AI, aims to develop and optimize state-of-the-art text-to-image models using RTX PRO 6000 Server Edition GPUs.

“With the powerful multimodal inference capabilities of the RTX PRO 6000 Server Edition, our customers will be able to significantly reduce latency for image generation workflows,” said name, title at Black Forest Labs. “We anticipate that, with the server edition GPUs’ support for FP4 precision, our Flux models will run faster, enabling interactive, AI-accelerated content creation.”

Cloud graphics company OTOY will optimize its OctaneRender real-time rendering application for NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs.

“The new NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs unlock brand-new workflows that were previously out of reach for 3D content creators,” said Jules Urbach, CEO of OTOY and founder of the Render Network. “With 96 GB of VRAM, the new server-edition GPUs can run complex neural rendering models within OctaneRender’s GPU path-tracer, enabling artists to tap into incredible new features and tools that blend the precision of traditional CGI augmented with frontier generative AI technology.”

Semiconductor equipment manufacturer KLA plans to use the RTX PRO 6000 Blackwell Server Edition to accelerate inference workloads powering the wafer manufacturing process — the creation of thin discs of semiconductor materials that are core to integrated circuits.

KLA and NVIDIA have worked together since 2008 to advance KLA’s physics-based AI with optimized high-performance computing solutions. KLA’s industry-leading inspection and metrology systems capture and process images by running complex AI algorithms at lightning-fast speeds to find the most critical semiconductor defects.

“Based on early results, we expect great performance from the RTX PRO 6000 Blackwell Server Edition,” said Kris Bhaskar, senior fellow and vice president of AI initiatives at KLA. “The increased memory capacity, FP4 reduced precision and new computational capabilities of NVIDIA Blackwell are going to be particularly helpful to KLA and its customers.”

Boosting Genomics and Drug Discovery Workloads

The RTX PRO 6000 Blackwell Server Edition also demonstrates game-changing acceleration for genomic analysis and drug discovery inference workloads, enabled by a new class of dynamic programming instructions.

On a single RTX PRO 6000 Blackwell Server Edition GPU, Fastq2bam and DeepVariant — elements of the NVIDIA Parabricks pipeline for germline analysis — run up to 1.5x faster compared with using an L40S GPU, and 1.75x faster compared with using an NVIDIA H100 GPU.

For Smith-Waterman, a core algorithm used in many sequence alignment and variant calling applications, RTX PRO 6000 Blackwell Server Edition GPUs accelerate throughput up to 6.8x compared with L40S GPUs.

And for OpenFold2, an AI model that predicts protein structures for drug discovery research, RTX PRO 6000 Blackwell Server Edition GPUs boost inference performance by up to 4.8x compared with L40S GPUs.

Genomics company Oxford Nanopore Technologies is collaborating with NVIDIA to bring the latest AI and accelerated computing technologies to its sequencing systems.

“The NVIDIA Blackwell architecture will help us drive the real-time sequencing analysis of anything, by anyone, anywhere,” said Chris Seymour, vice president of advanced platform development at Oxford Nanopore Technologies. “With the RTX PRO 6000 Blackwell Server Edition, we have seen up to a 2x improvement in basecalling speed across our Dorado platform.”

Availability via Global Network of Cloud Providers and System Partners

Platforms featuring the RTX PRO 6000 Blackwell Server Edition will be available from a global ecosystem of partners starting in May.

AWS, Google Cloud, Microsoft Azure, IBM Cloud, CoreWeave, Crusoe, Lambda, Nebius and Vultr will be among the first cloud service providers and GPU cloud providers to offer instances featuring the RTX PRO 6000 Blackwell Server Edition.

Cisco, Dell Technologies, Hewlett Packard Enterprise, Lenovo and Supermicro are expected to deliver a wide range of servers featuring the RTX PRO 6000 Blackwell Server Edition, as are Advantech, Aetina, Aivres, ASRockRack, ASUS, Compal, Foxconn, GIGABYTE, Inventec, MSI, Pegatron, Quanta Cloud Technology (QCT), MiTAC Computing, NationGate, Wistron and Wiwynn.

To learn more about the NVIDIA RTX PRO Blackwell series and other advancements in AI, watch the GTC keynote by NVIDIA founder and CEO Jensen Huang:

Read More

AI Factories, Built Smarter: New Omniverse Blueprint Advances AI Factory Design and Simulation

AI Factories, Built Smarter: New Omniverse Blueprint Advances AI Factory Design and Simulation

AI is now mainstream and driving unprecedented demand for AI factories — purpose-built infrastructure dedicated to AI training and inference — and the production of intelligence.

Many of these AI factories will be gigawatt-scale. Bringing up a single gigawatt AI factory is an extraordinary act of engineering and logistics — requiring tens of thousands of workers across suppliers, architects, contractors and engineers to build, ship and assemble nearly 5 billion components and over 210,000 miles of fiber cable.

To help design and optimize these AI factories, NVIDIA today unveiled at GTC the NVIDIA Omniverse Blueprint for AI factory design and operations.

During his GTC keynote, NVIDIA founder and CEO Jensen Huang showcased how NVIDIA’s data center engineering team developed an application on the Omniverse Blueprint to plan, optimize and simulate a 1 gigawatt AI factory. Connected to leading simulation tools such as Cadence Reality Digital Twin Platform and ETAP, the engineering teams can test and optimize power, cooling and networking long before construction starts.

Engineering AI Factories: A Simulation-First Approach

The NVIDIA Omniverse Blueprint for AI factory design and operations uses OpenUSD libraries that enable developers to aggregate 3D data from disparate sources such as the building itself, NVIDIA accelerated computing systems and power or cooling units from providers such as Schneider Electric and Vertiv.

By unifying the design and simulation of billions of components, the blueprint helps engineers address complex challenges like:

  • Component integration and space optimization — Unifying the design and simulation of NVIDIA DGX SuperPODs, GB300 NVL72 systems and their 5 billion components.
  • Cooling system performance and efficiency — Using Cadence Reality Digital Twin Platform, accelerated by NVIDIA CUDA and Omniverse libraries, to simulate and evaluate hybrid air- and liquid-cooling solutions from Vertiv and Schneider Electric.
  • Power distribution and reliability — Designing scalable, redundant electrical systems with ETAP to simulate power-block efficiency and reliability.
  • Networking topology and logic — Fine-tuning high-bandwidth infrastructure with NVIDIA Spectrum-X networking and the NVIDIA Air platform.

Breaking Down Engineering Silos With Omniverse

One of the biggest challenges in AI factory construction is that different teams — power, cooling and networking — operate in silos, leading to inefficiencies and potential failures.

Using the blueprint, engineers can now:

  • Collaborate in full context — Multiple disciplines can iterate in parallel, sharing live simulations that reveal how changes in one domain affect another.
  • Optimize energy usage — Real-time simulation updates enable teams to find the most efficient designs for AI workloads.
  • Eliminate failure points — By validating redundancy configurations before deployment, organizations reduce the risk of costly downtime.
  • Model real-world conditions — Predict and test how different AI workloads will impact cooling, power stability and network congestion.

By integrating real-time simulation across disciplines, the blueprint allows engineering teams to explore various configurations to model cost of ownership and optimize power utilization.

Real-Time Simulations for Faster Decision-Making

In Huang’s demo, engineers adjust AI factory configurations in real time — and instantly see the impact.

For example, a small tweak in cooling layout significantly improved efficiency — a detail that could have been missed on paper. And instead of waiting hours for simulation results, teams could test and refine strategies in just seconds.

Once an optimal design was finalized, Omniverse streamlined communication with suppliers and construction teams — ensuring that what gets built matches the model, down to the last detail.

Future-Proofing AI Factories

AI workloads aren’t static. The next wave of AI applications will push power, cooling and networking demands even further. The Omniverse Blueprint for AI factory design and operations helps ensure AI factories are ready by offering:

  • Workload-aware simulation — Predict how changes in AI workloads will affect power and cooling at data center scale.
  • Failure scenario testing — Model grid failures, cooling leaks and power spikes to ensure resilience.
  • Scalable upgrades — Plan for AI factory expansions and estimate infrastructure needs years ahead.

And when planning for retrofits and upgrades, users can easily test and simulate cost and downtime — delivering a future-proof AI factory.

For AI factory operators, staying ahead isn’t just about efficiency — it’s about preventing infrastructure failures that could cost millions of dollars per day.

For a 1 gigawatt AI factory, every day of downtime can cost over $100 million. By solving infrastructure challenges in advance, the blueprint reduces both risk and time to deployment.

Road to Agentic AI for AI Factory Operation

NVIDIA is working on the next evolution of the blueprint to expand into AI-enabled operations, working with key companies such as Vertech and Phaidra.

Vertech is collaborating with the NVIDIA data center engineering team on NVIDIA’s advanced AI factory control system, which integrates IT and operational technology data to enhance resiliency and operational visibility.

Phaidra is working with NVIDIA to integrate reinforcement-learning AI agents into Omniverse. These agents optimize thermal stability and energy efficiency through real-time scenario simulation, creating digital twins that continuously adapt to changing hardware and environmental conditions.

The AI Data Center Boom

AI is reshaping the global data center landscape. With $1 trillion projected for AI-driven data center upgrades, digital twin technology is no longer optional — it’s essential.

The NVIDIA Omniverse Blueprint for AI factory design and operations is poised to help NVIDIA and its ecosystem of partners lead this transformation — letting AI factory operators stay ahead of ever-evolving AI workloads, minimize downtime and maximize efficiency.

Learn more about NVIDIA Omniverse, watch the GTC keynote, register for Cadence’s GTC session to see the Omniverse Blueprint in action and read more about AI factories.

See notice regarding software product information.

Read More

Amazon Bedrock Guardrails announces IAM Policy-based enforcement to deliver safe AI interactions

Amazon Bedrock Guardrails announces IAM Policy-based enforcement to deliver safe AI interactions

As generative AI adoption accelerates across enterprises, maintaining safe, responsible, and compliant AI interactions has never been more critical. Amazon Bedrock Guardrails provides configurable safeguards that help organizations build generative AI applications with industry-leading safety protections. With Amazon Bedrock Guardrails, you can implement safeguards in your generative AI applications that are customized to your use cases and responsible AI policies. You can create multiple guardrails tailored to different use cases and apply them across multiple foundation models (FMs), improving user experiences and standardizing safety controls across generative AI applications. Beyond Amazon Bedrock models, the service offers the flexible ApplyGuardrails API that enables you to assess text using your pre-configured guardrails without invoking FMs, allowing you to implement safety controls across generative AI applications—whether running on Amazon Bedrock or on other systems—at both input and output levels.

Today, we’re announcing a significant enhancement to Amazon Bedrock Guardrails: AWS Identity and Access Management (IAM) policy-based enforcement. This powerful capability enables security and compliance teams to establish mandatory guardrails for every model inference call, making sure organizational safety policies are consistently enforced across AI interactions. This feature enhances AI governance by enabling centralized control over guardrail implementation.

Challenges with building generative AI applications

Organizations deploying generative AI face critical governance challenges: content appropriateness, where models might produce undesirable responses to problematic prompts; safety concerns, with potential generation of harmful content even from innocent prompts; privacy protection requirements for handling sensitive information; and consistent policy enforcement across AI deployments.

Perhaps most challenging is making sure that appropriate safeguards are applied consistently across AI interactions within an organization, regardless of which team or individual is developing or deploying applications.

Amazon Bedrock Guardrails capabilities

Amazon Bedrock Guardrails enables you to implement safeguards in generative AI applications customized to your specific use cases and responsible AI policies. Guardrails currently supports six types of policies:

  • Content filters – Configurable thresholds across six harmful categories: hate, insults, sexual, violence, misconduct, and prompt injections
  • Denied topics – Definition of specific topics to be avoided in the context of an application
  • Sensitive information filters – Detection and removal of personally identifiable information (PII) and custom regex entities to protect user privacy
  • Word filters – Blocking of specific words in generative AI applications, such as harmful words, profanity, or competitor names and products
  • Contextual grounding checks – Detection and filtering of hallucinations in model responses by verifying if the response is properly grounded in the provided reference source and relevant to the user query
  • Automated reasoning – Prevention of factual errors from hallucinations using sound mathematical, logic-based algorithmic verification and reasoning processes to verify the information generated by a model, so outputs align with known facts and aren’t based on fabricated or inconsistent data

Policy-based enforcement of guardrails

Security teams often have organizational requirements to enforce the use of Amazon Bedrock Guardrails for every inference call to Amazon Bedrock. To support this requirement, Amazon Bedrock Guardrails provides the new IAM condition key bedrock:GuardrailIdentifier, which can be used in IAM policies to enforce the use of a specific guardrail for model inference. The condition key in the IAM policy can be applied to the following APIs:

The following diagram illustrates the policy-based enforcement workflow.

If the guardrail configured in your IAM policy doesn’t match the guardrail specified in the request, the request will be rejected with an access denied exception, enforcing compliance with organizational policies.

Policy examples

In this section, we present several policy examples demonstrating how to enforce guardrails for model inference.

Example 1: Enforce the use of a specific guardrail and its numeric version

The following example illustrates the enforcement of exampleguardrail and its numeric version 1 during model inference:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "InvokeFoundationModelStatement1",
            "Effect": "Allow",
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": [
                "arn:aws:bedrock:region::foundation-model/*"
            ],
            "Condition": {
                "StringEquals": {
                    "bedrock:GuardrailIdentifier": "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail:1"
                }
            }
        },
        {
            "Sid": "InvokeFoundationModelStatement2",
            "Effect": "Deny",
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": [
                "arn:aws:bedrock:region::foundation-model/*"
            ],
            "Condition": {
                "StringNotEquals": {
                    "bedrock:GuardrailIdentifier": "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail:1"
                }
            }
        },
        {
            "Sid": "ApplyGuardrail",
            "Effect": "Allow",
            "Action": [
                "bedrock:ApplyGuardrail"
            ],
            "Resource": [
                "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail"
            ]
        }
    ]
}

The added explicit deny denies the user request for calling the listed actions with other GuardrailIdentifier and GuardrailVersion values irrespective of other permissions the user might have.

Example 2: Enforce the use of a specific guardrail and its draft version

The following example illustrates the enforcement of exampleguardrail and its draft version during model inference:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "InvokeFoundationModelStatement1",
            "Effect": "Allow",
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": [
                "arn:aws:bedrock:region::foundation-model/*"
            ],
            "Condition": {
                "StringEquals": {
                    "bedrock:GuardrailIdentifier": "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail"
                }
            }
        },
        {
            "Sid": "InvokeFoundationModelStatement2",
            "Effect": "Deny",
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": [
                "arn:aws:bedrock:region::foundation-model/*"
            ],
            "Condition": {
                "StringNotEquals": {
                    "bedrock:GuardrailIdentifier": "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail"
                }
            }
        },
        {
            "Sid": "ApplyGuardrail",
            "Effect": "Allow",
            "Action": [
                "bedrock:ApplyGuardrail"
            ],
            "Resource": [
                "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail"
            ]
        }
    ]
}

Example 3: Enforce the use of a specific guardrail and its numeric versions

The following example illustrates the enforcement of exampleguardrail and its numeric versions during model inference:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "InvokeFoundationModelStatement1",
            "Effect": "Allow",
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": [
                "arn:aws:bedrock:region::foundation-model/*"
            ],
            "Condition": {
                "StringLike": {
                    "bedrock:GuardrailIdentifier": "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail:*"
                }
            }
        },
        {
            "Sid": "InvokeFoundationModelStatement2",
            "Effect": "Deny",
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": [
                "arn:aws:bedrock:region::foundation-model/*"
            ],
            "Condition": {
                "StringNotLike": {
                    "bedrock:GuardrailIdentifier": "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail:*"
                }
            }
        },
        {
            "Sid": "ApplyGuardrail",
            "Effect": "Allow",
            "Action": [
                "bedrock:ApplyGuardrail"
            ],
            "Resource": [
                "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail"
            ]
        }
    ]
}

Example 4: Enforce the use of a specific guardrail and its versions, including the draft

The following example illustrates the enforcement of exampleguardrail and its versions, including the draft, during model inference:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "InvokeFoundationModelStatement1",
            "Effect": "Allow",
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": [
                "arn:aws:bedrock:region::foundation-model/*"
            ],
            "Condition": {
                "StringLike": {
                    "bedrock:GuardrailIdentifier": "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail*"
                }
            }
        },
        {
            "Sid": "InvokeFoundationModelStatement2",
            "Effect": "Deny",
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": [
                "arn:aws:bedrock:region::foundation-model/*"
            ],
            "Condition": {
                "StringNotLike": {
                    "bedrock:GuardrailIdentifier": "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail*"
                }
            }
        },
        {
            "Sid": "ApplyGuardrail",
            "Effect": "Allow",
            "Action": [
                "bedrock:ApplyGuardrail"
            ],
            "Resource": [
                "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail"
            ]
        }
    ]
}

Example 5: Enforce the use of a specific guardrail and version pair from a list of guardrail and version pairs

The following example illustrates the enforcement of exampleguardrail1 and its version 1, or exampleguardrail2 and its version 2, or exampleguardrail3 and its version 3 and its draft during model inference:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "InvokeFoundationModelStatement1",
            "Effect": "Allow",
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": [
                "arn:aws:bedrock:region::foundation-model/*"
            ],
            "Condition": {
                "StringEquals": {
                    "bedrock:GuardrailIdentifier": [
                        "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail1:1",
                        "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail2:2",
                        "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail3"
                    ]
                }
            }
        },
        {
            "Sid": "InvokeFoundationModelStatement2",
            "Effect": "Deny",
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": [
                "arn:aws:bedrock:region::foundation-model/*"
            ],
            "Condition": {
                "StringNotEquals": {
                    "bedrock:GuardrailIdentifier": [
                        "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail1:1",
                        "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail2:2",
                        "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail3"
                    ]
                }
            }
        },
        {
            "Sid": "ApplyGuardrail",
            "Effect": "Allow",
            "Action": [
                "bedrock:ApplyGuardrail"
            ],
            "Resource": [
                "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail1",
                "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail2",
                "arn:aws:bedrock:<region>:<account-id>:guardrail/exampleguardrail3"
            ]
        }
    ]
}

Known limitations

When implementing policy-based guardrail enforcement, be aware of these limitations:

  • At the time of this writing, Amazon Bedrock Guardrails doesn’t support resource-based policies for cross-account access.
  • If a user assumes a role that has a specific guardrail configured using the bedrock:GuardrailIdentifier condition key, the user can strategically use input tags to help avoid having guardrail checks applied to certain parts of their prompt. Input tags allow users to mark specific sections of text that should be processed by guardrails, leaving other sections unprocessed. For example, a user could intentionally leave sensitive or potentially harmful content outside of the tagged sections, preventing those portions from being evaluated against the guardrail policies. However, regardless of how the prompt is structured or tagged, the guardrail will still be fully applied to the model’s response.
  • If a user has a role configured with a specific guardrail requirement (using the bedrock:GuardrailIdentifier condition), they shouldn’t use that same role to access services like Amazon Bedrock Knowledge Bases RetrieveAndGenerate or Amazon Bedrock Agents InvokeAgent. These higher-level services work by making multiple InvokeModel calls behind the scenes on the user’s behalf. Although some of these calls might include the required guardrail, others don’t. When the system attempts to make these guardrail-free calls using a role that requires guardrails, it results in AccessDenied errors, breaking the functionality of these services. To help avoid this issue, organizations should separate permissions—using different roles for direct model access with guardrails versus access to these composite Amazon Bedrock services.

Conclusion

The new IAM policy-based guardrail enforcement in Amazon Bedrock represents a crucial advancement in AI governance as generative AI becomes integrated into business operations. By enabling centralized policy enforcement, security teams can maintain consistent safety controls across AI applications regardless of who develops or deploys them, effectively mitigating risks related to harmful content, privacy violations, and bias. This approach offers significant advantages: it scales efficiently as organizations expand their AI initiatives without creating administrative bottlenecks, helps prevent technical debt by standardizing safety implementations, and enhances the developer experience by allowing teams to focus on innovation rather than compliance mechanics.

This capability demonstrates organizational commitment to responsible AI practices through comprehensive monitoring and audit mechanisms. Organizations can use model invocation logging in Amazon Bedrock to capture complete request and response data in Amazon CloudWatch Logs or Amazon Simple Storage Service (Amazon S3) buckets, including specific guardrail trace documentation showing when and how content was filtered. Combined with AWS CloudTrail integration that records guardrail configurations and policy enforcement actions, businesses can confidently scale their generative AI initiatives with appropriate safety mechanisms protecting their brand, customers, and data—striking the essential balance between innovation and ethical responsibility needed to build trust in AI systems.

Get started today with Amazon Bedrock Guardrails and implement configurable safeguards that balance innovation with responsible AI governance across your organization.


About the Authors

Shyam Srinivasan is on the Amazon Bedrock Guardrails product team. He cares about making the world a better place through technology and loves being part of this journey. In his spare time, Shyam likes to run long distances, travel around the world, and experience new cultures with family and friends.

Antonio Rodriguez is a Principal Generative AI Specialist Solutions Architect at AWS. He helps companies of all sizes solve their challenges, embrace innovation, and create new business opportunities with Amazon Bedrock. Apart from work, he loves to spend time with his family and play sports with his friends.

Satveer Khurpa is a Sr. WW Specialist Solutions Architect, Amazon Bedrock at Amazon Web Services. In this role, he uses his expertise in cloud-based architectures to develop innovative generative AI solutions for clients across diverse industries. Satveer’s deep understanding of generative AI technologies allows him to design scalable, secure, and responsible applications that unlock new business opportunities and drive tangible value.

Read More

NVIDIA Launches NVIDIA Halos, a Full-Stack, Comprehensive Safety System for Autonomous Vehicles

NVIDIA Launches NVIDIA Halos, a Full-Stack, Comprehensive Safety System for Autonomous Vehicles

Physical AI is unlocking new possibilities at the intersection of autonomy and robotics — accelerating, in particular, the development of autonomous vehicles (AVs). The right technology and frameworks are crucial to ensuring the safety of drivers, passengers and pedestrians.

That’s why NVIDIA today announced NVIDIA Halos — a comprehensive safety system bringing together NVIDIA’s lineup of automotive hardware and software safety solutions with its cutting-edge AI research in AV safety.

Halos spans chips and software to tools and services to help ensure safe development of AVs from the cloud to the car, with a focus on AI-based, end-to-end AV stacks.

“With the launch of Halos, we’re empowering partners and developers to choose the state-of-the-art technology elements they need to build their own unique offerings, driving forward a shared mission to create safe and reliable autonomous vehicles,” said Riccardo Mariani, vice president of industry safety at NVIDIA. “Halos complements existing safety practices and can potentially accelerate standardization and regulatory compliance.”

At the Heart of Halos

Halos is a holistic safety system on three different but complementary levels.

At the technology level, it spans platform, algorithmic and ecosystem safety. At the development level, it includes design-time, deployment-time and validation-time guardrails. And at the computational level, it spans AI training to deployment, using three powerful computers — NVIDIA DGX for AI training, NVIDIA Omniverse and NVIDIA Cosmos running on NVIDIA OVX for simulation, and NVIDIA DRIVE AGX for deployment.

“Halos’ holistic approach to safety is particularly critical in a setting where companies want to harness the power of generative AI for increasingly capable AV systems developed end to end, which preclude traditional compositional design and verification,” said Marco Pavone, lead AV researcher at NVIDIA.

AI Systems Inspection Lab

Serving as an entry point to Halos is the NVIDIA AI Systems Inspection Lab, which allows automakers and developers to verify the safe integration of their products with NVIDIA technology.

The AI Systems Inspection Lab, announced at the CES trade show earlier this year, is the first worldwide program to be accredited by the ANSI National Accreditation Board for an inspection plan integrating functional safety, cybersecurity, AI safety and regulations into a unified safety framework.

Inaugural members of the AI Systems Inspection Lab include Ficosa, OMNIVISION, onsemi and Continental.

“Being a member of the AI Systems Inspection Lab means working at the forefront of automotive systems innovation and integrity,” said Cristian Casorran Hontiyuelo, advanced driver-assistance system engineering and product manager at Ficosa.

“Cars are so much more than just transportation,” said Paul Wu, head of product marketing for automotive at OMNIVISION. “They’ve also become our entertainment and information hubs. Vehicles must continually evolve in their ability to keep us safe. We are pleased to join NVIDIA’s new AI Systems Safety Lab as a demonstration of our commitment to achieving the highest levels of safety in our product offerings.”

“We are delighted to be working with NVIDIA and included in the launch of the NVIDIA AI Systems Inspection Lab,” said Geoff Ballew, general manager of the automotive sensing division at onsemi. “This unique initiative will improve road safety in an innovative way. We look forward to the advancements it will bring.”

“We are pleased to participate in the newly launched NVIDIA Drive AI Systems Inspection Lab and to further intensify the fruitful, ongoing collaboration between our two companies,” said Nobert Hammerschmidt, head of components business at Continental.

Key Elements of Halos

Halos is built on three focus areas: platform safety, algorithmic safety and ecosystem safety.

Platform Safety

Halos features a safety-assessed system-on-a-chip (SoC) with hundreds of built-in safety mechanisms.

It also includes NVIDIA DriveOS software, a safety-certified operating system that extends from CPU to GPU; a safety-assessed base platform that delivers the foundational computer needed to enable safe systems for all types of applications; and DRIVE AGX Hyperion, a hardware platform that connects SoC, DriveOS and sensors in an electronic control unit architecture.

Algorithmic Safety

Halos includes libraries for safety data loading and accelerators, and application programming interfaces for safety data creation, curation and reconstruction to filter out, for example, undesirable behaviors and biases before training.

It also features rich training, simulation and validation environments harnessing the NVIDIA Omniverse Blueprint for AV simulation with NVIDIA Cosmos world foundation models to train, test and validate AVs. In addition, it boasts a diverse AV stack combining modular components with end-to-end AI models to ensure safety with cutting-edge AI models in the loop.

Ecosystem Safety

Halos includes safety datasets with diverse, unbiased data, as well as safe deployment workflows, comprising triaging workflows and automated safety evaluations, along with a data flywheel for continual safety improvements — demonstrating leadership in AV safety standardization and regulation.

Safety Track Record

Halos brings together a vast amount of safety-focused technology research, development, deployment, partnerships and collaborations by NVIDIA, including:

  • 15,000+ engineering years invested in vehicle safety
  • 10,000+ hours of contributions to international standards committees
  • 1,000+ AV-safety patents filed
  • 240+ AV-safety research papers published
  • 30+ safety and cybersecurity certificates

It also dovetails with recent significant safety certifications and assessments of NVIDIA automotive products, including:

  • The NVIDIA DriveOS 6.0 operating system conforms with ISO 26262 automotive safety integrity level (ASIL D) standards.
  • TÜV SÜD granted the ISO/SAE 21434 Cybersecurity Process certification to NVIDIA for its automotive SoC, platform and software engineering processes.
  • TÜV Rheinland performed an independent safety assessment of NVIDIA DRIVE AV for the United Nations Economic Commission for Europe related to safety requirements for complex electronic systems.

To learn more about NVIDIA’s approach to automotive safety, attend AV Safety Day  today at NVIDIA GTC, a global AI conference running through Friday, March 21.

See notice regarding software product information.

Read More

NVIDIA Accelerates Science and Engineering With CUDA-X Libraries Powered by GH200 and GB200 Superchips

NVIDIA Accelerates Science and Engineering With CUDA-X Libraries Powered by GH200 and GB200 Superchips

Scientists and engineers of all kinds are equipped to solve tough problems a lot faster with NVIDIA CUDA-X libraries powered by NVIDIA GB200 and GH200 superchips.

Announced today at the NVIDIA GTC global AI conference, developers can now take advantage of tighter automatic integration and coordination between CPU and GPU resources — enabled by CUDA-X working with these latest superchip architectures — resulting in up to 11x speedups for computational engineering tools and 5x larger calculations compared with using traditional accelerated computing architectures.

This greatly accelerates and improves workflows in engineering simulation, design optimization and more, helping scientists and researchers reach groundbreaking results faster.

NVIDIA released CUDA in 2006, opening up a world of applications to the power of accelerated computing. Since then, NVIDIA has built more than 900 domain-specific NVIDIA CUDA-X libraries and AI models, making it easier to adopt accelerated computing and driving incredible scientific breakthroughs. Now, CUDA-X brings accelerated computing to a broad new set of engineering disciplines, including astronomy, particle physics, quantum physics, automotive, aerospace and semiconductor design.

The NVIDIA Grace CPU architecture delivers a significant boost to memory bandwidth while reducing power consumption. And NVIDIA NVLink-C2C interconnects provide such high bandwidth that the GPU and CPU can share memory, allowing developers to write less-specialized code, run larger problems and improve application performance.

Accelerating Engineering Solvers With NVIDIA cuDSS

NVIDIA’s superchip architectures allow users to extract greater performance from the same underlying GPU by making more efficient use of CPU and GPU processing capabilities.

The NVIDIA cuDSS library is used to solve large engineering simulation problems involving sparse matrices for applications such as design optimization, electromagnetic simulation workflows and more. cuDSS uses Grace GPU memory and the high-bandwidth NVLink-C2C interconnect to factorize and solve large matrices that normally wouldn’t fit in device memory. This enables users to solve extremely large problems in a fraction of the time.

The coherent shared memory between the GPU and Grace GPU minimizes data movement, significantly reducing overhead for large systems. For a range of large computational engineering problems, tapping the Grace CPU memory and superchip architecture accelerated the most heavy-duty solution steps by up to 4x with the same GPU, with cuDSS hybrid memory.

Ansys has integrated cuDSS into its HFSS solver, delivering significant performance enhancements for electromagnetic simulations. With cuDSS, HFSS software achieves up to an 11x speed improvement for the matrix solver.

Altair OptiStruct has also adopted the cuDSS Direct Sparse Solver library, substantially accelerating its finite element analysis workloads.

These performance gains are achieved by optimizing key operations on the GPU while intelligently using CPUs for shared memory and heterogeneous CPU and GPU execution. cuDSS automatically detects areas where CPU utilization provides additional benefits, further enhancing efficiency.

Scaling Up at Warp Speed With Superchip Memory

Scaling memory-limited applications on a single GPU becomes possible with the GB200 and GH200 architectures’ NVLink-CNC interconnects that provide CPU and GPU memory coherency.

Many engineering simulations are limited by scale and require massive simulations to produce the resolution necessary to design equipment with intricate components, such as aircraft engines. By tapping into the ability to seamlessly read and write between CPU and GPU memories, engineers can easily implement out-of-core solvers to process larger data.

For example, using NVIDIA Warp —a Python-based framework for accelerating data generation and spatial computing applications — Autodesk performed simulations of up to 48 billion cells using eight GH200 nodes. This is more than 5x larger than the simulations possible using eight NVIDIA H100 nodes.

Powering Quantum Computing Research With NVIDIA cuQuantum

Quantum computers promise to accelerate problems that are core to many science and industry disciplines. Shortening the time to useful quantum computing rests heavily on the ability to simulate extremely complex quantum systems.

Simulations allow researchers to develop new algorithms today that will run at scales suitable for tomorrow’s quantum computers. They also play a key role in improving quantum processors, running complex simulations of performance and noise characteristics of new qubit designs.

So-called state vector simulations of quantum algorithms require matrix operations to be performed on exponentially large vector objects that must be stored in memory. Tensor network simulations, on the other hand, simulate quantum algorithms through tensor contractions and can enable hundreds or thousands of qubits to be simulated for certain important classes of applications.

The NVIDIA cuQuantum library accelerates these workloads. cuQuantum is integrated with every leading quantum computing framework, so all quantum researchers can tap into simulation performance with no code changes.

Simulations of quantum algorithms are generally limited in scale by memory requirements. The GB200 and GH200 architectures provide an ideal platform for scaling up quantum simulations, as they enable large CPU memory to be used without bottlenecking performance. A GH200 system is up to 3x faster than an H100 system with x86 on quantum computing benchmarks.

Learn more about CUDA-X libraries, attend the GTC session on how math libraries can help accelerate applications on NVIDIA Blackwell GPUs and watch NVIDIA founder and CEO Jensen Huang’s GTC keynote.

Read More