March 2023 – Page 7

Best practices for viewing and querying Amazon SageMaker service quota usage

Amazon SageMaker customers can view and manage their quota limits through Service Quotas. In addition, they can view near real-time utilization metrics and create Amazon CloudWatch metrics to view and programmatically query SageMaker quotas.

SageMaker helps you build, train, and deploy machine learning (ML) models with ease. To learn more, refer to Getting started with Amazon SageMaker. Service Quotas simplifies limit management by allowing you to view and manage your quotas for SageMaker from a central location.

With Service Quotas, you can view the maximum number of resources, actions, or items in your AWS account or AWS Region. You can also use Service Quotas to request an increase for adjustable quotas.

With the increasing usage of MLOps practices, and therefore the demand for resources designated for ML model experimentation and retraining, more customers need to run multiple instances, often of the same instance type at the same time.

Many data science teams often work in parallel, using several instances for processing, training, and tuning concurrently. Previously, users would sometimes reach an adjustable account limit for some particular instance type and have to manually request a limit increase from AWS.

To request quota increases manually from the Service Quotas UI, you can choose the quota from the list and choose Request quota increase. For more information, refer to Requesting a quota increase.

In this post, we show how you can use the new features to automatically request limit increases when a high level of instances is reached.

Solution overview

The following diagram illustrates the solution architecture.

This architecture includes the following workflow:

A CloudWatch metric monitors the usage of the resource. A CloudWatch alarm triggers when the resource usage goes beyond a certain preconfigured threshold.
A message is sent to Amazon Simple Notification Service (Amazon SNS).
The message is received by an AWS Lambda function.
The Lambda function requests the quota increase.

Aside from requesting for a quota increase for the specific account, the Lambda function can also add the quota increase to the organization template (up to 10 quotas). This way, any new account created under a given AWS Organization has the increased quota requests by default.

Prerequisites

Complete the following prerequisite steps:

Set up an AWS account and create an AWS Identity and Access Management (IAM) user. For instructions, refer to Secure Your AWS Account.
Install the AWS SAM CLI.

Deploy using AWS Serverless Application Model

To deploy the application using the GitHub repo, run the following command in the terminal:

git clone https://github.com/aws-samples/sagemaker-quotas-alarm.git
cd sagemaker-quotas-alarm
sam build && sam deploy --stack-name usage --region us-east-1 --resolve-s3 --capabilities CAPABILITY_IAM --parameter-overrides ResourceUsageThreshold=50 SecurityGroupIds=<SECURITY-GROUP-IDS> SubnetIds=<SUBNETS>

After the solution is deployed, you should have a new alarm on the CloudWatch console. This alarm monitors usage for SageMaker notebook instances for the ml.t3.medium instance.

If your resource usage reaches more than 50%, the alarm triggers and the Lambda function requests an increase.

If the account you have is part of an AWS Organization and you have the quota request template enabled, you should also see those increases on the template, if the template has available slots. This way, new accounts from that organization also have the increases configured upon creation.

Deploy using the CloudWatch console

To deploy the application using the CloudWatch console, complete the following steps:

On the CloudWatch console, choose All alarms in the navigation pane.
Choose Create alarm.
Choose Select metric.
Choose Usage.
Select the metric you want to monitor.
Select the condition of when you would like the alarm to trigger.

For more possible configurations when configuring the alarm, see Create a CloudWatch alarm based on a static threshold.

Configure the SNS topic to be notified about the alarm.

You can also use Amazon SNS to trigger a Lambda function when the alarm is triggered. See Using AWS Lambda with Amazon SNS for more information.

For Alarm name, enter a name.
Choose Next.
Choose Create alarm.

Clean up

To clean up the resources created as part of this post, make sure to delete all the created stacks. To do that, run the following command:

sam delete --stack-name usage --region us-east-1

Conclusion

In this post, we showed how you can use the new integration from SageMaker with Service Quotas to automate the requests for quota increases for SageMaker resources. This way, data science teams can effectively work in parallel and reduce issues related to unavailability of instances.

You can learn more about Amazon SageMaker quotas by accessing the documentation. You can also learn more about Service Quotas here.

About the authors

Bruno Klein is a Machine Learning Engineer in the AWS ProServe team. He particularly enjoys creating automations and improving the lifecycle of models in production. In his free time, he likes to spend time outdoors and hiking.

Paras Mehra is a Senior Product Manager at AWS. He is focused on helping build Amazon SageMaker Training and Processing. In his spare time, Paras enjoys spending time with his family and road biking around the Bay Area. You can find him on LinkedIn.

Build custom code libraries for your Amazon SageMaker Data Wrangler Flows using AWS Code Commit

As organizations grow in size and scale, the complexities of running workloads increase, and the need to develop and operationalize processes and workflows becomes critical. Therefore, organizations have adopted technology best practices, including microservice architecture, MLOps, DevOps, and more, to improve delivery time, reduce defects, and increase employee productivity. This post introduces a best practice for managing custom code within your Amazon SageMaker Data Wrangler workflow.

Data Wrangler is a low-code tool that facilitates data analysis, preprocessing, and visualization. It contains over 300 built-in data transformation steps to aid with feature engineering, normalization, and cleansing to transform your data without having to write any code.

In addition to the built-in transforms, Data Wrangler contains a custom code editor that allows you to implement custom code written in Python, PySpark, or SparkSQL.

When using Data Wrangler custom transform steps to implement your custom functions, you need to implement best practices around developing and deploying code in Data Wrangler flows.

This post shows how you can use code stored in AWS CodeCommit in the Data Wrangler custom transform step. This provides you with additional benefits, including:

Improve productivity and collaboration across personnel and teams
Version your custom code
Modify your Data Wrangler custom transform step without having to log in to Amazon SageMaker Studio to use Data Wrangler
Reference parameter files in your custom transform step
Scan code in CodeCommit using Amazon CodeGuru or any third-party application for security vulnerabilities before using it in Data Wrangler flowssagemake

Solution overview

This post demonstrates how to build a Data Wrangler flow file with a custom transform step. Instead of hardcoding the custom function into your custom transform step, you pull a script containing the function from CodeCommit, load it, and call the loaded function in your custom transform step.

For this post, we use the bank-full.csv data from the University of California Irving Machine Learning Repository to demonstrate these functionalities. The data is related to the direct marketing campaigns of a banking institution. Often, more than one contact with the same client was required to assess if the product (bank term deposit) would be subscribed (yes) or not subscribed (no).

The following diagram illustrates this solution.

The workflow is as follows:

Create a Data Wrangler flow file and import the dataset from Amazon Simple Storage Service (Amazon S3).
Create a series of Data Wrangler transformation steps:
- A custom transform step to implement a custom code stored in CodeCommit.
- Two built-in transform steps.

We keep the transformation steps to a minimum so as not to detract from the aim of this post, which is focused on the custom transform step. For more information about available transformation steps and implementation, refer to Transform Data and the Data Wrangler blog.

In the custom transform step, write code to pull the script and configuration file from CodeCommit, load the script as a Python module, and call a function in the script. The function takes a configuration file as an argument.
Run a Data Wrangler job and set Amazon S3 as the destination.

Destination options also include Amazon SageMaker Feature Store.

Prerequisites

As a prerequisite, we set up the CodeCommit repository, Data Wrangler flow, and CodeCommit permissions.

Create a CodeCommit repository

For this post, we use an AWS CloudFormation template to set up a CodeCommit repository and copy the required files into this repository. Complete the following steps:

Choose Launch Stack:

Select the Region where you want to create the CodeCommit repository.
Enter a name for Stack name.
Enter a name for the repository to be created for RepoName.
Choose Create stack.

AWS CloudFormation takes a few seconds to provision your CodeCommit repository. After the CREATE_COMPLETE status appears, navigate to the CodeCommit console to see your newly created repository.

Set up Data Wrangler

Download the bank.zip dataset from the University of California Irving Machine Learning Repository. Then, extract the contents of bank.zip and upload bank-full.csv to Amazon S3.

To create a Data Wrangler flow file and import the bank-full.csv dataset from Amazon S3, complete the following steps:

Onboard to SageMaker Studio using the quick start for users new to Studio.
Select your SageMaker domain and user profile and on the Launch menu, choose Studio.

On the Studio console, on the File menu, choose New, then choose Data Wrangler Flow.
Choose Amazon S3 for Data sources.
Navigate to your S3 bucket containing the file and upload the bank-full.csv file.

A Preview Error will be thrown.

Change the Delimiter in the Details pane to the right to SEMICOLON.

A preview of the dataset will be displayed in the result window.

In the Details pane, on the Sampling drop-down menu, choose None.

This is a relatively small dataset, so you don’t need to sample.

Choose Import.

Configure CodeCommit permissions

You need to provide Studio with permission to access CodeCommit. We use a CloudFormation template to provision an AWS Identity and Access Management (IAM) policy that gives your Studio role permission to access CodeCommit. Complete the following steps:

Choose Launch Stack:

Select the Region you are working in.
Enter a name for Stack name.
Enter your Studio domain ID for SageMakerDomainID. The domain information is available on the SageMaker console Domains page, as shown in the following screenshot.

Enter your Studio domain user profile name for SageMakerUserProfileName. You can view your user profile name by navigating into your Studio domain. If you have multiple user profiles in your Studio domain, enter the name for the user profile used to launch Studio.

Select the acknowledgement box.

The IAM resources used by this CloudFormation template provide the minimum permissions to successfully create the IAM policy attached to your Studio role for CodeCommit access.

Choose Create stack.

Transformation steps

Next, we add transformations to process the data.

Custom transform step

In this post, we calculate the Variance Inflation factor (VIF) for each numerical feature and drop features that exceed a VIF threshold. We do this in the custom transform step because Data Wrangler doesn’t have a built-in transform for this task as of this writing.

However, we don’t hardcode this VIF function. Instead, we pull this function from the CodeCommit repository into the custom transform step. Then we run the function on the dataset.

On the Data Wrangler console, navigate to your data flow.
Choose the plus sign next to Data types and choose Add transform.

Choose + Add step.
Choose Custom transform.
Optionally, enter a name in the Name field.
Choose Python (PySpark) on the drop-down menu.
For Your custom transform, enter the following code (provide the name of the CodeCommit repository and Region where the repository is located):

# Table is available as variable `df`
import boto3
import os
import json
import numpy as np
from importlib import reload
import sys

# Initialize variables
repo_name= <Enter Name of Repository># Name of repository in CodeCommit
region= <Name of region where repository is located># Name of AWS region
script_name='pyspark_transform.py' # Name of script in CodeCommit
config_name='parameter.json' # Name of configuration file in CodeCommit

# Creating directory to hold downloaded files
newpath=os.getcwd()+"/input/"
if not os.path.exists(newpath):
    os.makedirs(newpath)
module_path=os.getcwd()+'/input/'+script_name

# Downloading configuration file to memory
client=boto3.client('codecommit', region_name=region)
response = client.get_file(
    repositoryName=repo_name,
    filePath=config_name)
param_file=json.loads(response['fileContent'])

# Downloading script to directory
script = client.get_file(
   repositoryName=repo_name,
   filePath=script_name)
with open(module_path,'w') as f:
   f.write(script['fileContent'].decode())

# importing pyspark script as module
sys.path.append(os.getcwd()+"/input/")
import pyspark_transform
#reload(pyspark_transform)

# Executing custom function in pyspark script
df=pyspark_transform.compute_vif(df,param_file)

The code uses the AWS SDK for Python (Boto3) to access CodeCommit API functions. We use the get_file API function to pull files from the CodeCommit repository into the Data Wrangler environment.

Choose Preview.

In the Output pane, a table is displayed showing the different numerical features and their corresponding VIF value. For this exercise, the VIF threshold value is set to 1.2. However, you can modify this threshold value in the parameter.json file found in your CodeCommit repository. You will notice that two columns have been dropped (pdays and previous), bringing the total column count to 15.

Choose Add.

Encode categorical features

Some feature types are categorical variables that need to be transformed into numerical forms. Use the one-hot encode built-in transform to achieve this data transformation. Let’s create numerical features representing the unique value in each categorical feature in the dataset. Complete the following steps:

Choose + Add step.
Choose the Encode categorical transform.
On the Transform drop-down menu, choose One-hot encode.
For Input column, choose all categorical features, including poutcome, y, month, marital, contact, default, education, housing, job, and loan.
For Output style, choose Columns.
Choose Preview to preview the results.

One-hot encoding might take a while to generate results, given the number of features and unique values within each feature.

Choose Add.

For each numerical feature created with one-hot encoding, the name combines the categorical feature name appended with an underscore (_) and the unique categorical value within that feature.

Drop column

The y_yes feature is the target column for this exercise, so we drop the y_no feature.

Choose + Add step.
Choose Manage columns.
Choose Drop column under Transform.
Choose y_no under Columns to drop.
Choose Preview, then choose Add.

Create a Data Wrangler job

Now that you have created all the transform steps, you can create a Data Wrangler job to process your input data and store the output in Amazon S3. Complete the following steps:

Choose Data flow to go back to the Data Flow page.
Choose the plus sign on the last tile of your flow visualization.
Choose Add destination and choose Amazon S3.

Enter the name of the output file for Dataset name.
Choose Browse and choose the bucket destination for Amazon S3 location.
Choose Add destination.
Choose Create job.

Change the Job name value as you see fit.
Choose Next, 2. Configure job.
Change Instance count to 1, because we work with a relatively small dataset, to reduce the cost incurred.
Choose Create.

This will start an Amazon SageMaker Processing job to process your Data Wrangler flow file and store the output in the specified S3 bucket.

Automation

Now that you have created your Data Wrangler flow file, you can schedule your Data Wrangler jobs to automatically run at specific times and frequency. This is a feature that comes out of the box with Data Wrangler and simplifies the process of scheduling Data Wrangler jobs. Furthermore, CRON expressions are supported and provide additional customization and flexibility in scheduling your Data Wrangler jobs.

However, this post shows how you can automate the Data Wrangler job to run every time there is a modification to the files in the CodeCommit repository. This automation technique ensures that any changes to the custom code functions or changes to values in the configuration file in CodeCommit trigger a Data Wrangler job to reflect these changes immediately.

Therefore, you don’t have to manually start a Data Wrangler job to get the output data that reflects the changes you just made. With this automation, you can improve the agility and scale of your Data Wrangler workloads. To automate your Data Wrangler jobs, you configure the following:

Amazon SageMaker Pipelines – Pipelines helps you create machine learning (ML) workflows with an easy-to-use Python SDK, and you can visualize and manage your workflow using Studio
Amazon EventBridge – EventBridge facilitates connection to AWS services, software as a service (SaaS) applications, and custom applications as event producers to launch workflows.

Create a SageMaker pipeline

First, you need to create a SageMaker pipeline for your Data Wrangler job. Then complete the following steps to export your Data Wrangler flow to a SageMaker pipeline:

Choose the plus sign on your last transform tile (the transform tile before the Destination tile).
Choose Export to.
Choose SageMaker Inference Pipeline (via Jupyter Notebook).

This creates a new Jupyter notebook prepopulated with code to create a SageMaker pipeline for your Data Wrangler job. Before running all the cells in the notebook, you may want to change certain variables.

To add a training step to your pipeline, change the add_training_step variable to True.

Be aware that running a training job will incur additional costs on your account.

Specify a value for the target_attribute_name variable to y_yes.

To change the name of the pipeline, change the pipeline_name variable.

Lastly, run the entire notebook by choosing Run and Run All Cells.

This creates a SageMaker pipeline and runs the Data Wrangler job.

To view your pipeline, choose the home icon on the navigation pane and choose Pipelines.

You can see the new SageMaker pipeline created.

Choose the newly created pipeline to see the run list.
Note the name of the SageMaker pipeline, as you will use it later.
Choose the first run and choose Graph to see a Directed Acyclic Graph (DAG) flow of your SageMaker pipeline.

As shown in the following screenshot, we didn’t add a training step to our pipeline. If you added a training step to your pipeline, it will display in your pipeline run Graph tab under DataWranglerProcessingStep.

Create an EventBridge rule

After successfully creating your SageMaker pipeline for the Data Wrangler job, you can move on to setting up an EventBridge rule. This rule listens to activities in your CodeCommit repository and triggers the run of the pipeline in the event of a modification to any file in the CodeCommit repository. We use a CloudFormation template to automate creating this EventBridge rule. Complete the following steps:

Choose Launch Stack:

Select the Region you are working in.
Enter a name for Stack name.
Enter a name for your EventBridge rule for EventRuleName.
Enter the name of the pipeline you created for PipelineName.
Enter the name of the CodeCommit repository you are working with for RepoName.
Select the acknowledgement box.

The IAM resources that this CloudFormation template uses provide the minimum permissions to successfully create the EventBridge rule.

Choose Create stack.

It takes a few minutes for the CloudFormation template to run successfully. When the Status changes to CREATE_COMPLTE, you can navigate to the EventBridge console to see the created rule.

Now that you have created this rule, any changes you make to the file in your CodeCommit repository will trigger the run of the SageMaker pipeline.

To test the pipeline edit a file in your CodeCommit repository, modify the VIF threshold in your parameter.json file to a different number, and go to the SageMaker pipeline details page to see a new run of your pipeline created.

In this new pipeline run, Data Wrangler drops numerical features that have a greater VIF value than the threshold you specified in your parameter.json file in CodeCommit.

You have successfully automated and decoupled your Data Wrangler job. Furthermore, you can add more steps to your SageMaker pipeline. You can also modify the custom scripts in CodeCommit to implement various functions in your Data Wrangler flow.

It’s also possible to store your scripts and files in Amazon S3 and download them into your Data Wrangler custom transform step as an alternative to CodeCommit. In addition, you ran your custom transform step using the Python (PyScript) framework. However, you can also use the Python (Pandas) framework for your custom transform step, allowing you to run custom Python scripts. You can test this out by changing your framework in the custom transform step to Python (Pandas) and modifying your custom transform step code to pull and implement the Python script version stored in your CodeCommit repository. However, the PySpark option for Data Wrangler provides better performance when working on a large dataset compared to the Python Pandas option.

Clean up

After you’re done experimenting with this use case, clean up the resources you created to avoid incurring additional charges to your account:

Stop the underlying instance used to create your Data Wrangler flow.
Delete the resources created by the various CloudFormation template.
If you see a DELETE_FAILED state, when deleting the CloudFormation template, delete the stack one more time to successfully delete it.

Summary

This post showed you how to decouple your Data Wrangler custom transform step by pulling scripts from CodeCommit. We also showed how to automate your Data Wrangler jobs using SageMaker Pipelines and EventBridge.

Now you can operationalize and scale your Data Wrangler jobs without modifying your Data Wrangler flow file. You can also scan your custom code in CodeCommit using CodeGuru or any third-party application for vulnerabilities before implementing it in Data Wrangler. To know more about end-to-end machine learning operations (MLOps) on AWS, visit Amazon SageMaker for MLOps.

About the Author

Uchenna Egbe is an Associate Solutions Architect at AWS. He spends his free time researching about herbs, teas, superfoods, and how to incorporate them into his daily diet.

NVIDIA to Bring AI to Every Industry, CEO Says

ChatGPT is just the start.

With computing now advancing at what he called “lightspeed,” NVIDIA founder and CEO Jensen Huang today announced a broad set of partnerships with Google, Microsoft, Oracle and a range of leading businesses that bring new AI, simulation and collaboration capabilities to every industry.

“The warp drive engine is accelerated computing, and the energy source is AI,” Huang said in his keynote at the company’s GTC conference. “The impressive capabilities of generative AI have created a sense of urgency for companies to reimagine their products and business models.”

In a sweeping 78-minute presentation anchoring the four-day event, Huang outlined how NVIDIA and its partners are offering everything from training to deployment for cutting-edge AI services. He announced new semiconductors and software libraries to enable fresh breakthroughs. And Huang revealed a complete set of systems and services for startups and enterprises racing to put these innovations to work on a global scale.

Huang punctuated his talk with vivid examples of this ecosystem at work. He announced NVIDIA and Microsoft will connect hundreds of millions of Microsoft 365 and Azure users to a platform for building and operating hyperrealistic virtual worlds. He offered a peek at how Amazon is using sophisticated simulation capabilities to train new autonomous warehouse robots. He touched on the rise of a new generation of wildly popular generative AI services such as ChatGPT.

And underscoring the foundational nature of NVIDIA’s innovations, Huang detailed how, together with ASML, TSMC and Synopsis, NVIDIA computational lithography breakthroughs will help make a new generation of efficient, powerful 2-nm semiconductors possible.

The arrival of accelerated computing and AI come just in time, with Moore’s Law slowing and industries tackling powerful dynamics —sustainability, generative AI, and digitalization, Huang said. “Industrial companies are racing to digitalize and reinvent into software-driven tech companies — to be the disruptor and not the disrupted,” Huang said.

Acceleration lets companies meet these challenges. “Acceleration is the best way to reclaim power and achieve sustainability and Net Zero,” Huang said.

GTC: The Premier AI Conference

GTC, now in its 14th year, has become one of the world’s most important AI gatherings. This week’s conference features 650 talks from leaders such as Demis Hassabis of DeepMind, Valeri Taylor of Argonne Labs, Scott Belsky of Adobe, Paul Debevec of Netflix, Thomas Schulthess of ETH Zurich and a special fireside chat between Huang and Ilya Sutskever, co-founder of OpenAI, the creator of ChatGPT.

More than 250,000 registered attendees will dig into sessions on everything from restoring the lost Roman mosaics of 2,000 years ago to building the factories of the future, from exploring the universe with a new generation of massive telescopes to rearranging molecules to accelerate drug discovery, to more than 70 talks on generative AI.

The iPhone Moment of AI

NVIDIA’s technologies are fundamental to AI, with Huang recounting how NVIDIA was there at the very beginning of the generative AI revolution. Back in 2016 he hand-delivered to OpenAI the first NVIDIA DGX AI supercomputer — the engine behind the large language model breakthrough powering ChatGPT.

Launched late last year, ChatGPT went mainstream almost instantaneously, attracting over 100 million users, making it the fastest-growing application in history. “We are at the iPhone moment of AI,” Huang said.

NVIDIA DGX supercomputers, originally used as an AI research instrument, are now running 24/7 at businesses across the world to refine data and process AI, Huang reported. Half of all Fortune 100 companies have installed DGX AI supercomputers.

“DGX supercomputers are modern AI factories,” Huang said.

NVIDIA H100, Grace Hopper, Grace, for Data Centers

Deploying LLMs like ChatGPT are a significant new inference workload, Huang said. For large-language-model inference, like ChatGPT, Huang announced a new GPU — the H100 NVL with dual-GPU NVLink.

Based on NVIDIA’s Hopper architecture, H100 features a Transformer Engine designed to process models such as the GPT model that powers ChatGPT. Compared to HGX A100 for GPT-3 processing, a standard server with four pairs of H100 with dual-GPU NVLink is up to 10x faster.

“H100 can reduce large language model processing costs by an order of magnitude,” Huang said.

Meanwhile, over the past decade, cloud computing has grown 20% annually into a $1 trillion industry, Huang said. NVIDIA designed the Grace CPU for an AI- and cloud-first world, where AI workloads are GPU accelerated. Grace is sampling now, Huang said.

NVIDIA’s new superchip, Grace Hopper, connects the Grace CPU and Hopper GPU over a high-speed 900GB/sec coherent chip-to-chip interface. Grace Hopper is ideal for processing giant datasets like AI databases for recommender systems and large language models, Huang explained.

“Customers want to build AI databases several orders of magnitude larger,” Huang said. “Grace Hopper is the ideal engine.”

DGX the Blueprint for AI Infrastructure

The latest version of DGX features eight NVIDIA H100 GPUs linked together to work as one giant GPU. “NVIDIA DGX H100 is the blueprint for customers building AI infrastructure worldwide,” Huang said, sharing that NVIDIA DGX H100 is now in full production.

H100 AI supercomputers are already coming online.

Oracle Cloud Infrastructure announced the limited availability of new OCI Compute bare-metal GPU instances featuring H100 GPUs

Additionally, Amazon Web Services announced its forthcoming EC2 UltraClusters of P5 instances, which can scale in size up to 20,000 interconnected H100 GPUs.

This follows Microsoft Azure’s private preview announcement last week for its H100 virtual machine, ND H100 v5.

Meta has now deployed its H100-powered Grand Teton AI supercomputer internally for its AI production and research teams.

And OpenAI will be using H100s on its Azure supercomputer to power its continuing AI research.

Other partners making H100 available include Cirrascale and CoreWeave, both which announced general availability today. Additionally, Google Cloud, Lambda, Paperspace and Vult are planning to offer H100.

And servers and systems featuring NVIDIA H100 GPUs are available from leading server makers including Atos, Cisco, Dell Technologies, GIGABYTE, Hewlett Packard Enterprise, Lenovo and Supermicro.

DGX Cloud: Bringing AI to Every Company, Instantly

And to speed DGX capabilities to startups and enterprises racing to build new products and develop AI strategies, Huang announced NVIDIA DGX Cloud, through partnerships with Microsoft Azure, Google Cloud and Oracle Cloud Infrastructure to bring NVIDIA DGX AI supercomputers “to every company, from a browser.”

DGX Cloud is optimized to run NVIDIA AI Enterprise, the world’s leading acceleration software suite for end-to-end development and deployment of AI. “DGX Cloud offers customers the best of NVIDIA AI and the best of the world’s leading cloud service providers,” Huang said.

NVIDIA is partnering with leading cloud service providers to host DGX Cloud infrastructure, starting with Oracle Cloud Infrastructure. Microsoft Azure is expected to begin hosting DGX Cloud next quarter, and the service will soon expand to Google Cloud and more.

This partnership brings NVIDIA’s ecosystem to cloud service providers while amplifying NVIDIA’s scale and reach, Huang said. Enterprises will be able to rent DGX Cloud clusters on a monthly basis, ensuring they can quickly and easily scale the development of large, multi-node training workloads.

Supercharging Generative AI

To accelerate the work of those seeking to harness generative AI, Huang announced NVIDIA AI Foundations, a family of cloud services for customers needing to build, refine and operate custom LLMs and generative AI trained with their proprietary data and for domain-specific tasks.

AI Foundations services include NVIDIA NeMo for building custom language text-to-text generative models; Picasso, a visual language model-making service for customers who want to build custom models trained with licensed or proprietary content; and BioNeMo, to help researchers in the $2 trillion drug discovery industry.

Adobe is partnering with NVIDIA to build a set of next-generation AI capabilities for the future of creativity.

Getty Images is collaborating with NVIDIA to train responsible generative text-to-image and text-to-video foundation models.

Shutterstock is working with NVIDIA to train a generative text-to-3D foundation model to simplify the creation of detailed 3D assets.

Accelerating Medical Advances

And NVIDIA announced Amgen is accelerating drug discovery services with BioNeMo. In addition, Alchemab Therapeutics, AstraZeneca, Evozyne, Innophore and Insilico are all early access users of BioNemo.

BioNeMo helps researchers create, fine-tune and serve custom models with their proprietary data, Huang explained.

Huang also announced that NVIDIA and Medtronic, the world’s largest healthcare technology provider, are partnering to build an AI platform for software-defined medical devices. The partnership will create a common platform for Medtronic systems, ranging from surgical navigation to robotic-assisted surgery.

And today Medtronic announced that its GI Genius system, with AI for early detection of colon cancer, is built on NVIDIA Holoscan, a software library for real-time sensor processing systems, and will ship around the end of this year.

“The world’s $250 billion medical instruments market is being transformed,” Huang said.

Speeding Deployment of Generative AI Applications

To help companies deploy rapidly emerging generative AI models, Huang announced inference platforms for AI video, image generation, LLM deployment and recommender inference. They combine NVIDIA’s full stack of inference software with the latest NVIDIA Ada, Hopper and Grace Hopper processors — including the NVIDIA L4 Tensor Core GPU and the NVIDIA H100 NVL GPU, both launched today.

• NVIDIA L4 for AI Video can deliver 120x more AI-powered video performance than CPUs, combined with 99% better energy efficiency.

• NVIDIA L40 for Image Generation is optimized for graphics and AI-enabled 2D, video and 3D image generation.

• NVIDIA H100 NVL for Large Language Model Deployment is ideal for deploying massive LLMs like ChatGPT at scale.

• And NVIDIA Grace Hopper for Recommendation Models is ideal for graph recommendation models, vector databases and graph neural networks.

Google Cloud is the first cloud service provider to offer L4 to customers with the launch of its new G2 virtual machines, available in private preview today. Google is also integrating L4 into its Vertex AI model store.

Microsoft, NVIDIA to Bring Omniverse to ‘Hundreds of Millions’

Unveiling a second cloud service to speed unprecedented simulation and collaboration capabilities to enterprises, Huang announced NVIDIA is partnering with Microsoft to bring NVIDIA Omniverse Cloud, a fully managed cloud service, to the world’s industries.

“Microsoft and NVIDIA are bringing Omnivese to hundreds of millions of Microsoft 365 and Azure users,” Huang said, also unveiling new NVIDIA OVX servers and a new generation of workstations powered by NVIDIA RTX Ada Generation GPUs and Intel’s newest CPUs optimized for NVIDIA Omniverse.

To show the extraordinary capabilities of Omniverse, NVIDIA’s open platform built for 3D design collaboration and digital twin simulation, Huang shared a video showing how NVIDIA Isaac Sim, NVIDIA’s robotics simulation and synthetic generation platform, built on Omniverse, is helping Amazon save time and money with full-fidelity digital twins.

It shows how Amazon is working to choreograph the movements of Proteus, Amazon’s first fully autonomous warehouse robot, as it moves bins of products from one place to another in Amazon’s cavernous warehouses alongside humans and other robots.

Digitizing the $3 Trillion Auto Industry

Illustrating the scale of Omniverse’s reach and capabilities, Huang dug into Omniverse’s role in digitalizing the $3 trillion auto industry. By 2030, auto manufacturers will build 300 factories to make 200 million electric vehicles, Huang said, and battery makers are building 100 more megafactories. “Digitalization will enhance the industry’s efficiency, productivity and speed,” Huang said.

Touching on Omniverse’s adoption across the industry, Huang said Lotus is using Omniverse to virtually assemble welding stations. Mercedes-Benz uses Omniverse to build, optimize and plan assembly lines for new models. Rimac and Lucid Motors use Omniverse to build digital stores from actual design data that faithfully represent their cars.

Working with Idealworks, BMW uses Isaac Sim in Omniverse to generate synthetic data and scenarios to train factory robots. And BMW is using Omniverse to plan operations across factories worldwide and is building a new electric-vehicle factory, completely in Omniverse, two years before the plant opens, Huang said.

Separately. NVIDIA today announced that BYD, the world’s leading manufacturer of new energy vehicles NEVs, will extend its use of the NVIDIA DRIVE Orin centralized compute platform in a broader range of its NEVs.

Accelerating Semiconductor Breakthroughs

Enabling semiconductor leaders such as ASML, TSMC and Synopsis to accelerate the design and manufacture of a new generation of chips as current production processes near the limits of what physics makes possible, Huang announced NVIDIA cuLitho, a breakthrough that brings accelerated computing to the field of computational lithography.

The new NVIDIA cuLitho software library for computational lithography is being integrated by TSMC, the world’s leading foundry, as well as electronic design automation leader Synopsys into their software, manufacturing processes and systems for the latest-generation NVIDIA Hopper architecture GPUs.

Chip-making equipment provider ASML is working closely with NVIDIA on GPUs and cuLitho, and plans to integrate support for GPUs into all of their computational lithography software products. With lithography at the limits of physics, NVIDIA’s introduction of cuLitho enables the industry to go to 2nm and beyond, Huang said.

“The chip industry is the foundation of nearly every industry,” Huang said.

Accelerating the World’s Largest Companies

Companies around the world are on board with Huang’s vision.

Telecom giant AT&T uses NVIDIA AI to more efficiently process data and is testing Omniverse ACE and the Tokkio AI avatar workflow to build, customize and deploy virtual assistants for customer service and its employee help desk.

American Express, the U.S. Postal Service, Microsoft Office and Teams, and Amazon are among the 40,000 customers using the high-performance NVIDIA TensorRT inference optimizer and runtime, and NVIDIA Triton, a multi-framework data center inference serving software.

Uber uses Triton to serve hundreds of thousands of ETA predictions per second.

And with over 60 million daily users, Roblox uses Triton to serve models for game recommendations, build avatars, and moderate content and marketplace ads.

Microsoft, Tencent and Baidu are all adopting NVIDIA CV-CUDA for AI computer vision. The technology, in open beta, optimizes pre- and post-processing, delivering 4x savings in cost and energy.

Helping Do the Impossible

Wrapping up his talk, Huang thanked NVIDIA’s systems, cloud and software partners, as well as researchers, scientists and employees.

NVIDIA has updated 100 acceleration libraries, including cuQuantum and the newly open-sourced CUDA Quantum for quantum computing, cuOpt for combinatorial optimization, and cuLitho for computational lithography, Huang announced.

The global NVIDIA ecosystem, Huang reported, now spans 4 million developers, 40,000 companies and 14,000 startups in NVIDIA Inception.

“Together,” Huang said. “We are helping the world do the impossible.”

Fresh-Faced AI: NVIDIA Avatar Solutions Enhance Customer Service and Virtual Assistants

Companies across industries are looking to use interactive avatars to enhance digital experiences. But creating them is a complex, time-consuming process requiring state-of-the-art AI models that can see, hear, understand and communicate with end users.

To ease this process, NVIDIA is providing creators and developers with real-time AI solutions through Omniverse Avatar Cloud Engine (ACE), a suite of cloud-native microservices for end-to-end development of interactive avatars. In collaboration with early-access partners, NVIDIA is delivering improvements that will provide users with the tools they need to easily design and deploy various kinds of avatars, from interactive chatbots to intelligent digital humans.

AT&T and Quantiphi are among the first to experience how Omniverse ACE can help increase employee productivity and enhance customer service experiences.

Omniverse ACE users can now seamlessly integrate NVIDIA AI into their applications, including Riva for speech AI, NeMo service for natural language understanding, and Omniverse Audio2Face or Live Portrait for AI-powered 2D and 3D character animation.

With the latest improvements to Omniverse ACE, teams can also deploy advanced avatars across web conferencing and customer service use cases by integrating domain-specific NVIDIA AI workflows like Tokkio and Maxine.

Early Partners and Customers Develop AI-Driven Digital Humans

AT&T is planning to use Omniverse ACE and the Tokkio AI avatar workflow to build, customize and deploy virtual assistants for customer service and its employee help desk. Working with Quantiphi, one of NVIDIA’s service delivery partners, AT&T is developing interactive avatars that can provide 24/7 support in local languages across regions. This is helping the company reduce costs while providing a better experience for its employees worldwide.

In addition to customer service, AT&T is planning to build and develop digital humans for various use cases across the company.

“Quantiphi and NVIDIA have been collaborating to make customer experience more immersive by combining the power of large language models, graphics and recommender systems,” said Siddharth Kotwal, global head of NVIDIA Practice at Quantiphi. “NVIDIA’s Tokkio framework has made it easier to build, deploy and personalize AI-powered digital assistants or avatars for our enterprise customers. The process of seamlessly integrating automatic speech recognition, conversational agents and information retrieval systems with real-time animation has been simplified.”

Leading professional-services company Deloitte is also working with NVIDIA to help enterprises deploy transformative applications. Deloitte’s latest hybrid-cloud offerings — which consist of NVIDIA AI and Omniverse services and platforms, including Omniverse ACE — will be added to the Deloitte Center for AI Computing.

An Advanced, Streamlined Solution for Deploying Avatars

Omniverse ACE provides all the necessary tools so users can streamline the development process for realistic, intelligent avatars. Teams can also customize pre-built AI avatar workflows to suit their needs with applications like NVIDIA Tokkio. Additionally, Omniverse ACE is bringing new improvements to existing microservices.

Learn more about NVIDIA Omniverse ACE and register to join the early-access program, available now for developers.

Dive into the art of AI avatars at GTC, a global conference for the era of AI and the metaverse. Join sessions with NVIDIA and industry experts, and watch the GTC keynote below:

NVIDIA Metropolis Ecosystem Grows With Advanced Development Tools to Accelerate Vision AI

With AI at its tipping point, AI-enabled computer vision is being used to address the world’s most challenging problems in nearly every industry.

At GTC, a global conference for the era of AI and the metaverse running through Thursday, March 23, NVIDIA announced technology updates poised to drive the next wave of vision AI adoption. These include NVIDIA TAO Toolkit 5.0 for creating customized, production-ready AI models; expansions to the NVIDIA DeepStream software development kit for developing vision AI applications and services; and early access to Metropolis Microservices for powerful, cloud-native building blocks that accelerate vision AI.

Exploding Adoption and Ecosystem

More than 1,000 companies are using NVIDIA Metropolis developer tools to solve Internet of Things (IoT), sensor processing and operational challenges with vision AI — and the rate of adoption is quickening. The tools have now been downloaded over 1 million times by those looking to build vision AI applications.

PepsiCo is optimizing its operations with NVIDIA Metropolis to improve throughput, reduce downtime and minimize energy consumption.

The convenience-food and beverages giant is developing AI-powered digital twins of its distribution centers using the NVIDIA Omniverse platform to visualize how different setups in its facilities will impact operational efficiency before implementing them in the real world. PepsiCo is also using advanced machine vision technology, powered by the NVIDIA AI platform and GPUs, to improve efficiency and accuracy in its distribution process.

Siemens, a technology leader in industrial automation and digitalization, is adding next-level perception into its edge-based applications through NVIDIA Metropolis. With millions of sensors across factories, Siemens uses NVIDIA Metropolis — a key application framework for edge AI — to connect entire fleets of robots and IoT devices and bring AI into its industrial environments.

Automaker BMW Group is using computer vision technologies based on lidar and cameras — built by Seoul Robotics and powered by the NVIDIA Jetson edge AI platform — at its manufacturing facility in Munich to automate the movement of cars. This automation has resulted in significant time and cost savings, as well as employee safety improvements.

Making World-Class Vision AI Accessible to Any Developer on Any Device

As AI is made accessible to developers of any skill level, the next phase of AI adoption will arrive.

GTC is showcasing major expansions of Metropolis workflows, which put some of the latest AI capabilities and research into the hands of developers through NVIDIA TAO Toolkit, Metropolis Microservices and the DeepStream SDK, as well as the NVIDIA Isaac Sim synthetic data generation tool and robotics simulation applications.

NVIDIA TAO Toolkit is a low-code AI framework that supercharges vision AI model development for practically any developer, in any service, on any device. TAO 5.0 is filled with new features, including vision transformer pretrained AI models, the ability to deploy models on any platform with standard ONNX export, automatic hyperparameter tuning with AutoML, and AI-assisted data annotation.

STMicroelectronics, a global leader in embedded microcontrollers, integrates TAO into its STM32Cube AI developer workflow. TAO has enabled the company to run sophisticated AI in widespread IoT and edge use cases that STM32 microcontrollers power within their compute and memory budget.

The NVIDIA DeepStream SDK has emerged as a powerful tool for developers looking to create vision AI applications across a wide range of industries. With its latest update, a new graph execution runtime (GXF) allows developers to expand beyond the open-source GStreamer multimedia framework. DeepStream’s addition of GXF is a game-changer for users seeking to build applications that require tight execution control, advanced scheduling and critical thread management. This feature unlocks a host of new applications, including those in industrial quality control, robotics and autonomous machines.

Adding perception to physical spaces often requires applying vision AI to numerous cameras covering multiple regions.

Challenges in computer vision include monitoring the flow of packaged goods across a warehouse or analyzing individual customer flow across a large retail space. Metropolis Microservices make these sophisticated vision AI tasks easy to integrate and deploy into users’ applications.

Leading IT services company Infosys is using NVIDIA Metropolis to supercharge its vision AI application development and deployment. The NVIDIA TAO low-code training framework and pretrained models help Infosys reduce AI training efforts. Metropolis Microservices, along with the DeepStream SDK, optimize the company’s vision processing pipeline throughput and cut overall solution costs. Infosys can also generate troves of synthetic data with the NVIDIA Omniverse Replicator SDK to easily train AI models with new stock keeping units and packaging.

Latest Metropolis Features

Tap into the latest in NVIDIA vision AI technologies:

Read the TAO 5.0 blog. Try TAO Toolkit on NVIDIA LaunchPad.
GXF runtime, now part of NVIDIA DeepStream, unlocks new use cases that require tight scheduling control. Try it on NVIDIA LaunchPad.
Sign up for early access to Metropolis Microservices, a suite of cloud-native microservices and reference applications that accelerate efforts to create API-driven solutions for the edge and the cloud.
Learn more about NVIDIA Metropolis — through corporate blogs, technical blogs and case studies — to see how vision AI is transforming the world.

Register free to attend GTC, and watch these sessions to learn how to accelerate vision AI application development and understand its many use cases.

Watch NVIDIA founder and CEO Jensen Huang’s GTC keynote in replay:

NVIDIA Studio at GTC: New AI-Powered Artistic Tools, Feature Updates, NVIDIA RTX Systems for Creators

Editor’s note: This post is part of our weekly In the NVIDIA Studio series, which celebrates featured artists, offers creative tips and tricks, and demonstrates how NVIDIA Studio technology improves creative workflows. We’re also deep diving on new GeForce RTX 40 Series GPU features, technologies and resources, and how they dramatically accelerate content creation.

Powerful AI technologies are revolutionizing 3D content creation — whether by enlivening realistic characters that show emotion or turning simple texts into imagery.

The brightest minds, artists and creators are gathering at NVIDIA GTC, a free, global conference on AI and the metaverse, taking place online through Thursday, March 23.

NVIDIA founder and CEO Jensen Huang’s GTC keynote announced a slew of advancements set to ease creators’ workflows, including using generative AI with the Omniverse Audio2Face app.

NVIDIA Omniverse, a platform for creating and operating metaverse applications, further expands with an updated Unreal Engine Connector, open-beta Unity Connector and new SimReady 3D assets.

New NVIDIA RTX GPUs, powered by the Ada Lovelace architecture, are fueling next-generation laptop and desktop workstations to meet the demands of the AI, design and the industrial metaverse.

The March NVIDIA Studio Driver adds support for the popular RTX Video Super Resolution feature, now available for GeForce RTX 40 and 30 Series GPUs.

And this week In the NVIDIA Studio, the Adobe Substance 3D art and development team explores the process of collaborating to create the animated short End of Summer using Omniverse USD Composer (formerly known as Create).

Omniverse Overdrive

Specialized generative AI tools can boost creator productivity, even for users who don’t have extensive technical skills. Generative AI brings creative ideas to life, producing high-quality, highly iterative experiences — all in a fraction of the time and cost of traditional asset development.

The Omniverse Audio2Face AI-powered app allows 3D artists to efficiently animate secondary characters, generating realistic facial animations with just an audio file — replacing what is often a tedious, manual process.

The latest release delivers significant upgrades in quality, usability and performance including a new headless mode and a REST API — enabling game developers and other creators to run the app and process numerous audio files from multiple users in the data center.

A new Omniverse Connector developed by NVIDIA for Unity workflows is available in open beta. Unity scenes can be added directly onto Omniverse Nucleus servers with access to platform features: the DeepSearch tool, thumbnails, bookmarks and more. Unidirectional live-sync workflows enable real-time updates.

With the Unreal Engine Connector’s latest release, Omniverse users can now use Unreal Engine’s USD import utilities to add skeletal mesh blend shape importing, and Python USD bindings to access stages on Omniverse Nucleus. This release also delivers improvements in import, export and live workflows, as well as updated software development kits.

In addition, over 1,000 new SimReady assets are available for creators. SimReady assets are built to real-world scale with accurate mass, physical materials and center of gravity for use within Omniverse PhysX for the most photorealistic visuals and accurate movements.

March Studio Driver Brings Superfly Super Resolution

Over 90% of online videos consumed by NVIDIA RTX GPU owners are 1080p resolution or lower, often resulting in upscaling that further degrades the picture despite the hardware being able to handle more.

The solution: RTX Video Super Resolution. The new feature, available on GeForce RTX 30 and 40 Series GPUs, uses AI to improve the quality of any video streamed through Google Chrome and Microsoft Edge browsers.

Click the image to see the differences between bicubic upscaling (left) and RTX Video Super Resolution.

This improves video sharpness and clarity. Users can watch online content in its native resolution, even on high-resolution displays.

RTX Video Super Resolution is now available in the March Studio Driver, which can be downloaded today.

New NVIDIA RTX GPUs Power Professional Creators

Six new professional-grade NVIDIA RTX GPUs — based on the Ada Lovelace architecture — enable creators to meet the demands of their most complex workloads using laptops and desktops.

The NVIDIA RTX 5000, RTX 4000, RTX 3500, RTX 3000 and RTX 2000 Ada Generation laptop GPUs deliver up to 2x the performance compared with the previous generation. The NVIDIA RTX 4000 Small Form Factor (SFF) Ada Generation desktop GPU features new RT Cores, Tensor Cores and CUDA cores with up to 20GB of graphics memory.

These include the latest NVIDIA Max-Q and RTX technologies and are backed by the NVIDIA Studio platform with RTX optimizations in over 110 creative apps, NVIDIA RTX Enterprise Drivers for the highest levels of stability and performance, and exclusive AI-powered NVIDIA tools: Omniverse, Canvas and Broadcast.

Professionals using these laptop GPUs can run advanced technologies like DLSS 3 to increase frame rates by up to 4x compared to the previous generation, and Omniverse Enterprise for real-time collaboration and simulation.

Next-generation mobile workstations featuring NVIDIA RTX GPUs will be available starting this month.

Creative Boosts at GTC

Experience GTC for more inspiring content, expert-led sessions and a must-see keynote to accelerate your life’s creative work.
Catch these sessions on Omniverse, AI and 3D workflows — live or on demand:
Fireside Chat With OpenAI Founder Ilya Sutskever and Jensen Huang: AI Today and Vision of the Future [S52092]
How Generative AI Is Transforming the Creative Process: Fireside Chat With Adobe’s Scott Belsky and NVIDIA’s Bryan Catanzaro [S52090]
Generative AI Demystified [S52089]
3D by AI: How Generative AI Will Make Building Virtual Worlds Easier [S52163]
Custom World Building With AI Avatars: The Little Martians Sci-Fi Project [S51360]
AI-Powered, Real-Time, Markerless: The New Era of Motion Capture [S51845]
3D and Beyond: How 3D Artists Can Build a Side Hustle in the Metaverse [SE52117]
NVIDIA Omniverse User Group [SE52047]
Accelerate the Virtual Production Pipeline to Produce an Award-Winning Sci-Fi Short Film [S51496]

As part of the Watch ‘n Learn Giveaway with valued partner 80LV, GTC attendees who register for any Omniverse for creators session — or watch on-demand before March 30 — have a chance to win a powerful GeForce RTX 4080 GPU. Simply fill out this form and tag #GTC23 and @NVIDIAOmniverse with the name of the session.

Search the GTC session catalog and check out the “Media and Entertainment” and “Omniverse” topics for additional creator-focused sessions.

A Father-Daughter Journey Back Home

The short animation End of Summer, created by the Substance 3D art and development team at Adobe, may evoke a surprising amount of heart. That was the team’s intent.

“We loved the idea of allowing the artwork to invoke an emotion in the viewer, letting them develop their own version of a story they felt was unfolding before their eyes,” said team member Wes McDermott.

End of Summer, a nod to stop-motion animation studios such as Laika, began as an internal Adobe Substance 3D project aimed at accomplishing two goals.

First, to encourage a relatively new group of artists to work together as a team by leaning into a creative endeavor. And second, to test their pipeline feature set for the potential of the Universal Scene Description (USD) framework.

The group divided the task of creating assets across the most popular 3D apps, including Adobe Substance 3D Modeler, Autodesk 3ds Max, Autodesk Maya, Blender and Maxon’s Cinema 4D. Their GeForce RTX GPUs unlocked AI denoising in the viewport for fast, interactive rendering and GPU-accelerated filters to speed up and simplify material creation.

“NVIDIA Omniverse is a great tool for laying out and setting up dressing scenes, as well as learning about USD workflows and collaboration. We used painting and NVIDIA PhysX collision tools to place assets.” — Wes McDermott

“We quickly started to see the power of using USD as not only an export format but also a way to build assets,” McDermott said. “USD enables artists on the team to use whatever 3D app they felt most comfortable with.”

The Adobe team relied heavily on the Substance 3D asset library of materials, models and lights to create their studio environment. All textures were applied in Substance 3D Painter, where RTX-accelerated light and ambient occlusion baking optimized assets in mere moments.

Then, they imported all models into Omniverse USD Composer, where the team simultaneously refined and assembled assets.

“This was also during the pandemic, and we were all quarantined in our homes,” McDermott said. “Having a project we could work on together as a team helped us to communicate and be creative.”

Accelerate scene composition, and assemble, simulate and render 3D scenes in real time in Omniverse USD Composer.

Lastly, the artists imported the scene into Unreal Engine as a stage for lighting and rendering.

McDermott stressed the importance of hardware in his team’s workflows. “The bakers in Substance Painter are GPU accelerated and benefit greatly from NVIDIA RTX GPUs,” he said. “We were also heavily working on Unreal Engine and reliant on real-time rendering.”

For more on this workflow, check out the GTC session, 3D Art Goes Multiplayer: Behind the Scenes of Adobe Substance’s ‘End of Summer’ Project With Omniverse. Registration is free.

Adobe Substance 3D team lead and artist Wes McDermott.

Check out McDermott’s portfolio on Instagram.

Follow NVIDIA Studio on Instagram, Twitter and Facebook. Access tutorials on the Studio YouTube channel and get updates directly in your inbox by subscribing to the Studio newsletter. Learn more about Omniverse on Instagram, Medium, Twitter and YouTube for additional resources and inspiration. Check out the Omniverse forums, and join our Discord server and Twitch channel to chat with the community.

From Concept to Production to Sales, NVIDIA AI and Omniverse Enable Automakers to Transform Their Entire Workflow

The automotive industry is undergoing a digital revolution, driven by breakthroughs in accelerated computing, AI and the industrial metaverse.

Automakers are digitalizing every phase of the product lifecycle — including concept and styling, design and engineering, software and electronics, smart factories, autonomous driving and retail — using the NVIDIA Omniverse platform and AI.

Based on the Universal Scene Description (USD) framework, Omniverse transforms complex 3D workflows, allowing teams to connect and customize 3D pipelines and simulate large-scale, physically accurate virtual worlds. By taking the automotive product workflow into the virtual world, automakers can bypass traditional bottlenecks to save critical time and reduce cost.

Bringing Ideas to Life

Designing new vehicle models — and refreshing current ones — is a collaborative process that requires review and alignment of even the tiniest details.

By refining concepts in Omniverse, designers can visualize every facet of a car’s interior and exterior in the full context of the broader vehicle. Global teams can iterate quickly with real-time, physically based, photorealistic rendering. For example, they can collaborate to design the cockpit’s critical components, such as digital instrument clusters and infotainment systems, which must strike a balance of communicating information while minimizing distraction.

Omniverse enables designers to flexibly lay out the cabin and cockpit onscreen user experience along with the vehicle’s physical interior to ensure a harmonious look and feel.

With this next-generation design process, automakers can catch flaws early and make real-time improvements, reducing the number of physical prototypes to test and validate.

Virtual Validation

Once the design is complete, developers can use Omniverse to kick the tires on their new concepts.

Perfecting the interior is necessary for customer experience as well as safety.

Developers can take these in-cabin designs for a spin in the virtual world, collaborating and sharing designs for efficient refinement and validation.

Digitalization is also transforming the way automakers approach vehicle engineering. Teams can test different materials and components in a virtual environment to further reduce physical prototyping. For example, engineers can use computational fluid dynamics to refine aerodynamics and perform virtual crash simulations for safer vehicle designs.

Continuous Improvement

The coming generation of vehicles are highly advanced computers on wheels, packed with complex, centralized electronic systems and software for enhanced safety, intelligence and security.

Typically, vehicle functions are controlled by dozens of electronic control units distributed throughout a vehicle. By centralizing computing into core domains, automakers can replace many components and simplify what has been an incredibly complex supply chain.

With a digital representation of this entire architecture, automakers can simulate and test the vehicle software, and then provide over-the-air updates for continuous improvement throughout the car’s lifespan — from remote diagnostics to autonomous-driving capabilities to subscriptions for entertainment and other services.

Digital-First Production

Vehicle production is a colossal undertaking that requires thousands of parts and workers moving in sync. Any supply chain or production issues can lead to costly delays.

With Omniverse, automakers can develop and operate complex, AI-enabled virtual environments for factory and warehouse design. These physically based, precision-timed digital twins are the key to unlocking operational efficiencies with predictive analysis and process automation.

Factory planners can access the digital twin of the factory to review and improve the plant as needed. Every change can be quickly evaluated and validated in the virtual world, then implemented in the real world to ensure maximum efficiency and optimal ergonomics for factory workers.

Additionally, automakers can synchronize plant locations anywhere in the world for scalable design and iteration.

Autonomous Vehicle Proving Grounds

On top of enhancing traditional product development and manufacturing, Omniverse offers a complete toolchain for developing and validating automated and autonomous-driving systems.

NVIDIA DRIVE Sim is a physically based simulation platform, built on NVIDIA Omniverse, designed for fast and efficient autonomous-vehicle testing and validation at scale. It is time-accurate and supports the complete development toolchain, so developers can run simulation at the component level or for the entire system.

With DRIVE Sim, developers can repeatedly simulate routine driving scenarios, as well as rare and hazardous conditions that are too risky to test in the real world. Additionally, real-world driving recordings can be turned into reactive simulation scenarios using the platform’s Neural Reconstruction Engine.

Automakers can also fine-tune their advanced driver-assistance and autonomous-vehicle systems for New Car Assessment Program (NCAP) regulations, which evaluate the safety performance of new cars based on several crash tests and safety features.

The DRIVE Sim NCAP tool provides high-fidelity NCAP test protocols in simulation, so automakers can efficiently perform dedicated development and validation at scale.

The ability to drive in physically based virtual environments can significantly accelerate the autonomous-vehicle development process, overcoming data collection and scenario diversity hurdles that occur in real-world testing.

Omniverse’s generative AI reconstructs previously driven routes into 3D so past experiences can be reenacted or modified.

Try Before You Buy

The end customer benefits from digitalization, too.

Immersive technologies in Omniverse — including 3D visualization, augmented reality (AR) and virtual reality (VR) streamed using NVIDIA CloudXR — deliver consumers a more engaging experience, allowing them to explore features before making a purchase.

Prospective buyers can customize their vehicle in a car configurator — choosing colors, interior materials, trim levels and more — without being limited by the physical inventory of a dealership. They can then check out the car from every angle using 3D visualization. And with AR and VR, they can view and virtually test drive a car from anywhere.

The benefits of digitalization extend beyond the automotive industry. With Omniverse, any enterprise can reimagine their workflows to increase efficiency, productivity and speed, revolutionizing the way they do business. Omniverse is the digital-to-physical operating system to realize industrial digitalization.

Learn more about the latest in AI and the metaverse by watching NVIDIA founder and CEO Jensen Huang’s GTC keynote address:

From Training AI in the Cloud to Running It on the Road, Transportation Leaders Trust NVIDIA DRIVE

Transportation industry trailblazers are propelling their next-generation vehicles by building on NVIDIA DRIVE end-to-end solutions, which span the cloud to the car.

The world’s best-selling new energy vehicle (NEV) brand BYD announced at NVIDIA GTC that it’s using the NVIDIA DRIVE Orin centralized compute platform to power an even wider range of vehicles within its mainstream Dynasty and Ocean series of NEVs.

This comes hot on the heels of BYD’s recent announcement that it’s working to bring the NVIDIA GeForce NOW cloud gaming service to its vehicles to further enhance the in-car experience.

DeepRoute.ai, a developer of production-ready autonomous driving solutions, has launched its Driver 3.0 HD Map-Free solution. Built on NVIDIA DRIVE Orin, this product is designed to offer a non-geo-fenced solution for mass-produced advanced driver-assistance system (ADAS) vehicles, and will be available at the end of the year.

By using the computational power of the automotive-grade DRIVE Orin system-on-a-chip, which delivers 254 trillion operations per second (TOPS) of compute performance, DeepRoute’s HD Map-Free solution promises to accelerate deployment of driver-assistance functions to consumer cars and robotaxis.

Plus, Pony.ai announced that its autonomous-driving domain controller (ADC), powered by NVIDIA DRIVE, will be deployed for large-scale commercial use in autonomous-delivery vehicles for Beijing-based companies Meituan and Neolix.

With NVIDIA DRIVE Orin as the AI brain of their driverless vehicles, Meituan and Neolix are well-positioned to fulfill growing consumer demand for safe, scalable autonomous delivery of goods.

Lenovo announced it is a tier-one manufacturer of a new ADC based on the next-generation NVIDIA DRIVE Thor centralized computer. Packed with up to 2,000 TOPS of performance, DRIVE Thor will power Lenovo’s ADC, which is set to become the company’s top-tier vehicle computing product line, with mass production expected in 2025.

Rimac Technology, the engineering arm of Croatian-based Rimac Group, is working on a new central vehicle computer, or R-CVC, that will power ADAS, in-vehicle cockpit systems, the vehicle dynamics logic and the body and comfort software stack.

NVIDIA DRIVE hardware and software will be used in this platform to accelerate Rimac Technology’s development efforts and enable its manufacturer customers to speed time to market, reduce development costs, streamline maintenance, and boost vehicle performance.

Rimac Technology’s central vehicle computer.

New premium intelligent all-electric auto brand smart is now developing next-generation intelligent mobility solutions with NVIDIA. The startup will build its future all-electric portfolio using the NVIDIA DRIVE Orin platform to create a “smarter” urban mobility experience for its global customers. The start of vehicle production is expected by the end of 2024.

In addition, smart will collaborate with NVIDIA to build a dedicated data center for the development of highly advanced assisted-driving and AI systems to explore cutting-edge mobility solutions.

Changing the Rules of the Road

The transportation industry is undergoing a revolution, and NVIDIA is leading the charge with its game-changing DRIVE end-to-end platform, which is transforming the way mobility leaders are building advanced driving systems.

NVIDIA’s dedication to safer, smarter and more enjoyable in-vehicle experiences is core to all aspects of its DRIVE platform, from the ability to train AI in the data center to delivering high-performance centralized compute in the car.

The NVIDIA DRIVE AV and DRIVE IX software stacks enable custom applications, and the DRIVE Sim platform powered by Omniverse provides a comprehensive testing and validation platform for autonomous vehicles.

Learn more about the latest technology breakthroughs in automotive and other industries by watching NVIDIA founder and CEO Jensen Huang’s GTC keynote:

Mitsui and NVIDIA Announce World’s First Generative AI Supercomputer for Pharmaceutical Industry

Mitsui & Co., Ltd., one of Japan’s largest business conglomerates, is collaborating with NVIDIA on Tokyo-1 — an initiative to supercharge the nation’s pharmaceutical leaders with technology, including high-resolution molecular dynamics simulations and generative AI models for drug discovery.

Announced today at the NVIDIA GTC global AI conference, the Tokyo-1 project features an NVIDIA DGX AI supercomputer that will be accessible to Japan’s pharma companies and startups. The effort is poised to accelerate Japan’s $100 billion pharma industry, the world’s third largest following the U.S. and China.

“Japanese pharma companies are experts in wet lab research, but they have not yet taken advantage of high performance computing and AI on a large scale,” said Yuhi Abe, general manager of the digital healthcare business department at Mitsui. “With Tokyo-1, we are creating an innovation hub that will enable the pharma industry to transform the landscape with state-of-the-art tools for AI-accelerated drug discovery.”

The project will provide customers with access to NVIDIA DGX H100 nodes supporting molecular dynamics simulations, large language model training, quantum chemistry, generative AI models that create novel molecular structures for potential drugs, and more. Tokyo-1 users can also harness large language models for chemistry, protein, DNA and RNA data formats through the NVIDIA BioNeMo drug discovery software and service.

Xeureka, a Mitsui subsidiary focused on AI-powered drug discovery, will be operating Tokyo-1, which is expected to go online later this year. The initiative will also include workshops and technical training on accelerated computing and AI for drug discovery.

Invigorating Drug Discovery Research With AI, HPC

According to Abe, Japan’s pharmaceutical environment has long faced drug lag: delays in both drug development and the approval of treatments that are already available elsewhere. The problem received renewed attention during the race to develop vaccines during the COVID-19 pandemic.

The nation’s pharmaceutical companies see AI adoption as part of the solution — a key tool to strengthen and accelerate the industry’s drug development pipeline. Training and fine-tuning AI models for drug discovery require enormous compute resources, such as the Tokyo-1 supercomputer, which in its first iteration will include 16 NVIDIA DGX H100 systems, each with eight NVIDIA H100 Tensor Core GPUs.

The DGX H100 is based on the powerful NVIDIA Hopper GPU architecture, which features a Transformer Engine designed to accelerate the training of transformer models, including generative AI models for biology and chemistry. Xeureka plans to add more nodes to the system as the project grows.

“Tokyo-1 is designed to address some of the barriers to implementing data-driven, AI-accelerated drug discovery in Japan,” said Hiroki Makiguchi, product engineering manager in the science and technology division at Xeureka. “This initiative will uplevel the Japanese pharmaceutical industry with high performance computing and unlock the potential of generative AI to discover new therapies.”

Customers will be able to access a dedicated server on the supercomputer, receive technical support from Xeureka and NVIDIA, and participate in workshops from the two companies. For larger training runs that require more computational resources, customers can request access to a server with more nodes. Users can also purchase Xeureka’s software solutions for molecular dynamics, docking, quantum chemistry and free-energy perturbation calculations.

By using NVIDIA BioNeMo software on the Tokyo-1 supercomputer, researchers will be able to scale state-of-the-art AI models to millions and billions of parameters for applications including protein structure prediction, small molecule generation and pose prediction estimation.

Tokyo-1 Accelerates Japanese Companies in Pharma and Beyond

Major Japanese pharma companies including Astellas Pharma, Daiichi-Sankyo and Ono Pharmaceutical are already making plans to advance their drug discovery projects with Tokyo-1.

Tokyo-based Astellas Pharma is pursuing innovative digital solutions across its business — including in sales, manufacturing, and research and development — to maximize outcomes for patients and reduce the costs of healthcare. With Tokyo-1, the company will accelerate its research with molecular simulations and large language models for generative AI through NVIDIA BioNeMo software.

“AI and large-scale simulations can be used for applications including small molecule compounds, antibodies, gene therapy, cell therapy, targeted protein degradation, engineered phage therapy and mRNA medicine,” said Kazuhisa Tsunoyama, head of digital research solutions, advanced informatics and analytics at Astellas. “By enabling us to take full advantage of recent advances in AI and simulation technology, Tokyo-1 will be one of the foundations on which Astellas can achieve its VISION for the future of pharmaceutical research.”

Tokyo-based Daiichi Sankyo will use Tokyo-1 to establish a drug discovery process that fully integrates AI and machine learning.

“By adopting AI and the cutting-edge GPU resources of Tokyo-1, we will be able to perform large-scale computations to accelerate our drug discovery efforts,” said Takayuki Serizawa, senior researcher at Daiichi Sankyo. “These advancements will provide new value to patients by improving drug delivery and potentially enabling personalized medicine.”

Ono Pharmaceutical, based in Osaka, focuses on drug discovery in the fields of oncology, immunology and neurology.

“Training AI models requires significant computational power, and we believe that the massive GPU resources of Tokyo-1 will solve this problem,” said Hiromu Egashira, director of the Drug Discovery DX Office in the drug discovery technology department at Ono. “We envision our use of the DGX supercomputer to be very broad, including high-quality simulations, image analysis, video analysis and language models.”

Beyond the pharmaceutical industry, Mitsui plans to make the Tokyo-1 supercomputer accessible to Japanese medical-device companies and startups — and to connect Tokyo-1 customers to AI solutions developed by global healthcare startups in the NVIDIA Inception program. NVIDIA will also connect Tokyo-1 users with the hundreds of global life science customers in its developer network.

Discover the latest in AI and healthcare at GTC, running online through Thursday, March 23. Registration is free.

Watch the GTC keynote address by NVIDIA founder and CEO Jensen Huang below:

Omniverse at Scale: NVIDIA Announces Third-Generation OVX Computing Systems to Power Industrial Metaverse Applications

Digitalization that combines AI and simulation is redefining how industrial products are created and transforming how people interact with the digital world.

To help enterprises tackle complex new workloads, NVIDIA has unveiled the third generation of its NVIDIA OVX computing system.

OVX is designed to power large-scale digital twins built on NVIDIA Omniverse Enterprise, a platform for creating and operating metaverse applications. The latest OVX system provides the breakthrough graphics and AI required to accelerate massive digital twin simulations and other demanding applications by combining NVIDIA BlueField-3 DPUs with NVIDIA L40 GPUs, ConnectX-7 SmartNICs and the NVIDIA Spectrum Ethernet platform.

Some of the world’s largest systems makers will be bringing the latest OVX systems to customers worldwide later this year, providing enterprises with the technology to handle complex manufacturing, design and Omniverse-based workloads. Businesses can take advantage of the real-time, true-to-reality capabilities of OVX to collaborate on the most challenging visualization, virtual workstation and data center processing workflows.

Reimagining Digital Twin Simulation

Customers using third-generation OVX systems can speed their workflows and optimize simulations through immersive digital twins used to model factories, cities, autonomous vehicles and more before deployment in the real world. This helps maximize operational efficiency and predictive planning capabilities.

For example, DB Netze’s Digitale Schiene Deutschland is leveraging the capabilities of OVX to power large-scale digital twins of dynamic physical systems, including rail networks. Others, like Jaguar Land Rover, are leveraging the graphics and simulation capabilities of OVX systems in conjunction with the NVIDIA DRIVE Sim platform to accelerate the testing and development of next-generation autonomous vehicles.

Next-Generation Platform Features

The third generation of OVX features a new architecture, with a server design based on a dual-CPU platform with four NVIDIA L40 GPUs. Based on the Ada Lovelace architecture, the L40 GPU delivers revolutionary neural graphics, AI compute and the performance needed for the most demanding Omniverse workloads.

Each OVX server also includes two high-performance ConnectX-7 SmartNICs to enable multi-node scalability and precise time synchronization. The Ethernet adapters enable the multi-node scalability of OVX systems and provide networking capabilities for the low-latency, high-bandwidth communication that globally dispersed teams need.

New with this generation, the BlueField-3 data processing unit offloads, accelerates and isolates CPU-intensive infrastructure tasks. For deploying Omniverse at data center scale, BlueField-3 DPUs provide a secure foundation for running the data center control-plane, enabling higher performance, limitless scaling, zero-trust security and better economics.

Helping users keep up with networking performance, the accelerated NVIDIA Spectrum Ethernet platform provides high bandwidth and network synchronization to enhance real-time simulation capabilities.

Availability

In addition to original NVIDIA OVX partners Lenovo and Supermicro, third-generation OVX systems will be available later this year through Dell Technologies, GIGABYTE and QCT. NVIDIA is also working on Digital Twin as a Service offerings based on OVX with HPE Greenlake.

To learn more about OVX, watch NVIDIA founder and CEO Jensen Huang’s GTC keynote.

Solution overview

Prerequisites

Deploy using AWS Serverless Application Model

Deploy using the CloudWatch console

Clean up

Conclusion

About the authors

Solution overview

Prerequisites

Create a CodeCommit repository

Set up Data Wrangler

Configure CodeCommit permissions

Transformation steps

Custom transform step

Encode categorical features

Drop column

Create a Data Wrangler job

Automation

Create a SageMaker pipeline

Create an EventBridge rule

Clean up

Summary

About the Author

GTC: The Premier AI Conference

The iPhone Moment of AI

NVIDIA H100, Grace Hopper, Grace, for Data Centers

DGX the Blueprint for AI Infrastructure

DGX Cloud: Bringing AI to Every Company, Instantly

Supercharging Generative AI

Accelerating Medical Advances

Speeding Deployment of Generative AI Applications

Microsoft, NVIDIA to Bring Omniverse to ‘Hundreds of Millions’

Digitizing the $3 Trillion Auto Industry

Accelerating Semiconductor Breakthroughs

Accelerating the World’s Largest Companies

Helping Do the Impossible

Early Partners and Customers Develop AI-Driven Digital Humans

An Advanced, Streamlined Solution for Deploying Avatars

Exploding Adoption and Ecosystem

Making World-Class Vision AI Accessible to Any Developer on Any Device

Latest Metropolis Features

Omniverse Overdrive

March Studio Driver Brings Superfly Super Resolution

New NVIDIA RTX GPUs Power Professional Creators

Creative Boosts at GTC

A Father-Daughter Journey Back Home

Bringing Ideas to Life

Virtual Validation

Continuous Improvement

Digital-First Production

Autonomous Vehicle Proving Grounds

Try Before You Buy

Changing the Rules of the Road

Invigorating Drug Discovery Research With AI, HPC

Tokyo-1 Accelerates Japanese Companies in Pharma and Beyond

Reimagining Digital Twin Simulation

Next-Generation Platform Features

Availability

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.