A Data Center on Wheels: NVIDIA Unveils DRIVE Atlan Autonomous Vehicle Platform

The next stop on the NVIDIA DRIVE roadmap is Atlan.

During today’s opening keynote of the GPU Technology Conference, NVIDIA founder and CEO Jensen Huang unveiled the upcoming generation of AI compute for autonomous vehicles, NVIDIA DRIVE Atlan. A veritable data center on wheels, Atlan centralizes the vehicle’s entire compute infrastructure into a single system-on-a-chip.

While vehicles are packing in more and more compute technology, they’re lacking the physical security that comes with data center-level processing. Atlan is a technical marvel for safe and secure AI computing, fusing all of NVIDIA’s technologies in AI, automotive, robotics, safety and BlueField data centers.

The next-generation platform will achieve an unprecedented 1,000 trillion operations per second (TOPS) of performance and an estimated SPECint score of more than 100 (SPECrate2017_int) — greater than the total compute in most robotaxis today. Atlan is also the first SoC to be equipped with an NVIDIA BlueField data processing unit (DPU) for trusted security, advanced networking and storage services.

While Atlan will not be available for a couple of years, software development is well underway. Like NVIDIA DRIVE Orin, the next-gen platform is software compatible with previous DRIVE compute platforms, allowing customers to leverage their existing investments across multiple product generations.

“To achieve higher levels of autonomy in more conditions, the number of sensors and their resolutions will continue to increase,” Huang said. “AI models will get more sophisticated. There will be more redundancy and safety functionality. We’re going to need all of the computing we can get.”

Advancing Performance at Light Speed

Autonomous vehicle technology is developing faster than it has in previous years, and the core AI compute must advance in lockstep to support this critical progress.

Cars and trucks of the future will require an optimized AI architecture not only for autonomous driving, but also for intelligent vehicle features like speech recognition and driver monitoring. Upcoming software-defined vehicles will be able to converse with occupants: answering questions, providing directions and warning of road conditions ahead.

Atlan is able to deliver more than 1,000 TOPS — a 4x gain over the previous generation — by leveraging NVIDIA’s latest GPU architecture, new Arm CPU cores and deep learning and computer vision accelerators.The platform architecture provides ample compute horsepower for the redundant and diverse deep neural networks that will power future AI vehicles and leaves headroom for developers to continue adding features and improvements.

This high-performance platform will run autonomous vehicle, intelligent cockpit and traditional infotainment applications concurrently.

A Guaranteed Shield with BlueField

Like every generation of NVIDIA DRIVE, Atlan is designed with the highest level of safety and security.

As a data-center-infrastructure-on-a-chip, the NVIDIA BlueField DPU is architected to handle the complex compute and AI workloads required for autonomous vehicles. By combining the industry-leading ConnectX network adapter with an array of Arm cores, BlueField offers purpose-built hardware acceleration engines with full programmability to deliver “zero-trust” security to prevent data breaches and cyberattacks.

This secure architecture will extend the safety and reliability of the NVIDIA DRIVE platform for vehicle generations to come. NVIDIA DRIVE Orin vehicle production timelines start in 2022, and Atlan will follow, sampling in 2023 and slated for 2025 production vehicles.

The post A Data Center on Wheels: NVIDIA Unveils DRIVE Atlan Autonomous Vehicle Platform appeared first on The Official NVIDIA Blog.

Read More

NVIDIA Opens Up Hyperion 8 Autonomous Vehicle Platform for AV Ecosystem

The next generation of vehicles will be packed with more technology than any computing system today.

And with NVIDIA DRIVE Hyperion, companies can embrace this shift to more intelligent, software-defined vehicles. Announced at GTC, the eighth-generation Hyperion platform includes the sensors, high-performance compute and software necessary for autonomous vehicle development, all verified, calibrated and synchronized right out of the box.

Developing an AV — essentially a data center on wheels — requires an entirely new process. Both the hardware and software must be comprehensively tested and validated to ensure they can handle not only the real-time processing for autonomous driving, but also withstand the harsh conditions of daily driving.

Hyperion is a fully operational, production-ready and open autonomous vehicle platform that cuts down the massive amount of time and cost required to outfit vehicles with the technology required for AI features and autonomous driving.

What’s Included

Hyperion comes with all the hardware needed to validate an autonomous driving system at the highest levels of performance.

At its core, two NVIDIA DRIVE Orin systems-on-a-chip (SoCs) provide ample compute for level 4 self-driving and intelligent cockpit capabilities. These SoCs process data from a halo of 12 exterior cameras, three interior cameras, nine radars and two lidar sensors in real time for safe autonomous operation.

Hyperion also includes all the tools necessary to evaluate the NVIDIA DRIVE AV and DRIVE IX software stack, as well as real-time record and capture capabilities for streamlined driving data processing.

And this entire toolset is synchronized and calibrated precisely for 3D data collection, giving developers valuable time back in setting up and running autonomous vehicle test drives.

Seamless Integration

With much of the industry leveraging NVIDIA DRIVE Orin for in-vehicle compute, DRIVE Hyperion is the next step for full autonomous vehicle development and validation.

By including a complete sensor setup on top of centralized compute, Hyperion provides everything needed to validate an intelligent vehicle’s hardware on the road. And with its compatibility with the NVIDIA DRIVE AV and DRIVE IX software stacks, Hyperion is also a critical platform for evaluating and validating self-driving software.

Plus, it’s already streamlining critical self-driving research and development. Institutions such as the Virginia Tech Transportation Institute and Stanford University are leveraging the current generation of Hyperion in autonomous vehicle research pilots.

Developers can begin leveraging the latest open platform soon — the eighth generation of Hyperion will be available to the NVIDIA DRIVE ecosystem later in 2021.

The post NVIDIA Opens Up Hyperion 8 Autonomous Vehicle Platform for AV Ecosystem appeared first on The Official NVIDIA Blog.

Read More

Brain Gain: NVIDIA DRIVE Orin Now Central Computer for Intelligent Vehicles

NVIDIA DRIVE Orin, our breakthrough autonomous vehicle system-on-a-chip, is the new mega brain of the software-defined vehicle.

Beyond self-driving features, NVIDIA CEO and founder Jensen Huang announced today during his GTC keynote that the SoC can power all the intelligent computing functions inside vehicles, including confidence view visualization of autonomous driving capabilities, digital clusters, infotainment and passenger interaction AI.

Slated for 2022 vehicle product lines, Orin processes more than 250 trillion operations per second while achieving systematic safety standards such as ISO 26262 ASIL-D.

Typically, vehicle functions are controlled by tens of electronic control units distributed throughout a vehicle. By centralizing control of these core domains, Orin can replace these components and simplify what has been an incredibly complex supply chain for automakers.

“The future is one central computer — four domains, virtualized and isolated, architected for functional safety and security, software-defined and upgradeable for the life of the car — in addition to super-smart AI and beautiful graphics,” Huang said.

Secure Computing for Every Need

Managing a system with multiple complex applications is incredibly difficult. And when it comes to automotive, safety is critical.

DRIVE Orin supports multiple operating systems, including Linux, QNX and Android, to enable this wide range of applications. As a high-performance compute platform architected for the highest level of safety, it does so in a way that is secure, virtualized and accelerated.

The digital cluster, driver monitoring system and AV confidence view are all crucial to ensuring the safety of a vehicle’s occupants. Each must be functionally secure, with the ability to update each application individually without requiring a system reboot.

DRIVE Orin is designed for software-defined operation, meaning it’s purpose-built to handle these continuous upgrades throughout the life of the vehicle.

The Highest Levels of Confidence

As vehicles become more and more autonomous, visualization within the cabin will be critical for building trust with occupants. And with the DRIVE Orin platform, manufacturers can integrate enhanced capability into their fleets over the life of their vehicles.

The confidence view is a rendering of the mind of the vehicle’s AI. It shows exactly what the sensor suite and perception system are detecting in real time and constructs it into a 3D surround model.

By incorporating this view in the cabin interior, the vehicle can communicate the accuracy and reliability of the autonomous driving system at every step of the journey. And occupants can gain a better understanding of how the vehicle’s AI sees the world.

As a high-performance AI compute platform, DRIVE Orin enables this visualization alongside the digital cluster, infotainment, and driver and occupant monitoring, while maintaining enough compute headroom to add new features that delight customers through the life of their vehicles.

The ability to support this multi-functionality safely and securely is what makes NVIDIA DRIVE Orin truly central to the next-generation intelligent vehicle experience.

The post Brain Gain: NVIDIA DRIVE Orin Now Central Computer for Intelligent Vehicles appeared first on The Official NVIDIA Blog.

Read More

NVIDIA Triton Tames the Seas of AI Inference

You don’t need a hunky sea god with a three-pronged spear to make AI work, but a growing group of companies from car makers to cloud service providers say you’ll feel a sea change if you sail with Triton.

More than half a dozen companies share hands-on experiences this week in deep learning with the NVIDIA Triton Inference Server, open-source software that takes AI into production by simplifying how models run in any framework on any GPU or CPU for all forms of inference.

For instance, in a talk at GTC (free with registration) Fabian Bormann, an AI engineer at Volkswagen Group, conducts a virtual tour through the Computer Vision Model Zoo, a repository of solutions curated from the company’s internal teams and future partners.

The car maker integrates Triton into its Volkswagen Computer Vision Workbench so users can make contributions to the Model Zoo without needing to worry about whether they are based on ONNX, PyTorch or TensorFlow frameworks. Triton simplifies model management and deployment, and that’s key for VW’s work serving up AI models in new and interesting environments, Bormann says in a description of his talk (session E32736) at GTC.

Salesforce Sold on Triton Benchmarks

A leader in customer-relationship management software and services, Salesforce recently benchmarked Triton’s performance on some of the world’s largest AI models — the transformers used for natural-language processing.

“Triton not only has excellent serving performance, but also comes included with several critical functions like dynamic batching, model management and model prioritization. It is quick and easy to set up and works for many deep learning frameworks including TensorFlow and PyTorch,” said Nitish Shirish Keskar, a senior research manager at Salesforce who’s presenting his work at GTC (session S32713).

Keskar described in a recent blog his work validating that Triton can handle 500-600 queries per second (QPS) while processing 100 concurrent threads and staying under 200ms latency on the well-known BERT models used to understand speech and text. He tested Triton on the much larger CTRL and GPT2-XL models, finding that despite their billions of neural-network nodes, Triton still cranked out an amazing 32-35 QPS.

A Model Collaboration with Hugging Face

More than 5,000 organizations turn to Hugging Face for help summarizing, translating and analyzing text with its 7,000 AI models for natural-language processing. Jeff Boudier, its product director, will describe at GTC (session S32003) how his team drove 100x improvements in AI inference on its models, thanks to a flow that included Triton.

“We have a rich collaboration with NVIDIA, so our users can have the most optimized performance running models on a GPU,” said Boudier.

Hugging Face aims to combine Triton with TensorRT, NVIDIA’s software for optimizing AI models, to drive the time to process an inference with a BERT model down to less than a millisecond. “That would push the state of the art, opening up new use cases with benefits for a broad market,” he said.

Deployed at Scale for AI Inference

American Express uses Triton in an AI service that operates within a 2ms latency requirement to detect fraud in real time across $1 trillion in annual transactions.

As for throughput, Microsoft uses Triton on its Azure cloud service to power the AI behind GrammarLink, its online editor for Microsoft Word that’s expected to serve as many as half a trillion queries a year.

Less well known but well worth noting, LivePerson, based in New York, plans to run thousands of models on Triton in a cloud service that provides conversational AI capabilities to 18,000 customers including GM Financial, Home Depot and European cellular provider Orange.

Triton Inference Server
Triton simplifies the job of executing multiple styles of inference with models based on various frameworks while maintaining highest throughput and system utilization.

And the chief technology officer of London-based Intelligent Voice will describe at GTC (session S31452) its LexIQal system, which uses Triton for AI inference to detect fraud in insurance and financial services.

They are among many companies using NVIDIA for AI inference today. In the past year alone, users downloaded the Triton software more than 50,000 times.

Triton’s Swiss Army Spear

Triton is getting traction in part because it can handle any kind of AI inference job, whether it’s one that runs in real time, batch mode, as a streaming service or even if it involves a chain or ensemble of models. That flexibility eliminates the need for users to adopt and manage custom inference servers for each type of task.

In addition, Triton assures high system utilization, distributing work evenly across GPUs whether inference is running in a cloud service, in a local data center or at the edge of the network. And it’s open, extensible code lets users customize Triton to their specific needs.

NVIDIA keeps improving Triton, too. A recently added model analyzer combs through all the options to show users the optimal batch size or instances-per-GPU for their job. A new tool automates the job of translating and validating a model trained in Tensorflow or PyTorch into a TensorRT format; in future, it will support translating models to and from any neural-network format.

Meet Our Inference Partners

Triton’s attracted several partners who support the software in their cloud services, including Amazon, Google, Microsoft and Tencent. Others such as Allegro, Seldon and Red Hat support Triton in the software for enterprise data centers for workflows including MLOps, the extension to DevOps for AI.

At GTC (session S33118), Arm will describe how it adapted Triton as part of its neural-network software that runs inference directly on edge gateways. Two engineers from Dell EMC will show how to boost performance in video analytics 6x using Triton (session S31437), and NetApp will talk about its work integrating Triton with its solid-state storage arrays (session S32187).

To learn more, register for GTC and check out one of two introductory sessions (S31114, SE2690) with NVIDIA experts on Triton for deep learning inference.

The post NVIDIA Triton Tames the Seas of AI Inference appeared first on The Official NVIDIA Blog.

Read More

Like Magic: NVIDIA Merlin Gains Adoption for Training and Inference

Recommenders personalize the internet. They suggest videos, foods, sneakers and advertisements that seem magically clairvoyant in knowing your tastes and interests.

It’s an AI that makes online experiences more enjoyable and efficient, quickly taking you to the things you want to see. While delivering content you like, it also targets tempting ads for jeans, or recommends comfort dishes that fit those midnight cravings.

But not all recommender systems can handle the data requirements to make smarter suggestions. That leads to slower training and less intuitive internet user experiences.

NVIDIA Merlin is turbocharging recommenders, boosting training and inference. Leaders in media, entertainment and on-demand delivery use the open source recommender framework for running accelerated deep learning on GPUs. Improving recommendations increases clicks, purchases — and satisfaction.

Merlin-Accelerated Recommenders 

NVIDIA Merlin enables businesses of all types to build recommenders accelerated by NVIDIA GPUs.

Its collection of libraries includes tools for building deep learning-based systems that provide better predictions than traditional methods and increase clicks. Each stage of the pipeline is optimized to support hundreds of terabytes of data, all accessible through easy-to-use APIs.

Merlin is in testing with hundreds of companies worldwide. Social media and video services are evaluating it for suggestions on next views and ads. And major on-demand apps and retailers are looking at it for suggestions on new items to purchase.

Videos with Snap

With Merlin, Snap is improving the customer experience with better load times by ranking content and ads 60% faster while also reducing their infrastructure costs. Using GPUs and Merlin provides Snap with additional compute capacity to explore more complex and accurate ranking models. These improvements allow Snap to deliver even more engaging experiences at a lower cost.

Tencent: Ads that Click

China’s leading online video media platform uses Merlin HugeCTR to help connect over 500 million monthly active users with ads that are relevant and engaging. With such a huge dataset, training speed matters and determines the performance of the recommender model. Tencent deployed its real-time training with Merlin and achieved more than a 7x speedup over the original TensorFlow solution on the same GPU platform. Tencent dives into this further at its GTC presentation.

Postmates Food Picks

Merlin was designed to streamline and support recommender workflows. Postmates uses recommenders to help people decide what’s for dinner. Postmates utilizes Merlin NVTabular to optimize training time, reducing it from 1 hour on CPUs to just 5 minutes on GPUs.

Using NVTabular for feature engineering, the company reduced training costs by 95 percent and is exploring more advanced deep learning models. Postmates delves more into this in its GTC presentation.

Merlin Streamlines Recommender Workflows at Scale

As Merlin is interoperable, it provides flexibility to accelerate recommender workflow pipelines.

The open beta release of the Merlin recommendation engine delivers leaps in data loading and training of deep learning systems.

NVTabular reduces data preparation time by GPU-accelerating feature transformations and preprocessing. NVTabular, which makes loading massive data lakes into training pipelines easier, gets multi-GPU support and improved interoperability with TensorFlow and PyTorch.

Merlin’s Magic for Training

Merlin HugeCTR is the main training component. It’s designed for training deep learning recommender systems and comes with its own optimized data loader, vastly outperforming generic deep learning frameworks. HugeCTR provides a parquet data reader to digest the NVTabular preprocessed data. HugeCTR is a deep neural network training framework specifically designed for recommender workflows capable of distributed training across multiple GPUs and nodes for maximum performance.

NVIDIA Triton Inference Server accelerates production inference on GPUs for feature transforms and neural network execution.

Learn more about the technology advances behind Merlin since its initial launch, including its support for NVTabular, HugeCTR and NVIDIA Triton Inference Server.

 

The post Like Magic: NVIDIA Merlin Gains Adoption for Training and Inference appeared first on The Official NVIDIA Blog.

Read More

NVIDIA Maxine Hits the Scene to Create Real-Time Video Experiences

The next time you’re in a virtual meeting or streaming a game, live event or TV program, the star of the show may be NVIDIA Maxine, which took center stage at GTC today when NVIDIA CEO Jensen Huang announced the availability of the GPU-accelerated software development kit during his keynote address.

Developers from video conferencing, content creation and streaming providers are using the Maxine SDK to create real-time video-based experiences. And it’s easily deployed to PCs, data centers or in the cloud.

Shift Towards Remote Work

Virtual collaboration continues to grow with 70 million hours of web meetings daily, and more global organizations are looking at technologies to support an increasingly remote workforce.

Pexip, a scalable video conferencing platform that enables interoperability between different video conferencing systems, was looking to push the boundaries of its video communications offering to meet this growing demand.

“We’re already using NVIDIA Maxine for audio noise removal and working on integrating virtual backgrounds to support premium video conferencing experiences for enterprises of all sizes,” said Giles Chamberlin, CTO and co-founder of Pexip.

Working with NVIDIA, Pexip aims to provide AI-powered video communications that support virtual meetings that are better than meetings in person.

It joins other companies in the video collaboration space like Avaya, which incorporated Maxine audio noise reduction into its Spaces app last October and has now implemented virtual background, which allows presenters to overlay their video over presentations.

Headroom uses AI to take distractions out of video conferencing, so participants can focus on interactions during meetings instead. This includes flagging when people have questions, note taking, transcription and smart meeting summarization.

Seeing Face Value for Virtual Events

Research has shown that there are over 1 million virtual events yearly, with more event marketers planning to invest in them in the future. As a result, everyone from event organizers to visual effects artists are looking for faster, more efficient ways to create digital experiences.

Among them is Touchcast, which combines AI and mixed reality to reimagine virtual events. It’s using Maxine’s super-resolution features to convert and deliver 1080p streams into 4K.

“NVIDIA Maxine is paving the future of video communications — a future where AI and neural networks enhance and enrich content in entirely new ways,” said Edo Segal, founder and CEO of Touchcast.

Another example is Notch, which creates tools that enable real-time visual effects and motion graphics for live events. Maxine provides it with real-time, AI-driven face and body tracking along with background removal.

Artists can track and mask performers in a live performance setting for a variety of creative use cases — all using a standard camera feed and eliminating the challenges of special hardware-tracking solutions.

“The integration of the Maxine SDK was very easy and took just a few days to complete,” said Matt Swoboda, founder and director of Notch.

Field of Streams

With nearly 10 million content creators on Twitch per month, becoming a live broadcaster has also never been easier. Live streamers are looking for powerful yet easy-to-use features to excite their audiences.

BeLive, which provides a platform for live streaming user-generated talk shows, is using Maxine to process its video streams in the cloud so customers don’t have to invest in expensive equipment. By running Maxine in the cloud, users can benefit from high-quality background replacement regardless of the hardware they’re running in the client.

With BeLive, live interactive call-in talk shows can be produced easily and streamed to YouTube or Facebook Live, with participants calling in from around the world.

OBS, the leading platform for streaming and recording, is a free and open source software solution broadly used for game streaming and live production. Users with NVIDIA RTX GPUs can now take advantage of noise removal, improving the clarity of their audio during production.

Maxine users
Developers are using the Maxine SDK for building virtual collaboration and content creation applications.

A Look Into NVIDIA Maxine

NVIDIA Maxine includes three AI SDKs covering video effects, audio effects and augmented reality — each with pre-trained deep learning models, so developers can quickly build or enhance their real-time applications.

Starting with the NVIDIA Video Effects SDK, enterprises can now apply AI effects to improve video quality without special cameras or other hardware. Features include super-resolution, generating 720p output live videos from 360p input videos along with artifact reduction to remove defects for crisper pictures.

Video noise removal eliminates low-light camera noise introduced in the video capture process while preserving all of the details. To hide messy rooms or other visual distractions, the Video Effects SDK removes the background of a webcam feed in real time, so only a user’s face and body show up in a livestream.

The NVIDIA Augmented Reality SDK enables real-time 3D face tracking using a standard web camera, delivering a more engaging virtual communication experience by automatically zooming into the face and keeping that face within view of the camera.

It’s now possible to detect human faces in images of video feeds, track the movement of facial expressions, create a 3D mesh representation of a person’s face, use video to track the movement of a  human body in 3D space, simulate eye contact through gaze estimation and much more.

The NVIDIA Audio Effects SDK uses AI to remove distracting background noise from incoming and outgoing audio feeds, improving the clarity and quality of any conversation.

This includes the removal of unwanted background noises — like a dog barking or baby crying — to make conversations easier to understand. For meetings in large spaces, it’s also possible to remove room echoes from the background to make voices clearer.

Developers can add Maxine AI effects into their existing applications or develop new pipelines from scratch using NVIDIA DeepStream, an SDK for building intelligent video analytics, and NVIDIA Video Codec, an SDK for accelerated video encode and decode on Windows and Linux.

Maxine can also be used with NVIDIA Jarvis, a framework for building conversational AI applications, to offer world-class language-based capabilities such as transcription and translation.

Availability

Get started with NVIDIA Maxine.

And don’t let the curtain close on the opportunity to learn more about NVIDIA Maxine during GTC, running April 12-16. Registration is free.

A full list of Maxine-focused sessions can be found here. Be sure to watch Huang’s keynote address on-demand. And check out a demo (below) of Maxine.

The post NVIDIA Maxine Hits the Scene to Create Real-Time Video Experiences appeared first on The Official NVIDIA Blog.

Read More

Fast Track to Enterprise AI: New NVIDIA Workflow Lets Any User Choose, Adapt, Deploy Models Easily

AI is the most powerful new technology of our time, but it’s been a force that’s hard to harness for many enterprises — until now.

Many companies lack the specialized skills, access to large datasets or accelerated computing that deep learning requires. Others are realizing the benefits of AI and want to spread them quickly across more products and services.

For both, there’s a new roadmap to enterprise AI. It leverages technology that’s readily available, then simplifies the AI workflow with NVIDIA TAO and NVIDIA Fleet Command to make the trip shorter and less costly.

Grab and Go AI Models

The journey begins with pre-trained models. You don’t have to design and train a neural network from scratch in 2021. You can choose one of many available today in our NGC catalog.

We’ve curated models that deliver skills to advance your business.  They span the spectrum of AI jobs from computer vision and conversational AI to natural-language understanding and more.

Models Show Their AI Resumes

So users know what they’re getting, many models in the catalog come with credentials. They’re like the resume for a prospective hire.

Model credentials show you the domain the model was trained for, the dataset that trained it, how often the model was deployed and how it’s expected to perform. They provide transparency and confidence you’re picking the right model for your use case.

Leveraging a Massive Investment

NVIDIA invested hundreds of millions of GPU compute hours over more than five years refining these models. We did this work so you don’t have to.

Here are three quick examples of the R&D you can leverage:

For computer vision, we devoted 3,700 person-years to labeling 500 million objects from 45 million frames. We used voice recordings to train our speech models on GPUs for more than a million hours. A database of biomedical papers packing 6.1 billion words educated our models for natural-language processing.

Transfer Learning, Your AI Tailor

Once you choose a model, you can fine tune it to fit your specific needs using NVIDIA TAO, the next stage of our expedited workflow for enterprise AI.

TAO enables transfer learning, a process that harvests features from an existing neural network and plants them in a new one using NVIDIA’s Transfer Learning Toolkit, an integrated part of TAO. It leverages small datasets users have on hand to give models a custom fit without the cost, time and massive datasets required to build and train a neural network from scratch.

Sometimes companies have an opportunity to further enhance models by training them across larger, more diverse datasets maintained by partners outside the walls of their data center.

TAO Lets Partners Collaborate with Privacy 

Federating learning, another part of TAO, lets different sites securely collaborate to refine a model for the highest accuracy. With this technique, users share components of models such as their partial weights. Datasets remain inside each company’s data center so data privacy is preserved.

In one recent example, 20 research sites collaborated to raise the accuracy of the so-called EXAM model that predicts whether a patient has COVID-19. After applying federated learning, the model also could predict the severity of the infection and whether the patient would need supplemental oxygen. Patient data stayed safely behind the walls of each partner.

Taking Enterprise AI to Production

Once a model is fine tuned, it needs to be optimized for deployment.

It’s a pruning process that makes models lean, yet robust, so they function efficiently on your target platform whether it’s an array of GPUs in a server or a Jetson-powered robot on the factory floor.

NVIDIA TensorRT, another part of TAO, dials a model’s mathematical coordinates to an optimal balance of the smallest size with the highest accuracy for the system it will run on. It’s a crucial step, especially for real-time services like speech recognition or fraud detection that won’t tolerate system latency.

Then, with the Triton Inference Server, users can select the optimal configuration to deploy, whatever the model’s architecture, the framework it uses or target CPU or GPU it will run on.

Once a model is optimized and ready for deployment, users can easily integrate it with whatever application framework that fits their use case or industry. For example, it could be Jarvis for conversational AI, Clara for healthcare, Metropolis for video analytics or Isaac for robotics to name just a few that NVIDIA provides.

NGC TAO Fleet Command workflow
Pre-trained models in NGC, along with TAO and Fleet Command for a simple, but powerful AI workflow.

With the chosen application framework, users can launch NVIDIA Fleet Command to deploy and manage the AI application across a variety of GPU-powered devices. It’s the last key step in the journey.

Zero to AI in Minutes

Fleet Command connects NVIDIA-Certified servers deployed at the network’s edge to the cloud. With it, users can work from a browser to securely pair, orchestrate and manage millions of servers, deploy AI to any remote location and update software as needed.

Administrators monitor health and update systems with one-click to simplify AI operations at scale.

Fleet Command uses end-to-end security protocols to ensure application data and intellectual property remain safe.

Data is sent between the edge and the cloud, fully encrypted, ensuring it’s protected. And applications are scanned for malware and vulnerabilities before they are deployed.

An AI Workflow That’s on the Job

Fleet Command and elements of TAO are already in use in warehouses, in retail, in hospitals and on the factory floor. Users include companies such as Accenture, BMW and Siemens Digital Industries

A demo (below) from the GTC keynote shows how the one-two-three combination of NGC models, TAO and Fleet Command can quickly tailor and deploy an application using multiple AI models.

You can sign up for Fleet Command today.

Core parts of TAO, such as the Transfer Learning Toolkit and federated learning, are available today. Apply now for early access to them all, fully integrated into TAO.

The post Fast Track to Enterprise AI: New NVIDIA Workflow Lets Any User Choose, Adapt, Deploy Models Easily appeared first on The Official NVIDIA Blog.

Read More

Dream State: Cybersecurity Vendors Detect Breaches in an Instant with NVIDIA Morpheus

In the geography of data center security, efforts have long focused on protecting north-south traffic — the data that passes between the data center and the rest of the network. But one of the greatest risks has become east-west traffic — network packets passing between servers within a data center.

That’s due to the growth of cloud-native applications built from microservices, whose connections across a data center are changing constantly. With a typical 1,000-server data center having over 1 billion network paths, it’s extremely difficult to write fixed rules that control the blast radius should a malicious actor get inside.

The new NVIDIA Morpheus AI application framework gives security teams complete visibility into security threats by bringing together unmatched AI processing and real-time monitoring on every packet through the data center. It lets them respond to anomalies and update policies immediately as threats are identified.

Combining the security superpowers of AI and NVIDIA BlueField data processing units (DPUs), Morpheus provides cybersecurity developers a highly optimized AI pipeline and pre-trained AI skills that, for the first time, allow them to instantaneously inspect all IP network communication through their data center fabric.

Bringing a new level of security to data centers, the framework provides dynamic protection, monitoring, adaptive policies and cyber defenses required to detect and remediate them.

Continuous AI Analytics on Network Traffic

Morpheus — which combines event streaming from NVIDIA Cumulus NetQ and GPU accelerated computing with RAPIDS data analytics pipelines, deep learning frameworks and Triton Inference Server, runs on mainstream NVIDIA-Certified enterprise servers — simplifies the analysis of computer logs and helps detect and mitigate security threats. Pre-trained AI models help find leaked credentials, keys, passwords, credit card numbers, bank account numbers and identify security policies that need to be hardened.

Integrating the framework into a third-party cybersecurity offering brings the world’s best AI computing to communication networks. Morpheus can receive rich telemetry feeds from every NVIDIA BlueField DPU-accelerated server in the data center without impacting server performance. BlueField-2 DPUs act both as a sensor to collect real-time packet flows and as a policy enforcement point to limit communication between any microservice container or virtual machine in a data center.

By placing BlueField-2 DPUs in servers across the data center, Morpheus can automatically write and change policies to immediately remediate security threats — from changing the logs being collected and altering the volume of ingesting, to dynamically redirecting certain log events, blocking traffic newly identified as malicious, rewriting rules to enforce policy updates, and more.

Accelerate and Secure the Data Center with NVIDIA BlueField DPUs 

The NVIDIA BlueField-2 DPU, available today, enables true software-defined, hardware-accelerated data center infrastructure. By having software-defined networking policies and telemetry collection run on the BlueField DPU before entering the server, the DPU offloads, accelerates, and isolates critical data center functions without burdening the server’s CPU. The DPU also extends the simple static security logging model and implements sophisticated dynamic telemetry that evolves with new policies being determined and adjusted.

Learn more about NVIDIA Morpheus and apply for early access, currently available in the U.S. and Israel.

The post Dream State: Cybersecurity Vendors Detect Breaches in an Instant with NVIDIA Morpheus appeared first on The Official NVIDIA Blog.

Read More

NVIDIA’s New CPU to ‘Grace’ World’s Most Powerful AI-Capable Supercomputer

NVIDIA’s new Grace CPU will power the world’s most powerful AI-capable supercomputer.

The Swiss National Computing Center’s (CSCS) new system will use Grace, a revolutionary Arm-based data center CPU introduced by NVIDIA today, to enable breakthrough research in a wide range of fields.

From climate and weather to materials sciences, astrophysics, computational fluid dynamics, life sciences, molecular dynamics, quantum chemistry and particle physics, as well as domains like economics and social sciences, Alps will play a key role in advancing science throughout Europe and worldwide when it comes online in 2023.

“We are thrilled to announce the Swiss National Supercomputing Center will build a supercomputer powered by Grace and our next-generation GPU,” NVIDIA CEO Jensen Huang said Monday during his keynote at NVIDIA’s GPU Technology Conference.

Alps will be built by Hewlett Packard Enterprise using the new HPE Cray EX supercomputer product line as well as the NVIDIA HGX supercomputing platform, including NVIDIA GPUs and the NVIDIA HPC SDK as well as the new Grace CPU.

The Alps system will replace CSCS’s existing Piz Daint supercomputer.

AI New Kind of Supercomputing

Alps is one of the new generation of machines that are expanding supercomputing beyond traditional modeling and simulation by taking advantage of GPU-accelerated deep learning.

“Deep learning is just an incredibly powerful set of tools that we add to the toolbox,” said CSCS Director Thomas Schulthess.

Taking advantage of the tight coupling between NVIDIA CPUs and GPUs, Alps is expected to be able to train GPT-3, the world’s largest natural language processing model, in only two days — 7x faster than NVIDIA’s 2.8-AI exaflops Selene supercomputer, currently recognized as the world’s leading supercomputer for AI by MLPerf.

CSCS users will be able to apply this incredible AI performance to a wide range of emerging scientific research that can benefit from natural language understanding.

This includes, for example, analyzing and understanding massive amounts of knowledge available in scientific papers and generating new molecules for drug discovery.

Soul of the New Machine

Based on the hyper-efficient Arm microarchitecture found in billions of smartphones and other edge computing devices, Grace will deliver 10x the performance of today’s fastest servers on the most complex AI and high-performance computing workloads.

Grace will support the next generation of NVIDIA’s coherent NVLink interconnect technology, allowing data to move more quickly between system memory, CPUs and GPUs.

And thanks to growing GPU support for data science acceleration at ever-larger scales, Alps will also be able to accelerate a bigger chunk of its users’ workflows, such as ingesting the vast quantities of data needed for modern supercomputing.

“The scientists will not only be able to carry out simulations, but also pre-process or post-process their data,” Schulthess said. “This makes the whole workflow more efficient for them.”

From Particle Physics to Weather Forecasts

CSCS has long supported scientists who are working at the cutting edge, particularly in materials science, weather forecasting and climate modeling, and understanding data streaming in from a new generation of scientific instruments.

CSCS designs and operates a dedicated system for numerical weather predictions (NWP) on behalf of MeteoSwiss, the Swiss meteorological service. This system has been running on GPUs since 2016.

That long-standing experience with operational NWP on GPUs will be key to future climate simulations as well — key not only to modeling long-term changes to climate, but to building models able to more accurately predict extreme weather events, saving lives.

One of that team’s goals is to run global climate models with a spatial resolution of 1 km that can map convective clouds such as thunderclouds.

The CSCS supercomputer is also used by Swiss scientists for the analysis of data from the Large Hadron Collider (LHC) at CERN, the European Council for Nuclear Research. It is the Swiss Tier-2 system in the World LHC Computing Grid.

Based in Geneva, the LHC — at $9 billion, one of the most expensive scientific instruments ever built — generates 90 petabytes of data a year.

Alps uses a new software-defined infrastructure that can support a wide range of projects.

As a result, in the future, different teams, such those from MeteoSwiss, will be able to use one or more partitions on a single, unified infrastructure, rather than different machines.

These can be virtual ad-hoc clusters for individual users or predefined clusters that research teams can put together with CSCS and then operate themselves.

 

 

 

 Featured image source: Steve Evans, from Citizen of the World.

 

The post NVIDIA’s New CPU to ‘Grace’ World’s Most Powerful AI-Capable Supercomputer appeared first on The Official NVIDIA Blog.

Read More

What Is Quantum Computing?

Twenty-seven years before Steve Jobs unveiled a computer you could put in your pocket, physicist Paul Benioff published a paper showing it was theoretically possible to build a much more powerful system you could hide in a thimble — a quantum computer.

Named for the subatomic physics it aimed to harness, the concept Benioff described in 1980 still fuels research today, including efforts to build the next big thing in computing: a system that could make a PC look in some ways quaint as an abacus.

Richard Feynman —  a Nobel Prize winner whose wit-laced lectures brought physics to a broad audience —  helped establish the field, sketching out how such systems could simulate quirky quantum phenomena more efficiently than traditional computers.

So, What Is Quantum Computing?

Quantum computing uses the physics that governs subatomic particles to perform sophisticated parallel calculations, replacing more simplistic transistors in today’s computers.

Quantum computers calculate using qubits, computing units that can be on, off or any value between, instead of the bits in traditional computers that are either on or off, one or zero. The qubit’s ability to live in the in-between state — called superposition — adds a powerful capability to the computing equation, making quantum computers superior for some kinds of math.

quantum computing definedWhat Does a Quantum Computer Do?

Using qubits, quantum computers could buzz through calculations that would take classical computers a loooong time — if they could even finish them.

For example, today’s computers use eight bits to represent any number between 0 and 255. Thanks to features like superposition, a quantum computer can use eight qubits to represent every number between 0 and 255, simultaneously.

It’s a feature like parallelism in computing: All possibilities are computed at once rather than sequentially, providing tremendous speedups.

So, while a classical computer steps through long division calculations one at a time to factor a humongous number, a quantum computer can get the answer in a single step. Boom!

That means quantum computers could reshape whole fields, like cryptography, that are based on factoring what are today impossibly large numbers.

A Big Role for Tiny Simulations

That could be just the start. Some experts believe quantum computers will bust through limits that now hinder simulations in chemistry, materials science and anything involving worlds built on the nano-sized bricks of quantum mechanics.

Quantum computers could even extend the life of semiconductors by helping engineers create more refined simulations of the quantum effects they’re starting to find in today’s smallest transistors.

Indeed, experts say quantum computers ultimately won’t replace classical computers, they’ll complement them. And some predict quantum computers will be used as accelerators much as GPUs accelerate today’s computers.

How Does Quantum Computing Work?

Don’t expect to build your own quantum computer like a DIY PC with parts scavenged from discount bins at the local electronics shop.

The handful of systems operating today typically require refrigeration that creates working environments just north of absolute zero. They need that computing arctic to handle the fragile quantum states that power these systems.

In a sign of how hard constructing a quantum computer can be, one prototype suspends an atom between two lasers to create a qubit. Try that in your home workshop!

Quantum computing takes nano-Herculean muscles to create something called entanglement. That’s when two or more qubits exist in a single quantum state, a condition sometimes measured by electromagnetic waves just a millimeter wide.

Crank up that wave with a hair too much energy and you lose entanglement or superposition, or both. The result is a noisy state called decoherence, the equivalent in quantum computing of the blue screen of death.

What’s the Status of Quantum Computers?

A handful of companies such as Alibaba, Google, Honeywell, IBM, IonQ and Xanadu operate early versions of quantum computers today.

Today they provide tens of qubits. But qubits can be noisy, making them sometimes unreliable. To tackle real-world problems reliably, systems need tens or hundreds of thousands of qubits.

Experts believe it could be a couple decades before we get to a high-fidelity era when quantum computers are truly useful.

quantum computing status
Quantum computers are slowly moving toward commercial use. (Source: ISSCC 2017 talk by Lieven Vandersypen.)

Predictions of when we reach so-called quantum computing supremacy — the time when quantum computers execute tasks classical ones can’t — is a matter of lively debate in the industry.

Accelerating Quantum Circuit Simulations Today

The good news is the world of AI and machine learning put a spotlight on accelerators like GPUs, which can perform many of the types of operations quantum computers would calculate with qubits.

So, classical computers are already finding ways to host quantum simulations with GPUs today. For example, NVIDIA ran a leading-edge quantum simulation on Selene, our in-house AI supercomputer.

NVIDIA announced in the GTC keynote the cuQuantum SDK to speed quantum circuit simulations running on GPUs. Early work suggests cuQuantum will be able to deliver orders of magnitude speedups.

The SDK takes an agnostic approach, providing a choice of tools users can pick to best fit their approach. For example, the state vector method provides high-fidelity results, but its memory requirements grow exponentially with the number of qubits.

That creates a practical limit of roughly 50 qubits on today’s largest classical supercomputers. Nevertheless we’ve seen great results (below) using cuQuantum to accelerate quantum circuit simulations that use this method.

quantum state vector results
State vector: 1,000 circuits, 36 qubits, depth m=10, complex 64 | CPU: Qiskit on dual AMD EPYC 7742 | GPU: Qgate on DGX A100

Researchers from the Jülich Supercomputing Centre will provide a deep dive on their work with the state vector method in session E31941 at GTC (free with registration).

A newer approach, tensor network simulations, use less memory and more computation to perform similar work.

Using this method, NVIDIA and Caltech accelerated a state-of-the-art quantum circuit simulator with cuQuantum running on NVIDIA A100 Tensor Core GPUs. It generated a sample from a full-circuit simulation of the Google Sycamore circuit in 9.3 minutes on Selene, a task that 18 months ago experts thought would take days using millions of CPU cores.

Quantum tensor chart
Tensor Network – 53 qubits, depth m=20 | CPU: Quimb on Dual AMD EPYC 7742 estimated | GPU: Quimb on DGX-A100

“Using the Cotengra/Quimb packages, NVIDIA’s newly announced cuQuantum SDK, and the Selene supercomputer, we’ve generated a sample of the Sycamore quantum circuit at depth m=20 in record time — less than 10 minutes,” said Johnnie Gray, a research scientist at Caltech.

“This sets the benchmark for quantum circuit simulation performance and will help advance the field of quantum computing by improving our ability to verify the behavior of quantum circuits,” said Garnet Chan, a chemistry professor at Caltech whose lab hosted the work.

NVIDIA expects the performance gains and ease of use of cuQuantum will make it a foundational element in every quantum computing framework and simulator at the cutting edge of this research.

Sign up to show early interest in cuQuantum here.

The post What Is Quantum Computing? appeared first on The Official NVIDIA Blog.

Read More