Carestream Health and Startups Develop AI-Enabled Medical Instruments with NVIDIA Clara AGX Developer Kit

Carestream Health, a leading maker of medical imaging systems, is investigating the use of  NVIDIA Clara AGX — an embedded AI platform for medical devices — in the development of AI-powered features on single-frame and streaming x-ray applications.

Startups around the world, too, are adopting Clara AGX for AI solutions in medical imaging, surgery and electron microscopy. Among them is Boston-based Activ Surgical, which recently received FDA clearance for a hardware imaging module to deliver real-time AI insights to the operating room.

Now in general availability, the NVIDIA Clara AGX developer kit advances the development of software-defined instruments, such as microscopes, ultrasounds and endoscopes.

This emerging generation of medical devices is equipped with dozens of real-time AI applications providing support at every step of the clinical experience — from automating patient set-up for scans and improving image quality to analyzing data streams and delivering critical insights to care providers.

NVIDIA Clara AGX is accelerating the development of these new medical instruments by providing a universal platform that can deliver high-bandwidth signal processing, accelerated computing reconstruction, AI processing and advanced 3D visualization.

Helping Clinicians Sense in Real Time 

Medical instruments like endoscopes and surgical robots are mounted with cameras, sending a live video feed to the clinicians operating the devices. Capturing these streams and applying computer vision AI to the video content can give medical professionals tools to improve patient care and bolster the capabilities of hospitals that lack adequate medical imaging resources.

Architected with NVIDIA Jetson AGX Xavier, an NVIDIA RTX 6000 GPU and the NVIDIA Mellanox ConnectX-6 SmartNIC, the Clara AGX developer kit comes with an SDK that makes it easy for developers to get up and running with real-time system software, libraries for input/output and video pipelining, and reference applications to create AI models for ultrasound and endoscopy.

Built into the platform is the NVIDIA EGX stack for cloud-native containerized software and microservices, including NVIDIA Fleet Command to securely deploy fleets of devices in hospitals, which together transform everyday sensors into smart sensors.

These smart sensors will be software-defined, meaning they can be regularly updated with AI algorithms as they improve — an essential capability to continuously connect research breakthroughs with the day-to-day practice of medicine.

Enabling Intelligent Instruments

Carestream Health is creating smart X-ray rooms that will include AI-powered features for an enhanced imaging workflow and faster, more efficient exams. The devices include automated positioning and exposure settings for similar exam types, which helps improve the consistency of X-ray images, boosting diagnostic confidence.

And Activ Surgical, a member of the NVIDIA Inception startup accelerator program, is using NVIDIA GPU-accelerated AI to deliver real-time surgical guidance. The company’s newly FDA-cleared ActivSight module will power its ActivINSIGHT product, which will provide surgeons with previously unavailable visual overlays, including blood flow and perfusion without the need for the injection of dyes.

Carestream Health and Activ Surgical are just two of the pioneering companies worldwide using NVIDIA AGX systems to power intelligent medical devices. Others include:

  • AJA Video Systems, based in California’s Gold Country, develops professional video and audio PCIe cards for high-bandwidth streaming. When combined with the NVIDIA Clara AGX developer kit, which includes two PCIe slots and high-speed network ports, the company’s cards can be used for endoscopy and surgical visualization applications.
  • Kaliber Labs, an NVIDIA Inception member, is building real-time AI-powered software solutions to support surgeons performing arthroscopic and minimally invasive procedures. Kaliber uses NVIDIA Clara AGX to deploy its surgical software suite, which equips surgeons with a first-of-its-kind contextualized and personalized surgical toolkit to help surgeons perform at the highest level and reduce surgical variability.
  • KAYA Instruments, an NVIDIA Inception member, develops computer vision products that can be used with imaging devices, including electron microscopes, ultrasound machines and MRI equipment. The Israel-based company’s video acquisition cards and cameras transfer medical imaging content to NVIDIA GPUs for real-time processing and AI-accelerated analysis.
  • Subtle Medical, an NVIDIA Inception member, has deployed FDA-cleared and CE-marked deep-learning powered image enhancement software solutions for PET and MRI protocols. The company will leverage NVIDIA Clara AGX for SubtleIR, an AI-powered software under development that improves the speed and quality of interventional imaging procedures.
  • Theator, an NVIDIA Inception member, will use NVIDIA Clara AGX to develop its surgical analytics platform. The Palo Alto-based startup is developing edge GPU-accelerated AI systems to annotate operation room footage, allowing surgeons to conduct post-surgery reviews where they can compare parts of a procedure with previous identical procedures.
  • us4us, a Poland-based maker of ultrasound research systems, is using NVIDIA AGX systems for a portable ultrasound platform that will support real-time digital beamforming — a compute-intensive technique essential to capturing quality ultrasound images. The software-defined system uses embedded GPU modules so medical researchers can develop and deploy custom AI models for image processing during ultrasound scans.

Learn more about Clara AGX for AI-powered medical devices and instruments in the GTC talk, “Using Ethernet to Stream High-Throughput, Low-Latency Medical Sensor Data.” The NVIDIA GPU Technology Conference is free to register. The healthcare track includes 16 live webinars, 18 special events and over 100 recorded sessions.

Registration isn’t required to watch NVIDIA CEO Jensen Huang’s keynote address.

Subscribe to NVIDIA healthcare news, and follow NVIDIA Healthcare on Twitter.

The post Carestream Health and Startups Develop AI-Enabled Medical Instruments with NVIDIA Clara AGX Developer Kit appeared first on The Official NVIDIA Blog.

Read More

NVIDIA Gives Arm a Second Shot of Acceleration

The Arm ecosystem got a booster shot of advances from NVIDIA at GTC today.

NVIDIA discussed work with Arm-based silicon, software and service providers, showing the potential of energy-efficient, accelerated platforms and applications across client, cloud, HPC and edge computing.

NVIDIA also announced three new processors built around Arm IP, including “Grace,” its first data center CPU which takes AI, cloud and high performance computing to new heights.

Separately, the new BlueField-3 data processing unit (DPU) sports more Arm cores, opening doors to new more powerful applications in data center networking.

And NVIDIA DRIVE Atlan becomes the company’s first processor for autonomous vehicles packing an Arm-enabled DPU, showing the potential for high performance networks in automaker’s 2025 models.

A Vision of What’s Possible

In his GTC keynote, NVIDIA CEO Jensen Huang shared his vision for AI, HPC, data science, graphics and more. He also reaffirmed his pledge to expand the Arm ecosystem as part of the Arm acquisition deal NVIDIA announced in September 2020.

On the road to making that vision a reality, NVIDIA described a set of efforts to accelerate CPUs from four key Arm partners with NVIDIA GPUs, DPUs and software, enhancing apps from Arm developers.

GPUs Boost AWS Graviton2 Instances

In the cloud, NVIDIA announced it will provide GPU acceleration for Amazon Web Services Graviton2, the cloud-service provider’s own Arm-based processor. The accelerated Graviton2 instances will provide rich game-streaming experiences and lower the cost of powerful AI inference capabilities.

For example, game developers will use the AWS instances to stream Android games and other services that combine the efficiency of Graviton2 with NVIDIA RTX graphics technologies like ray tracing and DLSS.

In high performance computing, the new NVIDIA Arm HPC Developer Kit provides a high-performance, energy-efficient platform for supercomputers that combine Ampere Computing’s Altra — a CPU packing 80 Arm cores running up to 3.3 GHz — with the latest NVIDIA GPUs and DPUs.

The devkit runs a suite of NVIDIA compilers, libraries and tools for AI and HPC so developers can accelerate Arm-based systems for science and technical computing. Leading researchers including Oak Ridge and Los Alamos National Labs in the U.S. as well as national labs in South Korea and Taiwan will be among its first users.

Pumping Up Client, Edge Platforms

In PCs, NVIDIA is working with MediaTek, the world’s largest supplier of smartphone chips, to create a new class of notebooks powered by an Arm-based CPU alongside an NVIDIA RTX GPU.

The notebooks will use Arm cores and NVIDIA graphics to give consumers energy-efficient portables with no-compromise media capabilities based on a reference platform that supports Chromium, Linux and NVIDIA SDKs.

And in edge computing, NVIDIA is working with Marvell Semiconductor to team its OCTEON Arm-based processors with NVIDIA’s GPUs. Together they will speed up AI workloads for network optimization and security.

Top AI Systems Join Arm’s Family

Two powerful AI supercomputers will come online next year.

The Swiss National Supercomputing Centre is building a system with 20 exaflops of AI performance. And in the U.S., the Los Alamos National Laboratory will switch on a new AI supercomputer for its researchers.

Both will be powered by NVIDIA’s first data center CPU, “Grace,” an Arm-based processor that will deliver 10x the performance of today’s fastest servers on the most complex AI and HPC workloads.

Named after pioneering computer scientist Grace Hopper, this CPU has the plumbing needed for the data-driven AI era. It sports coherent connections running at 900 GB/s to NVIDIA GPUs, thanks to a fourth generation NVLink — that’s 14x the bandwidth of today’s servers.

More Arm Cores for Networking

NVIDIA Mellanox networking is more than doubling down on its investment in Arm. The BlueField-3 DPU announced today packs 400-Gbps links and 5x the Arm compute power of the current DPU, the BlueField-2 available today.

Simple math shows why bulking up on Arm makes sense: One BlueField-3 DPU delivers the equivalent data center services that could consume up to 300 x86 CPU cores.

The advance gives Arm developers an expanding set of opportunities to build fast, efficient and smart data center networks.

Today DPUs offload communications, storage, security and systems-management tasks. That’s enabling whole new classes of systems such as the cloud-native supercomputer NVIDIA announced today.

NVIDIA and Arm Behind the Wheel

Arm cores will debut in next-generation AI-enabled autonomous vehicles powered by NVIDIA DRIVE Atlan, the next leap on NVIDIA’s roadmap.

DRIVE Atlan will pack quite a punch, kicking out more than 1,000 trillion operations per second. Atlan marks the first time the DRIVE platform integrates a DPU, carrying Arm cores that will help it pack the equivalent of data center networking into autonomous vehicles.

The DPU in Atlan provides a platform for Arm developers to create innovative applications in security, storage, networking and more.

The Best Is Yet to Come 

The expanding products and partnerships mark progress on our intention announced in October to bring the Arm ecosystem four acceleration suites:

  • NVIDIA AI – the industry standard for accelerating AI training and inference
  • RAPIDS – a suite of open-source software libraries maintained by NVIDIA to run data science and analytics on GPUs
  • NVIDIA HPC SDK – compilers, libraries and software tools for high performance computing
  • NVIDIA RTX – graphics drivers that deliver ray tracing and AI capabilities

And we’re just getting started. There’s much more to come and much more to say.

Learn about new opportunities combining NVIDIA and Arm at GTC21. Registration is free.

The post NVIDIA Gives Arm a Second Shot of Acceleration appeared first on The Official NVIDIA Blog.

Read More

NVIDIA DRIVE Sim Powered by Omniverse Available for Early Access This Summer

The path to autonomous vehicle deployment is accelerating through the Omniverse.

During his opening keynote at GTC, NVIDIA founder and CEO Jensen Huang announced the next generation of autonomous vehicle simulation, NVIDIA DRIVE Sim, now powered by NVIDIA Omniverse.

DRIVE Sim enables high-fidelity simulation by tapping into NVIDIA’s core technologies to deliver a powerful, cloud-based computing platform. It can generate datasets to train the vehicle’s perception system and provide a virtual proving ground to test the vehicle’s decision-making process while accounting for edge cases. The platform can be connected to the AV stack in software-in-the-loop or hardware-in-the-loop configurations to test the full driving experience.

DRIVE Sim on Omniverse is a major step forward as NVIDIA transitions the foundation for autonomous vehicle simulation from a game engine to a simulation engine.

This shift to simulation architected specifically for self-driving development has required significant effort, but brings an array of new capabilities and opportunities.

Enter the Omniverse

Creating a purpose-built autonomous vehicle simulation platform is not a simple undertaking. Game engines are powerful tools that provide incredible capabilities, however, they’re designed to build games, not scientific, physically accurate, repeatable simulations.

Designing the next generation of DRIVE Sim required a new approach. This new simulator had to be repeatable with precise timing, easily scale across GPUs and server nodes, simulate sensor feeds with physical accuracy and act as a modular and extensible platform.

NVIDIA Omniverse is the confluence of almost every core technology developed by NVIDIA. And DRIVE Sim takes advantage of the company’s expertise in graphics, high performance computing, AI and hardware design. Combining these capabilities provides a technology platform that is perfect for autonomous vehicle simulation.

Specifically, Omniverse provides a platform that was designed from the ground up to support multi-GPU computing. It incorporates a physically accurate, ray-tracing renderer based on NVIDIA RTX technology.

NVIDIA Omniverse also includes “Kit,” a scalable and extensible simulation framework for building interactive 3D applications and microservices. Using Kit over the last year, NVIDIA has implemented the DRIVE Sim core simulation engine in a way that supports repeatable simulation with precise control over all processes.

Timing and Repeatability

Autonomous vehicle simulation can only be an effective development tool if scenarios are repeatable and timing is accurate.

For instance, NVIDIA Omniverse schedules and manages all sensor and environment rendering functions to ensure repeatability without loss of accuracy.  It does this across GPUs and across nodes giving DRIVE Sim the ability to handle detailed environments and test vehicles with complex sensor suites. Additionally, it can manage such workloads at slower or faster than real time, while generating repeatable results.

Omniverse was designed to scale to many GPUs providing DRIVE Sim real-time rendering capabilities with repeatable results for complex sensor sets.

Not only does the platform enable this flexibility and accuracy, it does so in a way that’s scalable, so developers can run fleets of vehicles with various sensor suites at large scale and at the highest levels of fidelity.

Physically Accurate Sensors

In addition to accurately recreating real-world driving conditions, the simulation environment must also render vehicle sensor data in the exact same way cameras, radars and lidars take in data from the physical world.

With NVIDIA RTX technology, DRIVE Sim is able to render physically accurate sensor data in real time. Ray tracing provides realistic lighting by simulating the physical properties of visible and non-visible waveforms. And the NVIDIA Omniverse RTX renderer coupled with NVIDIA RTX GPUs enables ray tracing at real-time frame rates.

This scene of vehicles in a tunnel uses indirect lighting, which is challenging to render accurately in real-time, but is enabled in DRIVE Sim by the Omniverse RTX renderer.

The capability to simulate light in real time has significant benefits for autonomous vehicle simulation. It makes it possible to recreate lighting environments that can be virtually impossible to capture using rasterization — from the reflections off a tanker truck to the shadows inside a dim tunnel.

Generating physically accurate sensor data is especially powerful for building datasets to train AI-based perception networks, outputting the ground-truth data with the virtual sensor data. DRIVE Sim includes tools for advanced dataset creation including a powerful Python scripting interface and domain randomization tools.

Using this synthetic data in the DNN training process saves the cost of collecting and labeling real-world data, and speeds up iteration for streamlined autonomous vehicle deployment.

DRIVE Sim provides tools to generate ground truth data with simulation data, enabling rapid generation of complex datasets to train Deep Neural Networks (DNNs) for autonomous vehicle perception.

Modular and Extensible

As a modular, open and extensible platform, DRIVE Sim provides developers the ultimate flexibility and efficiency in simulation testing.

DRIVE Sim on Omniverse allows different components of the simulator to be run to support different use cases. One group of engineers can run just the perception stack in simulation. Another can focus on the planning and control stack by simulating scenarios based on ground-truth object data (thus bypassing the perception stack).

This modularity significantly cuts down on development time by allowing developers to focus on the task at hand, while ensuring that the entire team is using the same tools, scenarios, models and assets in simulation for consistent results.

Using the NVIDIA Omniverse Kit SDK, DRIVE Sim allows developers to build custom models, 3D content and validation tools or to interface with other simulations. Users can create their own plugins or choose from a rich library of vehicle, sensor and traffic plugins provided by DRIVE Sim ecosystem partners. This flexibility enables users to customize DRIVE Sim for their unique use case and tailor the simulation experience to their development and validation needs.

DRIVE Sim on Omniverse will be available to developers via an early access program this summer. Learn more about DRIVE Sim and accelerate the development of safer, more efficient transportation today.

The post NVIDIA DRIVE Sim Powered by Omniverse Available for Early Access This Summer appeared first on The Official NVIDIA Blog.

Read More

A Data Center on Wheels: NVIDIA Unveils DRIVE Atlan Autonomous Vehicle Platform

The next stop on the NVIDIA DRIVE roadmap is Atlan.

During today’s opening keynote of the GPU Technology Conference, NVIDIA founder and CEO Jensen Huang unveiled the upcoming generation of AI compute for autonomous vehicles, NVIDIA DRIVE Atlan. A veritable data center on wheels, Atlan centralizes the vehicle’s entire compute infrastructure into a single system-on-a-chip.

While vehicles are packing in more and more compute technology, they’re lacking the physical security that comes with data center-level processing. Atlan is a technical marvel for safe and secure AI computing, fusing all of NVIDIA’s technologies in AI, automotive, robotics, safety and BlueField data centers.

The next-generation platform will achieve an unprecedented 1,000 trillion operations per second (TOPS) of performance and an estimated SPECint score of more than 100 (SPECrate2017_int) — greater than the total compute in most robotaxis today. Atlan is also the first SoC to be equipped with an NVIDIA BlueField data processing unit (DPU) for trusted security, advanced networking and storage services.

While Atlan will not be available for a couple of years, software development is well underway. Like NVIDIA DRIVE Orin, the next-gen platform is software compatible with previous DRIVE compute platforms, allowing customers to leverage their existing investments across multiple product generations.

“To achieve higher levels of autonomy in more conditions, the number of sensors and their resolutions will continue to increase,” Huang said. “AI models will get more sophisticated. There will be more redundancy and safety functionality. We’re going to need all of the computing we can get.”

Advancing Performance at Light Speed

Autonomous vehicle technology is developing faster than it has in previous years, and the core AI compute must advance in lockstep to support this critical progress.

Cars and trucks of the future will require an optimized AI architecture not only for autonomous driving, but also for intelligent vehicle features like speech recognition and driver monitoring. Upcoming software-defined vehicles will be able to converse with occupants: answering questions, providing directions and warning of road conditions ahead.

Atlan is able to deliver more than 1,000 TOPS — a 4x gain over the previous generation — by leveraging NVIDIA’s latest GPU architecture, new Arm CPU cores and deep learning and computer vision accelerators.The platform architecture provides ample compute horsepower for the redundant and diverse deep neural networks that will power future AI vehicles and leaves headroom for developers to continue adding features and improvements.

This high-performance platform will run autonomous vehicle, intelligent cockpit and traditional infotainment applications concurrently.

A Guaranteed Shield with BlueField

Like every generation of NVIDIA DRIVE, Atlan is designed with the highest level of safety and security.

As a data-center-infrastructure-on-a-chip, the NVIDIA BlueField DPU is architected to handle the complex compute and AI workloads required for autonomous vehicles. By combining the industry-leading ConnectX network adapter with an array of Arm cores, BlueField offers purpose-built hardware acceleration engines with full programmability to deliver “zero-trust” security to prevent data breaches and cyberattacks.

This secure architecture will extend the safety and reliability of the NVIDIA DRIVE platform for vehicle generations to come. NVIDIA DRIVE Orin vehicle production timelines start in 2022, and Atlan will follow, sampling in 2023 and slated for 2025 production vehicles.

The post A Data Center on Wheels: NVIDIA Unveils DRIVE Atlan Autonomous Vehicle Platform appeared first on The Official NVIDIA Blog.

Read More

NVIDIA Opens Up Hyperion 8 Autonomous Vehicle Platform for AV Ecosystem

The next generation of vehicles will be packed with more technology than any computing system today.

And with NVIDIA DRIVE Hyperion, companies can embrace this shift to more intelligent, software-defined vehicles. Announced at GTC, the eighth-generation Hyperion platform includes the sensors, high-performance compute and software necessary for autonomous vehicle development, all verified, calibrated and synchronized right out of the box.

Developing an AV — essentially a data center on wheels — requires an entirely new process. Both the hardware and software must be comprehensively tested and validated to ensure they can handle not only the real-time processing for autonomous driving, but also withstand the harsh conditions of daily driving.

Hyperion is a fully operational, production-ready and open autonomous vehicle platform that cuts down the massive amount of time and cost required to outfit vehicles with the technology required for AI features and autonomous driving.

What’s Included

Hyperion comes with all the hardware needed to validate an autonomous driving system at the highest levels of performance.

At its core, two NVIDIA DRIVE Orin systems-on-a-chip (SoCs) provide ample compute for level 4 self-driving and intelligent cockpit capabilities. These SoCs process data from a halo of 12 exterior cameras, three interior cameras, nine radars and two lidar sensors in real time for safe autonomous operation.

Hyperion also includes all the tools necessary to evaluate the NVIDIA DRIVE AV and DRIVE IX software stack, as well as real-time record and capture capabilities for streamlined driving data processing.

And this entire toolset is synchronized and calibrated precisely for 3D data collection, giving developers valuable time back in setting up and running autonomous vehicle test drives.

Seamless Integration

With much of the industry leveraging NVIDIA DRIVE Orin for in-vehicle compute, DRIVE Hyperion is the next step for full autonomous vehicle development and validation.

By including a complete sensor setup on top of centralized compute, Hyperion provides everything needed to validate an intelligent vehicle’s hardware on the road. And with its compatibility with the NVIDIA DRIVE AV and DRIVE IX software stacks, Hyperion is also a critical platform for evaluating and validating self-driving software.

Plus, it’s already streamlining critical self-driving research and development. Institutions such as the Virginia Tech Transportation Institute and Stanford University are leveraging the current generation of Hyperion in autonomous vehicle research pilots.

Developers can begin leveraging the latest open platform soon — the eighth generation of Hyperion will be available to the NVIDIA DRIVE ecosystem later in 2021.

The post NVIDIA Opens Up Hyperion 8 Autonomous Vehicle Platform for AV Ecosystem appeared first on The Official NVIDIA Blog.

Read More

Brain Gain: NVIDIA DRIVE Orin Now Central Computer for Intelligent Vehicles

NVIDIA DRIVE Orin, our breakthrough autonomous vehicle system-on-a-chip, is the new mega brain of the software-defined vehicle.

Beyond self-driving features, NVIDIA CEO and founder Jensen Huang announced today during his GTC keynote that the SoC can power all the intelligent computing functions inside vehicles, including confidence view visualization of autonomous driving capabilities, digital clusters, infotainment and passenger interaction AI.

Slated for 2022 vehicle product lines, Orin processes more than 250 trillion operations per second while achieving systematic safety standards such as ISO 26262 ASIL-D.

Typically, vehicle functions are controlled by tens of electronic control units distributed throughout a vehicle. By centralizing control of these core domains, Orin can replace these components and simplify what has been an incredibly complex supply chain for automakers.

“The future is one central computer — four domains, virtualized and isolated, architected for functional safety and security, software-defined and upgradeable for the life of the car — in addition to super-smart AI and beautiful graphics,” Huang said.

Secure Computing for Every Need

Managing a system with multiple complex applications is incredibly difficult. And when it comes to automotive, safety is critical.

DRIVE Orin supports multiple operating systems, including Linux, QNX and Android, to enable this wide range of applications. As a high-performance compute platform architected for the highest level of safety, it does so in a way that is secure, virtualized and accelerated.

The digital cluster, driver monitoring system and AV confidence view are all crucial to ensuring the safety of a vehicle’s occupants. Each must be functionally secure, with the ability to update each application individually without requiring a system reboot.

DRIVE Orin is designed for software-defined operation, meaning it’s purpose-built to handle these continuous upgrades throughout the life of the vehicle.

The Highest Levels of Confidence

As vehicles become more and more autonomous, visualization within the cabin will be critical for building trust with occupants. And with the DRIVE Orin platform, manufacturers can integrate enhanced capability into their fleets over the life of their vehicles.

The confidence view is a rendering of the mind of the vehicle’s AI. It shows exactly what the sensor suite and perception system are detecting in real time and constructs it into a 3D surround model.

By incorporating this view in the cabin interior, the vehicle can communicate the accuracy and reliability of the autonomous driving system at every step of the journey. And occupants can gain a better understanding of how the vehicle’s AI sees the world.

As a high-performance AI compute platform, DRIVE Orin enables this visualization alongside the digital cluster, infotainment, and driver and occupant monitoring, while maintaining enough compute headroom to add new features that delight customers through the life of their vehicles.

The ability to support this multi-functionality safely and securely is what makes NVIDIA DRIVE Orin truly central to the next-generation intelligent vehicle experience.

The post Brain Gain: NVIDIA DRIVE Orin Now Central Computer for Intelligent Vehicles appeared first on The Official NVIDIA Blog.

Read More

NVIDIA Triton Tames the Seas of AI Inference

You don’t need a hunky sea god with a three-pronged spear to make AI work, but a growing group of companies from car makers to cloud service providers say you’ll feel a sea change if you sail with Triton.

More than half a dozen companies share hands-on experiences this week in deep learning with the NVIDIA Triton Inference Server, open-source software that takes AI into production by simplifying how models run in any framework on any GPU or CPU for all forms of inference.

For instance, in a talk at GTC (free with registration) Fabian Bormann, an AI engineer at Volkswagen Group, conducts a virtual tour through the Computer Vision Model Zoo, a repository of solutions curated from the company’s internal teams and future partners.

The car maker integrates Triton into its Volkswagen Computer Vision Workbench so users can make contributions to the Model Zoo without needing to worry about whether they are based on ONNX, PyTorch or TensorFlow frameworks. Triton simplifies model management and deployment, and that’s key for VW’s work serving up AI models in new and interesting environments, Bormann says in a description of his talk (session E32736) at GTC.

Salesforce Sold on Triton Benchmarks

A leader in customer-relationship management software and services, Salesforce recently benchmarked Triton’s performance on some of the world’s largest AI models — the transformers used for natural-language processing.

“Triton not only has excellent serving performance, but also comes included with several critical functions like dynamic batching, model management and model prioritization. It is quick and easy to set up and works for many deep learning frameworks including TensorFlow and PyTorch,” said Nitish Shirish Keskar, a senior research manager at Salesforce who’s presenting his work at GTC (session S32713).

Keskar described in a recent blog his work validating that Triton can handle 500-600 queries per second (QPS) while processing 100 concurrent threads and staying under 200ms latency on the well-known BERT models used to understand speech and text. He tested Triton on the much larger CTRL and GPT2-XL models, finding that despite their billions of neural-network nodes, Triton still cranked out an amazing 32-35 QPS.

A Model Collaboration with Hugging Face

More than 5,000 organizations turn to Hugging Face for help summarizing, translating and analyzing text with its 7,000 AI models for natural-language processing. Jeff Boudier, its product director, will describe at GTC (session S32003) how his team drove 100x improvements in AI inference on its models, thanks to a flow that included Triton.

“We have a rich collaboration with NVIDIA, so our users can have the most optimized performance running models on a GPU,” said Boudier.

Hugging Face aims to combine Triton with TensorRT, NVIDIA’s software for optimizing AI models, to drive the time to process an inference with a BERT model down to less than a millisecond. “That would push the state of the art, opening up new use cases with benefits for a broad market,” he said.

Deployed at Scale for AI Inference

American Express uses Triton in an AI service that operates within a 2ms latency requirement to detect fraud in real time across $1 trillion in annual transactions.

As for throughput, Microsoft uses Triton on its Azure cloud service to power the AI behind GrammarLink, its online editor for Microsoft Word that’s expected to serve as many as half a trillion queries a year.

Less well known but well worth noting, LivePerson, based in New York, plans to run thousands of models on Triton in a cloud service that provides conversational AI capabilities to 18,000 customers including GM Financial, Home Depot and European cellular provider Orange.

Triton Inference Server
Triton simplifies the job of executing multiple styles of inference with models based on various frameworks while maintaining highest throughput and system utilization.

And the chief technology officer of London-based Intelligent Voice will describe at GTC (session S31452) its LexIQal system, which uses Triton for AI inference to detect fraud in insurance and financial services.

They are among many companies using NVIDIA for AI inference today. In the past year alone, users downloaded the Triton software more than 50,000 times.

Triton’s Swiss Army Spear

Triton is getting traction in part because it can handle any kind of AI inference job, whether it’s one that runs in real time, batch mode, as a streaming service or even if it involves a chain or ensemble of models. That flexibility eliminates the need for users to adopt and manage custom inference servers for each type of task.

In addition, Triton assures high system utilization, distributing work evenly across GPUs whether inference is running in a cloud service, in a local data center or at the edge of the network. And it’s open, extensible code lets users customize Triton to their specific needs.

NVIDIA keeps improving Triton, too. A recently added model analyzer combs through all the options to show users the optimal batch size or instances-per-GPU for their job. A new tool automates the job of translating and validating a model trained in Tensorflow or PyTorch into a TensorRT format; in future, it will support translating models to and from any neural-network format.

Meet Our Inference Partners

Triton’s attracted several partners who support the software in their cloud services, including Amazon, Google, Microsoft and Tencent. Others such as Allegro, Seldon and Red Hat support Triton in the software for enterprise data centers for workflows including MLOps, the extension to DevOps for AI.

At GTC (session S33118), Arm will describe how it adapted Triton as part of its neural-network software that runs inference directly on edge gateways. Two engineers from Dell EMC will show how to boost performance in video analytics 6x using Triton (session S31437), and NetApp will talk about its work integrating Triton with its solid-state storage arrays (session S32187).

To learn more, register for GTC and check out one of two introductory sessions (S31114, SE2690) with NVIDIA experts on Triton for deep learning inference.

The post NVIDIA Triton Tames the Seas of AI Inference appeared first on The Official NVIDIA Blog.

Read More

Like Magic: NVIDIA Merlin Gains Adoption for Training and Inference

Recommenders personalize the internet. They suggest videos, foods, sneakers and advertisements that seem magically clairvoyant in knowing your tastes and interests.

It’s an AI that makes online experiences more enjoyable and efficient, quickly taking you to the things you want to see. While delivering content you like, it also targets tempting ads for jeans, or recommends comfort dishes that fit those midnight cravings.

But not all recommender systems can handle the data requirements to make smarter suggestions. That leads to slower training and less intuitive internet user experiences.

NVIDIA Merlin is turbocharging recommenders, boosting training and inference. Leaders in media, entertainment and on-demand delivery use the open source recommender framework for running accelerated deep learning on GPUs. Improving recommendations increases clicks, purchases — and satisfaction.

Merlin-Accelerated Recommenders 

NVIDIA Merlin enables businesses of all types to build recommenders accelerated by NVIDIA GPUs.

Its collection of libraries includes tools for building deep learning-based systems that provide better predictions than traditional methods and increase clicks. Each stage of the pipeline is optimized to support hundreds of terabytes of data, all accessible through easy-to-use APIs.

Merlin is in testing with hundreds of companies worldwide. Social media and video services are evaluating it for suggestions on next views and ads. And major on-demand apps and retailers are looking at it for suggestions on new items to purchase.

Videos with Snap

With Merlin, Snap is improving the customer experience with better load times by ranking content and ads 60% faster while also reducing their infrastructure costs. Using GPUs and Merlin provides Snap with additional compute capacity to explore more complex and accurate ranking models. These improvements allow Snap to deliver even more engaging experiences at a lower cost.

Tencent: Ads that Click

China’s leading online video media platform uses Merlin HugeCTR to help connect over 500 million monthly active users with ads that are relevant and engaging. With such a huge dataset, training speed matters and determines the performance of the recommender model. Tencent deployed its real-time training with Merlin and achieved more than a 7x speedup over the original TensorFlow solution on the same GPU platform. Tencent dives into this further at its GTC presentation.

Postmates Food Picks

Merlin was designed to streamline and support recommender workflows. Postmates uses recommenders to help people decide what’s for dinner. Postmates utilizes Merlin NVTabular to optimize training time, reducing it from 1 hour on CPUs to just 5 minutes on GPUs.

Using NVTabular for feature engineering, the company reduced training costs by 95 percent and is exploring more advanced deep learning models. Postmates delves more into this in its GTC presentation.

Merlin Streamlines Recommender Workflows at Scale

As Merlin is interoperable, it provides flexibility to accelerate recommender workflow pipelines.

The open beta release of the Merlin recommendation engine delivers leaps in data loading and training of deep learning systems.

NVTabular reduces data preparation time by GPU-accelerating feature transformations and preprocessing. NVTabular, which makes loading massive data lakes into training pipelines easier, gets multi-GPU support and improved interoperability with TensorFlow and PyTorch.

Merlin’s Magic for Training

Merlin HugeCTR is the main training component. It’s designed for training deep learning recommender systems and comes with its own optimized data loader, vastly outperforming generic deep learning frameworks. HugeCTR provides a parquet data reader to digest the NVTabular preprocessed data. HugeCTR is a deep neural network training framework specifically designed for recommender workflows capable of distributed training across multiple GPUs and nodes for maximum performance.

NVIDIA Triton Inference Server accelerates production inference on GPUs for feature transforms and neural network execution.

Learn more about the technology advances behind Merlin since its initial launch, including its support for NVTabular, HugeCTR and NVIDIA Triton Inference Server.

 

The post Like Magic: NVIDIA Merlin Gains Adoption for Training and Inference appeared first on The Official NVIDIA Blog.

Read More

NVIDIA Maxine Hits the Scene to Create Real-Time Video Experiences

The next time you’re in a virtual meeting or streaming a game, live event or TV program, the star of the show may be NVIDIA Maxine, which took center stage at GTC today when NVIDIA CEO Jensen Huang announced the availability of the GPU-accelerated software development kit during his keynote address.

Developers from video conferencing, content creation and streaming providers are using the Maxine SDK to create real-time video-based experiences. And it’s easily deployed to PCs, data centers or in the cloud.

Shift Towards Remote Work

Virtual collaboration continues to grow with 70 million hours of web meetings daily, and more global organizations are looking at technologies to support an increasingly remote workforce.

Pexip, a scalable video conferencing platform that enables interoperability between different video conferencing systems, was looking to push the boundaries of its video communications offering to meet this growing demand.

“We’re already using NVIDIA Maxine for audio noise removal and working on integrating virtual backgrounds to support premium video conferencing experiences for enterprises of all sizes,” said Giles Chamberlin, CTO and co-founder of Pexip.

Working with NVIDIA, Pexip aims to provide AI-powered video communications that support virtual meetings that are better than meetings in person.

It joins other companies in the video collaboration space like Avaya, which incorporated Maxine audio noise reduction into its Spaces app last October and has now implemented virtual background, which allows presenters to overlay their video over presentations.

Headroom uses AI to take distractions out of video conferencing, so participants can focus on interactions during meetings instead. This includes flagging when people have questions, note taking, transcription and smart meeting summarization.

Seeing Face Value for Virtual Events

Research has shown that there are over 1 million virtual events yearly, with more event marketers planning to invest in them in the future. As a result, everyone from event organizers to visual effects artists are looking for faster, more efficient ways to create digital experiences.

Among them is Touchcast, which combines AI and mixed reality to reimagine virtual events. It’s using Maxine’s super-resolution features to convert and deliver 1080p streams into 4K.

“NVIDIA Maxine is paving the future of video communications — a future where AI and neural networks enhance and enrich content in entirely new ways,” said Edo Segal, founder and CEO of Touchcast.

Another example is Notch, which creates tools that enable real-time visual effects and motion graphics for live events. Maxine provides it with real-time, AI-driven face and body tracking along with background removal.

Artists can track and mask performers in a live performance setting for a variety of creative use cases — all using a standard camera feed and eliminating the challenges of special hardware-tracking solutions.

“The integration of the Maxine SDK was very easy and took just a few days to complete,” said Matt Swoboda, founder and director of Notch.

Field of Streams

With nearly 10 million content creators on Twitch per month, becoming a live broadcaster has also never been easier. Live streamers are looking for powerful yet easy-to-use features to excite their audiences.

BeLive, which provides a platform for live streaming user-generated talk shows, is using Maxine to process its video streams in the cloud so customers don’t have to invest in expensive equipment. By running Maxine in the cloud, users can benefit from high-quality background replacement regardless of the hardware they’re running in the client.

With BeLive, live interactive call-in talk shows can be produced easily and streamed to YouTube or Facebook Live, with participants calling in from around the world.

OBS, the leading platform for streaming and recording, is a free and open source software solution broadly used for game streaming and live production. Users with NVIDIA RTX GPUs can now take advantage of noise removal, improving the clarity of their audio during production.

Maxine users
Developers are using the Maxine SDK for building virtual collaboration and content creation applications.

A Look Into NVIDIA Maxine

NVIDIA Maxine includes three AI SDKs covering video effects, audio effects and augmented reality — each with pre-trained deep learning models, so developers can quickly build or enhance their real-time applications.

Starting with the NVIDIA Video Effects SDK, enterprises can now apply AI effects to improve video quality without special cameras or other hardware. Features include super-resolution, generating 720p output live videos from 360p input videos along with artifact reduction to remove defects for crisper pictures.

Video noise removal eliminates low-light camera noise introduced in the video capture process while preserving all of the details. To hide messy rooms or other visual distractions, the Video Effects SDK removes the background of a webcam feed in real time, so only a user’s face and body show up in a livestream.

The NVIDIA Augmented Reality SDK enables real-time 3D face tracking using a standard web camera, delivering a more engaging virtual communication experience by automatically zooming into the face and keeping that face within view of the camera.

It’s now possible to detect human faces in images of video feeds, track the movement of facial expressions, create a 3D mesh representation of a person’s face, use video to track the movement of a  human body in 3D space, simulate eye contact through gaze estimation and much more.

The NVIDIA Audio Effects SDK uses AI to remove distracting background noise from incoming and outgoing audio feeds, improving the clarity and quality of any conversation.

This includes the removal of unwanted background noises — like a dog barking or baby crying — to make conversations easier to understand. For meetings in large spaces, it’s also possible to remove room echoes from the background to make voices clearer.

Developers can add Maxine AI effects into their existing applications or develop new pipelines from scratch using NVIDIA DeepStream, an SDK for building intelligent video analytics, and NVIDIA Video Codec, an SDK for accelerated video encode and decode on Windows and Linux.

Maxine can also be used with NVIDIA Jarvis, a framework for building conversational AI applications, to offer world-class language-based capabilities such as transcription and translation.

Availability

Get started with NVIDIA Maxine.

And don’t let the curtain close on the opportunity to learn more about NVIDIA Maxine during GTC, running April 12-16. Registration is free.

A full list of Maxine-focused sessions can be found here. Be sure to watch Huang’s keynote address on-demand. And check out a demo (below) of Maxine.

The post NVIDIA Maxine Hits the Scene to Create Real-Time Video Experiences appeared first on The Official NVIDIA Blog.

Read More

Fast Track to Enterprise AI: New NVIDIA Workflow Lets Any User Choose, Adapt, Deploy Models Easily

AI is the most powerful new technology of our time, but it’s been a force that’s hard to harness for many enterprises — until now.

Many companies lack the specialized skills, access to large datasets or accelerated computing that deep learning requires. Others are realizing the benefits of AI and want to spread them quickly across more products and services.

For both, there’s a new roadmap to enterprise AI. It leverages technology that’s readily available, then simplifies the AI workflow with NVIDIA TAO and NVIDIA Fleet Command to make the trip shorter and less costly.

Grab and Go AI Models

The journey begins with pre-trained models. You don’t have to design and train a neural network from scratch in 2021. You can choose one of many available today in our NGC catalog.

We’ve curated models that deliver skills to advance your business.  They span the spectrum of AI jobs from computer vision and conversational AI to natural-language understanding and more.

Models Show Their AI Resumes

So users know what they’re getting, many models in the catalog come with credentials. They’re like the resume for a prospective hire.

Model credentials show you the domain the model was trained for, the dataset that trained it, how often the model was deployed and how it’s expected to perform. They provide transparency and confidence you’re picking the right model for your use case.

Leveraging a Massive Investment

NVIDIA invested hundreds of millions of GPU compute hours over more than five years refining these models. We did this work so you don’t have to.

Here are three quick examples of the R&D you can leverage:

For computer vision, we devoted 3,700 person-years to labeling 500 million objects from 45 million frames. We used voice recordings to train our speech models on GPUs for more than a million hours. A database of biomedical papers packing 6.1 billion words educated our models for natural-language processing.

Transfer Learning, Your AI Tailor

Once you choose a model, you can fine tune it to fit your specific needs using NVIDIA TAO, the next stage of our expedited workflow for enterprise AI.

TAO enables transfer learning, a process that harvests features from an existing neural network and plants them in a new one using NVIDIA’s Transfer Learning Toolkit, an integrated part of TAO. It leverages small datasets users have on hand to give models a custom fit without the cost, time and massive datasets required to build and train a neural network from scratch.

Sometimes companies have an opportunity to further enhance models by training them across larger, more diverse datasets maintained by partners outside the walls of their data center.

TAO Lets Partners Collaborate with Privacy 

Federating learning, another part of TAO, lets different sites securely collaborate to refine a model for the highest accuracy. With this technique, users share components of models such as their partial weights. Datasets remain inside each company’s data center so data privacy is preserved.

In one recent example, 20 research sites collaborated to raise the accuracy of the so-called EXAM model that predicts whether a patient has COVID-19. After applying federated learning, the model also could predict the severity of the infection and whether the patient would need supplemental oxygen. Patient data stayed safely behind the walls of each partner.

Taking Enterprise AI to Production

Once a model is fine tuned, it needs to be optimized for deployment.

It’s a pruning process that makes models lean, yet robust, so they function efficiently on your target platform whether it’s an array of GPUs in a server or a Jetson-powered robot on the factory floor.

NVIDIA TensorRT, another part of TAO, dials a model’s mathematical coordinates to an optimal balance of the smallest size with the highest accuracy for the system it will run on. It’s a crucial step, especially for real-time services like speech recognition or fraud detection that won’t tolerate system latency.

Then, with the Triton Inference Server, users can select the optimal configuration to deploy, whatever the model’s architecture, the framework it uses or target CPU or GPU it will run on.

Once a model is optimized and ready for deployment, users can easily integrate it with whatever application framework that fits their use case or industry. For example, it could be Jarvis for conversational AI, Clara for healthcare, Metropolis for video analytics or Isaac for robotics to name just a few that NVIDIA provides.

NGC TAO Fleet Command workflow
Pre-trained models in NGC, along with TAO and Fleet Command for a simple, but powerful AI workflow.

With the chosen application framework, users can launch NVIDIA Fleet Command to deploy and manage the AI application across a variety of GPU-powered devices. It’s the last key step in the journey.

Zero to AI in Minutes

Fleet Command connects NVIDIA-Certified servers deployed at the network’s edge to the cloud. With it, users can work from a browser to securely pair, orchestrate and manage millions of servers, deploy AI to any remote location and update software as needed.

Administrators monitor health and update systems with one-click to simplify AI operations at scale.

Fleet Command uses end-to-end security protocols to ensure application data and intellectual property remain safe.

Data is sent between the edge and the cloud, fully encrypted, ensuring it’s protected. And applications are scanned for malware and vulnerabilities before they are deployed.

An AI Workflow That’s on the Job

Fleet Command and elements of TAO are already in use in warehouses, in retail, in hospitals and on the factory floor. Users include companies such as Accenture, BMW and Siemens Digital Industries

A demo (below) from the GTC keynote shows how the one-two-three combination of NGC models, TAO and Fleet Command can quickly tailor and deploy an application using multiple AI models.

You can sign up for Fleet Command today.

Core parts of TAO, such as the Transfer Learning Toolkit and federated learning, are available today. Apply now for early access to them all, fully integrated into TAO.

The post Fast Track to Enterprise AI: New NVIDIA Workflow Lets Any User Choose, Adapt, Deploy Models Easily appeared first on The Official NVIDIA Blog.

Read More