How Virtual Factories Are Making Industrial Digitalization a Reality

How Virtual Factories Are Making Industrial Digitalization a Reality

To address the shift to electric vehicles, increased semiconductor demand, manufacturing onshoring, and ambitions for greater sustainability, manufacturers are investing in new factory developments and re-engineering their existing facilities.

These projects often run over budget and schedule, due to complex and manual planning processes, legacy technology infrastructure, and disconnected tools, data and teams.

To address these challenges, manufacturers are embracing digitalization and virtual factories, powered by technologies like digital twins, the Universal Scene Description (OpenUSD) ecosystem and generative AI, that enable new possibilities from planning to operations.

What Is a Virtual Factory?

A virtual factory is a physically accurate representation of a real factory. These digital twins of factories allow manufacturers to model, simulate, analyze and optimize their production processes, resources and operations without the need for a physical prototype or pilot plant.

Benefits of Virtual Factories

Virtual factories unlock many benefits and possibilities for manufacturers, including:

  • Streamlined Communication: Instead of teams relying on in-person meetings and static planning documents for project alignment, virtual factories streamline communication and ensure that critical design and operations decisions are informed by the most current data.
  • Contextualized Planning: During facility design, construction and commissioning, virtual factories allow project stakeholders to visualize designs in the context of the entire facility and production process. Planning and operations teams can compare and verify built structures with the virtual designs in real time and decrease costs by identifying errors and incorporating feedback early in the review process.
  • Optimized Facility Designs: Connecting virtual factories to simulations of processes and discrete events enables teams to optimize facility designs for production and material flow, ergonomic work design, safety and overall utilization.
  • Intelligent and Optimized Operations: Operations teams can integrate their virtual factories with valuable production data from Internet of Things technology at the edge, and tap AI to drive further optimizations.

Virtual Factories: A Testing Ground for AI and Robotics

Robotics developers are increasingly using virtual factories to train and test AI and autonomous systems that run in physical factories. For example, virtual factories can enable developers and manufacturing teams to simulate digital workers and autonomous mobile robots (AMRs), vision AI agents and sensors to create a centralized map of worker activity throughout a facility. By fusing data from simulated camera streams with multi-camera tracking, developers can generate occupancy maps that inform optimal AMR routes.

Developers can also use these physically accurate virtual factories to train and test AI agents capable of managing their robot fleets, to ensure AI-enabled robots can adapt to real-world unpredictability and to identify streamlined configurations for human-robot collaboration.

What Are the Foundations of a Virtual Factory

Building large-scale, physically accurate virtual factories that unlock these transformational possibilities requires bringing together many tools, data formats and technologies to harmonize the representation of real-world aspects in the digital world.

Originally invented by Pixar Animation Studios, OpenUSD encompasses a collection of tools and capabilities that enable the data interoperability developers and manufacturers require to achieve their digitalization goals.

OpenUSD’s core superpower is flexible data modeling. 3D input can be accepted from source applications and combined with a variety of data, including from computer-aided design software, live sensors, documentation and maintenance records, through a unified data pipeline. OpenUSD enables developers to share these data types across different simulation tools and AI models, providing insights for all stakeholders. Data can be synced from the factory floor to the digital twin, surfacing real-time insights for factory managers and teams.

By developing virtual factory solutions on OpenUSD, developers can enhance collaboration for factory teams, allowing them to review plans, discuss optimization opportunities and make decisions in real time.

To support and accelerate the development of the OpenUSD ecosystem, Pixar, Adobe, Apple, Autodesk and NVIDIA formed the Alliance for OpenUSD, which is building open standards for USD in core specification, materials, geometry and more.

Industrial Use Cases for Virtual Factories

To unlock the potential of virtual factories, industry leaders including Autodesk, Continental, Pegatron, Rockwell Automation, Siemens and Wistron are developing virtual-factory solutions on OpenUSD and NVIDIA Omniverse, a platform of application programming interfaces (APIs) and software development kits that enable developers to build applications for complex 3D and industrial digitalization workflows based on OpenUSD.

FlexSim, an Autodesk company, uses OpenUSD to enable factory teams to analyze, visualize and optimize real-world processes with its simulation modeling for complex systems and operations. The discrete-event simulation software provides an intuitive drag-and-drop interface to create 3D simulation models, account for real-world variability, run “what-if” scenarios and perform in-depth analyses.

Developers at Continental, a leading German automotive technology company, developed ContiVerse, a factory planning and manufacturing operations application on OpenUSD and NVIDIA Omniverse. The application helps Continental optimize factory layouts and plan production processes collaboratively, leading to an expected 13% reduction in time to market. 

Partnering with software company SoftServe, Continental also developed Industrial Co-Pilot, which combines AI-driven insights with immersive visualization to deliver real-time guidance and predictive analytics to engineers. This is expected to reduce maintenance effort and downtime by 10%.

Pegatron, one of the world’s largest manufacturers of smartphones and consumer electronics, is developing virtual-factory solutions on OpenUSD to accelerate the development of new factories — as well as to minimize change orders, optimize operations and maximize production-line throughput in existing facilities.

Rockwell Automation is integrating NVIDIA Omniverse Cloud APIs and OpenUSD with its Emulate3D digital twin software to bring manufacturing teams data interoperability, live collaboration and physically based visualization for designing, building and operating industrial-scale digital twins of production systems.

Siemens, a leading technology company for automation, digitalization and sustainability and a member of the Alliance for OpenUSD, is adopting Omniverse Cloud APIs within its Siemens Xcelerator Platform, starting with Teamcenter X, the industry-leading cloud-based product lifecycle management software. This will help teams design, build and test next-generation products, manufacturing processes and factories virtually, before they’re built in the physical world.

Wistron, a leading global technology service provider and electronics manufacturer, is digitalizing new and existing factories with OpenUSD. By developing virtual-factory solutions on NVIDIA Omniverse, Wistron enables its factory teams to collaborate remotely to refine layout configurations, optimize surface mount technology and in-circuit testing lines, and transform product-on-dock testing. 

With these solutions, Wistron has achieved a 51% boost in worker efficiency and 50% reduction in production process times. Layout optimization and real-time monitoring have decreased defect rates by 40%. And construction time on Wistron’s new NVIDIA DGX factory was cut in half, from about five months to just two and a half months.

Learn more at the Virtual Factory Use Case page, where a reference architecture provides an overview of components and capabilities developers should consider when developing virtual-factory solutions.

Get started with NVIDIA Omniverse by downloading the standard license free, access OpenUSD resources, and learn how Omniverse Enterprise can connect your team. Stay up to date on Instagram, Medium and X. For more, join the Omniverse community on the forums, Discord server, Twitch and YouTube channels. 

Read More

NVIDIA to Acquire GPU Orchestration Software Provider Run:ai

NVIDIA to Acquire GPU Orchestration Software Provider Run:ai

To help customers make more efficient use of their AI computing resources, NVIDIA today announced it has entered into a definitive agreement to acquire Run:ai, a Kubernetes-based workload management and orchestration software provider.

Customer AI deployments are becoming increasingly complex, with workloads distributed across cloud, edge and on-premises data center infrastructure.

Managing and orchestrating generative AI, recommender systems, search engines and other workloads requires sophisticated scheduling to optimize performance at the system level and on the underlying infrastructure.

Run:ai enables enterprise customers to manage and optimize their compute infrastructure, whether on premises, in the cloud or in hybrid environments.

The company has built an open platform on Kubernetes, the orchestration layer for modern AI and cloud infrastructure. It supports all popular Kubernetes variants and integrates with third-party AI tools and frameworks.

Run:ai customers include some of the world’s largest enterprises across multiple industries, which use the Run:ai platform to manage data-center-scale GPU clusters.

“Run:ai has been a close collaborator with NVIDIA since 2020 and we share a passion for helping our customers make the most of their infrastructure,” said Omri Geller, Run:ai cofounder and CEO. “We’re thrilled to join NVIDIA and look forward to continuing our journey together.”

The Run:ai platform provides AI developers and their teams:

  • A centralized interface to manage shared compute infrastructure, enabling easier and faster access for complex AI workloads.
  • Functionality to add users, curate them under teams, provide access to cluster resources, control over quotas, priorities and pools, and monitor and report on resource use.
  • The ability to pool GPUs and share computing power — from fractions of GPUs to multiple GPUs or multiple nodes of GPUs running on different clusters — for separate tasks.
  • Efficient GPU cluster resource utilization, enabling customers to gain more from their compute investments.

NVIDIA will continue to offer Run:ai’s products under the same business model for the immediate future. And NVIDIA will continue to invest in the Run:ai product roadmap as part of NVIDIA DGX Cloud, an AI platform co-engineered with leading clouds for enterprise developers, offering an integrated, full-stack service optimized for generative AI.

NVIDIA DGX and DGX Cloud customers will gain access to Run:ai’s capabilities for their AI workloads, particularly for large language model deployments. Run:ai’s solutions are already integrated with NVIDIA DGX, NVIDIA DGX SuperPOD, NVIDIA Base Command, NGC containers, and NVIDIA AI Enterprise software, among other products.

NVIDIA’s accelerated computing platform and Run:ai’s platform will continue to support a broad ecosystem of third-party solutions, giving customers choice and flexibility.

Together with Run:ai, NVIDIA will enable customers to have a single fabric that accesses GPU solutions anywhere. Customers can expect to benefit from better GPU utilization, improved management of GPU infrastructure and greater flexibility from the open architecture.

Read More

Forecasting the Future: AI2’s Christopher Bretherton Discusses Using Machine Learning for Climate Modeling

Forecasting the Future: AI2’s Christopher Bretherton Discusses Using Machine Learning for Climate Modeling

Can machine learning help predict extreme weather events and climate change? Christopher Bretherton, senior director of climate modeling at the Allen Institute for Artificial Intelligence, or AI2, explores the technology’s potential to enhance climate modeling with AI Podcast host Noah Kravitz in an episode recorded live at the NVIDIA GTC global AI conference. Bretherton explains how machine learning helps overcome the limitations of traditional climate models and underscores the role of localized predictions in empowering communities to prepare for climate-related risks. Through ongoing research and collaboration, Bretherton and his team aim to improve climate modeling and enable society to better mitigate and adapt to the impacts of climate change.

Stay tuned for more episodes recorded live from GTC, and watch the replay of Bretherton’s GTC session on using machine learning for climate modeling.

Time Stamps

2:03: What is climate modeling and how can it prepare us for climate change?

5:28: How can machine learning help enhance climate modeling?

7:21: What were the limitations of traditional climate models?

10:24: How does a climate model work?

12:11: What information can you get from a climate model?

13:26: What are the current climate models telling us about the future?

15:56: How does machine learning help enable localized climate modeling?

18:39: What, if anything, can individuals or small communities do to prepare for what climate change has in store for us?

25:59: How do you measure the accuracy or performance of an emulator that’s doing something like climate modeling out into the future?

You Might Also Like…

ITIF’s Daniel Castro on Energy-Efficient AI and Climate Change – Ep. 215

AI-driven change is in the air, as are concerns about the technology’s environmental impact. In this episode of NVIDIA’s AI Podcast, Daniel Castro, vice president of the Information Technology and Innovation Foundation and director of its Center for Data Innovation, speaks with host Noah Kravitz about the motivation behind his AI energy use report, which addresses misconceptions about the technology’s energy consumption.

DigitalPath’s Ethan Higgins on Using AI to Fight Wildfires – Ep. 211

DigitalPath is igniting change in the golden state — using computer vision, generative adversarial networks and a network of thousands of cameras to detect signs of fire in real-time. In the latest episode of NVIDIA’s AI Podcast, host Noah Kravtiz spoke with DigitalPath system architect Ethan Higgins about the company’s role in the ALERTCalifornia initiative, a collaboration between California’s wildfire fighting agency CAL FIRE and the University of California, San Diego.

Anima Anandkumar on Using Generative AI to Tackle Global Challenges – Ep. 203

Generative AI-based models can not only learn and understand natural languages — they can learn the very language of nature itself, presenting new possibilities for scientific research. On the latest episode of NVIDIA’s AI Podcast, host Noah Kravitz spoke with Anandkumar on generative AI’s potential to make splashes in the scientific community.

How Alex Fielding and Privateer Space Are Taking on Space Debris – Ep. 196

In this episode of the NVIDIA AI Podcast, host Noah Kravitz dives into an illuminating conversation with Alex Fielding, co-founder and CEO of Privateer Space. Privateer Space, Fielding’s latest venture, aims to address one of the most daunting challenges facing our world today: space debris.

Subscribe to the AI Podcast

Get the AI Podcast through iTunes, Google Podcasts, Google Play, Amazon Music, Castbox, DoggCatcher, Overcast, PlayerFM, Pocket Casts, Podbay, PodBean, PodCruncher, PodKicker, Soundcloud, Spotify, Stitcher and TuneIn.

Make the AI Podcast better: Have a few minutes to spare? Fill out this listener survey.

Read More

Rays Up: Decoding AI-Powered DLSS 3.5 Ray Reconstruction

Rays Up: Decoding AI-Powered DLSS 3.5 Ray Reconstruction

Editor’s note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and which showcases new hardware, software, tools and accelerations for RTX PC users.

AI continues to raise the bar for PC gaming.

DLSS 3.5 with Ray Reconstruction creates higher quality ray-traced images for intensive ray-traced games and apps. This advanced AI-powered neural renderer is a groundbreaking feature that elevates ray-traced image quality for all GeForce RTX GPUs, outclassing traditional hand-tuned denoisers by using an AI network trained by an NVIDIA supercomputer. The result improves lighting effects like reflections, global illumination, and shadows to create a more immersive, realistic gaming experience.

A Ray of Light

Ray tracing is a rendering technique that can realistically simulate the lighting of a scene and its objects by rendering physically accurate reflections, refractions, shadows and indirect lighting. Ray tracing generates computer graphics images by tracing the path of light from the view camera — which determines the view into the scene — through the 2D viewing plane, out into the 3D scene, and back to the light sources. For instance, if rays strike a mirror, reflections are generated.

A visualization of how ray tracing works.

It’s the digital equivalent to real-world objects illuminated by beams of light and the path of the light being followed from the eye of the viewer to the objects that light interacts with. That’s ray tracing.

Simulating light in this manner — shooting rays for every pixel on the screen — is computationally intensive, even for offline renderers that calculate scenes over the course of several minutes or hours. Instead, ray samples fire a handful of rays at various points across the scene for a representative sample of the scene’s lighting, reflectivity and shadowing.

However, there are limitations. The output is a noisy, speckled image with gaps, good enough to ascertain how the scene should look when ray traced. To fill in the missing pixels that weren’t ray traced, hand-tuned denoisers use two different methods, temporally accumulating pixels across multiple frames, and spatially interpolating them to blend neighboring pixels together. Through this process, the noisy raw output is converted into a ray-traced image.

This adds complexity and cost to the development process, and reduces the frame rate in highly ray-traced games where multiple denoisers operate simultaneously for different lighting effects.

DLSS 3.5 Ray Reconstruction introduces an NVIDIA supercomputer-trained, AI-powered neural network that generates higher-quality pixels in between the sampled rays. It recognizes different ray-traced effects to make smarter decisions about using temporal and spatial data, and retains high frequency information for superior-quality upscaling. And it recognizes lighting patterns from its training data, such as that of global illumination or ambient occlusion, and recreates it in-game.

Portal with RTX is a great example of Ray Reconstruction in action. With DLSS OFF, the denoiser struggles to reconstruct the dynamic shadowing alongside the moving fan.

With DLSS 3.5 and Ray Reconstruction enabled, the denoiser is trained on AI and recognizes certain patterns associated with shadows and keeps the image stable, accumulating accurate pixels while blending neighboring pixels to generate high-quality reflections.

Deep Learning, Deep Gaming

Ray Reconstruction is just one of the AI graphics breakthroughs that multiply performance in DLSS. Super Resolution, the cornerstone of DLSS, samples multiple lower resolution images and uses motion data and feedback from prior frames to reconstruct native-quality images. The result is high image quality without sacrificing game performance.

DLSS 3 introduced Frame Generation, which boosts performance by using AI to analyze data from surrounding frames to predict what the next generated frame should look like. These generated frames are then inserted in between rendered frames. Combining the DLSS-generated frames with DLSS Super Resolution enables DLSS 3 to reconstruct seven-eighths of the displayed pixels with AI, boosting frame rates by up to 4x compared to without DLSS.

Because DLSS Frame Generation is post-processed (applied after the main render) on the GPU, it can boost frame rates even when the game is bottlenecked by the CPU.

Generative AI is transforming gaming, videoconferencing and interactive experiences of all kinds. Make sense of what’s new and what’s next by subscribing to the AI Decoded newsletter.

Read More

Small and Mighty: NVIDIA Accelerates Microsoft’s Open Phi-3 Mini Language Models

Small and Mighty: NVIDIA Accelerates Microsoft’s Open Phi-3 Mini Language Models

NVIDIA announced today its acceleration of Microsoft’s new Phi-3 Mini open language model with NVIDIA TensorRT-LLM, an open-source library for optimizing large language model inference when running on NVIDIA GPUs from PC to cloud.

Phi-3 Mini packs the capability of 10x larger models and is licensed for both research and broad commercial usage, advancing Phi-2 from its research-only roots. Workstations with NVIDIA RTX GPUs or PCs with GeForce RTX GPUs have the performance to run the model locally using Windows DirectML or TensorRT-LLM.

The model has 3.8 billion parameters and was trained on 3.3 trillion tokens in only seven days on 512 NVIDIA H100 Tensor Core GPUs.

Phi-3 Mini has two variants, with one supporting 4k tokens and the other supporting 128K tokens, which is the first model in its class for very long contexts. This allows developers to use 128,000 tokens — the atomic parts of language that the model processes — when asking the model a question, which results in more relevant responses from the model.

Developers can try Phi-3 Mini with the 128K context window at ai.nvidia.com, where it is packaged as an NVIDIA NIM, a microservice with a standard application programming interface that can be deployed anywhere.

Creating Efficiency for the Edge

Developers working on autonomous robotics and embedded devices can learn to create and deploy generative AI through community-driven tutorials, like on Jetson AI Lab, and deploy Phi-3 on NVIDIA Jetson.

With only 3.8 billion parameters, the Phi-3 Mini model is compact enough to run efficiently on edge devices. Parameters are like knobs, in memory, that have been precisely tuned during the model training process so that the model can respond with high accuracy to input prompts.

Phi-3 can assist in cost- and resource-constrained use cases, especially for simpler tasks. The model can outperform some larger models on key language benchmarks while delivering results within latency requirements.

TensorRT-LLM will support Phi-3 Mini’s long context window and uses many optimizations and kernels such as LongRoPE, FP8 and inflight batching, which improve inference throughput and latency. The TensorRT-LLM implementations will soon be available in the examples folder on GitHub. There, developers can convert to the TensorRT-LLM checkpoint format, which is optimized for inference and can be easily deployed with NVIDIA Triton Inference Server.

Developing Open Systems

NVIDIA is an active contributor to the open-source ecosystem and has released over 500 projects under open-source licenses.

Contributing to many external projects such as JAX, Kubernetes, OpenUSD, PyTorch and the Linux kernel, NVIDIA supports a wide variety of open-source foundations and standards bodies as well.

Today’s news expands on long-standing NVIDIA collaborations with Microsoft, which have paved the way for innovations including accelerating DirectML, Azure cloud, generative AI research, and healthcare and life sciences.

Learn more about our recent collaboration.

Read More

Climate Tech Startups Integrate NVIDIA AI for Sustainability Applications

Climate Tech Startups Integrate NVIDIA AI for Sustainability Applications

Whether they’re monitoring miniscule insects or delivering insights from satellites in space, NVIDIA-accelerated startups are making every day Earth Day.

Sustainable Futures, an initiative within the NVIDIA Inception program for cutting-edge startups, is supporting 750+ companies globally focused on agriculture, carbon capture, clean energy, climate and weather, environmental analysis, green computing, sustainable infrastructure and waste management.

This Earth Day, discover how five of these sustainability-focused startups are advancing their work with accelerated computing and the NVIDIA Earth-2 platform for climate tech.

Earth-2 features a suite of AI models that help simulate, visualize and deliver actionable insights about weather and climate.

Insect Farming Catches the AI Bug

Image courtesy of Bug Mars

Amid a changing climate, a key component of environmental resilience is food security: the ability to produce and provide enough food to meet the nutrition needs of all people. Edible insects, such as crickets and black soldier flies, are one solution that could reduce humans’ reliance on resource-intensive livestock farming for protein.

Bug Mars, a startup based in Ontario, Canada, supports insect protein production with AI tools that monitor variables including temperature, pests and number of insects — and predict issues and recommend actions based on that data. It can help insect farmers increase yield by 30%.

The company uses NVIDIA Jetson Orin Nano modules to accelerate its work, and recently announced it’s using synthetic data and digital twin technology to further advance its AI solutions for insect agriculture.

Seeing the Forest for the Trees

Based in Truckee, Calif., Vibrant Planet is modeling trillions of trees and other flammable vegetation such as shrublands and grasslands to help land managers, counties and fire districts across North America build wildfire and climate resilience.

NVIDIA hardware and software has helped Vibrant Planet develop transformer models for forest and ecosystem management and AI-enhanced operational planning.

Visualization of forest
Visualization courtesy of Vibrant Planet

The startup collects and analyzes data from lidar sensors, satellites and aircraft to train AI models that can map vegetation with high precision, estimate canopy height and detect characteristics of forest and vegetation areas such as carbon, water, biodiversity and built infrastructure. Customers can use this data to understand fire and drought hazards, and, with these insights, conduct scenario planning to forecast the effects of potential forest thinning, prescribed fire or other actions.

Delivering Tomorrow’s Forecast

Tomorrow.io, based in Boston, is a leading resilience platform that helps organizations adapt to increasing weather and climate volatility. Powered by next-generation space technology, advanced AI models and proprietary modeling capabilities, the startup enables businesses and governments to proactively mitigate risk, ensure operational resilience and drive critical decision-making.

screen capture of tomorrow.io dashboard
Image courtesy of Tomorrow.io

The startup is developing weather forecasting AI and is launching its own satellites to collect environmental data to further train its models. It’s also conducting experiments using Earth-2 AI forecast models to determine the optimal configurations of satellites to improve weather-forecasting conditions.

One of Tomorrow.io’s projects is an initiative in Kenya with the Bill and Melinda Gates Foundation that provides daily alerts to 6 million farmers with insights around when to water their crops, when to spray pesticides, when to harvest or when to change crops altogether due to changes in the local climate. The team hopes to scale up their user base to 100 million farmers in Africa by 2030.

Winds of Change

Palo Alto, Calif.-based WindBorne Systems is developing weather sensing balloons equipped with WeatherMesh, a state-of-the-art AI model for real-time global weather forecasts.

weather balloon against landscape
Image courtesy of WindBorne Systems

WeatherMesh predicts factors including surface temperature, pressure, winds, precipitation and radiation. The model has set world records for accuracy and is lightweight enough to run on a gaming laptop, unlike traditional models that run on supercomputers.

WindBorne uses NVIDIA GPUs to develop its AI and is an early-access user of Earth-2. The company’s weather balloon development is funded in part by the National Oceanic and Atmospheric Administration’s Weather Program Office.

Taking the Temperature of Global Cities

FortyGuard, a startup founded in Abu Dhabi with headquarters in Miami, is developing a system to measure urban heat with AI models that present insights for public health officials, city planners, landscape architects and environmental engineers.

FortyGuard presented in the Expo Hall Theater at NVIDIA GTC.

The company — an early-access user of the Earth-2 platform — aims for its temperature AI models to provide a more granular view into urban heat dynamics, providing data that can help industries and governments shape cooler and more livable cities.

FortyGuard’s technology, offered via application programming interfaces, could integrate with existing enterprise platforms to enable use cases including temperature-based route navigation, predictive enhanced EV performance and property insights.

To learn more about the Sustainable Futures program, watch the “AI Nations and Sustainable Futures Day” session from NVIDIA GTC

NVIDIA is a member of the U.S. Department of State’s Coalition for Climate Entrepreneurship, which aims to address the United Nations’ Sustainable Development Goals using emerging technologies. Learn more in the GTC session, “Global Strategies: Startups, Venture Capital, and Climate Change Solutions.”

Video at top courtesy of Vibrant Planet.

Read More

Wide Open: NVIDIA Accelerates Inference on Meta Llama 3   

Wide Open: NVIDIA Accelerates Inference on Meta Llama 3   

NVIDIA today announced optimizations across all its platforms to accelerate Meta Llama 3, the latest generation of the large language model (LLM).

The open model combined with NVIDIA accelerated computing equips developers, researchers and businesses to innovate responsibly across a wide variety of applications.

Trained on NVIDIA AI

Meta engineers trained Llama 3 on a computer cluster packing 24,576 NVIDIA H100 Tensor Core GPUs, linked with an NVIDIA Quantum-2 InfiniBand network. With support from NVIDIA, Meta tuned its network, software and model architectures for its flagship LLM.

To further advance the state of the art in generative AI, Meta recently described plans to scale its infrastructure to 350,000 H100 GPUs.

Putting Llama 3 to Work

Versions of Llama 3, accelerated on NVIDIA GPUs, are available today for use in the cloud, data center, edge and PC.

From a browser, developers can try Llama 3 at ai.nvidia.com. It’s packaged as an NVIDIA NIM microservice with a standard application programming interface that can be deployed anywhere.

Businesses can fine-tune Llama 3 with their data using NVIDIA NeMo, an open-source framework for LLMs that’s part of the secure, supported NVIDIA AI Enterprise platform. Custom models can be optimized for inference with NVIDIA TensorRT-LLM and deployed with NVIDIA Triton Inference Server.

Taking Llama 3 to Devices and PCs

Llama 3 also runs on NVIDIA Jetson Orin for robotics and edge computing devices, creating interactive agents like those in the Jetson AI Lab.

What’s more, NVIDIA RTX and GeForce RTX GPUs for workstations and PCs speed inference on Llama 3. These systems give developers a target of more than 100 million NVIDIA-accelerated systems worldwide.

Get Optimal Performance with Llama 3

Best practices in deploying an LLM for a chatbot involves a balance of low latency, good reading speed and optimal GPU use to reduce costs.

Such a service needs to deliver tokens — the rough equivalent of words to an LLM — at about twice a user’s reading speed which is about 10 tokens/second.

Applying these metrics, a single NVIDIA H200 Tensor Core GPU generated about 3,000 tokens/second — enough to serve about 300 simultaneous users — in an initial test using the version of Llama 3 with 70 billion parameters.

That means a single NVIDIA HGX server with eight H200 GPUs could deliver 24,000 tokens/second, further optimizing costs by supporting more than 2,400 users at the same time.

For edge devices, the version of Llama 3 with eight billion parameters generated up to 40 tokens/second on Jetson AGX Orin and 15 tokens/second on Jetson Orin Nano.

Advancing Community Models

An active open-source contributor, NVIDIA is committed to optimizing community software that helps users address their toughest challenges. Open-source models also promote AI transparency and let users broadly share work on AI safety and resilience.

Learn more about how NVIDIA’s AI inference platform, including how NIM, TensorRT-LLM and Triton use state-of-the-art techniques such as low-rank adaptation to accelerate the latest LLMs.

Read More

Up to No Good: ‘No Rest for the Wicked’ Early Access Launches on GeForce NOW

Up to No Good: ‘No Rest for the Wicked’ Early Access Launches on GeForce NOW

It’s time to get a little wicked. Members can now stream No Rest for the Wicked from the cloud.

It leads six new games joining the GeForce NOW library of more than 1,500 games.

Holy Moly

No Rest For The Wicked on GeForce NOW
There’s always another fight to be won.

No Rest for the Wicked is the highly anticipated action role-playing game from Moon Studios, developer of the Ori series, and publisher Private Division. Amid a plague-ridden world, step into the boots of a Cerim, a holy warrior on a desperate mission. The Great Pestilence has ravaged the land of Sacra, and a new king reigns. As a colonialist inquisition unfolds, engage in visceral combat, battle plague-infested creatures and uncover the secrets of the continent. Make the character you want with the game’s flexible soft-class system, explore a rich storyline, and prepare for intense boss battles as you build up the town of Sacrament.

Embark on a dark and perilous journey, where no rest awaits the wicked. Rise to the challenge and stream from GeForce RTX 4080 servers with a GeForce NOW Ultimate membership for the smoothest gameplay from the cloud. Be among the first to experience early access of the game, without having to wait for downloads.

Shiny New Games

Evil West on GeForce NOW
“Yippie ki-yay, evil doers!”

Become a Wild West superhero in Evil West, streaming on GeForce NOW this week and part of PC Game Pass. It’s part of six newly supported games this week:

  • Kill It With Fire 2 (New release on Steam, April 16)
  • The Crew Motorfest (New release on Steam, April 18)
  • No Rest for the Wicked (New release on Steam, April 18)
  • Evil West (Xbox, available on PC Game Pass)
  • Lightyear Frontier (Steam)
  • Tomb Raider I-III Remastered (Steam)

Riot Games shared in its 14.8 patch notes that it will soon add its Vanguard security software to League of Legends as part of the publisher’s commitment to remove scripters, bots and bot-leveled accounts from the game and make it more challenging for them to continue. Since Vanguard won’t support virtual machines when it’s added to League of Legends, the game will be put under maintenance and will no longer be playable on GeForce NOW once the 14.9 update goes live globally — currently planned for May 1, 2024. Members can continue to enjoy the game on GeForce NOW until then.

What are you planning to play this weekend? Let us know on X or in the comments below.

Read More

NVIDIA Honors Partners of the Year in Europe, Middle East, Africa

NVIDIA Honors Partners of the Year in Europe, Middle East, Africa

NVIDIA today recognized 18 partners in Europe, the Middle East and Africa for their achievements and commitment to driving AI adoption.

The recipients were honored at the annual EMEA Partner Day hosted by the NVIDIA Partner Network (NPN). The awards span seven categories that highlight the various ways partners work with NVIDIA to transform the region’s industries with AI.

“This year marks another milestone for NVIDIA and our partners across EMEA as we pioneer technological breakthroughs and unlock new business opportunities using NVIDIA’s full-stack platform,” said Dirk Barfuss, director of EMEA channel at NVIDIA. “These awards celebrate our partners’ dedication and expertise in delivering groundbreaking solutions that drive cost efficiencies, enhance productivity and inspire innovation.”

The 2024 NPN award winners for EMEA are:

Rising Star Awards

  • Vesper Technologies received the Rising Star Northern Europe award for its exceptional revenue growth and broad customer base deploying NVIDIA AI solutions in data centers. The company has demonstrated outstanding growth in recent years, augmenting the success of its existing business.
  • AMBER AI & Data Science Solutions GmbH received the Rising Star Central Europe award for its revenue growth of more than 100% across the complete portfolio of NVIDIA technologies. Through extensive collaboration with NVIDIA, the company has become a cornerstone of the NVIDIA partner landscape in Germany.
  • HIPER Global Enterprise Ltd. received the Rising Star Southern Europe & Middle East award for its excellence in serving its broad customer base with NVIDIA compute technologies. Last year, it supported one of the largest customer projects in the region, further accelerating its growth rate.

Star Performer Awards

  • Boston Limited received the Star Performer Northern Europe award for its consistent success in delivering full-stack implementations of NVIDIA technologies for customers across industries. The company over the last year achieved record revenue growth across its business areas.
  • DELTA Computer Products GmbH received the Star Performer Central Europe award for its outstanding sales achievements and strong customer relationships. With a massive technical knowledge base, the company has served as a trusted advisor for customers deploying NVIDIA technologies across industry, higher education and research.
  • COMMit DMCC received the Star Performer Southern Europe & Middle East award for its exceptional execution of strategic and complex solutions built on NVIDIA technologies, which led to record revenues for the United Arab Emirates-based company.

Distributor of the Year

  • PNY received the Distributor of the Year award for the third consecutive year, underscoring its consistent investment in technology training and commitment to providing NVIDIA accelerated computing platforms and software across markets.
  • TD Synnex received the Networking Distributor of the Year award for the second year in a row, highlighting its massive investments in NVIDIA’s portfolio of technologies —  especially networking — and dedication to delivering technical expertise to customers.

Go-to-Market Excellence 

  • Bynet Data Communications Ltd. received the Go-to-Market Excellence award for its collaboration with NVIDIA regional leads to devise and execute effective go-to-market strategies for the Israeli market. This included identifying key opportunities and creating localized marketing campaigns. Its efforts led to great success with the installation of NVIDIA DGX SuperPODs into several new industries in the region.
  • Vesper Technologies was Highly Commended in the Go-to-Market Excellence category for its fully integrated go-to-market strategy around the launch of the NVIDIA GH200 Grace Hopper Superchip. The company successfully deployed a results-driven marketing campaign, demonstrated a commitment to technical training and developed a pre-sales trial and evaluation platform.
  • M Computers s.r.o. was Highly Commended in the Go-to-Market Excellence category for its success and leadership in engaging AI customers in eastern Europe with NVIDIA technologies. The company’s marketing efforts, including speeches at AI events and social media campaigns, helped lead to the first NVIDIA DGX H100 and NVIDIA Grace CPU Superchip projects in the region.

Industry Innovation 

  • WPP received the Industry Innovation award for its innovative applications of AI and NVIDIA technology in the marketing and advertising sector. The company worked with NVIDIA to build a groundbreaking generative AI-powered content engine, built on the NVIDIA Omniverse platform, that enables the creation of brand-consistent content at scale.
  • Ascon Systems was Highly Commended in the Industry Innovation category for its cutting-edge Industrial Metaverse Portal, powered by NVIDIA Omniverse, that helped transform BMW Group’s manufacturing processes with real-time product control and enhanced visualization and interaction.
  • Gcore was Highly Commended in the Industry Innovation category for its creation of the first speech-to-text technology for Luxembourgish, using its fine-tuned LuxemBERT AI model. The technology integrates seamlessly into corporate systems and Luxembourgish messaging platforms, fostering the preservation of the traditionally spoken language, which lacked adequate tools for written communication.

Pioneer

  • Arrow Electronics – Intelligent Business Solutions received the Pioneer Award for its work promoting the NVIDIA IGX Orin platform for healthcare applications and building strategies to drive adoption of the technology. The company’s innovative approach and support led to the first integration of the NVIDIA IGX Orin Developer Kit with an NVIDIA RTX 6000 Ada Generation GPU for a robotic surgery platform.

Consulting Partner of the Year

  • SoftServe received the Consulting Partner of the Year award for its excellence in working with partners to drive the adoption of NVIDIA’s full-stack technologies, helping transform customers’ business with generative AI and NVIDIA Omniverse. Through its SoftServe University corporate learning hub, SoftServe trained its employees, customers and partners to expertly use NVIDIA technology.
  • Deloitte was Highly Commended in the Consulting Partner of the Year category for its focus on building sales and technical skills, efforts to deliver meaningful impact through projects and go-to-market strategy that helped drive enterprise-level AI transformation in the region.
  • Data Monsters was highly commended in the Consulting Partner of the Year category for its development of a virtual assistant with lifelike hearing, speech and animation capabilities using NVIDIA Avatar Cloud Engine and large language models.

​​Learn how to join NPN, or find a local NPN partner.

Read More

Seeing Beyond: Living Optics CEO Robin Wang on Democratizing Hyperspectral Imaging

Seeing Beyond: Living Optics CEO Robin Wang on Democratizing Hyperspectral Imaging

Step into the realm of the unseen with Robin Wang, CEO of Living Optics. The startup cofounder discusses the power of hyperspectral imaging with AI Podcast host Noah Kravitz in an episode recorded live at the NVIDIA GTC global AI conference. Living Optics’ hyperspectral imaging camera, which can capture visual data across 96 colors, reveals details invisible to the human eye. Potential applications are as diverse as monitoring plant health to detecting cracks in bridges. The startup aims to empower users across industries to gain new insights from richer, more informative datasets fueled by hyperspectral imaging technology.

Living Optics is a member of the NVIDIA Inception program for cutting-edge startups.

Stay tuned for more episodes recorded live from GTC.

Time Stamps

1:05: What is hyperspectral imaging?

1:45: The Living Optics camera’s ability to capture 96 colors

3:36: Where is hyperspectral imaging being used, and why is it so important?

7:19: How are hyperspectral images represented and accessed by the user?

9:34: Other use cases of hyperspectral imaging

13:07: What’s unique about Living Optics’ hyperspectral imaging camera?

18:36: Breakthroughs, challenges during the technology’s development

23:27: What’s next for Living Optics and hyperspectral imaging?

You Might Also Like…

Dotlumen CEO Cornel Amariei on Assisstive Technology for the Visually Impaired – Ep. 217

Dotlumen is illuminating a new technology to help people with visual impairments navigate the world. In this episode of NVIDIA’s AI Podcast, recorded live at the NVIDIA GTC global AI conference, host Noah Kravitz spoke with the Romanian startup’s founder and CEO, Cornel Amariei, about developing its flagship Dotlumen Glasses.

DigitalPath’s Ethan Higgins on Using AI to Fight Wildfires – Ep. 211

DigitalPath is igniting change in the golden state — using computer vision, generative adversarial networks and a network of thousands of cameras to detect signs of fire in real time. In the latest episode of NVIDIA’s AI Podcast, host Noah Kravtiz spoke with DigitalPath system architect Ethan Higgins about the company’s role in the ALERTCalifornia initiative, a collaboration between California’s wildfire fighting agency CAL FIRE and the University of California, San Diego.

MosaicML’s Naveen Rao on Making Custom LLMs More Accessible – Ep. 199

Startup MosaicML is on a mission to help the AI community enhance prediction accuracy, decrease costs, and save time by providing tools for easy training and deployment of large AI models. In this episode of NVIDIA’s AI Podcast, host Noah Kravitz speaks with MosaicML CEO and co-founder Naveen Rao about how the company aims to democratize access to large language models.

Peter Ma on Using AI to Find Promising Signals for Alien Life – Ep. 191

In this episode of the NVIDIA AI Podcast, host Noah Kravitz interviews Ma, an undergraduate student at the University of Toronto, about how he developed an AI algorithm that outperformed traditional methods in the search for extraterrestrial intelligence.

Subscribe to the AI Podcast

Get the AI Podcast through iTunes, Google Podcasts, Google Play, Amazon Music, Castbox, DoggCatcher, Overcast, PlayerFM, Pocket Casts, Podbay, PodBean, PodCruncher, PodKicker, Soundcloud, Spotify, Stitcher and TuneIn.

Make the AI Podcast better: Have a few minutes to spare? Fill out this listener survey.

Read More