AI is set to transform the workforce — and the Georgia Institute of Technology’s new AI Makerspace is helping tens of thousands of students get ahead of the curve. In this episode of NVIDIA’s AI Podcast, host Noah Kravitz speaks with Arijit Raychowdhury, a professor and Steve W. Cedex school chair of electrical engineering at Georgia Tech’s college of engineering, about the supercomputer hub, which provides students with the computing resources to reinforce their coursework and gain hands-on experience with AI. Built in collaboration with NVIDIA, the AI Makerspace underscores Georgia Tech’s commitment to preparing students for an AI-driven future, while fostering collaboration with local schools and universities.
5:57: What computing resources are included in the AI Makerspace?
7:23: What is the aim of the AI Makerspace?
14:47: Georgia Tech’s AI-focused minor and coursework
19:25: Raychowdhury’s insight on the intersection of AI and higher education
23:33: How have industries and jobs already changed as a result of AI?
27:44: What can younger students do to prepare to get a spot in Georgia Tech’s engineering program?
You Might Also Like…
How Two Students Are Building Robots for Handling Houeshold Chores – Ep. 224 Imagine having a robot that could help you clean up after a party — or fold heaps of laundry. Chengshu Eric Li and Josiah David Wong, two Stanford University Ph.D. students advised by renowned computer science professor Fei-Fei Li, are making that a dream come true with BEHAVIOR-1K, a project that aims to enable robots to perform 1,000 household chores, including picking up fallen objects or cooking.
Artificial intelligence is now a household term. Responsible AI is hot on its heels. Julia Stoyanovich, associate professor of computer science and engineering at NYU and director of the university’s Center for Responsible AI, wants to make the terms “AI” and “responsible AI” synonymous, sharing her advocacy efforts and how people can help.
Replit aims to empower the next billion software creators. Replit CEO Amjad Masad aims to bridge the gap between ideas and software, a task simplified by advances in generative AI. The company’s suite of technologies help make software creation accessible to all, even those with no coding experience.
Anant Agarwal, founder of edX and chief platform officer at 2U, shares his vision for the future of online education and the impact of AI in revolutionizing the learning experience, emphasizing the importance of accessibility and quality in education.
Businesses seeking to harness the power of AI need customized models tailored to their specific industry needs.
NVIDIA AI Foundry is a service that enables enterprises to use data, accelerated computing and software tools to create and deploy custom models that can supercharge their generative AI initiatives.
Just as TSMC manufactures chips designed by other companies, NVIDIA AI Foundry provides the infrastructure and tools for other companies to develop and customize AI models — using DGX Cloud, foundation models, NVIDIA NeMo software, NVIDIA expertise, as well as ecosystem tools and support.
The key difference is the product: TSMC produces physical semiconductor chips, while NVIDIA AI Foundry helps create custom models. Both enable innovation and connect to a vast ecosystem of tools and partners.
Enterprises can use AI Foundry to customize NVIDIA and open community models, including the new Llama 3.1 collection, as well as NVIDIA Nemotron, CodeGemma by Google DeepMind, CodeLlama, Gemma by Google DeepMind, Mistral, Mixtral, Phi-3, StarCoder2 and others.
Industry Pioneers Drive AI Innovation
Industry leaders Amdocs, Capital One, Getty Images, KT, Hyundai Motor Company, SAP, ServiceNow and Snowflake are among the first using NVIDIA AI Foundry. These pioneers are setting the stage for a new era of AI-driven innovation in enterprise software, technology, communications and media.
“Organizations deploying AI can gain a competitive edge with custom models that incorporate industry and business knowledge,” said Jeremy Barnes, vice president of AI Product at ServiceNow. “ServiceNow is using NVIDIA AI Foundry to fine-tune and deploy models that can integrate easily within customers’ existing workflows.”
The Pillars of NVIDIA AI Foundry
NVIDIA AI Foundry is supported by the key pillars of foundation models, enterprise software, accelerated computing, expert support and a broad partner ecosystem.
Its software includes AI foundation models from NVIDIA and the AI community as well as the complete NVIDIA NeMo software platform for fast-tracking model development.
The computing muscle of NVIDIA AI Foundry is NVIDIA DGX Cloud, a network of accelerated compute resources co-engineered with the world’s leading public clouds — Amazon Web Services, Google Cloud and Oracle Cloud Infrastructure. With DGX Cloud, AI Foundry customers can develop and fine-tune custom generative AI applications with unprecedented ease and efficiency, and scale their AI initiatives as needed without significant upfront investments in hardware. This flexibility is crucial for businesses looking to stay agile in a rapidly changing market.
If an NVIDIA AI Foundry customer needs assistance, NVIDIA AI Enterprise experts are on hand to help. NVIDIA experts can walk customers through each of the steps required to build, fine-tune and deploy their models with proprietary data, ensuring the models tightly align with their business requirements.
NVIDIA AI Foundry customers have access to a global ecosystem of partners that can provide a full range of support. Accenture, Deloitte, Infosys and Wipro are among the NVIDIA partners that offer AI Foundry consulting services that encompass design, implementation and management of AI-driven digital transformation projects. Accenture is first to offer its own AI Foundry-based offering for custom model development, the Accenture AI Refinery framework.
Additionally, service delivery partners such as Data Monsters, Quantiphi, Slalom and SoftServe help enterprises navigate the complexities of integrating AI into their existing IT landscapes, ensuring that AI applications are scalable, secure and aligned with business objectives.
Customers can develop NVIDIA AI Foundry models for production using AIOps and MLOps platforms from NVIDIA partners, including Cleanlab, DataDog, Dataiku, Dataloop, DataRobot, Domino Data Lab, Fiddler AI, New Relic, Scale and Weights & Biases.
Customers can output their AI Foundry models as NVIDIA NIM inference microservices — which include the custom model, optimized engines and a standard API — to run on their preferred accelerated infrastructure.
Inferencing solutions like NVIDIA TensorRT-LLM deliver improved efficiency for Llama 3.1 models to minimize latency and maximize throughput. This enables enterprises to generate tokens faster while reducing total cost of running the models in production. Enterprise-grade support and security is provided by the NVIDIA AI Enterprise software suite.
The broad range of deployment options includes NVIDIA-Certified Systems from global server manufacturing partners including Cisco, Dell Technologies, Hewlett Packard Enterprise, Lenovo and Supermicro, as well as cloud instances from Amazon Web Services, Google Cloud and Oracle Cloud Infrastructure.
Additionally, Together AI, a leading AI acceleration cloud, today announced it will enable its ecosystem of over 100,000 developers and enterprises to use its NVIDIA GPU-accelerated inference stack to deploy Llama 3.1 endpoints and other open models on DGX Cloud.
“Every enterprise running generative AI applications wants a faster user experience, with greater efficiency and lower cost,” said Vipul Ved Prakash, founder and CEO of Together AI. “Now, developers and enterprises using the Together Inference Engine can maximize performance, scalability and security on NVIDIA DGX Cloud.”
NVIDIA NeMo Speeds and Simplifies Custom Model Development
With NVIDIA NeMo integrated into AI Foundry, developers have at their fingertips the tools needed to curate data, customize foundation models and evaluate performance. NeMo technologies include:
NeMo Curator is a GPU-accelerated data-curation library that improves generative AI model performance by preparing large-scale, high-quality datasets for pretraining and fine-tuning.
NeMo Customizer is a high-performance, scalable microservice that simplifies fine-tuning and alignment of LLMs for domain-specific use cases.
NeMo Evaluator provides automatic assessment of generative AI models across academic and custom benchmarks on any accelerated cloud or data center.
NeMo Guardrails orchestrates dialog management, supporting accuracy, appropriateness and security in smart applications with large language models to provide safeguards for generative AI applications.
Using the NeMo platform in NVIDIA AI Foundry, businesses can create custom AI models that are precisely tailored to their needs. This customization allows for better alignment with strategic objectives, improved accuracy in decision-making and enhanced operational efficiency. For instance, companies can develop models that understand industry-specific jargon, comply with regulatory requirements and integrate seamlessly with existing workflows.
“As a next step of our partnership, SAP plans to use NVIDIA’s NeMo platform to help businesses to accelerate AI-driven productivity powered by SAP Business AI,” said Philipp Herzig, chief AI officer at SAP.
Enterprises can deploy their custom AI models in production with NVIDIA NeMo Retriever NIM inference microservices. These help developers fetch proprietary data to generate knowledgeable responses for their AI applications with retrieval-augmented generation (RAG).
“Safe, trustworthy AI is a non-negotiable for enterprises harnessing generative AI, with retrieval accuracy directly impacting the relevance and quality of generated responses in RAG systems,” said Baris Gultekin, Head of AI, Snowflake. “Snowflake Cortex AI leverages NeMo Retriever, a component of NVIDIA AI Foundry, to further provide enterprises with easy, efficient, and trusted answers using their custom data.”
Custom Models Drive Competitive Advantage
One of the key advantages of NVIDIA AI Foundry is its ability to address the unique challenges faced by enterprises in adopting AI. Generic AI models can fall short of meeting specific business needs and data security requirements. Custom AI models, on the other hand, offer superior flexibility, adaptability and performance, making them ideal for enterprises seeking to gain a competitive edge.
Learn more about how NVIDIA AI Foundry allows enterprises to boost productivity and innovation.
Generative AI applications have little, or sometimes negative, value without accuracy — and accuracy is rooted in data.
To help developers efficiently fetch the best proprietary data to generate knowledgeable responses for their AI applications, NVIDIA today announced four new NVIDIA NeMo Retriever NIM inference microservices.
NeMo Retriever allows organizations to seamlessly connect custom models to diverse business data and deliver highly accurate responses for AI applications using RAG. In essence, the production-ready microservices enable highly accurate information retrieval for building highly accurate AI applications.
For example, NeMo Retriever can boost model accuracy and throughput for developers creating AI agents and customer service chatbots, analyzing security vulnerabilities or extracting insights from complex supply chain information.
NIM inference microservices enable high-performance, easy-to-use, enterprise-grade inferencing. And with NeMo Retriever NIM microservices, developers can benefit from all of this — superpowered by their data.
These new NeMo Retriever embedding and reranking NIM microservices are now generally available:
NV-EmbedQA-E5-v5, a popular community base embedding model optimized for text question-answering retrieval
NV-EmbedQA-Mistral7B-v2, a popular multilingual community base model fine-tuned for text embedding for high-accuracy question answering
Snowflake-Arctic-Embed-L, an optimized community model, and
NV-RerankQA-Mistral4B-v3, a popular community base model fine-tuned for text reranking for high-accuracy question answering.
They join the collection of NIM microservices easily accessible through the NVIDIA API catalog.
Embedding and Reranking Models
NeMo Retriever NIM microservices comprise two model types — embedding and reranking — with open and commercial offerings that ensure transparency and reliability.
An embedding model transforms diverse data — such as text, images, charts and video — into numerical vectors, stored in a vector database, while capturing their meaning and nuance. Embedding models are fast and computationally less expensive than traditional large language models, or LLMs.
A reranking model ingests data and a query, then scores the data according to its relevance to the query. Such models offer significant accuracy improvements while being computationally complex and slower than embedding models.
NeMo Retriever provides the best of both worlds. By casting a wide net of data to be retrieved with an embedding NIM, then using a reranking NIM to trim the results for relevancy, developers tapping NeMo Retriever can build a pipeline that ensures the most helpful, accurate results for their enterprise.
With NeMo Retriever, developers get access to state-of-the-art open, commercial models for building text Q&A retrieval pipelines that provide the highest accuracy. When compared with alternate models, NeMo Retriever NIM microservices provided 30% fewer inaccurate answers for enterprise question answering.
Top Use Cases
From RAG and AI agent solutions to data-driven analytics and more, NeMo Retriever powers a wide range of AI applications.
NVIDIA AI workflows for these use cases provide an easy, supported starting point for developing generative AI-powered technologies.
Dozens of NVIDIA data platform partners are working with NeMo Retriever NIM microservices to boost their AI models’ accuracy and throughput.
DataStax has integrated NeMo Retriever embedding NIM microservices in its Astra DB and Hyper-Converged platforms, enabling the company to bring accurate, generative AI-enhanced RAG capabilities to customers with faster time to market.
Cohesity will integrate NVIDIA NeMo Retriever microservices with its AI product, Cohesity Gaia, to help customers put their data to work to power insightful, transformative generative AI applications through RAG.
Kinetica will use NVIDIA NeMo Retriever to develop LLM agents that can interact with complex networks in natural language to respond more quickly to outages or breaches — turning insights into immediate action.
NetApp is collaborating with NVIDIA to connect NeMo Retriever microservices to exabytes of data on its intelligent data infrastructure. Every NetApp ONTAP customer will be able to seamlessly “talk to their data” to access proprietary business insights without having to compromise the security or privacy of their data.
NVIDIA global system integrator partners including Accenture, Deloitte, Infosys, LTTS, Tata Consultancy Services, Tech Mahindra and Wipro, as well as service delivery partners Data Monsters, EXLService (Ireland) Limited, Latentview, Quantiphi, Slalom, SoftServe and Tredence, are developing services to help enterprises add NeMo Retriever NIM microservices into their AI pipelines.
Use With Other NIM Microservices
NeMo Retriever NIM microservices can be used with NVIDIA Riva NIM microservices, which supercharge speech AI applications across industries — enhancing customer service and enlivening digital humans.
New models that will soon be available as Riva NIM microservices include: FastPitch and HiFi-GAN for text-to-speech applications; Megatron for multilingual neural machine translation; and the record-breaking NVIDIA Parakeet family of models for automatic speech recognition.
NVIDIA NIM microservices can be used all together or separately, offering developers a modular approach to building AI applications. In addition, the microservices can be integrated with community models, NVIDIA models or users’ custom models — in the cloud, on premises or in hybrid environments — providing developers with further flexibility.
NVIDIA NIM microservices are available at ai.nvidia.com. Enterprises can deploy AI applications in production with NIM through the NVIDIA AI Enterprise software platform.
NIM microservices can run on customers’ preferred accelerated infrastructure, including cloud instances from Amazon Web Services, Google Cloud, Microsoft Azure and Oracle Cloud Infrastructure, as well as NVIDIA-Certified Systems from global server manufacturing partners including Cisco, Dell Technologies, Hewlett Packard Enterprise, Lenovo and Supermicro.
NVIDIA Developer Program members will soon be able to access NIM for free for research, development and testing on their preferred infrastructure.
Learn more about the latest in generative AI and accelerated computing by joining NVIDIA at SIGGRAPH, the premier computer graphics conference, running July 28-Aug. 1 in Denver.
See notice regarding software product information.
“The new trend in LLM competitions is that they don’t give you training data,” said Deotte, a senior data scientist at NVIDIA. “They give you 96 example questions — not enough to train a model — so we came up with 500,000 questions on our own.”
Deotte explained that the NVIDIA team generated a variety of questions by writing some themselves, using a large language model to create others, and transforming existing e-commerce datasets.
“Once we had our questions, it was straightforward to use existing frameworks to fine-tune a language model,” he said.
The competition organizers hid the test questions to ensure participants couldn’t exploit previously known answers. This approach encourages models that generalize well to any question about e-commerce, proving the model’s capability to handle real-world scenarios effectively.
Despite these constraints, Team NVIDIA’s innovative approach outperformed all competitors by using Qwen2-72B, a just-released LLM with 72 billion parameters, fine-tuned on eight NVIDIA A100 Tensor Core GPUs, and employing QLoRA, a technique for fine-tuning models with datasets.
About the KDD Cup 2024
The KDD Cup, organized by the Association for Computing Machinery’s Special Interest Group on Knowledge Discovery and Data Mining, or ACM SIGKDD, is a prestigious annual competition that promotes research and development in the field.
This year’s challenge, hosted by Amazon, focused on mimicking the complexities of online shopping with the goal of making it a more intuitive and satisfying experience using large language models. Organizers utilized the test dataset ShopBench — a benchmark that replicates the massive challenge for online shopping with 57 tasks and about 20,000 questions derived from real-world Amazon shopping data — to evaluate participants’ models.
The ShopBench benchmark focused on four key shopping skills, along with a fifth “all-in-one” challenge:
Shopping Concept Understanding: Decoding complex shopping concepts and terminologies.
Shopping Knowledge Reasoning: Making informed decisions with shopping knowledge.
User Behavior Alignment: Understanding dynamic customer behavior.
Multilingual Abilities: Shopping across languages.
All-Around: Solving all tasks from the previous tracks in a unified solution.
NVIDIA’s Winning Solution
NVIDIA’s winning solution involved creating a single model for each track.
The team fine-tuned the just-released Qwen2-72B model using eight NVIDIA A100 Tensor Core GPUs for approximately 24 hours. The GPUs provided fast and efficient processing, significantly reducing the time required for fine-tuning.
First, the team generated training datasets based on the provided examples and synthesized additional data using Llama 3 70B hosted on build.nvidia.com.
Next, they employed QLoRA (Quantized Low-Rank Adaptation), a training process using the data created in step one. QLoRA modifies a smaller subset of the model’s weights, allowing efficient training and fine-tuning.
The model was then quantized — making it smaller and able to run on a system with a smaller hard drive and less memory — with AWQ 4-bit and used the vLLM inference library to predict the test datasets on four NVIDIA T4 Tensor Core GPUs within the time constraints.
This approach secured the top spot in each individual track and the overall first place in the competition—a clean sweep for NVIDIA for the second year in a row.
The team plans to submit a detailed paper on its solution next month and plans to present its findings at KDD 2024 in Barcelona.
It’s progress the wider community is starting to acknowledge.
“Even if the predictions that data centers will soon account for 4% of global energy consumption become a reality, AI is having a major impact on reducing the remaining 96% of energy consumption,” said a report from Lisbon Council Research, a nonprofit formed in 2003 that studies economic and social issues.
The article from the Brussels-based research group is among a handful of big-picture AI policy studies starting to emerge. It uses Italy’s Leonardo supercomputer, accelerated with nearly 14,000 NVIDIA GPUs, as an example of a system advancing work in fields from automobile design and drug discovery to weather forecasting.
Why Accelerated Computing Is Sustainable Computing
Accelerated computing uses the parallel processing of NVIDIA GPUs to do more work in less time. As a result, it consumes less energy than general-purpose servers that employ CPUs built to handle one task at a time.
The gains are even greater when accelerated systems apply AI, an inherently parallel form of computing that’s the most transformative technology of our time.
“When it comes to frontier applications like machine learning or deep learning, the performance of GPUs is an order of magnitude better than that of CPUs,” the report said.
User Experiences With Accelerated AI
Users worldwide are documenting energy-efficiency gains with AI and accelerated computing.
In financial services, Murex — a Paris-based company with a trading and risk-management platform used daily by more than 60,000 people — tested the NVIDIA Grace Hopper Superchip. On its workloads, the CPU-GPU combo delivered a 4x reduction in energy consumption and a 7x reduction in time to completion compared with CPU-only systems (see chart below).
“On risk calculations, Grace is not only the fastest processor, but also far more power-efficient, making green IT a reality in the trading world,” said Pierre Spatz, head of quantitative research at Murex.
In manufacturing, Taiwan-based Wistron built a digital copy of a room where NVIDIA DGX systems undergo thermal stress tests to improve operations at the site. It used NVIDIA Omniverse, a platform for industrial digitization, with a surrogate model, a version of AI that emulates simulations.
The digital twin, linked to thousands of networked sensors, enabled Wistron to increase the facility’s overall energy efficiency by up to 10%. That amounts to reducing electricity consumption by 120,000 kWh per year and carbon emissions by a whopping 60,000 kilograms.
Up to 80% Fewer Carbon Emissions
The RAPIDS Accelerator for Apache Spark can reduce the carbon footprint for data analytics, a widely used form of machine learning, by as much as 80% while delivering 5x average speedups and 4x reductions in computing costs, according to a recent benchmark.
Thousands of companies — about 80% of the Fortune 500 — use Apache Spark to analyze their growing mountains of data. Companies using NVIDIA’s Spark accelerator include Adobe, AT&T and the U.S. Internal Revenue Service.
In healthcare, Insilico Medicine discovered and put into phase 2 clinical trials a drug candidate for a relatively rare respiratory disease, thanks to its NVIDIA-powered AI platform.
Using traditional methods, the work would have cost more than $400 million and taken up to six years. But with generative AI, Insilico hit the milestone for one-tenth of the cost in one-third of the time.
“This is a significant milestone not only for us, but for everyone in the field of AI-accelerated drug discovery,” said Alex Zhavoronkov, CEO of Insilico Medicine.
This is just a sampler of results that users of accelerated computing and AI are pursuing at companies such as Amgen, BMW, Foxconn, PayPal and many more.
Speeding Science With Accelerated AI
In basic research, the National Energy Research Scientific Computing Center (NERSC), the U.S. Department of Energy’s lead facility for open science, measured results on a server with four NVIDIA A100 Tensor Core GPUs compared with dual-socket x86 CPU servers across four of its key high-performance computing and AI applications.
Researchers found that the apps, when accelerated with the NVIDIA A100 GPUs, saw energy efficiency rise 5x on average (see below). One application, for weather forecasting, logged gains of nearly 10x.
Scientists and researchers worldwide depend on AI and accelerated computing to achieve high performance and efficiency.
In a recent ranking of the world’s most energy-efficient supercomputers, known as the Green500, NVIDIA-powered systems swept the top six spots, and 40 of the top 50.
Underestimated Energy Savings
The many gains across industries and science are sometimes overlooked in forecasts that extrapolate only the energy consumption of training the largest AI models. That misses the benefits from most of an AI model’s life when it’s consuming relatively little energy, delivering the kinds of efficiencies users described above.
In an analysis citing dozens of sources, a recent study debunked as misleading and inflated projections based on training models.
“Just as the early predictions about the energy footprints of e-commerce and video streaming ultimately proved to be exaggerated, so too will those estimates about AI likely be wrong,” said the report from the Information Technology and Innovation Foundation (ITIF), a Washington-based think tank.
The report notes as much as 90% of the cost — and all the efficiency gains — of running an AI model are in deploying it in applications after it’s trained.
“Given the enormous opportunities to use AI to benefit the economy and society — including transitioning to a low-carbon future — it is imperative that policymakers and the media do a better job of vetting the claims they entertain about AI’s environmental impact,” said the report’s author, who described his findings in a recent podcast.
Others Cite AI’s Energy Benefits
Policy analysts from the R Street Institute, also in Washington, D.C., agreed.
“Rather than a pause, policymakers need to help realize the potential for gains from AI,” the group wrote in a 1,200-word article.
“Accelerated computing and the rise of AI hold great promise for the future, with significant societal benefits in terms of economic growth and social welfare,” it said, citing demonstrated benefits of AI in drug discovery, banking, stock trading and insurance.
AI can make the electric grid, manufacturing and transportation sectors more efficient, it added.
AI Supports Sustainability Efforts
The reports also cited the potential of accelerated AI to fight climate change and promote sustainability.
“AI can enhance the accuracy of weather modeling to improve public safety as well as generate more accurate predictions of crop yields. The power of AI can also contribute to … developing more precise climate models,” R Street said.
The Lisbon report added that AI plays “a crucial role in the innovation needed to address climate change” for work such as discovering more efficient battery materials.
How AI Can Help the Environment
ITIF called on governments to adopt AI as a tool in efforts to decarbonize their operations.
For its part, NVIDIA is working with hundreds of startups employing AI to address climate issues. NVIDIA also announced plans for Earth-2, expected to be the world’s most powerful AI supercomputer dedicated to climate science.
Enhancing Energy Efficiency Across the Stack
Since its founding in 1993, NVIDIA has worked on energy efficiency across all its products — GPUs, CPUs, DPUs, networks, systems and software, as well as platforms such as Omniverse.
In AI, the brunt of an AI model’s life is in inference, delivering insights that help users achieve new efficiencies. The NVIDIA GB200 Grace Blackwell Superchip has demonstrated 25x energy efficiency over the prior NVIDIA Hopper GPU generation in AI inference.
Over the last eight years, NVIDIA GPUs have advanced a whopping 45,000x in their energy efficiency running large language models (see chart below).
Recent innovations in software include TensorRT-LLM. It can help GPUs reduce 3x the energy consumption of LLM inference.
Here’s an eye-popping stat: If the efficiency of cars improved as much as NVIDIA has advanced the efficiency of AI on its accelerated computing platform, cars would get 280,000 miles per gallon. That means you could drive to the moon on less than a gallon of gas.
The analysis applies to the fuel efficiency of cars NVIDIA’s whopping 10,000x efficiency gain in AI training and inference from 2016 to 2025 (see chart below).
Driving Data Center Efficiency
NVIDIA delivers many optimizations through system-level innovations. For example, NVIDIA BlueField-3 DPUs can reduce power consumption up to 30% by offloading essential data center networking and infrastructure functions from less efficient CPUs.
Last year, NVIDIA received a $5 million grant from the U.S. Department of Energy — the largest of 15 grants from a pool of more than 100 applications — to design a new liquid-cooling technology for data centers. It will run 20% more efficiently than today’s air-cooled approaches and has a smaller carbon footprint.
These are just some of the ways NVIDIA contributes to the energy efficiency of data centers.
Data centers are among the most efficient users of energy and one of the largest consumers of renewable energy.
The ITIF report notes that between 2010 and 2018, global data centers experienced a 550% increase in compute instances and a 2,400% increase in storage capacity, but only a 6% increase in energy use, thanks to improvements across hardware and software.
NVIDIA continues to drive energy efficiency for accelerated AI, helping users in science, government and industry accelerate their journeys toward sustainable computing.
Over 1,800 attendees gained insights on how to kick-start their careers and use NVIDIA’s technologies and resources to accelerate their professional development.
Opportunities in AI
AI’s impact is touching nearly every industry, presenting new career opportunities for professionals of all backgrounds.
Lauren Silveira, a university recruiting program manager at NVIDIA, challenged attendees to take their unique education and experience and apply it in the AI field.
“You don’t have to work directly in AI to impact the industry,” said Silveira. “I knew I wouldn’t be a doctor or engineer — that wasn’t in my career path — but I could create opportunities for those that wanted to pursue those dreams.”
Kevin McFall, a principal instructor for the NVIDIA Deep Learning Institute, offered some advice for those looking to navigate a career in AI and advanced technologies but finding themselves overwhelmed or unsure of where to start.
“Don’t try to do it all by yourself,” he said. “Don’t get focused on building everything from scratch — the best skill that you can have is being able to take pieces of code or inspiration from different resources and plug them together to make a whole.”
A main takeaway from the panelists was that students and industry professionals can significantly enhance their capabilities by leveraging tools and resources in addition to their networks.
Staying up to date on the rapidly expanding technology industry involves more than just keeping up with the latest education and certifications.
Sabrina Koumoin, a senior software engineer at NVIDIA, spoke on the importance of networking. She believes people can find like-minded peers and mentors to gain inspiration from by sharing their personal learning journeys or projects on social platforms like LinkedIn.
A self-taught coder, Koumoin also advocates for active engagement and education accessibility. Outside of work, she hosted multiple coding bootcamps for people looking to break into tech.
“It’s a way to show that learning technical skills can be engaging, not intimidating,” she said.
David Ajoku, founder and CEO at Demystifyd and Aware.ai, also emphasized the importance of using LinkedIn to build connections, demonstrate key accomplishments and show passion.
He outlined a three-step strategy to enhance your LinkedIn presence, designed to help you stand out, gain deeper insights into your preferred companies and boldly share your aspirations and interests:
Think about a company you’d like to work for and what draws you to it.
Research thoroughly, focusing on its main activities, mission and goals.
Be bold — create a series of posts informing your network about your career journey and what advancements interest you in the chosen company.
One attendee asked about how AI might evolve over the next decade and what skills professionals should focus on to stay relevant. Louis Stewart, head of strategic initiatives at NVIDIA, replied that crafting a personal narrative and growth journey is just as important as ensuring certifications and skills are up to date.
“Be intentional and purposeful — have an end in mind,” he said. “That’s how you connect with future potential companies and people — it’s a skill you have to develop to stay ahead.”
Deep Dive Into Learning
NVIDIA offers a variety of programs and resources to equip the next generation of AI professionals with the skills and training needed to excel in a career in AI.
NVIDIA’s AI Learning Essentials is designed to give individuals the knowledge, skills and certifications they need to be prepared for the workforce and the fast moving field of AI. It includes free access to self-paced introductory courses and webinars on topics such as generative AI, retrieval-augmented generation (RAG) and CUDA.
The NVIDIA Deep Learning Institute (DLI) provides a diverse range of resources, including learning materials, self-paced and live trainings, and educator programs spanning AI, accelerated computing and data science, graphics simulation and more. They also offer technical workshops for students currently enrolled in universities.
Research published earlier this month in the science journal Nature used NVIDIA-powered supercomputers to validate a pathway toward the commercialization of quantum computing.
The research, led by Nobel laureate Giorgio Parisi, focuses on quantum annealing, a method that may one day tackle complex optimization problems that are extraordinarily challenging to conventional computers.
To conduct their research, the team utilized 2 million GPU computing hours at the Leonardo facility (Cineca, in Bologna, Italy), nearly 160,000 GPU computing hours on the Meluxina-GPU cluster, in Luxembourg, and 10,000 GPU hours from the Spanish Supercomputing Network. Additionally, they accessed the Dariah cluster, in Lecce, Italy.
They used these state-of-the-art resources to simulate the behavior of a certain kind of quantum computing system known as a quantum annealer.
Quantum computers fundamentally rethink how information is computed to enable entirely new solutions.
Unlike classical computers, which process information in binary — 0s and 1s — quantum computers use quantum bits or qubits that can allow information to be processed in entirely new ways.
Quantum annealers are a special type of quantum computer that, though not universally useful, may have advantages for solving certain types of optimization problems.
The paper, “The Quantum Transition of the Two-Dimensional Ising Spin Glass,” represents a significant step in understanding the phase transition — a change in the properties of a quantum system — of Ising spin glass, a disordered magnetic material in a two-dimensional plane, a critical problem in computational physics.
The paper addresses the problem of how the properties of magnetic particles arranged in a two-dimensional plane can abruptly change their behavior.
The study also shows how GPU-powered systems play a key role in developing approaches to quantum computing.
GPU-accelerated simulations allow researchers to understand the complex systems’ behavior in developing quantum computers, illuminating the most promising paths forward.
Quantum annealers, like the systems developed by the pioneering quantum computing company D-Wave, operate by methodically decreasing a magnetic field that is applied to a set of magnetically susceptible particles.
When strong enough, the applied field will act to align the magnetic orientation of the particles — similar to how iron filings will uniformly stand to attention near a bar magnet.
If the strength of the field is varied slowly enough, the magnetic particles will arrange themselves to minimize the energy of the final arrangement.
Finding this stable, minimum-energy state is crucial in a particularly complex and disordered magnetic system known as a spin glass since quantum annealers can encode certain kinds of problems into the spin glass’s minimum-energy configuration.
Finding the stable arrangement of the spin glass then solves the problem.
Understanding these systems helps scientists develop better algorithms for solving difficult problems by mimicking how nature deals with complexity and disorder.
That’s crucial for advancing quantum annealing and its applications in solving extremely difficult computational problems that currently have no known efficient solution — problems that are pervasive in fields ranging from logistics to cryptography.
Unlike gate-model quantum computers, which operate by applying a sequence of quantum gates, quantum annealers allow a quantum system to evolve freely in time.
This is not a universal computer — a device capable of performing any computation given sufficient time and resources — but may have advantages for solving particular sets of optimization problems in application areas such as vehicle routing, portfolio optimization and protein folding.
Through extensive simulations performed on NVIDIA GPUs, the researchers learned how key parameters of the spin glasses making up quantum annealers change during their operation, allowing a better understanding of how to use these systems to achieve a quantum speedup on important problems.
Mistral AI and NVIDIA today released a new state-of-the-art language model, Mistral NeMo 12B, that developers can easily customize and deploy for enterprise applications supporting chatbots, multilingual tasks, coding and summarization.
By combining Mistral AI’s expertise in training data with NVIDIA’s optimized hardware and software ecosystem, the Mistral NeMo model offers high performance for diverse applications.
“We are fortunate to collaborate with the NVIDIA team, leveraging their top-tier hardware and software,” said Guillaume Lample, cofounder and chief scientist of Mistral AI. “Together, we have developed a model with unprecedented accuracy, flexibility, high-efficiency and enterprise-grade support and security thanks to NVIDIA AI Enterprise deployment.”
Mistral NeMo was trained on the NVIDIA DGX Cloud AI platform, which offers dedicated, scalable access to the latest NVIDIA architecture.
NVIDIA TensorRT-LLM for accelerated inference performance on large language models and the NVIDIA NeMo development platform for building custom generative AI models were also used to advance and optimize the process.
This collaboration underscores NVIDIA’s commitment to supporting the model-builder ecosystem.
Delivering Unprecedented Accuracy, Flexibility and Efficiency
Excelling in multi-turn conversations, math, common sense reasoning, world knowledge and coding, this enterprise-grade AI model delivers precise, reliable performance across diverse tasks.
With a 128K context length, Mistral NeMo processes extensive and complex information more coherently and accurately, ensuring contextually relevant outputs.
Released under the Apache 2.0 license, which fosters innovation and supports the broader AI community, Mistral NeMo is a 12-billion-parameter model. Additionally, the model uses the FP8 data format for model inference, which reduces memory size and speeds deployment without any degradation to accuracy.
That means the model learns tasks better and handles diverse scenarios more effectively, making it ideal for enterprise use cases.
Mistral NeMo comes packaged as an NVIDIA NIM inference microservice, offering performance-optimized inference with NVIDIA TensorRT-LLM engines.
This containerized format allows for easy deployment anywhere, providing enhanced flexibility for various applications.
As a result, models can be deployed anywhere in minutes, rather than several days.
NIM features enterprise-grade software that’s part of NVIDIA AI Enterprise, with dedicated feature branches, rigorous validation processes, and enterprise-grade security and support.
It includes comprehensive support, direct access to an NVIDIA AI expert and defined service-level agreements, delivering reliable and consistent performance.
The open model license allows enterprises to integrate Mistral NeMo into commercial applications seamlessly.
Designed to fit on the memory of a single NVIDIA L40S, NVIDIA GeForce RTX 4090 or NVIDIA RTX 4500 GPU, the Mistral NeMo NIM offers high efficiency, low compute cost, and enhanced security and privacy.
Advanced Model Development and Customization
The combined expertise of Mistral AI and NVIDIA engineers has optimized training and inference for Mistral NeMo.
Trained with Mistral AI’s expertise, especially on multilinguality, code and multi-turn content, the model benefits from accelerated training on NVIDIA’s full stack.
It’s designed for optimal performance, utilizing efficient model parallelism techniques, scalability and mixed precision with Megatron-LM.
The model was trained using Megatron-LM, part of NVIDIA NeMo, with 3,072 H100 80GB Tensor Core GPUs on DGX Cloud, composed of NVIDIA AI architecture, including accelerated computing, network fabric and software to increase training efficiency.
Availability and Deployment
With the flexibility to run anywhere — cloud, data center or RTX workstation — Mistral NeMo is ready to revolutionize AI applications across various platforms.
Experience Mistral NeMo as an NVIDIA NIM today via ai.nvidia.com, with a downloadable NIM coming soon.
See notice regarding software product information.
It’s time for a sweet treat — the GeForce NOW Summer Sale offers high-performance cloud gaming at half off for a limited time.
And starting today, gamers can directly access supported PC games on GeForce NOW via Xbox.com game pages, enabling them to get into their favorite Xbox PC games even faster.
It all comes with nine new games joining the cloud this week.
We Halve a Deal
Take advantage of a special new discount — one-month and six-month GeForce NOW Priority or Ultimate memberships are now 50% off until Aug. 18. It’s perfect for members wanting to level up their gaming experience or those looking to try GeForce NOW for the first time to access and stream an ever-growing library of over 1,900 games with top-notch performance.
Priority members enjoy more benefits over free users, including faster access to gaming servers and gaming sessions of up to six hours. They can also stream beautifully ray-traced graphics across multiple devices with RTX ON for the most immersive experience in supported games.
For those looking for top-notch performance, the Ultimate tier provides members with exclusive access to servers and the ability to stream at up to 4K resolution and 120 frames per second, or up to 240 fps — even without upgraded hardware. Ultimate members get all the same benefits as GeForce RTX 40 series GPU owners, including NVIDIA DLSS 3 for the smoothest frame rates and NVIDIA Reflex for the lowest-latency streaming from the cloud.
Strike while it’s hot — this scorching summer sale ends soon.
Path of the Goddess
Capcom’s latest release, Kunitsu-Gami: Path of the Goddess is a unique Japanese-inspired, single-player Kagura Action Strategy game.
The game takes place on a mountain covered in defilement. During the day, purify the villages and prepare for sundown. During the night, protect the Maiden against the hordes of the Seethe. Repeat the day-and-night cycle until the mountain has been cleansed of defilement and peace has returned to the land.
Walk the path of the goddess in the cloud with extended gaming sessions for Ultimate and Priority members. Ultimate members can also enjoy seeing supernatural and human worlds collide in ultrawide resolutions for an even more immersive experience.
Slay New Games
In Dungeons of Hinterberg from Microbird Games, play as Luisa, a burnt-out law trainee taking a break from her fast-paced corporate life. Explore the beautiful alpine village of Hinterberg armed with just a sword and a tourist guide, and uncover the magic hidden within its dungeons. Master magic, solve puzzles and slay monsters — all from the cloud.
Editor’s note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, software, tools and accelerations for RTX PC and workstation users.
Video is everywhere — nearly 80% of internet bandwidth today is used to stream video from content providers and social networks. While screens have become bigger and support higher resolutions, nearly all video is only 1080p quality or lower.
Upscalers can help sharpen streamed video and, powered by AI on the NVIDIA RTX platform, significantly enhance image quality and detail.
What Is an Upscaler?
The larger file size of videos makes it harder to compress and transmit compared to images or text. Platforms like Netflix, Vimeo and YouTube work around this limitation by encoding video — the process of compressing the raw source of a video into a smaller container format.
The encoder first analyzes the video to decide what information it can remove to make it fit a target resolution and frame rate. If the target bitrate is insufficient, the video quality decreases, resulting in a loss of detail and sharpness and the presence of encoding artifacts. The smaller the file, the easier it is to share on the internet — but the worse it looks.
Typically, software on the viewer’s device will upscale the video file to fit the display’s native resolution. However, these upscalers are fairly simplistic, merely multiplying pixels to meet the desired resolution. They can help sharpen the outlines of objects and scenes, but the final video typically carries encoding artifacts and sometimes looks over-sharpened and unnatural.
AI Know a Better Way
The NVIDIA RTX platform uses AI to easily de-artifact and upscale videos.
The process of AI upscaling involves analyzing images and motion vectors to generate new details not present in the original video. Instead of merely multiplying pixels, it recognizes the patterns of the image and enhances them to provide greater detail and video quality.
Images must be first de-artifacted before any processing begins. Artifacts — or unwanted distortions and anomalies that appear in video and image files — occur due to overcompression or data loss during transmission and storage.
NVIDIA AI networks can de-artifact images, helping remove blocky areas sometimes seen in streamed video. Without this first step, AI upscalers might end up enhancing the artifacted image itself instead of the desired content.
Super-Sized Video
Just like putting on a pair of prescription glasses can instantly snap the world into focus, RTX Video Super Resolution, one of NVIDIA’s latest innovations in AI-enhanced video technology, gives users a clearer picture into the world of streamed video.
Available on GeForce RTX 40 and 30 Series GPUs and RTX professional GPUs, it uses AI running on dedicated Tensor Cores to remove block compression artifacts and upscale lower-resolution content up to 4K, matching the user’s native display resolution.
RTX Video Super Resolution can be used to enhance all video watched on browsers. By combining de-artifacting with AI upscaling techniques, it can make even low-bitrate Twitch streams look stunningly clear. RTX Video Super Resolution is also supported in popular video apps like VLC so users can apply the same upscaling process to their offline videos.
Creators can soon use RTX Video Super Resolution in editing apps like Black Magic’s Davinci Resolve, making it easier than ever to upscale lower-quality video files to 4K resolution, as well as convert standard-dynamic range source files into high-dynamic range (HDR).
Say Hi to High-Dynamic Range
RTX Video now also supports AI HDR. HDR video supports a wider range of colors, lending greater detail especially to the darker and lighter areas of images. The problem is that there isn’t that much HDR content online yet.
Enter RTX Video HDR — by simply turning on the feature, the AI network will turn any standard or low-dynamic-range content into HDR, performing the correct tone mapping so the image still looks natural and retains its original colors.
AI Across the Board
RTX Video is just the latest implementation of AI upscaling powered by NVIDIA RTX.
Members of the GeForce NOW cloud streaming service can play their favorite PC games on nearly any device. GeForce RTX servers located all over the world first render the game video content, encode it and then stream it to the player’s local device — just like streaming video from other content providers.
Members on older NVIDIA GPU-powered devices can still use AI-enhanced upscaling to improve gameplay quality. This means they can enjoy the best of both worlds — gameplay rendered on servers powered by RTX 4080-class GPUs in the cloud and AI-enhanced streaming quality. Get more information on enabling AI-enhanced upscaling on GeForce NOW.
The NVIDIA SHIELD TV takes this one step further, processing AI neural networks directly on its NVIDIA Tegra system-on-a-chip to upscale 1080p-quality or lower content from nearly any streaming platform to a display’s native resolution. That means users can improve the video quality of content streamed from Netflix, Prime Video, Max, Disney+ and more at the push of a remote button.
SHIELD TV is currently available for up to $30 off in North America and £30 or 35€ off in Europe as part of Amazon’s Prime Day event running July 16-17. For Prime members in Europe, eligible SHIELD TV purchases also include one month of the GeForce NOW Ultimate membership for free, enabling GeForce RTX 4080-class PC gameplay streamed directly to the living room.
AI has enabled unprecedented improvements in video quality, helping set a new standard in streaming experiences.
Generative AI is transforming gaming, videoconferencing and interactive experiences of all kinds. Make sense of what’s new and what’s next by subscribing to the AI Decoded newsletter.