2025 Predictions: Enterprises, Researchers and Startups Home In on Humanoids, AI Agents as Generative AI Crosses the Chasm

2025 Predictions: Enterprises, Researchers and Startups Home In on Humanoids, AI Agents as Generative AI Crosses the Chasm

From boardroom to break room, generative AI took this year by storm, stirring discussion across industries about how to best harness the technology to enhance innovation and creativity, improve customer service, transform product development and even boost communication.

The adoption of generative AI and large language models is rippling through nearly every industry, as incumbents and new entrants reimagine products and services to generate an estimated $1.3 trillion in revenue by 2032, according to a report by Bloomberg Intelligence.

Yet, some companies and startups are still slow to adopt AI, sticking to experimentation and siloed projects even as the technology advances at a dizzying pace. That’s partly because AI benefits vary by company, use case and level of investment.

Cautious approaches are giving way to optimism. Two-thirds of the respondents to Forrester Research’s 2024 State of AI Survey believe their organizations would require less than 50% return on investments to consider their AI initiatives successful.

The next big thing on the horizon is agentic AI, a form of autonomous or “reasoning” AI that requires using diverse language models, sophisticated retrieval-augmented generation stacks and advanced data architectures.

NVIDIA experts in industry verticals already shared their expectations for the year ahead. Now, hear from company experts driving innovation in AI across enterprises, research and the startup ecosystem:

IAN BUCK
Vice President of Hyperscale and HPC

Inference drives the AI charge: As AI models grow in size and complexity, the demand for efficient inference solutions will increase.

The rise of generative AI has transformed inference from simple recognition of the query and response to complex information generation — including summarizing from multiple sources and large language models such as OpenAI o1 and Llama 450B — which dramatically increases computational demands. Through new hardware innovations, coupled with continuous software improvements, performance will increase and total cost of ownership is expected to shrink by 5x or more.

Accelerate everything: With GPUs becoming more widely adopted, industries will look to accelerate everything, from planning to production. New architectures will add to that virtuous cycle, delivering cost efficiencies and an order of magnitude higher compute performance with each generation.

As nations and businesses race to build AI factories to accelerate even more workloads, expect many to look for platform solutions and reference data center architectures or blueprints that can get a data center up and running in weeks versus months. This will help them solve some of the world’s toughest challenges, including quantum computing and drug discovery.

Quantum computing — all trials, no errors: Quantum computing will make significant strides as researchers focus on supercomputing and simulation to solve the greatest challenges to the nascent field: errors.

Qubits, the basic unit of information in quantum computing, are susceptible to noise, becoming unstable after performing only thousands of operations. This prevents today’s quantum hardware from solving useful problems. In 2025, expect to see the quantum computing community move toward challenging, but crucial, quantum error correction techniques. Error correction requires quick, low-latency calculations. Also expect to see quantum hardware that’s physically colocated within supercomputers, supported by specialized infrastructure.

AI will also play a crucial role in managing these complex quantum systems, optimizing error correction and enhancing overall quantum hardware performance. This convergence of quantum computing, supercomputing and AI into accelerated quantum supercomputers will drive progress in realizing quantum applications for solving complex problems across various fields, including drug discovery, materials development and logistics.

BRYAN CATANZARO
Vice President of Applied Deep Learning Research

Putting a face to AI: AI will become more familiar to use, emotionally responsive and marked by greater creativity and diversity. The first generative AI models that drew pictures struggled with simple tasks like drawing teeth. Rapid advances in AI are making image and video outputs much more photorealistic, while AI-generated voices are losing that robotic feel.

These advancements will be driven by the refinement of algorithms and datasets and enterprises’ acknowledgment that AI needs a face and a voice to matter to 8 billion people. This will also cause a shift from turn-based AI interactions to more fluid and natural conversations. Interactions with AI will no longer feel like a series of exchanges but instead offer a more engaging and humanlike conversational experience.

Rethinking industry infrastructure and urban planning: Nations and industries will begin examining how AI automates various aspects of the economy to maintain the current standard of living, even as the global population shrinks.

These efforts could help with sustainability and climate change. For instance, the agriculture industry will begin investing in autonomous robots that can clean fields and remove pests and weeds mechanically. This will reduce the need for pesticides and herbicides, keeping the planet healthier and freeing up human capital for other meaningful contributions. Expect to see new thinking in urban planning offices to account for autonomous vehicles and improve traffic management.

Longer term, AI can help find solutions for reducing carbon emissions and storing carbon, an urgent global challenge.

KARI BRISKI
Vice President of Generative AI Software

A symphony of agents — AI orchestrators: Enterprises are set to have a slew of AI agents, which are semiautonomous, trained models that work across internal networks to help with customer service, human resources, data security and more. To maximize these efficiencies, expect to see a rise in AI orchestrators that work across numerous agents to seamlessly route human inquiries and interpret collective results to recommend and take actions for users.

These orchestrators will have access to deeper content understanding, multilingual capabilities and fluency with multiple data types, ranging from PDFs to video streams. Powered by self-learning data flywheels, AI orchestrators will continuously refine business-specific insights. For instance, in manufacturing, an AI orchestrator could optimize supply chains by analyzing real-time data and making recommendations on production schedules and supplier negotiations.

This evolution in enterprise AI will significantly boost productivity and innovation across industries while becoming more accessible. Knowledge workers will be more productive because they can tap into a personalized team of AI-powered experts. Developers will be able to build these advanced agents using customizable AI blueprints.

Multistep reasoning amplifies AI insights: AI for years has been good at giving answers to specific questions without having to delve into the context of a given query. With advances in accelerated computing and new model architectures, AI models will tackle increasingly complex problems and respond with greater accuracy and deeper analysis.

Using a capability called multistep reasoning, AI systems increase the amount of “thinking time” by breaking down large, complex questions into smaller tasks — sometimes even running multiple simulations — to problem-solve from various angles. These models dynamically evaluate each step, ensuring contextually relevant and transparent responses. Multistep reasoning also involves integrating knowledge from various sources to enable AI to make logical connections and synthesize information across different domains.

This will likely impact fields ranging from finance and healthcare to scientific research and entertainment. For example, a healthcare model with multistep reasoning could make a number of recommendations for a doctor to consider, depending on the patient’s diagnosis, medications and response to other treatments.

Start your AI query engine: With enterprises and research organizations sitting on petabytes of data, the challenge is gaining quick access to the data to deliver actionable insights.

AI query engines will change how businesses mine that data, and company-specific search engines will be able to sift through structured and unstructured data, including text, images and videos, using natural language processing and machine learning to interpret a user’s intent and provide more relevant and comprehensive results.

This will lead to more intelligent decision-making processes, improved customer experiences and enhanced productivity across industries. The continuous learning capabilities of AI query engines will create self-improving data flywheels that help  applications become increasingly effective.

CHARLIE BOYLE
Vice President of DGX Platforms

Agentic AI makes high-performance inference essential for enterprises: The dawn of agentic AI will drive demand for near-instant responses from complex systems of multiple models. This will make high-performance inference just as important as high-performance training infrastructure. IT leaders will need scalable, purpose-built and optimized accelerated computing infrastructure that can keep pace with the demands of agentic AI to deliver the performance required for real-time decision-making.

Enterprises expand AI factories to process data into intelligence: Enterprise AI factories transform raw data into business intelligence. Next year, enterprises will expand these factories to leverage massive amounts of historical and synthetic data, then generate forecasts and simulations for everything from consumer behavior and supply chain optimization to financial market movements and digital twins of factories and warehouses. AI factories will become a key competitive advantage that helps early adopters anticipate and shape future scenarios, rather than just react to them.

Chill factor — liquid-cooled AI data centers: As AI workloads continue to drive growth, pioneering organizations will transition to liquid cooling to maximize performance and energy efficiency. Hyperscale cloud providers and large enterprises will lead the way, using liquid cooling in new AI data centers that house hundreds of thousands of AI accelerators, networking and software.

Enterprises will increasingly choose to deploy AI infrastructure in colocation facilities rather than build their own — in part to ease the financial burden of designing, deploying and operating intelligence manufacturing at scale. Or, they will rent capacity as needed. These deployments will help enterprises harness the latest infrastructure without needing to install and operate it themselves. This shift will accelerate broader industry adoption of liquid cooling as a mainstream solution for AI data centers.

GILAD SHAINER
Senior Vice President of Networking 

Goodbye network, hello computing fabric:  The term “networking” in the data center will seem dated as data center architecture transforms into an integrated compute fabric that enables thousands of accelerators to efficiently communicate with one another via scale-up and scale-out communications, spanning miles of cabling and multiple data center facilities.

This integrated compute fabric will include NVIDIA NVLink, which enables scale-up communications, as well as scale-out capabilities enabled by intelligent switches, SuperNICs and DPUs. This will help securely move data to and from accelerators and perform calculations on the fly that drastically minimize data movement. Scale-out communication across networks will be crucial to large-scale AI data center deployments — and key to getting them up and running in weeks versus months or years.

As agentic AI workloads grow — requiring communication across multiple interconnected AI models working together rather than monolithic and localized AI models — compute fabrics will be essential to delivering real-time generative AI.

Distributed AI: All data centers will become accelerated as new approaches to Ethernet design emerge that enable hundreds of thousands of GPUs to support a single workload. This will help democratize AI factory rollouts for multi-tenant generative AI clouds and enterprise AI data centers.

This breakthrough technology will also enable AI to expand quickly into enterprise platforms and simplify the buildup and management of AI clouds.

Companies will build data center resources that are more geographically dispersed — located hundreds or even thousands of miles apart — because of power limitations and the need to build closer to renewable energy sources. Scale-out communications will ensure reliable data movement over these long distances.

LINXI (JIM) FAN
Senior Research Scientist, AI Agents

Robotics will evolve more into humanoids: Robots will begin to understand arbitrary language commands. Right now, industry robots must be programmed by hand, and they don’t respond intelligently to unpredictable inputs or languages other than those programmed. Multimodal robot foundation models that incorporate vision, language and arbitrary actions will evolve this “AI brain,” as will agentic AI that allows for greater AI reasoning.

To be sure, don’t expect to immediately see intelligent robots in homes, restaurants, service areas and factories. But these use cases may be closer than you think, as governments look for solutions to aging societies and shrinking labor pools. Physical automation is going to happen gradually, in 10 years being as ubiquitous as the iPhone.

AI agents are all about inferencing: In September, OpenAI announced a new large language model trained with reinforcement learning to perform complex reasoning. OpenAI o1, dubbed Strawberry, thinks before it answers: It can produce a long internal chain of thought, correcting mistakes and breaking down tricky steps into simple ones, before responding to the user.

2025 will be the year a lot of computation begins to shift to inference at the edge. Applications will need hundreds of thousands of tokens for a single query, as small language models make one query after another in microseconds before churning out an answer.

Small models will be more energy efficient and will become increasingly important for robotics, creating humanoids and robots that can assist humans in everyday jobs and promoting mobile intelligence applications..

BOB PETTE
Vice President of Enterprise Platforms

Seeking sustainable scalability: As enterprises prepare to embrace a new generation of semiautonomous AI agents to enhance various business processes, they’ll focus on creating robust infrastructure, governance and human-like capabilities for effective large-scale deployment. At the same time, AI applications will increasingly use local processing power to enable more sophisticated AI features to run directly on workstations, including thin, lightweight laptops and compact form factors, and improve performance while reducing latency for AI-driven tasks.

Validated reference architectures, which provide guidance on appropriate hardware and software platforms, will become crucial to optimize performance and accelerate AI deployments. These architectures will serve as essential tools for organizations navigating the complex terrain of AI implementation by helping ensure that their investments align with current needs and future technological advancements.

Revolutionizing construction, engineering and design with AI: Expect to see a rise in generative AI models tailored to the construction, engineering and design industries that will boost efficiency and accelerate innovation.

In construction, agentic AI will extract meaning from massive volumes of construction data collected from onsite sensors and cameras, offering insights that lead to more efficient project timelines and budget management.

AI will evaluate reality capture data (lidar, photogrammetry and radiance fields) 24/7 and derive mission-critical insights on quality, safety and compliance — resulting in reduced errors and worksite injuries.

For engineers, predictive physics based on physics-informed neural networks will accelerate flood prediction, structural engineering and computational fluid dynamics for airflow solutions tailored to individual rooms or floors of a building — allowing for faster design iteration.

In design, retrieval-augmented generation will enable compliance early in the design phase by ensuring that information modeling for designing and constructing buildings complies with local building codes. Diffusion AI models will accelerate conceptual design and site planning by enabling architects and designers to combine keyword prompts and rough sketches to generate richly detailed conceptual images for client presentations. That will free up time to focus on research and design.

SANJA FIDLER
Vice President of AI Research

Predicting unpredictability: Expect to see more models that can learn in the everyday world, helping digital humans, robots and even autonomous cars understand chaotic and sometimes unpredictable situations, using very complex skills with little human intervention.

From the research lab to Wall Street, we’re entering a hype cycle similar to the optimism about autonomous driving 5-7 years ago. It took many years for companies like Waymo and Cruise to deliver a system that works — and it’s still not scalable because the troves of data these companies and others, including Tesla, have collected may be applicable in one region but not another.

With models introduced this year, we can now move more quickly — and with much less capital expense — to use internet-scale data to understand natural language and emulate movements by observing human and other actions. Edge applications like robots, cars and warehouse machinery will quickly learn coordination, dexterity and other skills in order to navigate, adapt and interact with the real world.

Will a robot be able to make coffee and eggs in your kitchen, and then clean up after? Not yet. But it may come sooner than you think.

Getting real: Fidelity and realism is coming to generative AI across the graphics and simulation pipeline, leading to hyperrealistic games, AI-generated movies and digital humans.

Unlike with traditional graphics, the vast majority of images will come from generated pixels instead of renderings, resulting in more natural motions and appearances. Tools that develop and iterate on contextual behaviors will result in more sophisticated games for a fraction of the cost of today’s AAA titles.

Industries adopt generative AI: Nearly every industry is poised to use AI to enhance and improve the way people live and play.

Agriculture will use AI to optimize the food chain, improving the delivery of food. For example, AI can be used to predict the greenhouse gas emissions from different crops on individual farms. These analyses can help inform design strategies that help reduce greenhouse gas in supply chains. Meanwhile, AI agents in education will personalize learning experiences, speaking in a person’s native language and asking or answering questions based on level of education in a particular subject.

As next-generation accelerators enter the marketplace, you’ll also see a lot more efficiency in delivering these generative AI applications. By improving the training and efficiency of the models in testing, businesses and startups will see better and faster returns on investment across those applications.

ANDREW FENG
Vice President of GPU Software 

Accelerated data analytics offers insights with no code change: In 2025, accelerated data analytics will become mainstream for organizations grappling with ever-increasing volumes of data.

Businesses generate hundreds of petabytes of data annually, and every company is seeking ways to put it to work. To do so, many will adopt accelerated computing for data analytics.

The future lies in accelerated data analytics solutions that support “no code change” and “no configuration change,” enabling organizations to combine their existing data analytics applications with accelerated computing with minimum effort. Generative AI-empowered analytics technology will further widen the adoption of accelerated data analytics by empowering users — even those who don’t have traditional programming knowledge — to create new data analytics applications.

The seamless integration of accelerated computing, facilitated by a simplified developer experience, will help eliminate adoption barriers and allow organizations to harness their unique data for new AI applications and richer business intelligence.

NADER KHALIL
Director of Developer Technology

The startup workforce: If you haven’t heard much about prompt engineers or AI personality designers, you will in 2025. As businesses embrace AI to increase productivity, expect to see new categories of essential workers for both startups and enterprises that blend new and existing skills.

A prompt engineer designs and refines precise text strings that optimize AI training and produce desired outcomes based on the creation, testing and iteration of prompt designs for chatbots and agentic AI. The demand for prompt engineers will extend beyond tech companies to sectors like legal, customer support and publishing. As AI agents proliferate, businesses and startups will increasingly lean in to AI personality designers to enhance agents with unique personalities.

Just as the rise of computers spawned job titles like computer scientists, data scientists and machine learning engineers, AI will create different types of work, expanding opportunities for people with strong analytical skills and natural language processing abilities.

Understanding employee efficiency: Startups incorporating AI into their practices increasingly will add revenue per employee (RPE) to their lexicon when talking to investors and business partners.

Instead of a “growth at all costs” mentality, AI supplementation of the workforce will allow startup owners to home in on how hiring each new employee helps everyone else in the business generate more revenue. In the world of startups, RPE fits into discussions about the return on investment in AI and the challenges of filling roles in competition against big enterprises and tech companies.

Read More

Stream ‘Indiana Jones and the Great Circle’ at Launch With RTX Power in the Cloud at up to 50% Off

Stream ‘Indiana Jones and the Great Circle’ at Launch With RTX Power in the Cloud at up to 50% Off

GeForce NOW is wrapping a sleigh-full of gaming gifts this month, stuffing members’ cloud gaming stockings with new titles and fresh offers to keep holiday gaming spirits merry and bright.

Adventure calls and whip-cracking action awaits in the highly anticipated Indiana Jones and the Great Circle, streaming in the cloud today during the Advanced Access period for those who have preordered the Premium Edition from Steam or the Microsoft Store.

The title can only be played with RTX ON — GeForce NOW is offering gamers without high-performance hardware the ability to play it with 25% off Ultimate and Performance Day Passes. It’s like finding that extra-special gift hidden behind the tree.

This GFN Thursday also brings a new limited-time offer: 50% off the first month of new Ultimate or Performance memberships — a gift that can keep on giving.

Whether looking to try out GeForce NOW or buckle in for long-term cloud gaming, new members can choose between the Day Pass sale or the new membership offer. There’s a perfect gaming gift for everyone this holiday season.

GFN Thursday also brings 13 new titles in December, with four available this week to get the festivities going.

Plus, the latest update to GeForce NOW — version 2.0.69 — includes expanded support for 10-bit color precision. This feature enhances image quality when streaming on Windows, macOS and NVIDIA SHIELD TVs — and now to Edge and Chrome browsers on Windows devices, as well as to the Chrome browser on Chromebooks, Samsung TVs and LG TVs.

An Epic, Globetrotting Adventure

Uncover one of history’s greatest mysteries, streaming Indiana Jones and the Great Circle from the cloud. Members can access it today through Steam’s Advance Access period and will also be able to enjoy it via PC Game Pass on GeForce NOW next week.

The year is 1937, sinister forces are scouring the globe for the secret to an ancient power connected to the Great Circle, and only Indiana Jones can stop them. Experience a thrilling story full of exploration, immersive action and intriguing puzzles. Travel the world, from the pyramids of Egypt to the sunken temples of Sukhothai and beyond. Combine stealth infiltration, melee combat and gunplay to overcome enemies. Use guile and wits to unearth fascinating secrets, solve ancient riddles and survive deadly traps.

Members can indulge their inner explorer by streaming Indiana Jones and the Great Circle on GeForce NOW at release. Enhanced with NVIDIA’s ray-tracing technology, every game scene is bathed in rich, natural light that bounces realistically off surfaces for enhanced immersion.

Ultimate and Performance members can max out their settings for a globe-trotting journey at the highest resolution and lowest latency, even on low-powered devices, thanks to enhancements like NVIDIA DLSS 3 Frame Generation and NVIDIA Reflex. Ultimate members can experience additional perks, like 4K resolution and longer gaming sessions.

This game requires RTX ON, so free members can upgrade today to join in on the action. Take advantage of a limited-time Day Pass sale, with 25% off through Thursday, Dec. 12. Experience all the premium features of GeForce NOW’s Ultimate and Performance tiers with a 24-hour trial before upgrading to a one- or six-month membership.

Making the Cloud Merry and Bright

Holiday Sale on GeForce NOW
Deals for days.

For gamers looking to take their cloud gaming journey even further, unlock the power of GeForce RTX-powered cloud gaming with a monthly GeForce NOW membership. It’s the perfect time to do so, with new members gettings 50% off their first month, now through Monday, Dec. 30.

Experience gaming at its finest with an Ultimate membership by streaming at up to 4K resolution and 120 frames per second, or 1080p at 240 fps. The Performance membership offers an enhanced streaming experience at up to 1440p resolution with ultrawide resolutions for even more immersive gameplay. Both premium tiers provide extended session lengths, priority access to servers and the ability to play the latest and greatest titles with RTX ON.

Whether looking to conquer new worlds, compete at the highest level or unwind with a long-time favorite game, now is an ideal time to join the cloud gaming community. Sign up to transform any device into a powerful gaming rig — just in time for the holiday gaming marathons.

Dashing December

Path of Exile 2 early access on GeForce NOW
The cloud is the path of least resistance.

Path of Exile 2 is the highly anticipated sequel to the popular free-to-play action role-playing game from Grinding Gear Games. The game will be available for members to stream Friday, Dec. 6, in early access with a wealth of content to experience.

Explore the three acts of the campaign, six playable character classes and a robust endgame system in the dark world of Wraeclast, a continent populated by unique cultures, ancient secrets and monstrous dangers. A sinister threat, long thought destroyed, has begun creeping back on the edge of civilisation, driving people mad and sickening the land with Corruption. Play Path of Exile 2 solo or grab the squad for online co-op with up to six players.

Look for these games available to stream in the cloud this week:

  • Indiana Jones and the Great Circle (Advanced Access on Steam and Xbox, available on the Microsoft Store)
  • Path of Exile 2 (New release on Steam and Grinding Gears, Dec. 6)
  • JR EAST Train Simulator (Steam)
  • JR EAST Train Simulator Demo (Steam)

Here’s what members can expect in December:

  • Fast Food Simulator (New release on Steam, Dec. 10)
  • Legacy of Kain Soul Reaver 1&2 Remastered (New release on Steam, Dec. 10)
  • The Spirit of the Samurai (New release on Steam, Dec. 12)
  • The Lord of the Rings: Return to Moria (Steam)
  • NieR:Automata (Steam)
  • NieR Replicant ver.1.22474487139… (Steam)
  • Replikant Chat (Steam)
  • Supermarket Together (Steam)
  • Ys X: Nordics (Steam)

New to November

In addition to the 17 games announced last month, 13 more joined the GeForce NOW library:

  • Ara: History Untold (Steam and Xbox, available on PC Game Pass)
  • Call of Duty: Black Ops Cold War (Steam and Battle.net)
  • Call of Duty: Vanguard (Steam and Battle.net)
  • Crash Bandicoot N. Sane Trilogy (Steam and Xbox, available on PC Game Pass)
  • The Elder Scrolls IV: Oblivion Game of the Year Edition (Epic Games Store, Steam and Xbox, available on PC Game Pass)
  • Fallout 3: Game of the Year Edition (Epic Games Store, Steam and Xbox, available on PC Game Pass)
  • Magicraft (Steam)
  • MEGA MAN X DiVE Offline Demo (Steam)
  • New Arc Line (New release on Steam, Nov. 26)
  • Resident Evil 7 Teaser: Beginning Hour Demo (Steam)
  • Spyro Reignited Trilogy (Steam and Xbox, available on PC Game Pass)
  • StarCraft II (Xbox, available on PC Game Pass, Nov. 5. Members need to enable access.)
  • StarCraft Remastered (Xbox, available on PC Game Pass, Nov. 5. Members need to enable access.)

Metal Slug Tactics, Dungeons & Degenerate Gamblers and Headquarters: World War II didn’t make it last month. Stay tuned to future GFN Thursday for updates.

What are you planning to play this weekend? Let us know on X or in the comments below.

Read More

NVIDIA NIM on AWS Supercharges AI Inference

NVIDIA NIM on AWS Supercharges AI Inference

Generative AI is rapidly transforming industries, driving demand for secure, high-performance inference solutions to scale increasingly complex models efficiently and cost-effectively.

Expanding its collaboration with NVIDIA, Amazon Web Services (AWS) revealed today at its annual AWS re:Invent conference that it has extended NVIDIA NIM microservices across key AWS AI services to support faster AI inference and lower latency for generative AI applications.

NVIDIA NIM microservices are now available directly from the AWS Marketplace, as well as Amazon Bedrock Marketplace and Amazon SageMaker JumpStart, making it even easier for developers to deploy NVIDIA-optimized inference for commonly used models at scale.

NVIDIA NIM, part of the NVIDIA AI Enterprise software platform available in the AWS Marketplace, provides developers with a set of easy-to-use microservices designed for secure, reliable deployment of high-performance, enterprise-grade AI model inference across clouds, data centers and workstations.

These prebuilt containers are built on robust inference engines, such as NVIDIA Triton Inference Server, NVIDIA TensorRT, NVIDIA TensorRT-LLM and PyTorch, and support a broad spectrum of AI models — from open-source community ones to NVIDIA AI Foundation models and custom ones.

NIM microservices can be deployed across various AWS services, including Amazon Elastic Compute Cloud (EC2), Amazon Elastic Kubernetes Service (EKS) and Amazon SageMaker.

Developers can preview over 100 NIM microservices built from commonly used models and model families, including Meta’s Llama 3, Mistral AI’s Mistral and Mixtral, NVIDIA’s Nemotron, Stability AI’s SDXL and many more on the NVIDIA API catalog. The most commonly used ones are available for self-hosting to deploy on AWS services and are optimized to run on NVIDIA accelerated computing instances on AWS.

NIM microservices now available directly from AWS include:

  • NVIDIA Nemotron-4, available in Amazon Bedrock Marketplace, Amazon SageMaker Jumpstart and AWS Marketplace. This is a cutting-edge LLM designed to generate diverse synthetic data that closely mimics real-world data, enhancing the performance and robustness of custom LLMs across various domains.
  • Llama 3.1 8B-Instruct, available on AWS Marketplace. This 8-billion-parameter multilingual large language model is pretrained and instruction-tuned for language understanding, reasoning and text-generation use cases.
  • Llama 3.1 70B-Instruct, available on AWS Marketplace. This 70-billion-parameter pretrained, instruction-tuned model is optimized for multilingual dialogue.
  • Mixtral 8x7B Instruct v0.1, available on AWS Marketplace. This high-quality sparse mixture of experts model with open weights can follow instructions, complete requests and generate creative text formats.

NIM on AWS for Everyone

Customers and partners across industries are tapping NIM on AWS to get to market faster, maintain security and control of their generative AI applications and data, and lower costs.

SoftServe, an IT consulting and digital services provider, has developed six generative AI solutions fully deployed on AWS and accelerated by NVIDIA NIM and AWS services. The solutions, available on AWS Marketplace, include SoftServe Gen AI Drug Discovery, SoftServe Gen AI Industrial Assistant, Digital Concierge, Multimodal RAG System, Content Creator and Speech Recognition Platform.

They’re all based on NVIDIA AI Blueprints, comprehensive reference workflows that accelerate AI application development and deployment and feature NVIDIA acceleration libraries, software development kits and NIM microservices for AI agents, digital twins and more.

Start Now With NIM on AWS

Developers can deploy NVIDIA NIM microservices on AWS according to their unique needs and requirements. By doing so, developers and enterprises can achieve high-performance AI with NVIDIA-optimized inference containers across various AWS services.

Visit the NVIDIA API catalog to try out over 100 different NIM-optimized models, and request either a developer license or 90-day NVIDIA AI Enterprise trial license to get started deploying the microservices on AWS services. Developers can also explore NIM microservices in the AWS Marketplace, Amazon Bedrock Marketplace or Amazon SageMaker JumpStart.

See notice regarding software product information.

Read More

Latest NVIDIA AI, Robotics and Quantum Computing Software Comes to AWS

Latest NVIDIA AI, Robotics and Quantum Computing Software Comes to AWS

Expanding what’s possible for developers and enterprises in the cloud, NVIDIA and Amazon Web Services are converging at AWS re:Invent in Las Vegas this week to showcase new solutions designed to accelerate AI and robotics breakthroughs and simplify research in quantum computing development.

AWS re:Invent is a conference for the global cloud-computing community packed with keynotes and more than 2,000 technical sessions.

Announcement highlights include the availability of NVIDIA DGX Cloud on AWS and enhanced AI, quantum computing and robotics tools.

NVIDIA DGX Cloud on AWS for AI at Scale

The NVIDIA DGX Cloud AI computing platform is now available through AWS Marketplace Private Offers, offering a high-performance, fully managed solution for enterprises to train and customize AI models.

DGX Cloud offers flexible terms, a fully managed and optimized platform, and direct access to NVIDIA experts to help businesses scale their AI capabilities quickly.

Early adopter Leonardo.ai, part of the Canva family, is already using DGX Cloud on AWS to develop advanced design tools.

AWS Liquid-Cooled Data Centers With NVIDIA Blackwell

Newer AI servers benefit from liquid cooling to cool high-density compute chips more efficiently for better performance and energy efficiency. AWS has developed solutions that provide configurable liquid-to-chip cooling across its data centers.

The cooling solution announced today will seamlessly integrate air- and liquid-cooling capabilities for the most powerful rack-scale AI supercomputing systems like NVIDIA GB200 NVL72, as well as AWS’ network switches and storage servers.

This flexible, multimodal cooling design provides maximum performance and efficiency for running AI models and will be used for the next-generation NVIDIA Blackwell platform.

Blackwell will be the foundation of Amazon EC2 P6 instances, DGX Cloud on AWS and Project Ceiba.

NVIDIA Advances Physical AI With Accelerated Robotics Simulation on AWS

NVIDIA is also expanding the reach of NVIDIA Omniverse on AWS with NVIDIA Isaac Sim, now running on high-performance Amazon EC2 G6e instances accelerated by NVIDIA L40S GPUs.

Available now, this reference application built on NVIDIA Omniverse enables developers to simulate and test AI-driven robots in physically based virtual environments.

One of the many workflows enabled by Isaac Sim is synthetic data generation. This pipeline is now further accelerated with the infusion of OpenUSD NIM microservices, from scene creation to data augmentation.

Robotics companies such as Aescape, Cohesive Robotics, Cobot, Field AI, Standard Bots, Swiss Mile and Vention are using Isaac Sim to simulate and validate the performance of their robots prior to deployment.

In addition, Rendered.ai, SoftServe and Tata Consultancy Services are using the synthetic data generation capabilities of Omniverse Replicator and Isaac Sim to bootstrap perception AI models that power various robotics applications.

NVIDIA BioNeMo on AWS for Advanced AI-Based Drug Discovery

NVIDIA BioNeMo NIM microservices and AI Blueprints, developed to advance drug discovery, are now integrated into AWS HealthOmics, a fully managed biological data compute and storage service designed to accelerate scientific breakthroughs in clinical diagnostics and drug discovery.

This collaboration gives researchers access to AI models and scalable cloud infrastructure tailored to drug discovery workflows. Several biotech companies already use NVIDIA BioNeMo on AWS to drive their research and development pipelines.

For example, A-Alpha Bio, a biotechnology company based in Seattle, recently published a study in biorxiv describing a collaborative effort with NVIDIA and AWS to develop and deploy an antibody AI model called AlphaBind.

Using AlphaBind via the BioNeMo framework on Amazon EC2 P5 instances equipped with NVIDIA H100 Tensor Core GPUs, A-Alpha Bio achieved a 12x increase in inference speed and processed over 108 million inference calls in two months.

Additionally, SoftServe today launched Drug Discovery, its generative AI solution built with NVIDIA Blueprints, to enable computer-aided drug discovery and efficient drug development. This solution is set to deliver faster workflows and will soon be available in AWS Marketplace.

Real-Time AI Blueprints: Ready-to-Deploy Options for Video, Cybersecurity and More

NVIDIA’s latest AI Blueprints are available for instant deployment on AWS, making real-time applications like vulnerability analysis for container security, and video search and summarization agents readily accessible.

Developers can easily integrate these blueprints into existing workflows to speed deployments.

Developers and enterprises can use the NVIDIA AI Blueprint for video search and summarization to build visual AI agents that can analyze real-time or archived videos to answer user questions, generate summaries and enable alerts for specific scenarios.

AWS collaborated with NVIDIA to provide a reference architecture applying the NVIDIA AI Blueprint for vulnerability analysis to augment early security patching in continuous integration pipelines on AWS cloud-native services.

NVIDIA CUDA-Q on Amazon Braket: Quantum Computing Made Practical

NVIDIA CUDA-Q is now integrated with Amazon Braket to streamline quantum computing development. CUDA-Q users can  use Amazon Braket’s quantum processors, while Braket users can tap CUDA-Q’s GPU-accelerated workflows for development and simulation.

The CUDA-Q platform allows developers to build hybrid quantum-classical applications and run them on many different types of quantum processors, simulated and physical.

Now preinstalled on Amazon Braket, CUDA-Q provides a seamless development platform for hybrid quantum-classical applications, unlocking new potential in quantum research.

Enterprise Platform Providers and Consulting Leaders Advance AI With NVIDIA on AWS

Leading software platforms and global system integrators are helping enterprises rapidly scale generative AI applications built with NVIDIA AI on AWS to drive innovation across industries.

Cloudera is using NVIDIA AI on AWS to enhance its new AI inference solution, helping Mercy Corps improve the precision and effectiveness of its aid distribution technology.

Cohesity has integrated NVIDIA NeMo Retriever microservices in its generative AI-powered conversational search assistant, Cohesity Gaia, to improve the recall performance of retrieval-augmented generation. Cohesity customers running on AWS can take advantage of the NeMo Retriever integration within Gaia.

DataStax announced that Wikimedia Deutschland is applying the DataStax AI Platform to make Wikidata available to developers as an embedded vectorized database. The Datastax AI Platform is built with NVIDIA NeMo Retriever and NIM microservices, and available on AWS.

Deloitte’s C-Suite AI now supports NVIDIA AI Enterprise software, including NVIDIA NIM microservices and NVIDIA NeMo for CFO-specific use cases, including financial statement analysis, scenario modeling and market analysis.

RAPIDS Quick Start Notebooks Now Available on Amazon EMR

NVIDIA and AWS are also speeding data science and data analytics workloads with the RAPIDS Accelerator for Apache Spark, which accelerates analytics and machine learning workloads with no code change and reduces data processing costs by up to 80%.

Quick Start notebooks for RAPIDS Accelerator for Apache Spark are now available on Amazon EMR, Amazon EC2 and Amazon EMR on EKS. These offer a simple way to qualify Spark jobs tuned to maximize the performance of RAPIDS on GPUs, all within AWS EMR.

NVIDIA and AWS Power the Next Generation of Industrial Edge Systems

The NVIDIA IGX Orin and Jetson Orin platforms now integrate seamlessly with AWS IoT Greengrass to streamline  the deployment and running of AI models at the edge and to efficiently manage fleets of connected devices at scale. This combination enhances scalability and simplifies the deployment process for industrial and robotics applications.

Developers can now tap into NVIDIA’s advanced edge computing power with AWS’ purpose-built IoT services, creating a secure, scalable environment for autonomous machines and smart sensors. A guide for getting started, authored by AWS, is now available to support developers putting these capabilities to work.

The integration underscores NVIDIA’s work in advancing enterprise-ready industrial edge systems to enable rapid, intelligent operations in real-world applications.

Catch more of NVIDIA’s work at AWS: re:Invent 2024 through live demos, technical sessions and hands-on labs. 

See notice regarding software product information.

Read More

NVIDIA Advances Physical AI With Accelerated Robotics Simulation on AWS

NVIDIA Advances Physical AI With Accelerated Robotics Simulation on AWS

Field AI is building robot brains that enable robots to autonomously manage a wide range of industrial processes. Vention creates pretrained skills to ease development of robotic tasks. And Cobot offers Proxie, an AI-powered cobot designed to handle material movement and adapt to dynamic environments, working seamlessly alongside humans.

These leading robotics startups are all making advances using NVIDIA Isaac Sim on Amazon Web Services. Isaac Sim is a reference application built on NVIDIA Omniverse for developers to simulate and test AI-driven robots in physically based virtual environments.

NVIDIA announced at AWS re:Invent today that Isaac Sim now runs on Amazon Elastic Cloud Computing (EC2) G6e instances accelerated by NVIDIA L40S GPUs. And with NVIDIA OSMO, a cloud-native orchestration platform, developers can easily manage their complex robotics workflows across their AWS computing infrastructure.

This combination of NVIDIA-accelerated hardware and software — available on the cloud — allows teams of any size to scale their physical AI workflows.

Physical AI describes AI models that can understand and interact with the physical world. It embodies the next wave of autonomous machines and robots, such as self-driving cars, industrial manipulators, mobile robots, humanoids and even robot-run infrastructure like factories and warehouses.

With physical AI, developers are embracing a three computer solution for training, simulation and inference to make breakthroughs.

Yet physical AI for robotics systems requires robust training datasets to achieve precision inference in deployment. Developing such datasets, however, and testing them in real situations can be impractical and costly.

Simulation offers an answer, as it can significantly accelerate the training, testing and deployment of AI-driven robots.

Harnessing L40S GPUs in the Cloud to Scale Robotics Simulation and Training

Simulation is used to verify, validate and optimize robot designs as well as the systems and their algorithms before deployment. Simulation can also optimize facility and system designs before construction or remodeling starts for maximum efficiencies, reducing costly manufacturing change orders.

Amazon EC2 G6e instances accelerated by NVIDIA L40S GPUs provide a 2x performance gain over the prior architecture, while allowing the flexibility to scale as scene and simulation complexity grows. The instances are used to train many computer vision models that power AI-driven robots. This means the same instances can be extended for various tasks, from data generation to simulation to model training.

Using NVIDIA OSMO in the cloud allows teams to orchestrate and scale complex ‌robotics development workflows across distributed computing resources, whether on premises or in the AWS cloud.

Isaac Sim provides access to the latest robotics simulation capabilities and the cloud, fostering collaboration. One of the critical workflows is generating synthetic data for perception model training.

Using a reference workflow that combines NVIDIA Omniverse Replicator, a framework for building custom synthetic data generation (SDG) pipelines and a core extension of Isaac Sim, with NVIDIA NIM microservices, developers can build generative AI-enabled SDG pipelines.

These include the USD Code NIM microservice for generating Python USD code and answering OpenUSD queries, and the USD Search NIM microservice for exploring OpenUSD assets using natural language or image inputs. The Edify 360 HDRi NIM microservice generates 360-degree environment maps, while the Edify 3D NIM microservice creates ready-to-edit 3D assets from text or image prompts. This eases the synthetic data generation process by reducing many tedious and manual steps, from asset creation to image augmentation, using the power of generative AI.

Rendered.ai’s synthetic data engineering platform integrated with Omniverse Replicator enables companies to generate synthetic data for computer vision models used in industries from security and intelligence to manufacturing and agriculture.

SoftServe, an IT consulting and digital services provider, uses Isaac Sim to generate synthetic data and validate robots used in vertical farming with Pfeifer & Langen, a leading European food producer.

Tata Consultancy Services is building custom synthetic data generation pipelines to power its Mobility AI suite to address automotive and autonomous use cases by simulating real-world scenarios. Its applications include defect detection, end-of-line quality inspection and hazard avoidance.

Learning to Be Robots in Simulation

While Isaac Sim enables developers to test and validate robots in physically accurate simulation, Isaac Lab, an open-source robot learning framework built on Isaac Sim, provides a virtual playground for building robot policies that can run on AWS Batch.

Because these simulations are repeatable, developers can easily troubleshoot and reduce the number of cycles required for validation and testing.

Several robotics developers are embracing NVIDIA Isaac on AWS to develop physical AI, such as:

  • Aescape’s robots are able to provide precision-tailored massages by accurately modeling and tuning onboard sensors in Isaac Sim.
  • Cobot has used Isaac Sim with its AI-powered cobot, Proxie, to optimize logistics in warehouses, hospitals, manufacturing sites, and more.
  • Cohesive Robotics has integrated Isaac Sim into its software framework called Argus OS for developing and deploying robotic workcells used in high-mix manufacturing environments.
  • Field AI, a builder of robot foundation models, uses Isaac Sim and Isaac Lab to evaluate the performance of its models in complex, unstructured environments across industries such as construction, manufacturing, oil and gas, mining and more.
  • Standard Bots is simulating and validating the performance of its R01 robot used in manufacturing and machining setup.
  • Swiss Mile is using Isaac Sim and Isaac Lab for robot learning so that wheeled quadruped robots can perform tasks autonomously with new levels of efficiency in factories and warehouses.
  • Vention, which offers a full-stack cloud-based automation platform, is harnessing Isaac Sim for developing and testing new capabilities for robot cells used by small to medium-size manufacturers.

Learn more about Isaac Sim 4.2, now available on Amazon EC2 G6e instances powered by NVIDIA L40S GPUs on AWS Marketplace.

Read More

New NVIDIA Certifications Expand Professionals’ Credentials in AI Infrastructure and Operations

New NVIDIA Certifications Expand Professionals’ Credentials in AI Infrastructure and Operations

As generative AI continues to grow, implementing and managing the right infrastructure becomes even more critical to ensure the secure and efficient development and deployment of AI-based solutions.

To meet these needs, NVIDIA has introduced two professional-level certifications that offer structured paths for infrastructure and operations practitioners to enhance and validate the skills needed to work effectively with advanced AI technologies.

The certification exams and recommended training to prepare for them are designed for network and system administrators, DevOps and MLOps engineers, and others who need to understand AI infrastructure and operations.

NVIDIA’s certification program equips professionals with skills in areas such as AI infrastructure, deep learning and accelerated computing to enhance their career prospects and give them a competitive edge in these high-demand fields.

Developed in collaboration with industry experts, the program features relevant content that emphasizes practical application alongside theoretical knowledge.

The NVIDIA-Certified Professional: AI Infrastructure certification is designed for practitioners seeking to showcase advanced skills in deploying AI infrastructure. Candidates must demonstrate expertise in GPU and DPU installation, hardware validation and system optimization for both AI and HPC workloads. The exam also tests proficiency in managing physical layers, configuring MIG, deploying the NVIDIA BlueField operating system, and integrating NVIDIA’s cloud-native stack with Docker and NVIDIA NGC.

To prepare for this professional-level certification, candidates are encouraged to attend the AI Infrastructure Professional Workshop. This hands-on training covers critical AI data center technologies, including compute platforms, GPU operations, networking, storage solutions and BlueField DPUs. The workshop is recommended for professionals aiming to elevate their AI infrastructure expertise.

The NVIDIA-Certified Professional: AI Operations certification is tailored for individuals seeking proficiency in managing and optimizing AI operations. The exam tests expertise in managing AI data centers, including the use of Kubernetes, Slurm, MIG, BCM, NGC containers, storage configuration and DPU services.

To prepare for this professional-level certification, candidates are encouraged to attend the AI Operations Professional Workshop, where they will gain hands-on experience in managing AI data centers, including compute platforms, networking, storage and GPU virtualization. The workshop also provides practical experience with NVIDIA AI software and solutions, including NGC containers and the NVIDIA AI Enterprise software suite, making it ideal for professionals looking to deepen their AI operations expertise.

Both of these professional-level certifications build upon the foundational knowledge covered in the NVIDIA-Certified Associate: AI Infrastructure and Operations certification.

Additional NVIDIA certifications include:

Saleh Hassan, an embedded software engineer at Two Six Technologies, successfully completed three NVIDIA certification exams at the NVIDIA AI Summit in Washington, D.C., earlier this year.

“The knowledge I gained has definitely made me a better developer when it comes to integrating AI,” said Hassan, who encourages others to pursue certifications as a key milestone for advancing their AI careers.

Saleh Hassan showing off one of his NVIDIA certifications.

All NVIDIA certifications are part of a comprehensive learning path that offers foundational courses, advanced training and hands-on labs to thoroughly prepare candidates for real-world applications.

The certifications support individual career growth and organizations can use them to enhance workforce capabilities.

Explore the options on the NVIDIA Certification portal and sign up for NVIDIA’s monthly newsletter to stay updated on the latest offerings.

Read More

How AI Can Enhance Disability Inclusion, Special Education

How AI Can Enhance Disability Inclusion, Special Education

A recent survey from the Special Olympics Global Center for Inclusion in Education shows that while a majority of students with an intellectual and developmental disability (IDD) and their parents view AI as a potentially transformative technology, only 35% of educators believe that AI developers currently account for the needs and priorities of students with IDD.

In this episode of the NVIDIA AI Podcast, U.S. Special Advisor on International Disability Rights at the U.S. Department of State Sara Minkara and Timothy Shriver, chairman of the board of Special Olympics, discuss AI’s potential to enhance special education and disability inclusion.

U.S. Special Advisor on International Disability Rights at the U.S. Department of State Sara Minkara at the G7 Summit. Image courtesy of the Government of Italy.

They highlight the critical need to include the voices from disability communities in AI development and policy conversations. Minkara and Shriver also explain the cultural, financial and social importance of building an inclusive future.

Time Stamps

2:12: Minkara and Shriver’s work on disability inclusion

9:47: Benefits of AI for people with disabilities

20:46: Notes from the recent G7 ministerial meeting on inclusion and disability

24:51: Challenges and future directions of AI in disability inclusion

Image courtesy of Special Olympics.

You Might Also Like…

Taking AI to School: A Conversation With MIT’s Anant Agarwal – Ep. 197

Educators and technologists alike have long been excited about AI’s potential to transform teaching and learning. Anant Agarwal, founder of edX and chief platform officer at 2U, talked about the future of online education and how AI is revolutionizing the learning experience.

NVIDIA’s Louis Stewart on How AI Is Shaping Workforce Development – Ep. 237

Workforce development is central to ensuring the changes brought by AI benefit all of us. Louis Stewart, head of strategic initiatives for NVIDIA’s global developer ecosystem, explains what workforce development looks like in the age of AI, and why it all starts with education.

Dotlumen CEO Cornel Amariei on Assistive Technology for the Visually Impaired – Ep. 217

Equipped with sensors and powered by AI, Dotlumen Glasses compute a safely walkable path for persons who are blind or have low vision, and offer haptic — or tactile — feedback on how to proceed via corresponding vibrations. Dotlumen founder and CEO Cornel Amariei discusses the challenges and breakthroughs of developing assistive technology.

How the Ohio Supercomputer Center Drives the Future of Computing – Ep. 213

Alan Chalker, director of strategic programs at the Ohio Supercomputing Center, dives into the history and evolution of the OSC, how it’s working with client companies like NASCAR, and how the center’s Open OnDemand program empowers Ohio higher education institutions and industries with computational services and training and educational programs.

Read More

Siemens Healthineers Adopts MONAI Deploy for Medical Imaging AI

Siemens Healthineers Adopts MONAI Deploy for Medical Imaging AI

3.6 billion. That’s about how many medical imaging tests are performed annually worldwide to diagnose, monitor and treat various conditions.

Speeding up the processing and evaluation of all these X-rays, CT scans, MRIs and ultrasounds is essential to helping doctors manage their workloads and to improving health outcomes.

That’s why NVIDIA introduced MONAI, which serves as an open-source research and development platform for AI applications used in medical imaging and beyond. MONAI unites doctors with data scientists to unlock the power of medical data to build deep learning models and deployable applications for medical AI workflows.

This week at the annual meeting of RSNA, the Radiological Society of North America, NVIDIA announced that Siemens Healthineers has adopted MONAI Deploy, a module within MONAI that bridges the gap from research to clinical production, to boost the speed and efficiency of integrating AI workflows for medical imaging into clinical deployments.

With over 15,000 installations in medical devices around the world, the Siemens Healthineers Syngo Carbon and syngo.via enterprise imaging platforms help clinicians better read and extract insights from medical images of many sources.

Developers typically use a variety of frameworks when building AI applications. This makes it a challenge to deploy their applications into clinical environments.

With a few lines of code, MONAI Deploy builds AI applications that can run anywhere. It is a tool for developing, packaging, testing, deploying and running medical AI applications in clinical production. Using it streamlines the process of developing and integrating medical imaging AI applications into clinical workflows.

.MONAI Deploy on the Siemens Healthineers platform has significantly accelerated the AI integration process, letting users port trained AI models into real-world clinical settings with just a few clicks, compared with what used to take months. This helps researchers, entrepreneurs and startups get their applications into the hands of radiologists more quickly.

“By accelerating AI model deployment, we empower healthcare institutions to harness and benefit from the latest advancements in AI-based medical imaging faster than ever,” said Axel Heitland, head of digital technologies and research at Siemens Healthineers. “With MONAI Deploy, researchers can quickly tailor AI models and transition innovations from the lab to clinical practice, providing thousands of clinical researchers worldwide access to AI-driven advancements directly on their syngo.via and Syngo Carbon imaging platforms.”

Enhanced with MONAI-developed apps, these platforms can significantly streamline AI integration. These apps can be easily provided and used on the Siemens Healthineers Digital Marketplace, where users can browse, select and seamlessly integrate them into their clinical workflows.

MONAI Ecosystem Boosts Innovation and Adoption

Now marking its five-year anniversary, MONAI has seen over 3.5 million downloads, 220 contributors from around the world, acknowledgements in over 3,000 publications, 17 MICCAI challenge wins and use in numerous clinical products.

The latest release of MONAI — v1.4 — includes updates that give researchers and clinicians even more opportunities to take advantage of the innovations of MONAI and contribute to Siemens Healthineers Syngo Carbon, syngo.via and the Siemens Healthineers Digital Marketplace.

The updates in MONAI v1.4 and related NVIDIA products include new foundation models for medical imaging, which can be customized in MONAI and deployed as NVIDIA NIM microservices. The following models are now generally available as NIM microservices:

  • MAISI (Medical AI for Synthetic Imaging) is a latent diffusion generative AI foundation model that can simulate high-resolution, full-format 3D CT images and their anatomic segmentations.
  • VISTA-3D is a foundation model for CT image segmentation that offers accurate out-of-the-box performance covering over 120 major organ classes. It also offers effective adaptation and zero-shot capabilities to learn to segment novel structures.

Alongside MONAI 1.4’s major features, the new MONAI Multi-Modal Model, or M3, is now accessible through MONAI’s VLM GitHub repo. M3 is a framework that extends any multimodal LLM with medical AI experts such as trained AI models from MONAI’s Model Zoo. The power of this new framework is demonstrated by the VILA-M3 foundation model that’s now available on Hugging Face, offering state-of-the-art radiological image copilot performance.

MONAI Bridges Hospitals, Healthcare Startups and Research Institutions

Leading healthcare institutions, academic medical centers, startups and software providers around the world are adopting and advancing MONAI, including:

  • German Cancer Research Center leads MONAI’s benchmark and metrics working group, which provides metrics for measuring AI performance and guidelines for how and when to use those metrics.
  • Nadeem Lab from Memorial Sloan Kettering Cancer Center (MSK) pioneered the cloud-based deployment of multiple AI-assisted annotation pipelines and inference modules for pathology data using MONAI.
  • University of Colorado School of Medicine faculty developed MONAI-based ophthalmology tools for detecting retinal diseases using a variety of imaging modalities. The university also leads some of the original federated learning developments and clinical demonstrations using MONAI.
  • MathWorks has integrated MONAI Label with its Medical Imaging Toolbox, bringing medical imaging AI and AI-assisted annotation capabilities to thousands of MATLAB users engaged in medical and biomedical applications throughout academia and industry.
  • GSK is exploring MONAI foundation models such as VISTA-3D and VISTA-2D for image segmentation.
  • Flywheel offers a platform, which includes MONAI for streamlining imaging data management, automating research workflows, and enabling AI development and analysis, that scales for the needs of research institutions and life sciences organizations.
  • Alara Imaging published its work on integrating MONAI foundation models such as VISTA-3D with LLMs such as Llama 3 at the 2024 Society for Imaging Informatics in Medicine conference.
  • RadImageNet is exploring the use of MONAI’s M3 framework to develop cutting-edge vision language models that utilize expert image AI models from MONAI to generate high-quality radiological reports.
  • Kitware is providing professional software development services surrounding MONAI, helping integrate MONAI into custom workflows for device manufacturers as well as regulatory-approved products.

Researchers and companies are also using MONAI on cloud service providers to run and deploy scalable AI applications. Cloud platforms providing access to MONAI include AWS HealthImaging, Google Cloud, Precision Imaging Network, part of Microsoft Cloud for Healthcare, and Oracle Cloud Infrastructure.

See disclosure statements about syngo.via, Syngo Carbon and products in the Digital Marketplace.

Read More

Get the Power of GeForce-Powered Gaming in the Cloud Half Off With Black Friday Deal

Get the Power of GeForce-Powered Gaming in the Cloud Half Off With Black Friday Deal

Turn Black Friday into Green Thursday with a new deal on GeForce NOW Ultimate and Performance memberships this week. For a limited time, get 50% off new Ultimate or Performance memberships for the first three months to experience the power of GeForce RTX-powered gaming at a fraction of the cost.

The giving continues for GeForce NOW members: SteelSeries is offering a 30% discount exclusively to all GeForce NOW members on Stratus+ or Nimbus+ controllers, perfect for gaming anytime, anywhere when paired with GeForce NOW on Android and iOS devices. To redeem the discount, opt in to GeForce NOW rewards and look out for an email with details. Enjoy this exclusive offer on its own — it can’t be combined with other SteelSeries promotions.

It’s not a GFN Thursday without new games — this week, six are joining the over 2,000 titles in the GeForce NOW library.

Plus, the Steam Autumn Sale is happening now, featuring stellar discounts on GeForce NOW-supported games. Snag beloved publishers’ top titles, including Nightingale from Inflexion Games, Remnant and Remnant II from Arc Games, and Cult of the Lamb and The Plucky Squire from Devolver — and even more from publishers Frost Giant Studios, Metric Empire, tinyBuild, Torn Banner Studios and Tripwire. The sale runs through Wednesday, Dec. 4.

Stuff Your Stockings

This holiday season, GeForce NOW is spreading cheer to gamers everywhere with an irresistible Black Friday offer. Those looking to try out the cloud gaming service can now level up their gaming with 50% off new Ultimate and Performance memberships for the first three months. It’s the perfect time for gamers to treat themselves or a buddy to GeForce RTX-powered gaming without having to upgrade any hardware.

Black Friday Deal on GeForce NOW
Thankful for cloud gaming discounts.

Lock in all the perks of the newly enhanced Performance membership, now featuring crisp 1440p streaming, at half off for the next three months. Or go all out with the Ultimate tier — delivering the same premium experience GeForce RTX 4080 GPU owners enjoy — now available at the regular monthly cost of a Performance membership.

With a GeForce NOW membership, gamers can stream over 2,000 PC games from popular digital gaming stores with longer gaming sessions and real-time ray tracing for supported titlgames across nearly all devices. Performance members can stream at up to 1440p at 60 frames per second, and Ultimate members can stream up to 4K at 120 fps or 1080p at 240 fps.

Don’t let this festive deal slip away — give the gift of gaming this holiday season with GeForce NOW’s Black Friday sale. Whether battling winter bosses or exploring snowy landscapes, do it with exceptional performance at an exceptional price.

Elevating New Games

In addition, members can look for the following:

  • New Arc Line (New release on Steam, Nov. 26)
  • MEGA MAN X DiVE Offline Demo (Steam)
  • PANICORE (Steam)
  • Resident Evil 7 Teaser: Beginning Hour Demo (Steam)
  • Slime Rancher (Steam)
  • Sumerian Six (Steam)

What are you planning to play this weekend? Let us know on X or in the comments below.

Read More

How RTX AI PCs Unlock AI Agents That Solve Complex Problems Autonomously With Generative AI

How RTX AI PCs Unlock AI Agents That Solve Complex Problems Autonomously With Generative AI

Editor’s note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, software, tools and accelerations for GeForce RTX PC and NVIDIA RTX workstation users.

Generative AI has transformed the way people bring ideas to life. Agentic AI takes this one step further — using sophisticated, autonomous reasoning and iterative planning to help solve complex, multi-step problems.

AnythingLLM is a customizable open-source desktop application that lets users seamlessly integrate large language model (LLM) capabilities into various applications locally on their PCs. It enables users to harness AI for tasks such as content generation, summarization and more, tailoring tools to meet specific needs.

Accelerated on NVIDIA RTX AI PCs, AnythingLLM has launched a new Community Hub where users can share prompts, slash commands and AI agent skills while experimenting with building and running AI agents locally.

Autonomously Solve Complex, Multi-Step Problems With Agentic AI

AI agents can take chatbot capabilities further. They typically understand the context of the tasks and can analyze challenges and develop strategies — and some can even fully execute assigned tasks.

For example, while a chatbot could answer a prompt asking for a restaurant recommendation, an AI agent could even surface the restaurant’s phone number for a reservation and add reminders to the user’s calendar.

Agents help achieve big-picture goals and don’t get bogged down at the task level. There are many agentic apps in development to tackle to-do lists, manage schedules, help organize tasks, automate email replies, recommend personalized workout plans or plan trips.

Once prompted, an AI agent can gather and process data from various sources, including databases. It can use an LLM for reasoning — for example, to understand the task — then generate solutions and specific functions. If integrated with external tools and software, an AI agent can next execute the task.

Some sophisticated agents can even be improved through a feedback loop. When the data it generates is fed back into the system, the AI agent becomes smarter and faster.

A step-by-step look at the process behind agentic AI systems. AI agents process user input, retrieve information from databases and other sources, and refine tasks in real time to deliver actionable results.

Accelerated by NVIDIA RTX AI PCs, these agents can perform inferencing and execute tasks faster than any other PC. Users can operate the agent locally to help ensure data privacy, even without an internet connection.

AnythingLLM: A Community Effort, Accelerated by RTX

The AI community is already diving into the possibilities of agentic AI, experimenting with ways to create smarter, more capable systems.

Applications like AnythingLLM let developers easily build, customize and unlock agentic AI with their favorite models — like Llama and Mistral — as well as with other tools, such as Ollama and LMStudio. AnythingLLM is accelerated on RTX-powered AI PCs and workstations with high-performance Tensor Cores, dedicated hardware that provides the compute performance needed to run the latest and most demanding AI models.

AnythingLLM is designed to make working with AI seamless, productive and accessible to everyone. It allows users to chat with their documents using intuitive interfaces, use AI agents to handle complex and custom tasks, and run cutting-edge LLMs locally on RTX-powered PCs and workstations. This means unlocked access to local resources, tools and applications that typically can’t be integrated with cloud- or browser-based applications, or those that require extensive setup and knowledge to build. By tapping into the power of NVIDIA RTX GPUs, AnythingLLM delivers faster, smarter and more responsive AI for a variety of workflows — all within a single desktop application.

AnythingLLM’s Community Hub lets AI enthusiasts easily access system prompts that can help steer LLM behavior, discover productivity-boosting slash commands, build specialized AI agent skills for unique workflows and custom tools, and access on-device resources.

Example of a user invoking the agent to complete a Web Search query.

Some example agent skills that are available in the Community Hub include Microsoft Outlook email assistants, calendar agents, web searches and home assistant controllers, as well as agents for populating and even integrating custom application programming interface endpoints and services for a specific use case.

By enabling AI enthusiasts to download, customize and use agentic AI workflows on their own systems with full privacy, AnythingLLM is fueling innovation and making it easier to experiment with the latest technologies — whether building a spreadsheet assistant or tackling more advanced workflows.

Experience AnythingLLM now.

Powered by People, Driven by Innovation

AnythingLLM showcases how AI can go beyond answering questions to actively enhancing productivity and creativity. Such applications illustrate AI’s move toward becoming an essential collaborator across workflows.

Agentic AI’s potential applications are vast and require creativity, expertise and computing capabilities. NVIDIA RTX AI PCs deliver peak performance for running agents locally,  whether accomplishing simple tasks like generating and distributing content, or managing more complex use cases such as orchestrating enterprise software.

Learn more and get started with agentic AI.

Generative AI is transforming gaming, videoconferencing and interactive experiences of all kinds. Make sense of what’s new and what’s next by subscribing to the AI Decoded newsletter.

Read More