October 2024 – Page 5

Smart Audit System Empowered by LLM

Manufacturing quality audits are pivotal for ensuring high product standards in mass production environments. Traditional auditing processes, however, are labor-intensive and heavily reliant on human expertise, posing challenges in maintaining transparency, accountability, and continuous improvement across complex global supply chains. To address these challenges, we propose a smart audit system empowered by large language models (LLMs). Our approach introduces three key innovations: a dynamic risk assessment model that streamlines audit procedures and optimizes resource allocation; a…Apple Machine Learning Research

NVIDIA Works With Deloitte to Deploy Digital AI Agents for Healthcare

Ahead of a visit to the hospital for a surgical procedure, patients often have plenty of questions about what to expect — and can be plenty nervous.

To help minimize presurgery jitters, Deloitte is enhancing its Quartz Frontline AI, an AI solution for customer service powered by NVIDIA AI, to now include AI agents to bring the next generation of digital, frontline teammates to patients before they even step foot inside the hospital.

“The Frontline AI Teammate offers a novel and innovative solution to help combat our health human resource crisis.” — Mathieu LeBreton, digital experience lead at The Ottawa Hospital

These virtual teammates, built with the NVIDIA AI Enterprise software platform, can have natural, human-like conversations with patients, answer a wide range of questions and provide support prior to preadmission appointments at hospitals.

Working with NVIDIA, Deloitte has added Frontline AI Teammate to its Quartz platform for use in settings like hospitals, where the digital avatar can have practical conversations — in multiple languages — that give the end user, such as a patient, instant answers to pressing questions.

“Avatar-based conversational AI agents offer an incredible opportunity to reduce the productivity paradox that our healthcare system faces with digitization,” said Niraj Dalmia, partner at Deloitte Canada. “It could possibly be the complementary innovation that reduces administrative burden, complements our healthcare human resources to free up capacity and helps solve for patient experience challenges.”

Next-Gen Technologies Powering Digital Humans

Digital human technology can provide lifelike interactions that can enhance experiences for doctors and patients.

Deloitte’s Frontline AI Teammate, built with NVIDIA AI Enterprise and Deloitte’s Conversational AI Framework, is designed to deliver human-to-machine experiences in healthcare settings. Developed on the NVIDIA Omniverse platform, Deloitte’s lifelike avatar can respond to complex, domain-specific questions that are pivotal in healthcare delivery.

Developers can tap into NVIDIA NIM microservices, which streamline the path for developing AI-powered applications and moving AI models into production, to craft digital humans for healthcare industry applications.

NVIDIA NIM Agent Blueprints offer a customizable, reference AI workflow for creating interactive, AI-driven avatars that are ideal for telehealth — and include best practices for how to use NVIDIA NeMo Retriever, an industry-leading embedding, retrieval and re-ranking model that allows for fast responses based on up-to-date healthcare data.

Customizable digital humans — like James, an interactive demo developed by NVIDIA — can handle tasks such as scheduling appointments, filling out intake forms and answering questions about upcoming health services. This can make healthcare services more efficient while also improving patient access.

In addition to NIM microservices, the James interactive demo also uses NVIDIA ACE to provide natural, low-latency responses.

NVIDIA ACE is a suite of AI, graphics and simulation technologies for bringing digital humans to life. It can integrate every aspect of a digital human into healthcare applications — from speech and translation abilities capable of understanding diverse accents and languages, to realistic animations of facial and body movements.

Personalized Experiences for Hospital Patients

Patients can get overwhelmed with the amount of preoperative information. Typically, they have only one pre-admission appointment, often many weeks before the surgery, which can leave them with lingering questions and escalating concerns. The stress of a serious diagnosis may also prevent them from asking all the necessary questions during these brief interactions, leaving them without comprehensive knowledge about the appointment’s purpose, duration, location and necessary documents — and potentially leading to delays or even rescheduling of their surgeries.

To enhance patient preparation and reduce pre-procedure anxiety, The Ottawa Hospital is using AI agents, powered by NVIDIA and Deloitte’s technologies, to provide more consistent, accurate and continuous access to information.

With the Frontline AI teammate, patients can experience benefits including:

24/7 access to the digital teammate using a smartphone, tablet or home computer.
Reliable, preapproved answers to detailed questions, including information around anesthesia or the procedure itself.
Postsurgery consultation to resolve any questions about the recovery process, potentially improving treatment adherence and health outcomes.

In user acceptance testing conducted this summer, a majority of the testers noted that responses provided were clear, relevant and met the needs of the given interaction.

“The Frontline AI Teammate offers a novel and innovative solution to help combat our health human resource crisis — it has the potential to reduce the administrative burden, giving back time to healthcare providers to provide the quality care our population deserves and expects from The Ottawa Hospital,” said Mathieu LeBreton, digital experience lead at The Ottawa Hospital. “The opportunity to explore these technologies is well-timed, given the planning of the New Campus Development, a new hospital project in Ottawa. Proper identification of the problems we are trying to solve is imperative to ensure this is done responsibly and transparently.”

Deloitte is working with other hospitals and healthcare institutions to deploy digital agents. A patient-facing pilot with Ottawa Hospital is expected to go live by the end of the year.

Developers can get started by accessing the digital human NIM Agent Blueprint.

How Planview built a scalable AI Assistant for portfolio and project management using Amazon Bedrock

This post is co-written with Lee Rehwinkel from Planview.

Businesses today face numerous challenges in managing intricate projects and programs, deriving valuable insights from massive data volumes, and making timely decisions. These hurdles frequently lead to productivity bottlenecks for program managers and executives, hindering their ability to drive organizational success efficiently.

Planview, a leading provider of connected work management solutions, embarked on an ambitious plan in 2023 to revolutionize how 3 million global users interact with their project management applications. To realize this vision, Planview developed an AI assistant called Planview Copilot, using a multi-agent system powered by Amazon Bedrock.

Developing this multi-agent system posed several challenges:

Reliably routing tasks to appropriate AI agents
Accessing data from various sources and formats
Interacting with multiple application APIs
Enabling the self-serve creation of new AI skills by different product teams

To overcome these challenges, Planview developed a multi-agent architecture built using Amazon Bedrock. Amazon Bedrock is a fully managed service that provides API access to foundation models (FMs) from Amazon and other leading AI startups. This allows developers to choose the FM that is best suited for their use case. This approach is both architecturally and organizationally scalable, enabling Planview to rapidly develop and deploy new AI skills to meet the evolving needs of their customers.

This post focuses primarily on the first challenge: routing tasks and managing multiple agents in a generative AI architecture. We explore Planview’s approach to this challenge during the development of Planview Copilot, sharing insights into the design decisions that provide efficient and reliable task routing.

We describe customized home-grown agents in this post because this project was implemented before Amazon Bedrock Agents was generally available. However, Amazon Bedrock Agents is now the recommended solution for organizations looking to use AI-powered agents in their operations. Amazon Bedrock Agents can retain memory across interactions, offering more personalized and seamless user experiences. You can benefit from improved recommendations and recall of prior context where required, enjoying a more cohesive and efficient interaction with the agent. We share our learnings in our solution to help you understanding how to use AWS technology to build solutions to meet your goals.

Solution overview

Planview’s multi-agent architecture consists of multiple generative AI components collaborating as a single system. At its core, an orchestrator is responsible for routing questions to various agents, collecting the learned information, and providing users with a synthesized response. The orchestrator is managed by a central development team, and the agents are managed by each application team.

The orchestrator comprises two main components called the router and responder, which are powered by a large language model (LLM). The router uses AI to intelligently route user questions to various application agents with specialized capabilities. The agents can be categorized into three main types:

Help agent – Uses Retrieval Augmented Generation (RAG) to provide application help
Data agent – Dynamically accesses and analyzes customer data
Action agent – Runs actions within the application on the user’s behalf

After the agents have processed the questions and provided their responses, the responder, also powered by an LLM, synthesizes the learned information and formulates a coherent response to the user. This architecture allows for a seamless collaboration between the centralized orchestrator and the specialized agents, which provides users an accurate and comprehensive answers to their questions. The following diagram illustrates the end-to-end workflow.

Technical overview

Planview used key AWS services to build its multi-agent architecture. The central Copilot service, powered by Amazon Elastic Kubernetes Service (Amazon EKS), is responsible for coordinating activities among the various services. Its responsibilities include:

Managing user session chat history using Amazon Relational Database Service (Amazon RDS)
Coordinating traffic between the router, application agents, and responder
Handling logging, monitoring, and collecting user-submitted feedback

The router and responder are AWS Lambda functions that interact with Amazon Bedrock. The router considers the user’s question and chat history from the central Copilot service, and the responder considers the user’s question, chat history, and responses from each agent.

Application teams manage their agents using Lambda functions that interact with Amazon Bedrock. For improved visibility, evaluation, and monitoring, Planview has adopted a centralized prompt repository service to store LLM prompts.

Agents can interact with applications using various methods depending on the use case and data availability:

Existing application APIs – Agents can communicate with applications through their existing API endpoints
Amazon Athena or traditional SQL data stores – Agents can retrieve data from Amazon Athena or other SQL-based data stores to provide relevant information
Amazon Neptune for graph data – Agents can access graph data stored in Amazon Neptune to support complex dependency analysis
Amazon OpenSearch Service for document RAG – Agents can use Amazon OpenSearch Service to perform RAG on documents

The following diagram illustrates the generative AI assistant architecture on AWS.

Router and responder sample prompts

The router and responder components work together to process user queries and generate appropriate responses. The following prompts provide illustrative router and responder prompt templates. Additional prompt engineering would be required to improve reliability for a production implementation.

First, the available tools are described, including their purpose and sample questions that can be asked of each tool. The example questions help guide the natural language interactions between the orchestrator and the available agents, as represented by tools.

tools = '''
<tool>
<toolName>applicationHelp</toolName>
<toolDescription>
Use this tool to answer application help related questions.
Example questions:
How do I reset my password?
How do I add a new user?
How do I create a task?
</toolDescription>
</tool>
<tool>
<toolName>dataQuery</toolName>
<toolDescription>
Use this tool to answer questions using application data.
Example questions:
Which tasks are assigned to me?
How many tasks are due next week?
Which task is most at risk?
</toolDescription>
</tool>

Next, the router prompt outlines the guidelines for the agent to either respond directly to user queries or request information through specific tools before formulating a response:

system_prompt_router = f'''
<role>
Your job is to decide if you need additional information to fully answer the User's 
questions.
You achieve your goal by choosing either 'respond' or 'callTool'.
You have access to your chat history in <chatHistory></chatHistory> tags.
You also have a list of available tools to assist you in <tools></tools> tags.
</role>
<chatHistory>
{chatHistory}
</chatHistory>
<tools>
{tools}
</tools>
<rules>
- If the chat history contains sufficient information to answer the User's questions, 
choose the 'respond' action.
- To gather more information before responding, choose the 'callTool' action.
- You many only choose from the tools in the <tools></tools> tags.
- If no tool can assist with the question, choose the 'respond' action.
- Place your chosen action within <action></action> tags.
- When you chose the 'callTool' action, provide the <toolName> and the <toolQuestion> you
would like to ask.
- Your <toolQuestion> should be verbose and avoid using pronouns.
- Start by providing your step-by-step thinking in <thinking></thinking> tags.
- Then you will give your answer in <answer></answer> tags.
- Your answer should follow the format of one of these three examples:
When choosing the 'respond' action, your answer should follow the below example EXACTLY:
<example>
<answer>
<action>respond</action>
</answer>
</example>
When choosing the 'callTool' action for a single Tool:
<example>
<answer>
<action>callTool</action>
<tools>
<tool>
<toolName>applicationHelp</toolName>
<toolQuestion>How do I reset my password?</toolQuestion>
</tool>
</tools>
</answer>
</example>
Executing the above, would produce the following result:
You can also call multiple Tools using this format:
<example>
<answer>
<action>callTool</action>
<tools>
<tool>
<toolName>dataQuery</toolName>
<toolQuestion>How many tasks are assigned to me?</toolQuestion>
</tool>
<tool>
<toolName>applicationHelp</toolName>
<toolQuestion>How do I add a new task?</toolQuestion>
</tool>
</tools>
</answer>
</example>
</rules>
'''

The following is a sample response from the router component that initiates the dataQuery tool to retrieve and analyze task assignments for each user:

<thinking>
To determine who has the most tasks assigned, I will need to query the application data. The "dataQuery" tool seems most appropriate for this question.
</thinking>

<answer>
<action>callTool</action>
<tools>
    <tool>
        <toolName>dataQuery</toolName>
        <toolQuestion>Which user has the most tasks currently assigned to them?   </toolQuestion>
    </tool>
</tools>
</answer>

The following is a sample response from the responder component that uses the dataQuery tool to fetch information about the user’s assigned tasks. It reports that the user has five tasks assigned to them.

<thinking>
Based on the chat history, I previously called the dataQuery tool to ask "How many tasks are currently assigned to the user?". The tool responded that the user has 5 tasks assigned to them.
</thinking>

<answer>
According to the data I queried previously, you have 5 tasks assigned to you.
</answer>

Model evaluation and selection

Evaluating and monitoring generative AI model performance is crucial in any AI system. Planview’s multi-agent architecture enables assessment at various component levels, providing comprehensive quality control despite the system’s complexity. Planview evaluates components at three levels:

Prompts – Assessing LLM prompts for effectiveness and accuracy
AI agents – Evaluating complete prompt chains to maintain optimal task handling and response relevance
AI system – Testing user-facing interactions to verify seamless integration of all components

The following figure illustrates the evaluation framework for prompts and scoring.

To conduct these evaluations, Planview uses a set of carefully crafted test questions that cover typical user queries and edge cases. These evaluations are performed during the development phase and continue in production to track the quality of responses over time. Currently, human evaluators play a crucial role in scoring responses. To aid in the evaluation, Planview has developed an internal evaluation tool to store the library of questions and track the responses over time.

To assess each component and determine the most suitable Amazon Bedrock model for a given task, Planview established the following prioritized evaluation criteria:

Quality of response – Assuring accuracy, relevance, and helpfulness of system responses
Time of response – Minimizing latency between user queries and system responses
Scale – Making sure the system can scale to thousands of concurrent users
Cost of response – Optimizing operational costs, including AWS services and generative AI models, to maintain economic viability

Based on these criteria and the current use case, Planview selected Anthropic’s Claude 3 Sonnet on Amazon Bedrock for the router and responder components.

Results and impact

Over the past year, Planview Copilot’s performance has significantly improved through the implementation of a multi-agent architecture, development of a robust evaluation framework, and adoption of the latest FMs available through Amazon Bedrock. Planview saw the following results between the first generation of Planview Copilot developed mid-2023 and the latest version:

Accuracy – Human-evaluated accuracy has improved from 50% answer acceptance to now exceeding 95%
Response time – Average response times have been reduced from over 1 minute to 20 seconds
Load testing – The AI assistant has successfully passed load tests, where 1,000 questions were submitted simultaneous with no noticeable impact on response time or quality
Cost-efficiency – The cost per customer interaction has been slashed to one tenth of the initial expense
Time-to-market – New agent development and deployment time has been reduced from months to weeks

Conclusion

In this post, we explored how Planview was able to develop a generative AI assistant to address complex work management process by adopting the following strategies:

Modular development – Planview built a multi-agent architecture with a centralized orchestrator. The solution enables efficient task handling and system scalability, while allowing different product teams to rapidly develop and deploy new AI skills through specialized agents.
Evaluation framework – Planview implemented a robust evaluation process at multiple levels, which was crucial for maintaining and improving performance.
Amazon Bedrock integration – Planview used Amazon Bedrock to innovate faster with broad model choice and access to various FMs, allowing for flexible model selection based on specific task requirements.

Planview is migrating to Amazon Bedrock Agents, which enables the integration of intelligent autonomous agents within their application ecosystem. Amazon Bedrock Agents automate processes by orchestrating interactions between foundation models, data sources, applications, and user conversations.

As next steps, you can explore Planview’s AI assistant feature built on Amazon Bedrock and stay updated with new Amazon Bedrock features and releases to advance your AI journey on AWS.

About Authors

Sunil Ramachandra is a Senior Solutions Architect enabling hyper-growth Independent Software Vendors (ISVs) to innovate and accelerate on AWS. He partners with customers to build highly scalable and resilient cloud architectures. When not collaborating with customers, Sunil enjoys spending time with family, running, meditating, and watching movies on Prime Video.

Benedict Augustine is a thought leader in Generative AI and Machine Learning, serving as a Senior Specialist at AWS. He advises customer CxOs on AI strategy, to build long-term visions while delivering immediate ROI.As VP of Machine Learning, Benedict spent the last decade building seven AI-first SaaS products, now used by Fortune 100 companies, driving significant business impact. His work has earned him 5 patents.

Lee Rehwinkel is a Principal Data Scientist at Planview with 20 years of experience in incorporating AI & ML into Enterprise software. He holds advanced degrees from both Carnegie Mellon University and Columbia University. Lee spearheads Planview’s R&D efforts on AI capabilities within Planview Copilot. Outside of work, he enjoys rowing on Austin’s Lady Bird Lake.

Intel GPU Support Now Available in PyTorch 2.5

Support for Intel GPUs is now available in PyTorch® 2.5, providing improved functionality and performance for Intel GPUs which including Intel® Arc™ discrete graphics, Intel® Core™ Ultra processors with built-in Intel® Arc™ graphics and Intel® Data Center GPU Max Series. This integration brings Intel GPUs and the SYCL* software stack into the official PyTorch stack, ensuring a consistent user experience and enabling more extensive AI application scenarios, particularly in the AI PC domain.

Developers and customers building for and using Intel GPUs will have a better user experience by directly obtaining continuous software support from native PyTorch, unified software distribution, and consistent product release time.

Furthermore, Intel GPU support provides more choices to users. Now PyTorch provides a consistent GPU programming paradigm on both front ends and back ends. Developers can now run and deploy workloads on Intel GPUs with minimal coding efforts.

Overview of Intel GPU support

Intel GPU support in PyTorch provides eager mode and graph mode support in the PyTorch built-in front end. Eager mode now has an implementation of commonly used Aten operators with the SYCL programming language. Graph mode (torch.compile) now has an enabled Intel GPU back end to implement the optimization for Intel GPUs and to integrate Triton.

Essential components of Intel GPU support were added to PyTorch, including runtime, Aten operators, oneDNN, TorchInductor, Triton and Intel GPU tool chains integration. Meanwhile, quantization and distributed are being actively developed in preparation for the PyTorch 2.6 release.

Features

In addition to providing key features for Intel® Client GPUs and Intel® Data Center GPU Max Series for inference and training, PyTorch keeps the same user experience as other hardware the PyTorch supports. If you migrate code from CUDA*, you can run the existing application code on an Intel GPU with minimal code changes for the device name (from cuda to xpu). For example:

# CUDA Code
tensor = torch.tensor([1.0, 2.0]).to(“cuda”)

# Code for Intel GPU
tensor = torch.tensor([1.0, 2.0]).to(“xpu”)

PyTorch 2.5 features with an Intel GPU include:

Inference and training workflows.
Enhance both torch.compile and eager mode functionalities (more Ops), together with performance improvement, and fully run three Dynamo Hugging Face*, TIMM* and TorchBench* benchmarks for eager and compile modes.
Data types such as FP32, BF16, FP16, and automatic mixed precision (AMP).
Runs on Intel® Client GPUs and Intel® Data Center GPU Max Series.
Supports Linux (Ubuntu, SUSE Linux and Red Hat Linux) and Windows 10/11.

Get Started

Get a tour of the environment setup, PIP wheels installation, and examples on Intel® Client GPUs and Intel® Data Center GPU Max Series from Getting Started Guide. Support for Intel GPUs can be experienced through PyTorch PIP wheels installation by nightly and preview binary releases.

Try Intel® Client GPUs through Intel® Arc™ Graphics family (Codename DG2), Intel® Core™ Ultra processor family with Intel® Graphics (Codename Meteor Lake), and Intel® Core™ Ultra mobile processor family with Intel® Graphics (Codename Lunar Lake).
Try Intel Data Center GPU Max Series through Intel® Tiber™ AI Cloud.
1. To learn how to create a free Standard account, see Get Started. Then do the following:
  - Sign in to the cloud console.
  - From the Training section, open the PyTorch on Intel® GPUs notebook and click “Launch Jupyter Notebook.”
  - Ensure that the PyTorch 2.5 kernel is selected for the notebook.

Performance

The performance of Intel GPU on PyTorch was continuously optimized to achieve decent result on three Dynamo Hugging Face, TIMM and TorchBench benchmarks for eager and compile modes.

The latest performance data measured on top of PyTorch Dynamo Benchmarking Suite using Intel® Data Center GPU Max Series 1100 single card showcase the FP16/BF16 significant speedup ratio over FP32 on eager mode in Figure 1, and Torch.compile mode speedup ratio over eager mode in Figure 2. Both inference and training reached the similar significant improvements.

Figure 2: FP16/BF16 Performance Gains Over FP32 Eager

Figure 3: Torch.compile Performance Gains Over Eager Mode

Summary

Intel GPU on PyTorch 2.5 brings Intel® Client GPUs (Intel® Core™ Ultra processors with built-in Intel® Arc™ graphics and Intel® Arc™ Graphics for dGPU parts) and Intel® Data Center GPU Max Series into the PyTorch ecosystem for AI workload acceleration. Especially, Client GPUs is added to the GPU-supported list for AI PC use scenarios on Windows and Linux environment.

We warmly welcome the community to evaluate and provide feedback on these enhancements to  Intel GPU support on PyTorch. 

Resources

Acknowledgments

We want thank PyTorch open source community for their technical discussions and insights: Andrey Talman, Alban Desmaison, Nikita Shulga, Eli Uriegas, Jason Ansel, and Bin Bao.

We also thank collaborators from PyTorch for their professional support and guidance.

Performance Configuration

The configurations in the table are collected with svr-info. Test by Intel on September 12, 2024.

Table 1

Component	Details
Name	Intel® Max Series GPU 1100 in Intel® Tiber™ Developer Cloud
Time	Thu Sep 12 08:21:27 UTC 2024
System	Supermicro SYS-521GE-TNRT
Baseboard	Supermicro X13DEG-OA
Chassis	Supermicro Other
CPU Model	Intel(R) Xeon(R) Platinum 8468V
Microarchitecture	SPR_XCC
Sockets	2
Cores per Socket	48
Hyperthreading	Enabled
CPUs	192
Intel Turbo Boost	Enabled
Base Frequency	2.4GHz
All-core Maximum Frequency	2.4GHz
Maximum Frequency	2.9GHz
NUMA Nodes	2
Prefetchers	L2 HW: Enabled, L2 Adj.: Enabled, DCU HW: Enabled, DCU IP: Enabled, AMP: Disabled, Homeless: Disabled, LLC: Disabled
PPINs	5e3f862ef7ba9d50, 6c85812edfcc84b1
Accelerators	DLB 2, DSA 2, IAA 2, QAT (on CPU) 2, QAT (on chipset) 0
Installed Memory	1024GB (16x64GB DDR5 4800 MT/s [4800 MT/s])
Hugepagesize	2048 kB
Transparent Huge Pages	madvise
Automatic NUMA Balancing	Enabled
NIC	2 x Ethernet Controller X710 for 10GBASE-T, 4 x MT2892 Family [ConnectX-6 Dx]
Disk	1 x 894.3G Micron_7450_MTFDKBG960TFR
BIOS	1.4a
Microcode	0x2b0004b1
OS	Ubuntu 22.04.2 LTS
Kernel	5.15.0-73-generic
TDP	330W
Power & Perf Policy	Normal (6)
Frequency Governor	performance
Frequency Driver	acpi-cpufreq
Max C-State	9

Table 2

Component	Details
Single Card	Intel® Max Series GPU 1100 series on 4th Gen Intel® Xeon® processors of Intel Tiber Developer Cloud
Workload & version	Timm ac34701, TorchBench 03cde49, Torchvision d23a6e1, Torchaudio b3f6f51, Transformers 243e186
Software Stack	intel-for-pytorch-gpu-dev 0.5.3, intel-pti-dev 0.9.0, Intel xpu backend for Triton cc981fe
Framework	Pytorch 4a3dabd67f8ce63f2fc45f278421cca3cc532cfe
GPU driver	agama-ci-devel-803.61
GFX FW Version	PVC2_1.23374

Notices & Disclaimers

Performance varies by use, configuration and other factors. Learn more on the Performance Index site. Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates.  See backup for configuration details.  No product or component can be absolutely secure. Your costs and results may vary. Intel technologies may require enabled hardware, software or service activation.

Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.

AI disclaimer:
AI features may require software purchase, subscription or enablement by a software or platform provider, or may have specific configuration or compatibility requirements. Details at  www.intel.com/AIPC. Results may vary.

Divide-or-Conquer? Which Part Should You Distill Your LLM?

Recent methods have demonstrated that Large Language Models (LLMs) can solve reasoning tasks better when they are encouraged to solve subtasks of the main task first. In this paper we devise a similar strategy that breaks down reasoning tasks into a problem decomposition phase and a problem solving phase and show that the strategy is able to outperform a single stage solution. Further, we hypothesize that the decomposition should be easier to distill into a smaller model compared to the problem solving because the latter requires large amounts of domain knowledge while the former only requires…Apple Machine Learning Research

‘India Should Manufacture Its Own AI,’ Declares NVIDIA CEO

Artificial intelligence will be the driving force behind India’s digital transformation, fueling innovation, economic growth, and global leadership, NVIDIA founder and CEO Jensen Huang said Thursday at NVIDIA’s AI Summit in Mumbai.

Addressing a crowd of entrepreneurs, developers, academics and business leaders, Huang positioned AI as the cornerstone of the country’s future.

India has an “amazing natural resource” in its IT and computer science expertise, Huang said, noting the vast potential waiting to be unlocked.

To capitalize on this country’s talent and India’s immense data resources, the country’s leading cloud infrastructure providers are rapidly accelerating their data center capacity. NVIDIA is playing a key role, with NVIDIA GPU deployments expected to grow nearly 10x by year’s end, creating the backbone for an AI-driven economy.

Together with NVIDIA, these companies are at the cutting edge of a shift Huang compared to the seismic change in computing introduced by IBM’s System 360 in 1964, calling it the most profound platform shift since then.

“This industry, the computing industry, is going to become the intelligence industry,” Huang said, pointing to India’s unique strengths to lead this industry, thanks to its enormous amounts of data and large population.

With this rapid expansion in infrastructure, AI factories will play a critical role in India’s future, serving as the backbone of the nation’s AI-driven growth.

NVIDIA founder and CEO Jensen Huang speaking with Reliance Industries Chairman Mukesh Ambani at NVIDIA’s AI Summit in Mumbai.

“It makes complete sense that India should manufacture its own AI,” Huang said. “You should not export data to import intelligence,” he added, noting the importance of India building its own AI infrastructure.

Huang identified three areas where AI will transform industries: sovereign AI, where nations use their own data to drive innovation; agentic AI, which automates knowledge-based work; and physical AI, which applies AI to industrial tasks through robotics and autonomous systems. India, Huang noted, is uniquely positioned to lead in all three areas.

India’s startups are already harnessing NVIDIA technology to drive innovation across industries and are positioning themselves as global players, bringing the country’s AI solutions to the world.

Meanwhile, India’s robotics ecosystem is adopting NVIDIA Isaac and Omniverse to power the next generation of physical AI, revolutionizing industries like manufacturing and logistics with advanced automation.

Huang’s also keynote featured a surprise appearance by actor and producer Akshay Kumar.

Following Huang’s remarks, the focus shifted to a fireside chat between Huang and Reliance Industries Chairman Mukesh Ambani, where the two leaders explored how AI will shape the future of Indian industries, particularly in sectors like energy, telecommunications and manufacturing.

Ambani emphasized that AI is central to this continued growth. Reliance, in partnership with NVIDIA, is building AI factories to automate industrial tasks and transform processes in sectors like energy and manufacturing.

Both men discussed their companies’ joint efforts to pioneer AI infrastructure in India.

Ambani underscored the role of AI in public sector services, explaining how India’s data combined with AI is already transforming governance and service delivery.

Huang added that AI promises to democratize technology.

“The ability to program AI is something that everyone can do … if AI could be put into the hands of every citizen, it would elevate and put into the hands of everyone this incredible capability,” he said.

Huang emphasized NVIDIA’s role in preparing India’s workforce for an AI-driven future.

NVIDIA is partnering with India’s IT giants such as Infosys, TCS, Tech Mahindra and Wipro to upskill nearly half a million developers, ensuring India leads the AI revolution with a highly trained workforce.

“India’s technical talent is unmatched,” Huang said.

Ambani echoed these sentiments, stressing that “India will be one of the biggest intelligence markets,” pointing to the nation’s youthful, technically talented population.

A Vision for India’s AI-Driven Future

As the session drew to a close, Huang and Ambani reflected on their vision for India’s AI-driven future.

With its vast talent pool, burgeoning tech ecosystem and immense data resources, the country, they agreed, has the potential to contribute globally in sectors such as energy, healthcare, finance and manufacturing.

“This cannot be done by any one company, any one individual, but we all have to work together to bring this intelligence age safely to the world so that we can create a more equal world, a more prosperous world,” Ambani said.

Huang echoed the sentiment, adding: “Let’s make it a promise today that we will work together so that India can take advantage of the intelligence revolution that’s ahead of us.”

‘India Should Manufacture Its Own AI,’ Declares NVIDIA CEO

Addressing a crowd of entrepreneurs, developers, academics and business leaders, Huang positioned AI as the cornerstone of the country’s future.

India has an “amazing natural resource” in its IT and computer science expertise, Huang said, noting the vast potential waiting to be unlocked.

India’s startups are already harnessing NVIDIA technology to drive innovation across industries and are positioning themselves as global players, bringing the country’s AI solutions to the world.

Huang’s also keynote featured a surprise appearance by actor and producer Akshay Kumar.

Both men discussed their companies’ joint efforts to pioneer AI infrastructure in India.

Ambani underscored the role of AI in public sector services, explaining how India’s data combined with AI is already transforming governance and service delivery.

Huang added that AI promises to democratize technology.

Huang emphasized NVIDIA’s role in preparing India’s workforce for an AI-driven future.

“India’s technical talent is unmatched,” Huang said.

Ambani echoed these sentiments, stressing that “India will be one of the biggest intelligence markets,” pointing to the nation’s youthful, technically talented population.

A Vision for India’s AI-Driven Future

As the session drew to a close, Huang and Ambani reflected on their vision for India’s AI-driven future.

Huang echoed the sentiment, adding: “Let’s make it a promise today that we will work together so that India can take advantage of the intelligence revolution that’s ahead of us.”

‘India Should Manufacture Its Own AI,’ Declares NVIDIA CEO

Addressing a crowd of entrepreneurs, developers, academics and business leaders, Huang positioned AI as the cornerstone of the country’s future.

India has an “amazing natural resource” in its IT and computer science expertise, Huang said, noting the vast potential waiting to be unlocked.

India’s startups are already harnessing NVIDIA technology to drive innovation across industries and are positioning themselves as global players, bringing the country’s AI solutions to the world.

Huang’s also keynote featured a surprise appearance by actor and producer Akshay Kumar.

Both men discussed their companies’ joint efforts to pioneer AI infrastructure in India.

Ambani underscored the role of AI in public sector services, explaining how India’s data combined with AI is already transforming governance and service delivery.

Huang added that AI promises to democratize technology.

Huang emphasized NVIDIA’s role in preparing India’s workforce for an AI-driven future.

“India’s technical talent is unmatched,” Huang said.

Ambani echoed these sentiments, stressing that “India will be one of the biggest intelligence markets,” pointing to the nation’s youthful, technically talented population.

A Vision for India’s AI-Driven Future

As the session drew to a close, Huang and Ambani reflected on their vision for India’s AI-driven future.

Huang echoed the sentiment, adding: “Let’s make it a promise today that we will work together so that India can take advantage of the intelligence revolution that’s ahead of us.”

‘India Should Manufacture Its Own AI,’ Declares NVIDIA CEO

Addressing a crowd of entrepreneurs, developers, academics and business leaders, Huang positioned AI as the cornerstone of the country’s future.

India has an “amazing natural resource” in its IT and computer science expertise,” Huang said, noting the vast potential waiting to be unlocked.

India’s startups are already harnessing NVIDIA technology to drive innovation across industries and are positioning themselves as global players, bringing the country’s AI solutions to the world.

Huang’s also keynote featured a surprise appearance by actor and producer Akshay Kumar.

Both men discussed their companies’ joint efforts to pioneer AI infrastructure in India.

Ambani underscored the role of AI in public sector services, explaining how India’s data combined with AI is already transforming governance and service delivery.

Huang added that AI promises to democratize technology.

Huang emphasized NVIDIA’s role in preparing India’s workforce for an AI-driven future.

“India’s technical talent is unmatched,” Huang said.

Ambani echoed these sentiments, stressing that “India will be one of the biggest intelligence markets,” pointing to the nation’s youthful, technically talented population.

A Vision for India’s AI-Driven Future

As the session drew to a close, Huang and Ambani reflected on their vision for India’s AI-driven future.

Huang echoed the sentiment, adding: “Let’s make it a promise today that we will work together so that India can take advantage of the intelligence revolution that’s ahead of us.”

‘India Should Manufacture Its Own AI,’ Declares NVIDIA CEO

Addressing a crowd of entrepreneurs, developers, academics and business leaders, Huang positioned AI as the cornerstone of the country’s future.

India has an “amazing natural resource” in its IT and computer science expertise,” Huang said, noting the vast potential waiting to be unlocked.

India’s startups are already harnessing NVIDIA technology to drive innovation across industries and are positioning themselves as global players, bringing the country’s AI solutions to the world.

Huang’s also keynote featured a surprise appearance by actor and producer Akshay Kumar.

Both men discussed their companies’ joint efforts to pioneer AI infrastructure in India.

Ambani underscored the role of AI in public sector services, explaining how India’s data combined with AI is already transforming governance and service delivery.

Huang added that AI promises to democratize technology.

Huang emphasized NVIDIA’s role in preparing India’s workforce for an AI-driven future.

“India’s technical talent is unmatched,” Huang said.

Ambani echoed these sentiments, stressing that “India will be one of the biggest intelligence markets,” pointing to the nation’s youthful, technically talented population.

A Vision for India’s AI-Driven Future

As the session drew to a close, Huang and Ambani reflected on their vision for India’s AI-driven future.

Huang echoed the sentiment, adding: “Let’s make it a promise today that we will work together so that India can take advantage of the intelligence revolution that’s ahead of us.”

Vedere AI

Monthly Archives: October 2024

Smart Audit System Empowered by LLM

NVIDIA Works With Deloitte to Deploy Digital AI Agents for Healthcare

Next-Gen Technologies Powering Digital Humans

Personalized Experiences for Hospital Patients

How Planview built a scalable AI Assistant for portfolio and project management using Amazon Bedrock

Solution overview

Technical overview

Router and responder sample prompts

Model evaluation and selection

Results and impact

Conclusion

About Authors

Intel GPU Support Now Available in PyTorch 2.5

Overview of Intel GPU support

Features

Get Started

Performance

Summary

Resources

Acknowledgments

Performance Configuration

Table 1

Table 2

Divide-or-Conquer? Which Part Should You Distill Your LLM?

‘India Should Manufacture Its Own AI,’ Declares NVIDIA CEO

A Vision for India’s AI-Driven Future

‘India Should Manufacture Its Own AI,’ Declares NVIDIA CEO

A Vision for India’s AI-Driven Future

‘India Should Manufacture Its Own AI,’ Declares NVIDIA CEO

A Vision for India’s AI-Driven Future

‘India Should Manufacture Its Own AI,’ Declares NVIDIA CEO

A Vision for India’s AI-Driven Future

‘India Should Manufacture Its Own AI,’ Declares NVIDIA CEO

A Vision for India’s AI-Driven Future

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.