Lab Confidential: Japan Research Keeps Healthcare Data Secure

Lab Confidential: Japan Research Keeps Healthcare Data Secure

Established 77 years ago, Mitsui & Co stays vibrant by building businesses and ecosystems with new technologies like generative AI and confidential computing.

Digital transformation takes many forms at the Tokyo-based conglomerate with 16 divisions. In one case, it’s an autonomous trucking service, in another it’s a geospatial analysis platform. Mitsui even collaborates with a partner at the leading edge of quantum computing.

One new subsidiary, Xeureka, aims to accelerate R&D in healthcare, where it can take more than a billion dollars spent over a decade to bring to market a new drug.

“We create businesses using new digital technology like AI and confidential computing,” said Katsuya Ito, a project manager in Mitsui’s digital transformation group. “Most of our work is done in collaboration with tech companies — in this case NVIDIA and Fortanix,” a San Francisco based security software company.

In Pursuit of Big Data

Though only three years old, Xeureka already completed a proof of concept addressing one of drug discovery’s biggest problems — getting enough data.

Speeding drug discovery requires powerful AI models built with datasets larger than most pharmaceutical companies have on hand. Until recently, sharing across companies has been unthinkable because data often contains private patient information as well as chemical formulas proprietary to the drug company.

Enter confidential computing, a way of processing data in a protected part of a GPU or CPU that acts like a black box for an organization’s most important secrets.

To ensure their data is kept confidential at all times, banks, government agencies and even advertisers are using the technology that’s backed by a consortium of some of the world’s largest companies.

A Proof of Concept for Privacy

To validate that confidential computing would allow its customers to safely share data, Xeureka created two imaginary companies, each with a thousand drug candidates. Each company’s dataset was used separately to train an AI model to predict the chemicals’ toxicity levels. Then the data was combined to train a similar, but larger AI model.

Xeureka ran its test on NVIDIA H100 Tensor Core GPUs using security management software from Fortanix, one of the first startups to support confidential computing.

The H100 GPUs support a trusted execution environment with hardware-based engines that ensure and validate confidential workloads are protected while in use on the GPU, without compromising performance. The Fortanix software manages data sharing, encryption keys and the overall workflow.

Up to 74% Higher Accuracy

The results were impressive. The larger model’s predictions were 65-74% more accurate, thanks to use of the combined datasets.

The models created by a single company’s data showed instability and bias issues that were not present with the larger model, Ito said.

“Confidential computing from NVIDIA and Fortanix essentially alleviates the privacy and security concerns while also improving model accuracy, which will prove to be a win-win situation for the entire industry,” said Xeureka’s CTO, Hiroki Makiguchi, in a Fortanix press release.

An AI Supercomputing Ecosystem

Now, Xeureka is exploring broad applications of this technology in drug discovery research, in collaboration with the community behind Tokyo-1, its GPU-accelerated AI supercomputer. Announced in February, Tokyo-1 aims to enhance the efficiency of pharmaceutical companies in Japan and beyond.

Initial projects may include collaborations to predict protein structures, screen ligand-base pairs and accelerate molecular dynamics simulations with trusted services. Tokyo-1 users can harness large language models for chemistry, protein, DNA and RNA data formats through the NVIDIA BioNeMo drug discovery microservices and framework.

It’s part of Mitsui’s broader strategic growth plan to develop software and services for healthcare, such as powering Japan’s $100 billion pharma industry, the world’s third largest following the U.S. and China.

Xeueka’s services will include using AI to quickly screen billions of drug candidates, to predict how useful molecules will bind with proteins and to simulate detailed chemical behaviors.

To learn more, read about NVIDIA Confidential Computing and NVIDIA BioNeMo, an AI platform for drug discovery.

Read More

NVIDIA and Global Consulting Leaders Speed AI Adoption Across Japan’s Industries

NVIDIA and Global Consulting Leaders Speed AI Adoption Across Japan’s Industries

Consulting giants including Accenture, Deloitte, EY Strategy and Consulting Co., Ltd. (or EY Japan), FPT,  Kyndryl and Tata Consultancy Services Japan (TCS Japan) are working with NVIDIA to establish innovation centers in Japan to accelerate the nation’s goal of embracing enterprise AI and physical AI across its industrial landscape.

The centers will use NVIDIA AI Enterprise software, local language models and NVIDIA NIM microservices to help clients in Japan advance the development and deployment of AI agents tailored to their industries’ respective needs, boosting productivity with a digital workforce.

Using the NVIDIA Omniverse platform, Japanese firms can develop digital twins and simulate complex physical AI systems, driving innovation in manufacturing, robotics and other sectors.

Like many nations, Japan is navigating complex social and demographic challenges,  which is leading to a smaller workforce as older generations retire. Leaning into its manufacturing and robotics leadership, the country is seeking opportunities to solve these challenges using AI.

The Japanese government in April published a paper on its aims to become “the world’s most AI-friendly country.” AI adoption is strong and growing, as IDC reports that the Japanese AI systems market reached approximately $5.9 billion this year, with a year-on-year growth rate of 31.2%.1

The consulting giants’ initiatives and activities include:

  • Accenture has established the Accenture NVIDIA Business Group and will provide solutions and services incorporating a Japanese large language model (LLM), which uses NVIDIA NIM and NVIDIA NeMo, as a Japan-specific offering. In addition, Accenture will deploy agentic AI solutions based on Accenture AI Refinery to all industries in Japan, accelerating total enterprise reinvention for its clients. In the future, Accenture plans to build new services using NVIDIA AI Enterprise and Omniverse at Accenture Innovation Hub Tokyo.
  • Deloitte is establishing its AI Experience Center in Tokyo, which will serve as an executive briefing center to showcase generative AI solutions built on NVIDIA technology. This facility builds on the Deloitte Japan NVIDIA Practice announced in June and will allow clients to experience firsthand how AI can revolutionize their operations. The center will also offer NVIDIA AI and Omniverse Blueprints to help enterprises in Japan adopt agentic AI effectively.
  • EY Strategy and Consulting Co., Ltd (EY Japan) is developing a multitude of digital transformation (DX) solutions in Japan across diverse industries including finance, retail, media and manufacturing. The new EY Japan DX offerings will be built with NVIDIA AI Enterprise to serve the country’s growing demand for digital twins, 3D applications, multimodal AI and generative AI.
  • FPT is launching FPT AI Factory in Japan with NVIDIA Hopper GPUs and NVIDIA AI Enterprise software to support the country’s AI transformation by using business data in a secure, sovereign environment. FPT is integrating the NVIDIA NeMo framework with FPT AI Studio for building, pretraining and fine-tuning generative AI models, including FPT’s multi-language LLM, named Saola. In addition, to provide end-to-end AI integration services, FPT plans to train over 1,000 software engineers and consultants domestically in Japan, and over 7,000 globally by 2026.
  • IT infrastructure services provider Kyndryl has launched a dedicated AI private cloud in Japan. Built in collaboration with Dell Technologies using the Dell AI Factory with NVIDIA, this new AI private cloud will provide a controlled, secure and sovereign location for customers to develop, test and plan implementation of AI on the end-to-end NVIDIA AI platform, including  NVIDIA accelerated computing and networking, as well as the NVIDIA AI Enterprise software.
  • TCS Japan will begin offering its TCS global AI offerings built on the full NVIDIA AI stack in the automotive and manufacturing industries. These solutions will be hosted in its showcase centers at TCS Japan’s Azabudai office in Tokyo.

Located in the Tokyo and Kansai metropolitan areas, these new consulting centers offer hands-on experience with NVIDIA’s latest technologies and expert guidance — helping accelerate AI transformation, solve complex social challenges and support the nation’s economic growth.

To learn more, watch the NVIDIA AI Summit Japan fireside chat with NVIDIA founder and CEO Jensen Huang.

Editor’s note: IDC figures are sourced to IDC, 2024 Domestic AI System Market Forecast Announced, April 2024. The IDC forecast amount was converted to USD by NVIDIA, while the CAGR (31.2%) was calculated based on JPY.

Read More

Japan’s Startups Drive AI Innovation With NVIDIA Accelerated Computing

Japan’s Startups Drive AI Innovation With NVIDIA Accelerated Computing

Lifelike digital humans engage with audiences in real time. Autonomous systems streamline complex logistics. And AI-driven language tools break down communication barriers on the fly.

This isn’t sci-fi. This is Tokyo’s startup scene.

Supercharged by AI — and world-class academic and industrial might — the region has become a global innovation hub. And the NVIDIA Inception program is right in the middle of it.

With over 370 AI-driven startups in the program and a 250,000-person strong NVIDIA developer community, Japan’s AI startup ecosystem is as bold as it is fast-moving.

This week’s NVIDIA AI Summit Japan puts these achievements in the spotlight, capturing the region’s relentless innovation momentum.

NVIDIA founder and CEO Jensen Huang and SoftBank Group Chairman and CEO Masayoshi Son opened the summit with a fireside chat to discuss AI’s transformative role, with Jensen diving into Japan’s growing AI ecosystem and its push toward sovereign AI.

Sessions followed with leaders from METI (Japan’s Ministry of Economy, Trade and Industry), the University of Tokyo and other key players. Their success is no accident.

Tokyo’s academic powerhouses, global technology and industrial giants, and technology-savvy population of 14 million, provide the underpinnings of a global AI hub that stretches from the bustling startup scene in Shibuya to new hotbeds of tech development in Chiyoda and beyond.

Supercharging Japan’s Creative Class 

Iconic works from anime to manga have not only redefined entertainment in Japan — they’ve etched themselves into global culture, inspiring fans across continents, languages and generations.

Now, Japan’s vibrant visual pop culture is spilling into AI, finding fresh ways to surprise and connect with audiences.

Take startup AiHUB’s digital celebrity Sali.

Sali isn’t just a character in the traditional sense. She’s a digital being with presence — responsive and lifelike. She blinks, she smiles, she reacts.

Here, AI is doing something quietly revolutionary, slipping under the radar to redefine how people interact with media.

At AI Summit Japan, AiHUB revealed that it will adopt the NVIDIA Avatar Cloud Engine, or ACE, in the lip-sync module of its digital human framework, providing Sali nuanced expressions and human-like emotional depth.

ACE doesn’t just make Sali relatable — it puts her in a league of characters who transcend screens and pages.

This integration reduced development and future management costs by approximately 50% while improving the expressiveness of the avatars, according to AiHUB.

SDK Adoption: From Hesitation to High Velocity

In the global tech race, success doesn’t always hinge on the heroes you’d expect.

The unsung stars here are software development kits — those bundles of tools, libraries and documentation that cut the guesswork out of innovation. And in Japan’s fast-evolving AI ecosystem, these once-overlooked SDKs are driving an improbable revolution.

For years, Japan’s tech companies treated SDKs with caution. Now, however, with AI advancing at lightspeed and NVIDIA GPUs powering the engine, SDKs have moved from a quiet corner to center stage.

Take NVIDIA NeMo, a platform for building large language models, or LLMs. It’s swiftly becoming the background for Japan’s latest wave of real-time, AI-driven communication technologies.

One company at the forefront is Kotoba Technologies, which has cracked the code on real-time speech recognition thanks to NeMo’s powerful tools.

Under a key Japanese government grant, Kotoba’s language tools don’t just capture sound — they translate it live. It’s a blend of computational heft and human ingenuity, redefining how multilingual communication happens in non-English-speaking countries like Japan.

Kotoba’s tools are used in customer call centers and for automatic meeting minutes creation across various industries. It was also used to perform live transcription during the AI Summit Japan fireside chat between Huang and Son.

And if LLMs are the engines driving Japan’s AI, then companies like APTO supply the fuel. Using NVIDIA NeMo Curator, APTO is changing the game in data annotation, handling the intensive prep work that makes LLMs effective.

By refining data quality for big clients like RIKEN, Ricoh and ORIX, APTO has mastered the fine art of sifting valuable signals from noise. Through tools like WordCountFilter — an ingenious mechanism that prunes short or unnatural sentences — it’s supercharging performance.

APTO’s data quality control boosted model accuracy scores and slashed training time.

Across Japan, developers are looking to move on AI fast, and they’re embracing SDKs to go further, faster.

The Power of Cross-Sector Synergy

The gears of Japan’s AI ecosystem increasingly turn in sync thanks to NVIDIA-powered infrastructure that enables startups to build on each other’s breakthroughs.

As Japan’s population ages, solutions like these address security needs as well as an intensifying labor shortage. Here, ugo and Asilla have taken on the challenge, using autonomous security systems to manage facilities across the country.

Asilla’s cutting-edge anomaly detection was developed with security in mind but is now finding applications in healthcare and retail. Built on the NVIDIA DeepStream and Triton Inference Server SDKs, Asilla’s tech doesn’t just identify risks — it responds to them.

In high-stakes environments, ugo and Asilla’s systems, powered by the NVIDIA Jetson platform, are already in action, identifying potential security threats and triggering real-time responses.

NVIDIA’s infrastructure is also at the heart of Kotoba Technologies’ language tools, as well as AiHUB’s lifelike digital avatars. Running on an AI backbone, these various tools seamlessly bridge media, communication and human interaction.

The Story Behind the Story: Tokyo IPC and Osaka Innovation Hub

All of these startups are part of a larger ecosystem that’s accelerating Japan’s rise as an AI powerhouse.

Leading the charge is UTokyo IPC, the wholly owned venture capital arm of the University of Tokyo, operating through its flagship accelerator program, 1stRound.

Cohosted by 18 universities and four national research institutions, this program serves as the nexus where academia and industry converge, providing hands-on guidance, resources and strategic support.

By championing the real-world deployment of seed-stage deep-tech innovations, UTokyo IPC is igniting Japan’s academic innovation landscape and setting the standard for others to follow.

Meanwhile, Osaka’s own Innovation Hub, OIH, expands this momentum beyond Tokyo, providing startups with coworking spaces and networking events. Its Startup Acceleration Program brings early-stage projects to market faster.

Fast-moving hubs like these are core to Japan’s AI ecosystem, giving startups the mentorship, funding and resources they need to go from prototype to fully commercialized product.

And through NVIDIA’s accelerated computing technologies and the Inception program, Japan’s fast-moving startups are united with AI innovators across the globe.

Image credit: ugo.

Read More

Japan Tech Leaders Supercharge Sovereign AI With NVIDIA AI Enterprise and Omniverse

Japan Tech Leaders Supercharge Sovereign AI With NVIDIA AI Enterprise and Omniverse

From call centers to factories to hospitals, AI is sweeping Japan.

Undergirding it all: the exceptional resources of the island nation’s world-class universities and global technology leaders such as Fujitsu, The Institute of Science Tokyo, NEC and NTT.

NVIDIA software — NVIDIA AI Enterprise for building and deploying AI agents and NVIDIA Omniverse for bringing AI into the physical world — is playing a crucial role in supporting Japan’s transformation into a global hub for AI development.

The bigger picture: Japan’s journey to AI sovereignty is well underway to support the nation in building, developing and sharing AI innovations at home and across the world.

Japanese AI Pioneers to Power Homegrown Innovation

Putting Japan in a position to become a global AI leader begins with AI-driven language models. Japanese tech leaders are developing advanced AI models that can better interpret Japanese cultural and linguistic nuances.

These models enable developers to build AI applications for industries requiring high-precision outcomes, such as healthcare, finance and manufacturing.

As Japan’s tech giants support AI adoption across the country, they’re using NVIDIA AI Enterprise software.

Fujitsu’s Takane model is specifically built for high-stakes sectors like finance and security.

The model is designed to prioritize security and accuracy with Japanese data, which is crucial for sensitive fields. It excels in both domestic and international Japanese LLM benchmarks for natural Japanese expression and accuracy.

The companies plan to use NVIDIA NeMo for additional fine-tuning, and Fujitsu has tapped NVIDIA to support making Takane available as an NVIDIA NIM to broaden accessibility for the developer community.

NEC’s cotomi model uses NeMo’s parallel processing techniques for efficient model training. It’s already integrated with NEC’s solutions in finance, manufacturing, healthcare and local governments.

NTT Group is moving forward with NTT Communications’ launch of NTT’s large language model “tsuzumi,” which is accelerated with NVIDIA TensorRT-LLM for AI agent customer experiences and use cases such as document summarization.

Meanwhile, startups such as Kotoba Technologies, a Tokyo-based software developer, will unveil its Kotoba-Whisper model, built using NVIDIA NeMo for AI model building.

The transcription application built on the Kotoba-Whisper model performed live transcription during this week’s conversation between SoftBank Chairman and CEO Masayoshi Son and NVIDIA founder and CEO Jensen Huang at NVIDIA AI Summit Japan.

Kotoba Technologies reports that using NeMo’s automatic speech recognition for data preprocessing delivers superior transcription performance.

Kotoba-Whisper is already used in healthcare to create medical records from patient conversations, in customer call centers and for automatic meeting minutes creation across various industries.

These models are used by developers and researchers, especially those focusing on Japanese language AI applications.

Academic Contributions to Japan’s Sovereign AI Vision

Japanese universities, meanwhile, are powering the ongoing transformation with a wave of AI innovations.

Nagoya University’s Ruri-Large, built using NVIDIA’s Nemotron-4 340B — which is also available as a NIM microservice — is a Japanese embedding model. It achieves high document retrieval performance with high-quality synthetic data generated by Nemotron-4 340B, and it enables the enhancement of language model capabilities through retrieval-augmented generation using external, authoritative knowledge bases.

The National Institute of Informatics will introduce LLM.jp-3-13B-Instruct, a sovereign AI model developed from scratch. Supported by several Japanese government-backed programs, this model underscores the nation’s commitment to self-sufficiency in AI. It’s expected to be available as a NIM microservice soon.

The Institute of Science Tokyo and Japan’s National Institute of Advanced Industrial Science and Technology, better known as AIST, will present the Llama 3.1 Swallow model. Optimized for Japanese tasks, it’s now a NIM microservice that can integrate into generative AI workflows for uses ranging from cultural research to business applications.

The University of Tokyo’s Human Genome Center uses NVIDIA AI Enterprise and NVIDIA Parabricks software for rapid genomic analysis, advancing life sciences and precision medicine.

Japan’s Tech Providers Helping Organizations Adopt AI

In addition, technology providers are working to bring NVIDIA AI technologies of all kinds to organizations across Japan.

Accenture will deploy AI agent solutions based on the Accenture AI Refinery across all industries in Japan, customizing with NVIDIA NeMo and deploying with NVIDIA NIM for a Japanese-specific solution.

Dell Technologies is deploying the Dell AI Factory with NVIDIA globally — with a key focus on the Japanese market — and will support NVIDIA NIM microservices for Japanese enterprises across various industries.

Deloitte will integrate NIM microservices that support the leading Japanese language models including LLM.jp, Kotoba, Ruri-large, Swallow and more, into its multi-agent solution.

HPE has launched HPE Private Cloud AI platform, supporting NVIDIA AI Enterprise in a private environment. This solution can be tailored for organizations looking to tap into Japan’s sovereign AI NIM microservices, meeting the needs of companies that prioritize data sovereignty while using advanced AI capabilities.

Bringing Physical AI to Industries With NVIDIA Omniverse

The proliferation of language models across academia, startups and enterprises, however, is just the start of Japan’s AI revolution.

A leading maker of industrial robots, a top automaker and a retail giant are all embracing NVIDIA Omniverse and AI, as physics-based simulation drives the next wave of automation.

Industrial automation provider Yaskawa, which has shipped 600,000 robots, is developing adaptive robots for increased autonomy. Yaskawa is now adopting NVIDIA Isaac libraries and AI models to create adaptive robot applications for factory automation and other industries such as food, logistics, medical, agriculture and more.

It’s using NVIDIA Isaac Manipulator, a reference workflow of NVIDIA-accelerated libraries and AI models, to help its developers build AI-enabled manipulators, or robot arms.

It’s also using NVIDIA FoundationPose for precise 6D pose estimation and tracking.

More broadly, NVIDIA and Yaskawa teams use AI-powered simulations and digital twin technology — powered by Omniverse — to accelerate the development and deployment of Yaskawa’s robotic solutions, saving time and resources.

Meanwhile, Toyota is looking into how to build robotic factory lines in Omniverse to improve tasks in robot motion in metal-forging processes.

And another iconic Japanese company, Seven & i Holdings, is using Omniverse to gather insights from video cameras in research to optimize retail and enhance safety.

To learn more, check out our blog on these use cases.

See notice regarding software product information.

Read More

Generative AI for agriculture: How Agmatix is improving agriculture with Amazon Bedrock

Generative AI for agriculture: How Agmatix is improving agriculture with Amazon Bedrock

This post is co-written with Etzik Bega from Agmatix. Agmatix is an Agtech company pioneering data-driven solutions for the agriculture industry that harnesses advanced AI technologies, including generative AI, to expedite R&D processes, enhance crop yields, and advance sustainable agriculture. Focused on addressing the challenge of agricultural data standardization, Agmatix has developed proprietary patented technology to harmonize and standardize data, facilitating informed decision-making in agriculture. Its suite of data-driven tools enables the management of agronomic field trials, the creation of digital crop nutrient prescriptions, and the promotion of sustainable agricultural practices. Widely embraced by agronomists, scientists, and R&D teams in crop input manufacturing and contract-based research organizations, Agmatix’s field trial and analysis solutions are at the forefront of agricultural innovation.

This post describes how Agmatix uses Amazon Bedrock and AWS fully featured services to enhance the research process and development of higher-yielding seeds and sustainable molecules for global agriculture.

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities to build generative AI applications with security, privacy, and responsible AI. With Amazon Bedrock, you can experiment with and evaluate top FMs for your use case, privately customize them with your data using techniques such as fine-tuning and Retrieval Augmented Generation (RAG), and build agents that run tasks using your enterprise systems and data sources.

Through this innovative approach, Agmatix streamlines operations, accelerates the introduction of higher-yielding seeds, and fosters the development of new and sustainable molecules used in crop protection, including pesticides, herbicides, fungicides, and biologicals.

Innovation in field trial R&D is complex

Innovation continues to be a major driver for increasing yields and the security of our global food supply. Discoveries and improvements across seed genetics, site-specific fertilizers, and molecule development for crop protection products have coincided with innovations in generative AI, Internet of Things (IoT) and integrated research and development trial data, and high-performance computing analytical services.

Holistically, these systems have enabled dramatic reductions in time to market for new genetics and molecules, enabling growers with new and more effective products. Historical and current R&D on crop varieties and agricultural chemicals is essential to improving agricultural yields, but the process of bringing a new crop input to farms is expensive and complex. A key stage in this process is field trials. After new inputs are developed in labs, field trials are conducted to test the effectiveness of new crop varieties and agricultural chemicals in real-world conditions.

There are various technologies that help operationalize and optimize the process of field trials, including data management and analytics, IoT, remote sensing, robotics, machine learning (ML), and now generative AI.

Led by agricultural technology innovators, generative AI is the latest AI technology that helps agronomists and researchers have open-ended human-like interactions with computing applications to assist with a variety of tasks and automate historically manual processes. Applications of generative AI in agriculture include yield prediction, improving precision agriculture recommendations, educating and training agronomy staff, and enabling users to query vast datasets using natural language.

Current challenges in analyzing field trial data

Agronomic field trials are complex and create vast amounts of data. Most companies are unable to use their field trial data based on manual processes and disparate systems. Agmatix’s trial management and agronomic data analysis infrastructure can collect, manage, and analyze agricultural field trials data. Agronomists use this service to accelerate innovation and turn research and experimentation data into meaningful, actionable intelligence.

Agronomists upload or enter field trial data, create and manage tasks for monitoring field trials, and analyze and visualize trial data to generate insights. The time-consuming, undifferentiated task of cleaning, standardizing, harmonizing, and processing the data is automated and handled by Agmatix’s intelligent service.

Without the use of generative AI, the ability to build an analytical dashboard to analyze trial data and gain meaningful insights from field trials is complex and time-consuming. The following are two common challenges:

  • Each trial may contain hundreds of different parameters, and it’s challenging for an agronomist to understand which parameters and data points are meaningful to the specific problems they want to investigate.
  • There is a wide range of analytical visualization tools and charts (such as ANOVA One-Way, Regression, Boxplots, and Maps) available to choose from. However, selecting the most appropriate visualization technique that facilitates understanding of patterns and identification of anomalies within the data can be a challenging task.

Moreover, after the analytical dashboard is created, it can be complex to draw conclusions and establish connections between the different data points. For example, do the results of the trial support the hypothesis of the trial? Is there a connection between the fertilizer applied and the weight of the grain produced? Which external factors have the biggest impact on the efficacy of the product trial?

AWS generative AI services provide a solution

In addition to other AWS services, Agmatix uses Amazon Bedrock to solve these challenges. Amazon Bedrock is a fully managed, serverless generative AI offering from AWS that provides a range of high-performance FMs to support generative AI use cases.

Through the integration of Agmatix’s landscape with Amazon Bedrock, Agmatix has developed a specialized generative AI assistant called Leafy, which gives agronomists and R&D staff a significantly improved user experience.

Instead of spending hours evaluating data points for investigation, selecting the right visualization tools, and creating multiple dashboards for analyzing R&D and trial information, agronomists can write their questions in natural language and get Leafy to provide the relevant dashboards and insights immediately (see the following screenshot for an example of Leafy in action). This helps improve productivity and user experience.

The first step in developing and deploying generative AI use cases is having a well-defined data strategy. Agmatix’s technology architecture is built on AWS. Their data pipeline (as shown in the following architecture diagram) consists of ingestion, storage, ETL (extract, transform, and load), and a data governance layer. Multi-source data is initially received and stored in an Amazon Simple Storage Service (Amazon S3) data lake. AWS Glue accesses data from Amazon S3 to perform data quality checks and important transformations. AWS Lambda is then used to further enrich the data. The transformed data acts as the input to AI/ML services. The generated insights are accessed by users through Agmatix’s interface.

Architecture diagram

Focusing on generative AI, let’s first understand the fundamentals of the generative AI chatbot application:

  • Prompt – The input question or task including contextual information provided by the user
  • Data – The data required to answer the question in the prompt
  • Agent – The agent that performs the orchestration of tasks

In the case of Agmatix, when the agronomist asks Leafy a question, Agmatix’s Insights solution sends a request to Anthropic Claude on Amazon Bedrock through an API:

  • Prompt – The prompt sent to Anthropic Claude consists of tasks and data. The task is the question submitted by the user.
  • Data – The data in the prompt includes two types of data:
    • Context data instructions to the model; for example, a list of the types of widgets available for visualization.
    • The data from the specific field trial.

The following diagram illustrates the generative AI workflow.

Generative AI workflow

The workflow consists of the following steps:

  1. The user submits the question to Agmatix’s AI assistant, Leafy.
  2. The application reads the field trial data, business rules, and other required data from the data lake.
  3. The agent inside the Insights application collects questions and tasks and the relevant data, and sends it as a prompt to the FM through Amazon Bedrock.
  4. The generative AI model’s response is sent back to the Insights application.
  5. The response is displayed to the user through the widgets visualizing the trial data and the answer to the user’s specific question, as shown in the following screenshot.

The response in Agmatix's Leafy AI

The data used in the prompt engineering (trial result and rules) is stored in plain text and sent to the model as is. Prompt engineering plays a central part in this generative AI solution. For more information, refer to the Anthropic Claude prompt engineering guide.

Overall, by using Amazon Bedrock on AWS, Agmatix’s data-driven field trials service observed over 20% improved efficiency, more than 25% improvement in data integrity, and a three-fold increase in analysis potential throughput.

This is how generative AI technology is helping improve the overall experience and productivity of agronomists so they can focus on solving complex challenges and tasks that require human knowledge and intervention.

A real-life example of this solution can be seen within the largest open nutrient database for crop nutrition, powered by the Agmatix infrastructure, where researchers can tap into insights gleaned from thousands of field trials. In this practical scenario, users benefit from guided question prompts and responses facilitated by generative AI. This advanced data processing enhances users’ grasp of evolving trends in crop nutrient uptake and removal, simplifying the creation of decision support systems.

Conclusion

Seed, chemical, and fertilizer manufacturers need innovative, smart agricultural solutions to advance the next generation of genetics and molecules. Ron Baruchi, President and CEO of Agmatix, highlights the beneficial synergy between humans and technology:

“AI complements, rather than replaces, human expertise. By integrating Amazon Bedrock’s generative AI into our infrastructure, we provide our customers with self-service analytical tools that simplify complex and time-consuming tasks.”

This integration equips agronomists and researchers with advanced AI capabilities for data processing and analysis, enabling them to concentrate on strategic decision-making and creative problem-solving.

Field trial management has long needed a fresh dose of technology infusion. With Agmatix’s AI-enabled agriculture service, powered by AWS, input manufacturers can reduce the time and cost associated with field trials, while improving the overall productivity and experience of agronomists and growers. By delivering growers the most successful seeds, crop protection products, and fertilizers, their farming operations can thrive. This approach not only maximizes the efficiency of these essential crop inputs but also minimizes natural resource usage, resulting in a more sustainable and healthier planet for all.

Contact us to learn more about Agmatix.

Resources

Check out the following resources to learn more about AWS and Amazon Bedrock:


About the Authors

Etzik BegaEtzik Bega is the Chief Architect of Agmatix, where he has revolutionized the company’s data lake architecture using cutting-edge GenAI technology. With over 25 years of experience in cybersecurity, system architecture, and communications, Etzik has recently focused on helping organizations move to the public cloud securely and efficiently.

Menachem MelamedMenachem Melamed is a Senior Solutions Architect at AWS, specializing in Big Data analytics and AI. With a deep background in software development and cloud architecture, he empowers organizations to build innovative solutions using modern cloud technologies.

Prerana SharmaPrerana Sharma is Manager of Solutions Architects at AWS, specializing in Manufacturing. With a wide experience of working in the Digital Farming space, Prerana helps customers solve business problems by experimenting and innovating with emerging technologies on AWS.

Read More

Generate financial industry-specific insights using generative AI and in-context fine-tuning

Generate financial industry-specific insights using generative AI and in-context fine-tuning

In this blog post, we demonstrate prompt engineering techniques to generate accurate and relevant analysis of tabular data using industry-specific language. This is done by providing large language models (LLMs) in-context sample data with features and labels in the prompt. The results are similar to fine-tuning LLMs without the complexities of fine-tuning models. We used a method called Generative Tabular Learning (GTL) based on the whitepaper From Supervised to Generative: A Novel Paradigm for Tabular Deep Learning with Large Language Models and demonstrate the advantages of GTL using fully managed JupyterLab notebooks in Amazon SageMaker notebooks to interact with Meta Llama models hosted in Amazon SageMaker or Amazon Bedrock. You may check out additional reference notebooks on aws-samples for how to use Meta’s Llama models hosted on Amazon Bedrock.

Prerequisites

The following sections describes the prerequisites needed for this demonstration. You can implement these steps either from the AWS Management Console or using the latest version of the AWS Command Line Interface (AWS CLI).

  • Access to LLMs such as Meta’s Llama models hosted on Amazon SageMaker or Amazon Bedrock
  • Amazon SageMaker Domain configuration configured with JupyterLab notebooks and the necessary python libraries and packages to interact with the LLMs
  • Sample tabular datasets from the financial industry formatted as structured data (we are using exchange-traded funds data from Kaggle) available for querying using a SQL engine like Amazon Athena.
  • Knowledge of generative AI prompt engineering techniques to provide LLMs with relevant context and sample data
  • Ability to evaluate and compare LLM-generated outputs for accuracy and relevance to the analysis task
  • Understanding of financial industry data and knowledge of staging and querying this data in a structured tabular format consumable by LLMs
  • Knowledge of the industry domain that the data belongs to in order to determine appropriate features and labels for sample data prompts

Financial industry data

In the financial industry data can be in the form of a table in PDF files or structured data in a database. The following is an example of a financial information dataset for exchange-traded funds (ETFs) from Kaggle in a structured tabular format that we used to test our solution.

A user can ask a business- or industry-related question for ETFs.

NOTE: Since we used an SQL query engine to query the dataset for this demonstration, the prompts and generated outputs mention SQL below.

# Business question 
question = "Please provide a list of about 100 ETFs or ETNs names with exposure to US markets" 

# Generate a prompt to get the LLM to provide an SQL query 
SQL_SYS_PROMPT = PromptTemplate.from_template(tmp_sql_sys_prompt).format(
     question=question, 
     table_schema=table_schema_etf 
) 

results = get_llm_sql_analysis( 
     question=question, 
     sql_sys_prompt=SQL_SYS_PROMPT, 
     qna_sys_prompt=QNA_SYS_PROMPT 
)

After the data is retrieved from the dataset, it’s sent to the LLM hosted in Amazon Bedrock (refer to the list of supported models in Amazon Bedrock) for analysis and generates a response to the user’s question or query in natural language.

The question in the preceding example doesn’t require a lot of complex analysis on the data returned from the ETF dataset. We get a response from the LLM based on its analysis of the data in a satisfactory industry or business-relevant language:

LLM SQL Analysis: 
After analyzing the provided SQL query results, I can conclude that the list of ETFs/ETNs does not primarily focus on US markets. Instead, it appears to be a comprehensive list of bond ETFs/ETNs with a global scope, covering various regions, currencies, and bond types.

Here are some key observations:

1. **Global coverage**: The list includes ETFs/ETNs tracking bond markets in Europe (e.g., Eurozone, UK), the US, and globally diversified indices.
2. **Bond types**: The list covers a range of bond types, including corporate bonds, government bonds, high-yield bonds, and green bonds.
3. **Currency exposure**: ETFs/ETNs are denominated in various currencies, such as EUR, USD, and GBP, with some offering hedged exposure to mitigate currency risks.
4. **ESG and SRI focus**: A significant portion of the list consists of ETFs/ETNs with an Environmental, Social, and Governance (ESG) or Socially Responsible Investing (SRI) focus, which suggests a emphasis on sustainable investing.

To answer the original question, I can identify a subset of ETFs/ETNs from the list that have exposure to US markets:

**US-focused ETFs/ETNs:**

1. xxxx USD Corporate Bond 0-3yr ESG UCITS ETF USD (Dist)
2. xxxx USD Corporate Bond ESG 0-3yr UCITS ETF EUR Hedged (Acc)
3. xxxx ESG USD High Yield (DR) UCITS ETF - Dist
4. xxxx USD High Yield Corporate Bond ESG UCITS ETF USD (Acc)
5. xxxx USD High Yield Corporate Bond ESG UCITS ETF USD (Dist)
6. xxxx Index US Corporate SRI UCITS ETF DR (C)
7. xxxx Index US Corporate SRI UCITS ETF DR Hedged EUR (D)
8. xxxx USD Corporate Bond ESG UCITS ETF (Acc)
9. xxxx USD Corporate Bond ESG UCITS ETF (Dist)
10. xxxx ESG USD High Yield Corporate Bond UCITS ETF 1C
11. xxxx ETF (LU) xxxx xxxx US Liquid Corporates Sustainable UCITS ETF (USD) A-dis
12. xxxx USD Corporate Green Bond UCITS ETF 2C Acc USD

Please note that this subset is not exhaustive, and there may be other ETFs/ETNs in the original list that have some exposure to US markets. Additionally, investors should carefully evaluate the investment objectives, risks, and characteristics of each ETF/ETN before making any investment decisions.

NOTE: Output ETF names do not represent the actual data in the dataset used in this demonstration.

NOTE: Outputs generated by LLMs are non-deterministic and may vary in your testing.

What would the LLM’s response or data analysis be when the user’s questions in industry specific natural language get more complex? To answer questions that require more complex analysis of the data with industry-specific context the model would need more information than relying solely on its pre-trained knowledge.

Solution overview

We encourage you to think about this question before starting: Can enhancing the context provided to the LLM in the prompt along with the user’s natural language question work in generating better outputs, before trying to fine-tuning the LLMs which requires setting up MLOPS processes and environments, collecting and preparing relevant and accurate labeled datasets, and more?

We propose an intermediate GTL framework using the Meta Llama model on Amazon Bedrock. The proposed framework is not meant to replace the fine-tuning option. The following diagram illustrates this framework of GTL for LLMs.

GTL is a type of few-shot prompting technique where we provide the following information about the data retrieved from the structured dataset as part of the prompt to the LLM:

  • A personality for the LLM to use when generating the data analysis (which provides hints to the model to use industry-specific data it has already been pre-trained with)
  • Data features and descriptions
  • Data labels and descriptions
  • A small sample dataset containing features
  • A sample analysis as an example

The following is an example GTL prompt:

instructions = [
    {
        "role": "user",
        "content": """Given the following SQL query results: {query_results}

And the original question: {question}

You are an expert in Exchange-Traded Funds or ETFs and Exchange-Traded Notes or ETNs .
Based on the features of the funds or notes, please predict how expensive the funds are for investors.
I will supply multiple instances with features and the corresponding label for reference.
Please refer to the table below for detailed descriptions of the features and label:
— feature description —
Features:
isin: International Securities Identification Number
wkn: Wertpapierkennnummer or German securities identification number
name: ETF Name
fundprovider: Financial Company providing the ETF
legalstructure: Exchange Traded Fund (ETF) or Exchange Traded Notes (ETN)
totalexpenseratio: An expense ratio is the cost of owning an ETF or ETN, the management fee paid to the fund company for the benefit of owning the fund, 
paid annually and measured as a percent of your investment in the fund. 0.30 percent means you’ll pay $30 per year for every $10,000 you have invested in the fund.
— label description —
Expensive: Whether the fund is expensive for investors or not. 0 means not expensive, 1 means expensive.
— data —
|isin|wkn|name|fundprovider|legalstructure|totalexpenseratio|Expensive|
|GB00BNRRxxxx |A3xxxx|xxxx Physical Staked Cardano|xxxx|ETN|0.0|0|
|BGPLWIG0xxxx|A2xxxx|xxxx Poland WIGxxx UCITS ETF|xxxx|ETF|0.0138|0|
|CH044568xxxx|A2Txxxx|xxxx Crypto Basket Index ETP|xxxx|ETN|0.025|1|
|CH1114873xxxx|A3Gxxxx|xxxx Solana ETP|xxxx|ETN|0.025|1|
|GB00BNRRxxxx|A3xxxx|xxxx Physical Staked Algorand|xxxx|ETN|0.0|<MASK>|
Please use the supplied data to predict the <MASK>. Fund is expensive[1] or not[0]?
Answer: 0
Please provide an analysis and interpretation of the results to answer the original {question}.
"""
    }
]
QNA_SYS_PROMPT = format_instructions(instructions)

In our GTL prompt, we are highlighting only the subset of columns in the dataset that are relevant to produce accurate and relevant analysis of the data based on the industry out of the possible 129 columns in the EFT dataset we have chosen and also examples of how to interpret the data.

Use case examples

Let’s look at a few sample prompts with generated analysis. The following question requires complex industry knowledge-based analysis of data from multiple columns in the ETF database. In this example the user is trying to find ETFs (funds) that provide higher dividends and lower volatility in value, which are a desired fund characteristic that investors look for in funds they want to invest in.

# Business question 
question = "Name the least risk funds that yields higher dividends and isn't volatile" 

# Generate a prompt to get the LLM to provide an SQL query 
SQL_SYS_PROMPT = PromptTemplate.from_template(tmp_sql_sys_prompt).format(
     question=question, 
     table_schema=table_schema_etf 
) 

results = get_llm_sql_analysis( 
      question=question, 
      sql_sys_prompt=SQL_SYS_PROMPT, 
      qna_sys_prompt=QNA_SYS_PROMPT 
)

The following is our response without the GTL prompt:

LLM SQL Query: 

SELECT 
    name, 
    yeardividendyield, 
    yearvolatilitycur, 
    fiveyearreturncur 
FROM 
    etftable 
WHERE 
    yeardividendyield > 3 
    AND yearvolatilitycur < 10 
    AND fiveyearreturncur > 5 
ORDER BY 
    yearvolatilitycur ASC 
LIMIT 10;

LLM SQL Analysis: 
Based on the provided data, I will analyze the features and predict the GoodInvestment label for each fund. Then, I will identify 
the least risk funds that yield higher dividends and aren't volatile.

**Analysis**

From the data, I observe the following:

1. **Low total expense ratio**: Funds with lower totalexpenseratio are more desirable, as they charge lower management fees. 
A lower expense ratio indicates that the fund is cheaper to own.
2. **High five-year return**: Funds with higher fiveyearreturncur indicate better performance over the past 5 years.
3. **Non-zero yearly dividend yield**: Funds with non-zero yeardividendyield provide a regular income stream to investors.

**Prediction of GoodInvestment label**

Based on the analysis, I predict the GoodInvestment label for each fund as follows:

|isin|GoodInvestment|
| --- | --- |
|LU17815xxxx|1|
|LU17815xxxx7|1|
|IE00B5Bxxxx|0|
|LU14078xxxx|1|
|IE00B4xxxx|1|
|LU17815xxxx|1|

**Least risk funds that yield higher dividends and aren't volatile**

To identify the least risk funds, I will consider the following criteria:

1. **Low volatility**: Funds with lower fiveyearreturncur volatility are less risky.
2. **Higher dividend yield**: Funds with higher yeardividendyield provide a regular income stream to investors.
3. **Low expense ratio**: Funds with lower totalexpenseratio are cheaper to own.

Based on these criteria, I recommend the following funds:

1. **xxxxUK NT (DR) UCITS ETF - Dist (LU17815xxxx)**: This fund has a low expense ratio (0.04%), a relatively high dividend yield (0.0379), 
and a moderate five-year return (0.2296).
2. **xxxxUS Equity (DR) UCITS ETF - Dist (LU17815xxxx)**: This fund has a low expense ratio (0.04%), a relatively high dividend yield (0.0196), 
and a high five-year return (0.7337).
3. **xxxxEURO STOXX 50 UCITS ETF EUR (IE00B4Kxxxx)**: This fund has a moderate expense ratio (0.05%), a relatively high dividend yield (0.0297), 
and a moderate five-year return (0.3927).

These funds offer a balance between dividend yield, low expense ratio, and moderate volatility, making them suitable for investors seeking a 
relatively stable investment with a regular income stream.

We see that the data retrieval only uses the yearly volatility and not the 3-year and 5-year volatility information. Also, some of the funds don’t have volatility data in the dataset (no values for 1-year, 3-year, or 5-year volatility).

The following is a modified question requesting additional column considerations for 3-year and 5-year data.

# Business question 
question = "Name the least risk funds that yields higher dividends and isn't volatile based on five year, three year and one year volatiliy data" 

# Generate a prompt to get the LLM to provide an SQL query 

SQL_SYS_PROMPT = PromptTemplate.from_template(tmp_sql_sys_prompt).format( 
     question=question, 
     table_schema=table_schema_etf 
) 

results = get_llm_sql_analysis( 
     question=question, 
     sql_sys_prompt=SQL_SYS_PROMPT, 
     qna_sys_prompt=QNA_SYS_PROMPT 
)

We use the following GTL prompt with labels to interpret 1-year, 3-year, and 5-year data or lack of data:

instructions = [
    {
        "role": "user",
        "content": """Given the following SQL query results: {query_results}

And the original question: {question}

You are an expert in Exchange-Traded Funds or ETFs and Exchange-Traded Notes or ETNs .
Based on the features of the funds or notes, please predict best funds for investors to invest in.
I will supply multiple instances with features and the corresponding label for reference.
Please refer to the table below for detailed descriptions of the features and label:
— feature description —
Features:
isin: International Securities Identification Number
wkn: Wertpapierkennnummer or German securities identification number
name: ETF Name
fundprovider: Financial Company providing the ETF
legalstructure: Exchange Traded Fund (ETF) or Exchange Traded Notes (ETN)
yeardividendyield: Yearly Dividend yield as a percentage of total investment
fiveyearreturncur: Returns over past 5 year period as a percentage of investment
totalexpenseratio: An expense ratio is the cost of owning an ETF or ETN, the management fee paid to the fund company for the benefit of owning the fund, 
paid annually and measured as a percent of your investment in the fund. 0.30 percent means you’ll pay $30 per year for every $10,000 you have invested in the fund.
— label description —
volatile: The fund has low fiveyearvolatilitycur, threeyearvolatilitycur, yearvolatilitycur. 0 means not volatile, 1 means volatile, 2 means cannot be determined.
— data —
|isin|name|fiveyearvolatilitycur|threeyearvolatilitycur|yearvolatilitycur|Risk|
|LU033504xxxx|xxxx II EUR Overnight Rate Swap UCITS ETF 1D|8.0E-4|9.0E-4|0.0011|0|
|FR001051xxxx|Lyxor Euro Overnight Return UCITS ETF - Acc|8.0E-4|9.0E-4|0.0011|0|
|LU0290358xxxx|xxxxII EUR Overnight Rate Swap UCITS ETF 1C|9.0E-4|0.001|0.0013|0|
|BGCROEX03xxxx|xxxx UCITS ETF|0.675|0.8606|1.4787|1|
|IE000RN0xxxx|xxxxDisruptive Technology Real Estate UCITS ETF Acc||||2|
|GB00BNRRxxxx|xxxxPhysical Staked xxxx||||2|
|FR001075xxxx|xxxx ETF xxxx0-6 Months Euro Investment Grade UCITS ETF EUR (C)|0.0014|0.0016|0.0021|<MASK>|
Please use the supplied data to predict the <MASK>. Fund is volatile[1] or not[0] or cannot-be-determined[2]?
Answer: 1
Please provide an analysis and interpretation of the results to answer the original {question}.
"""
    }
]

We see that with additional prompting the model uses all of the volatility columns in the dataset (1-year, 3-year, and 5-year) and provides output suggestions for when data is present or missing in the volatility columns.

The following is our response with GTL prompts:

LLM SQL Query: 

SELECT 
  name, 
  fundprovider, 
  yeardividendyield, 
  fiveyearvolatilitycur, 
  threeyearvolatilitycur, 
  yearvolatilitycur
FROM 
  etftable
WHERE 
  yeardividendyield > 0 
  AND fiveyearvolatilitycur < (SELECT AVG(fiveyearvolatilitycur) FROM etftable) 
  AND threeyearvolatilitycur < (SELECT AVG(threeyearvolatilitycur) FROM etftable) 
  AND yearvolatilitycur < (SELECT AVG(yearvolatilitycur) FROM etftable)
ORDER BY 
  yeardividendyield DESC, 
  fiveyearvolatilitycur, 
  threeyearvolatilitycur, 
  yearvolatilitycur
LIMIT 10;

LLM SQL Analysis: 
Based on the provided data, I will analyze and interpret the results to answer the original question.

**Least Risk Funds with Higher Dividend Yields and Low Volatility**

To identify the least risk funds, I will consider the following criteria:

1. **Low Volatility**: Funds with low five-year, three-year, and one-year volatility (less than 0.1).
2. **Higher Dividend Yields**: Funds with a higher yearly dividend yield (greater than 0.07).
3. **Low Expense Ratio**: Funds with a low total expense ratio (less than 0.1).

After analyzing the data, I have identified the following funds that meet the above criteria:

1. **xxxx Global Select Dividend 100 Swap UCITS ETF 1D**:
	* Yearly Dividend Yield: 0.0938
	* Five-year Volatility: 0.1096
	* Three-year Volatility: 0.1125
	* One-year Volatility: 0.1236
	* Expense Ratio: Not available in the data
2. **xxxx Emerging Markets Equity Income UCITS ETF**:
	* Yearly Dividend Yield: 0.0797
	* Five-year Volatility: 0.1532
	* Three-year Volatility: 0.1471
	* One-year Volatility: 0.1335
	* Expense Ratio: Not available in the data
3. **xxxx FTSE EM High Dividend Low Volatility UCITS ETF**:
	* Yearly Dividend Yield: 0.0787
	* Five-year Volatility: 0.1566
	* Three-year Volatility: 0.1621
	* One-year Volatility: 0.1439
	* Expense Ratio: Not available in the data

These funds have relatively low volatility and higher dividend yields compared to the other funds in the dataset. However, please note that the 
expense ratio is not available in the data, which is an important factor to consider when evaluating the overall cost of investing in an ETF.

**Additional Considerations**

When evaluating these funds, investors should also consider other factors such as:

1. **Fund Provider**: The reputation and experience of the fund provider can impact the fund's performance and risk profile.
2. **Legal Structure**: The legal structure of the fund (ETF or ETN) can affect its tax efficiency, liquidity, and regulatory environment.
3. **Investment Strategy**: The fund's investment strategy and asset allocation can influence its risk profile and potential returns.

In conclusion, the three funds mentioned above appear to be the least risk funds with higher dividend yields and low volatility based 
on the provided data. However, investors should conduct further research and consider additional factors before making an investment decision.

As we can see the data retrieval is more accurate. Additionally, the generated analysis has considered all of the volatility information in the dataset (1-year, 3-year, and 5-year) and accounted for present or missing data for volatility.

Based on this outcome, the recommendation is to build a curated set of GTL prompts along with the most common user questions pertaining to datasets that users will be asking. The prompts will need to be created by dataset specialists who have deep understanding of the dataset from industry perspective and can provide the right context to the LLMs. Organizations can use such a prompt library to build interactive applications that allow regular business users who may not have deep knowledge or understanding of underlying datasets to interact with and gain insights from these datasets using natural language questions.

Conclusion

As newer and larger LLMs are released, they get better at generating an analysis of structured datasets using industry-specific language. However, there is room for improvement in the analysis of data from structured datasets. One option is to fine-tune the LLM to improve relevance and language of the generated data analysis using specific business language. Fine-tuning requires additional efforts and costs (collecting relevant data, labeling the data, additional costs involved in procuring, and provisioning, and maintaining the fine-tuning compute environment).

In this post, we showcased a method with few-shot prompting using Meta Llama models available through Amazon Bedrock that can improve industry- or business-specific analysis of the data with just prompt engineering. (For certain use cases, fine-tuning may be required. Refer to Amazon Bedrock pricing for estimated costs with or without using fine-tuned models).

Try this solution with your own industry-specific use cases and datasets, and let us know your feedback and questions in the comments.

NOTE: Blog authors are not providing any financial or investment advice in this blog post, nor are they recommending this dataset or ETFs mentioned in this dataset.


About the Authors

Randy DeFauw is a Senior Principal Solutions Architect at AWS. He holds an MSEE from the University of Michigan, where he worked on computer vision for autonomous vehicles. He also holds an MBA from Colorado State University. Randy has held a variety of positions in the technology space, ranging from software engineering to product management. In entered the Big Data space in 2013 and continues to explore that area. He is actively working on projects in the ML space and has presented at numerous conferences including Strata and GlueCon.

Arghya Banerjee is a Sr. Solutions Architect at AWS in the San Francisco Bay Area focused on helping customers adopt and use AWS Cloud. He is focused on Big Data, Data Lakes, Streaming and batch Analytics services and generative AI technologies.

Ravi Ganesh is a Sr Solution Architect in AWS at Austin Texas Area, focused on helping customer address their business problems through adoption of Cloud, He is focussed on Analytics, Resiliency, Security and Generative AI technologies.

Varun Mehta is a Sr. Solutions Architect at AWS. He is passionate about helping customers build enterprise-scale Well-Architected solutions on the AWS Cloud. He works with strategic customers who are using AI/ML to solve complex business problems. Outside of work, he loves to spend time with his wife and kids

Read More

Preventing side-channels in the cloud

Preventing side-channels in the cloud

Icons representing hardware and devices, security, privacy, and cryptography, and systems and networking on a blue to green gradient background.

Cloud computing delivers scalable and cost-effective compute resources to a wide range of customers. The ability for cloud providers to share components of the hardware stack across customers, or tenants, is essential for running efficient cloud systems. For example, modern central processing units (CPUs) pack hundreds of physical hardware threads sharing terabytes of dynamic random-access memory (DRAM), which can be flexibly assigned to many independent virtual machines (VMs).

Preventing tenants from snooping on others who share the same hardware requires security mechanisms. Microsoft Azure (opens in new tab) provides strong protection via comprehensive architectural isolation through access control mechanisms implemented across the cloud platform, including the hardware and the hypervisor. Confidential computing (opens in new tab) powered by trusted execution environments further hardens architectural isolation via hardware memory encryption to protect tenants even against privileged attackers. 

A changing threat landscape

Even with perfect architectural isolation, sharing microarchitectural resources, such as CPU caches and DRAM row buffers, can leak small amounts of information, because interference (due to sharing) leads to variations in the latency of memory accesses. This gives rise to so-called microarchitectural side-channel attacks where a malicious tenant can learn information about another tenant, in the worst case: their cryptographic keys.

Microsoft Azure protects tenants and critical infrastructure against currently practical side-channel attacks. For example, side-channels in on-core resources (e.g., buffers, predictors, private caches) are comprehensively (opens in new tab) mitigated by Hyper-V HyperClear (opens in new tab) via core scheduling, microarchitectural flushing and scrubbing, and virtual-processor address space isolation; and our cryptographic libraries are carefully hardened to prevent any secrets from being leaked via microarchitectural side-channels. 

However, the threat landscape is changing. First, side-channel attacks are becoming increasingly sophisticated: For example, recent academic research (opens in new tab) has shown that even cache-coherence directories can be exploited to leak information across cores. Second, future CPUs are likely to employ increasingly sophisticated microarchitectural optimizations, which are prone to new kinds of attacks: For example, the recently introduced data-dependent prefetchers have already been found to leak information (opens in new tab).

In Azure Research’s Project Venice, we are investigating principled defenses, to be prepared in case such emerging attacks start posing a risk to Azure customers.

Preventing microarchitectural side-channels with resource-exclusive domains

In a research paper (opens in new tab), which has received a distinguished paper award at the ACM Conference on Computer and Communications Security (ACM CCS’24 (opens in new tab)), we present a system design that can prevent cross-VM microarchitectural side-channels in the cloud. Our design provides what we call resource-exclusive domains, which extend the architectural abstraction of private physical threads and private memory to the microarchitectural level. That is, resource-exclusive domains guarantee isolation even against powerful attackers that try to mount side-channel attacks on shared microarchitectural resources.

Our approach builds on isolation schemes, a novel abstraction of the way a CPU shares microarchitectural structures between its physical threads.  Isolation schemes can be used by the hypervisor and host operating system to assign physical threads and physical memory pages, eliminating the risk of information leakage across resource-exclusive domains. Technically, for a given assignment of physical threads to resource-exclusive domains, the isolation scheme partitions each microarchitectural resource that is shared between domains (as this would leak information), but without partitioning resources that are private to a domain (as this would affect performance). We achieve this using hardware mechanisms, if available, and multi-resource memory coloring, if not.

In a complementary research paper (opens in new tab) (appearing at ACM CCS’24 (opens in new tab)), we provide the theoretical foundations and practical algorithms for computing such multi-resource memory coloring schemes for existing microarchitectures, as well as design patterns for future microarchitectures to support a large number of resource-exclusive domains. 

We have implemented our approach in a research prototype based on Microsoft Hyper-V for a modern cloud chiplet-based CPU, AMD EPYC 7543P, that supports VM-level trusted execution environments. Using a collection of microbenchmarks and cloud benchmarks, we demonstrate that our approach eliminates all identified side-channels and incurs only small performance overheads. For example, when allocating resources at chiplet and channel granularity (i.e., coupling a chiplet with one of the local DRAM channels) we observe an overhead of less than 2%; and only up to 4% when allocating resources at chiplet granularity and coloring with 2MB pages.

Co-designing cloud platforms for future microarchitectural isolation

To validate the effectiveness and practicality of our approach, we inferred isolation schemes for a single CPU by reverse-engineering its microarchitecture. This approach is incomplete and does not scale to the diverse hardware fleet available in the cloud. We are working with CPU vendors to develop isolation schemes for future CPUs, which will then be exposed via the hardware interface for consumption by the hypervisor’s hardware abstraction layer. In this way, we will be able to reap the benefits of microarchitectural performance optimizations while continuing to provide strong security guarantees to cloud tenants. 

Additional Contributors

Cédric Fournet, Senior Principal Researcher
Jana Hofmann, Researcher
Oleksii Oleksenko, Senior Researcher

The post Preventing side-channels in the cloud appeared first on Microsoft Research.

Read More

Deliver personalized marketing with Amazon Bedrock Agents

Deliver personalized marketing with Amazon Bedrock Agents

Creative content plays a crucial role in marketing, and personalized creative content in particular significantly boosts marketing performance. Generating personalized content can present a significant challenge for marketers because it requires considerable time and resources. This challenge stems from the need for multiple versions of creative content across various channels, such as paid media (ads) and owned media, including electronic direct mail (EDM), social media posts, app notifications, and SMS. Scaling this process can be challenging, especially for small and medium-sized businesses.

Generative AI now empowers marketers to efficiently create personalized content, even with limited resources. By using machine learning (ML) models, you can pinpoint customer preferences for specific merchandise and tailor your marketing campaigns accordingly. This enables the crafting of compelling promotional text and striking visuals that effectively resonate with each customer segment, thereby driving engagement and increasing sales. Using Amazon Bedrock Agents to create your own marketing agent allows you to seamlessly accomplish list targeting and personalized material generation for specific marketing purposes.

In this post, we demonstrate a solution using Amazon Bedrock Agents, Amazon Bedrock Knowledge Bases, Amazon Bedrock Developer Experience, and Amazon Personalize that allow marketers to save time and deliver efficient personalized advertising using a generative AI enhanced solution. Our solution is a marketing agent that shows how Amazon Personalize can effectively segment target customers based on relevant characteristics and behaviors. Additionally, by using Amazon Bedrock Agents and foundation models (FMs), our tool generates personalized creative content specifically tailored to each purpose. It customizes the tone, creative style, and individual preferences according to each customer’s specific prompt, providing highly customized and effective marketing communications.

Marketing agent overview

In the following diagram, we show the components that power our marketing agent.

The difference between an agent and a large language model (LLM) is that an agent comprises not only LLMs, but also includes planning skills, tool usage, and memory. This means that when you provide a natural language prompt, you receive user segment results along with creative content tailored to your specifications. For example, if you want to promote an oven through EDM, social media posts, or SMS, the marketing agent will use its tools to generate a customer list using a segmentation model trained on your data. Furthermore, it will generate creative content that uses your historical creative content as examples and incorporate detailed merchandise data from your database.

The marketing agent solution includes three tools:

  • Merchandise tool – Retrieve merchandise details from Amazon DynamoDB (item database) and deliver them to the creative content tool according to the customer’s prompt.
  • User segment tool – Retrieve a list from Amazon Simple Storage Service (Amazon S3) created by Amazon Personalize which is tailored to the merchandise plan for promotion. This process uses comprehensive user, merchandise (item), and interaction data.
  • Creative content tool – Generate the personalized creative content using an LLM based on the augmented prompt. The augmented prompt is formed by retrieving creative assets data from Amazon Bedrock Knowledge Bases (historical creative content), the merchandise database from DynamoDB, and the user database from DynamoDB, based on the customer’s input prompt.

This agent operates based on natural language prompts and your organization’s data. These managed agents serve as intelligent orchestrators, managing interactions between FMs, API integrations, user questions and instructions, and knowledge sources filled with your proprietary data. The agent skillfully coordinates and processes user inputs through various dynamic steps during its runtime.

Amazon Bedrock is a fully managed service that offers a choice of high-performing FMs from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities to build generative AI applications with security, privacy, and responsible AI. The single API access, regardless of the models you choose, gives you the flexibility to use different FMs and upgrade to the latest model versions with minimal code changes.

Amazon Bedrock agents plan and run multistep tasks using company systems and data sources—from answering customer questions about your product availability to taking their orders. With Amazon Bedrock, you can create an agent in just a few clicks by selecting an FM and providing it access to your enterprise systems, knowledge bases, and AWS Lambda functions to securely run your APIs. An agent analyzes the user request and automatically calls the necessary APIs and data sources to fulfill the request. Amazon Bedrock Agents enables you to do this securely and privately—you don’t have to engineer prompts, manage session context, or manually orchestrate tasks.

Amazon Bedrock Knowledge Bases is a fully managed capability that helps you implement the entire retrieval augmented generation (RAG) workflow, from ingestion to retrieval and prompt augmentation, without having to build custom integrations to data sources or manage data flows. Session context management is built in, so your app can readily support multi-turn conversations. You can use the Retrieve API to fetch relevant results for a user query from knowledge bases. You can also add knowledge bases to Amazon Bedrock Agents to provide contextual information to agents. The information retrieved from the knowledge base is provided with citations to improve transparency and minimize hallucinations.

Amazon Personalize is a fully managed ML service that uses your data to generate recommendations for your users and enables developers to quickly implement a customized personalization engine, without requiring ML expertise. It accelerates your digital transformation with ML, making it effortless to integrate personalized recommendations into existing websites, applications, email marketing systems, and more.

Solution overview

Amazon Bedrock Agents is our key component for developing our marketing agent. It enables you to build and configure autonomous agents in your application. Agents orchestrate interactions between FMs, data sources, software applications, and user conversations, and automatically call APIs to perform actions and invoke knowledge bases to supplement information for these actions. You can add actions for it to carry out and define how to handle them by writing Lambda functions in a programming language of your choice. For more details, refer to Automate tasks in your application using conversational agents.

We implement the marketing agent through Amazon Bedrock Agents, and use the following key features:

  • Foundation model – The agent invokes an FM to interpret user input, generate subsequent prompts in its orchestration process, and generate creative content based on the customer’s requirement.
  • Instructions – Instructions tell the agent what it’s designed to do and how to do it.
  • Action groups – Action groups are interfaces that an agent uses to interact with the different underlying components such as APIs (such as Amazon Personalize batch inference result on Amazon S3) and databases (such as user or merchandise databases). An agent uses action groups to carry out actions, such as making an API call to another tool.
  • Knowledge base – The knowledge base is a link to an existing data source, consisting of the customer’s historical creative content, which allows the agent to query for extra context for the prompts.

For details about supported models, refer to Supported foundation models in Amazon Bedrock, Supported regions and models for Amazon Bedrock Agents, and Supported regions and models for Amazon Bedrock Knowledge Bases.

The following diagram illustrates the solution workflow.

There are two associated action groups:

  • Segment targeted customer list – Useful for segmenting a customer list for specific merchandise that you aim to promote
  • Generate personalized creative content – Useful for generating creative content tailored to specific purposes, such as diverse customer preferences, varying customer types, and different marketing channels

We use two types of datasets in this solution:

  • Structured customer data – We use customer data, merchandise (item) data, and interaction data to train the segmentation model using Amazon Personalize
  • Unstructured data – We use historical creative content and merchandise (item) data as augmented prompts to make sure that the creative content generated by the LLM aligns with your brand’s style and marketing guidelines

When the marketing agent receives a prompt from a business user, it follows a number of steps as part of its orchestration:

  1. Outline the steps for the task by using an LLM within Amazon Bedrock according to the specifications provided in the prompt.
  2. Follow chain-of-thought reasoning and instructions, and complete the steps using appropriate action groups. As part of the process, depending on the prompt, the agent will search and identify relevant context for RAG.
  3. Pass the results with the prompt to an LLM within Amazon Bedrock.
  4. Augment the prompt with the results of the tool execution or knowledge base search and send it to the LLM.

The following diagram illustrates the technical architecture and key steps.

Amazon Bedrock Agents allows you to set up the entire process, including getting the user segmentation list from Amazon Personalize and generating the personalized promotional content with Anthropic’s Claude 3 on Amazon Bedrock. There are three steps: data preparation, agent development, and agent testing. You can find the sample code and the AWS Cloud Development Kit (AWS CDK) stack in the GitHub repo.

Prepare the data

Complete the following steps to prepare your data:

  1. Store your creative content on Amazon S3. Ingest your data by generating embeddings with an FM and storing them in a supported vector store like Amazon OpenSearch Service.
  2. Use Amazon Bedrock Knowledge Bases by specifying an S3 bucket that contains your exported creative content data. For instructions, refer to Retrieve data and generate AI responses with knowledge bases.
    1. Use OpenSearch Service as the vector database.
    2. Complete the knowledge base configuration and synchronize data from Amazon S3 to OpenSearch Service so the vector database data remains up to date.
  3. Initiate an Amazon Personalize job with the USER_SEGMENTATION recipe to create user segmentations and export the results to Amazon S3. For more information, see Getting user segments.
    1. Upload your user dataset, interactions dataset, and item dataset into Amazon S3 for model training and create a batch segment job to get your user segment list. This allows you to map item IDs to a list of users interested in these items.
    2. The batch segment job output will be a JSON file stored on Amazon S3 that looks like the following example:
{"input": {"itemId": "e1669081-8ffc-4dec-97a6-e9176d7f6651"}, "output": {"usersList": ["3184","4223","4301",...]}, "error": null}

Build the agent

In this solution, you need a marketing agent, a creative content knowledge base, and three tools (the merchandise tool to get detailed merchandise information, the user segment tool to get the target audience list, and the creative content tool to generate the creative content by the LLM) tailored to automate the various tasks associated with delivering personalized creative content efficiently. Complete the following steps to build your agent:

  1. Clone the repository to your local machine or AWS environment, set up a virtual environment and activate it, download the related data, and install the required Python packages using the following code:
git clone https://github.com/aws-samples/amazon-bedrock-samples.git

cd ./amazon-bedrock-samples/agents-for-bedrock/use-case-examples/marketing-agent

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements-dev.txt
wget https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/ML-16145/data.zip
unzip data.zip
wget https://code.retaildemostore.retail.aws.dev/images.tar.gz
mkdir -p data/image

tar xf images.tar.gz -C data/image
  1. Deploy using the following code:
cdk deploy
  1. Locate the agent name in the output of the preceding command. In the following screenshot, for example, the Amazon Bedrock agent name is bedrock-agent-markegent-2e1ece3c.

To dive deeper into the solution and code for the complete steps, see the GitHub repo.

Test the agent

You can test the agent on the Amazon Bedrock console. For instructions, see Test and troubleshoot agent behavior.

The following screenshot shows an example of testing the agent.

We used the following input message:

I would like to write a social media post to promote the Sans pareil scarf (e1669081-8ffc-4dec-97a6-e9176d7f6651). Please assist me in crafting the marketing content.

The following is the agent output:

Elevate Your Style with Timeless Elegance 💎 Introducing the Sans Pareil Scarf - a true masterpiece of fashion designed for the sophisticated, style-conscious woman. Crafted with impeccable artistry from the finest materials, this $124.99 luxury accessory exudes an air of refinement that instantly elevates any outfit. Its rich colors and intricate patterns complement both casual and formal ensembles, adding an undeniable touch of elegance. Whether dressing for the office, a night out, or a special occasion, the Sans Pareil Scarf is the ultimate statement piece for the modern woman seeking timeless style. From the fashion-forward trendsetter to the classically chic professional, this exquisite scarf transcends age while remaining a treasured investment that will never go out of style. Indulge in enduring quality and sophistication with the Sans Pareil Scarf. Elevate your style and turn heads wherever you go. #LuxuryFashion #SansPareilScarf #TimelessElegance

Clean up

In this post, we demonstrated interactions using Lambda, Amazon Bedrock Agents, and Amazon Bedrock Knowledge Bases. To avoid incurring additional costs, delete these resources in the following order:

  1. Delete the Amazon Bedrock agent.
  2. Delete the Amazon Bedrock knowledge base and its associated resources.
  3. Delete the Lambda function and its related resources.

Summary

In this post, we discussed the use case of targeted marketing as an example to demonstrate the efficient delivery of personalized marketing creative content and target audience lists through a generative AI agent. The next step might involve developing a reinforcement learning-based agent to iterate on the performance of the agent.

Our customer, Chunghwa Telecom, a leading telecom customer in Taiwan, followed this solution to implement generative AI enhanced marketing technology tool to enhance their business through Amazon Bedrock. The marketing agent enabled CHT to initiate tailored campaigns promptly, leading to the realization of personalized marketing strategies and a 24-fold increase in their clickthrough rate.

To use our marketing agent to enhance your marketing tasks, refer to the GitHub repo.


About the Authors

Ray Wang is a Senior Solutions Architect at AWS. With 10 years of experience in the IT industry, Ray is dedicated to building modern solutions on the cloud, especially in NoSQL, big data, machine learning, and Generative AI. As a hungry go-getter, he passed all 12 AWS certificates to make his technical field not only deep but wide. He loves to read and watch sci-fi movies in his spare time.

Paul Lu is a Senior Solution Architect at Amazon Web Services (AWS). He specialize in Serverless and modern application development, helping customers design high-performing, scalable cloud solutions. With extensive experience, he is passionate about driving innovation and delivering exceptional results.

Read More

GPU’s Companion: NVIDIA App Supercharges RTX GPUs With AI-Powered Tools and Features

GPU’s Companion: NVIDIA App Supercharges RTX GPUs With AI-Powered Tools and Features

The NVIDIA app — officially releasing today — is a companion platform for content creators, GeForce gamers and AI enthusiasts using GeForce RTX GPUs.

Featuring a GPU control center, the NVIDIA app allows users to access all their GPU settings in one place. From the app, users can do everything from updating to the latest drivers and configuring NVIDIA G-SYNC monitor settings, to tapping AI video enhancements through RTX Video and discovering exclusive AI-powered NVIDIA apps.

In addition, NVIDIA RTX Remix has a new update that improves performance and streamlines workflows.

For a deeper dive on gaming-exclusive benefits, check out the GeForce article.

The GPU’s PC Companion

The NVIDIA app turbocharges GeForce RTX GPUs with a bevy of applications, features and tools.

Keep NVIDIA Studio Drivers up to date — The NVIDIA app automatically notifies users when the latest Studio Driver is available. These graphics drivers, fine-tuned in collaboration with developers, enhance performance in top creative applications and are tested extensively to deliver maximum stability. They’re released once a month.

Discover AI creator apps — Millions have used the NVIDIA Broadcast app to turn offices and dorm rooms into home studios using AI-powered features that improve audio and video quality — without the need for expensive, specialized equipment. It’s user-friendly, works in virtually any app and includes AI features like Noise and Acoustic Echo Removal, Virtual Backgrounds, Eye Contact, Auto Frame, Vignettes and Video Noise Removal.

NVIDIA RTX Remix is a modding platform built on NVIDIA Omniverse that allows users to capture game assets, automatically enhance materials with generative AI tools and create stunning RTX remasters with full ray tracing, including DLSS 3.5 support featuring Ray Reconstruction.

NVIDIA Canvas uses AI to turn simple brushstrokes into realistic landscape images. Artists can create backgrounds quickly or speed up concept exploration, enabling them to visualize more ideas.

Enhance video streams with AI — The NVIDIA app includes a System tab as a one-stop destination for display, video and GPU options. It also includes an AI feature called RTX Video that enhances all videos streamed on browsers.

RTX Video Super Resolution uses AI to enhance video streaming on GeForce RTX GPUs by removing compression artifacts and sharpening edges when upscaling.

RTX Video HDR converts any standard dynamic range video into vibrant high dynamic range (HDR) when played in Google Chrome, Microsoft Edge, Mozilla Firefox or the VLC media player. HDR enables more vivid, dynamic colors to enhance gaming and content creation. A compatible HDR10 monitor is required.

Give game streams or video on demand a unique look with AI filters — Content creators looking to elevate their streamed or recorded gaming sessions can access the NVIDIA app’s redesigned Overlay feature with AI-powered game filters.

Freestyle RTX filters allow livestreamers and content creators to apply fun post-processing filters, changing the look and mood of content with tweaks to color and saturation.

Joining these Freestyle RTX game filters is RTX Dynamic Vibrance, which enhances visual clarity on a per-app basis. Colors pop more on screen, and color crushing is minimized to preserve image quality and immersion. The filter is accelerated by Tensor Cores on GeForce RTX GPUs, making it easier for viewers to enjoy all the action.

Enhanced visual clarity with RTX Dynamic Vibrance.

Freestyle RTX filters empower gamers to personalize the visual aesthetics of their favorite games through real-time post-processing filters. This feature boasts compatibility with a vast library of more than 1,200 games.

Download the NVIDIA app today.

RTX Remix 0.6 Release

The new RTX Remix update offers modders significantly improved mod performance, as well as quality of life improvements that help streamline the mod-making process.

RTX Remix now supports the ability to test experimental features under active development. It includes a new Stage Manager that makes it easier to see and change every mesh, texture, light or element in scenes in real time.

To learn more about the RTX Remix 0.6 release, check out the release notes.

With RTX Remix in the NVIDIA app launcher, modders have direct access to Remix’s powerful features. Through the NVIDIA app, RTX Remix modders can benefit from faster start-up times, lower CPU usage and direct control over updates with an optimized user interface.

To the 3D Victor Go the Spoils

NVIDIA Studio in June kicked off a 3D character contest for artists in collaboration with Reallusion, a company that develops 2D and 3D character creation and animation software. Today, we’re celebrating the winners from that contest.

In the category of Best Realistic Character Animation, Robert Lundqvist won for the piece Lisa and Fia.

In the category of Best Stylized Character Animation, Loic Bramoulle won for the piece HellGal.

Both winners will receive an NVIDIA Studio-validated laptop to help further their creative efforts.

View over 250 imaginative and impressive entries here.

Follow NVIDIA Studio on Instagram, X and Facebook. Access tutorials on the Studio YouTube channel and get updates directly in your inbox by subscribing to the Studio newsletter. 

Generative AI is transforming gaming, videoconferencing and interactive experiences of all kinds. Make sense of what’s new and what’s next by subscribing to the AI Decoded newsletter.

Read More