We investigate the out-of-domain generalization of random feature (RF) models and Transformers. We first prove that in the ‘generalization on the unseen (GOTU)’ setting, where training data is fully seen in some part of the domain but testing is made on another part, and for RF models in the small feature regime, the convergence takes place to interpolators of minimal degree as in the Boolean case (Abbe et al., 2023). We then consider the sparse target regime and explain how this regime relates to the small feature regime, but with a different regularization term that can alter the picture in…Apple Machine Learning Research
Revealing the Utilized Rank of Subspaces of Learning in Neural Networks
In this work, we study how well the learned weights of a neural network utilize the space available to them. This notion is related to capacity, but additionally incorporates the interaction of the network architecture with the dataset. Most learned weights appear to be full rank, and are therefore not amenable to low rank decomposition. This deceptively implies that the weights are utilizing the entire space available to them. We propose a simple data-driven transformation that projects the weights onto the subspace where the data and the weight interact. This preserves the functional mapping…Apple Machine Learning Research
Mile-High AI: NVIDIA Research to Present Advancements in Simulation and Gen AI at SIGGRAPH
NVIDIA is taking an array of advancements in rendering, simulation and generative AI to SIGGRAPH 2024, the premier computer graphics conference, which will take place July 28 – Aug. 1 in Denver.
More than 20 papers from NVIDIA Research introduce innovations advancing synthetic data generators and inverse rendering tools that can help train next-generation models. NVIDIA’s AI research is making simulation better by boosting image quality and unlocking new ways to create 3D representations of real or imagined worlds.
The papers focus on diffusion models for visual generative AI, physics-based simulation and increasingly realistic AI-powered rendering. They include two technical Best Paper Award winners and collaborations with universities across the U.S., Canada, China, Israel and Japan as well as researchers at companies including Adobe and Roblox.
These initiatives will help create tools that developers and businesses can use to generate complex virtual objects, characters and environments. Synthetic data generation can then be harnessed to tell powerful visual stories, aid scientists’ understanding of natural phenomena or assist in simulation-based training of robots and autonomous vehicles.
Diffusion Models Improve Texture Painting, Text-to-Image Generation
Diffusion models, a popular tool for transforming text prompts into images, can help artists, designers and other creators rapidly generate visuals for storyboards or production, reducing the time it takes to bring ideas to life.
Two NVIDIA-authored papers are advancing the capabilities of these generative AI models.
ConsiStory, a collaboration between researchers at NVIDIA and Tel Aviv University, makes it easier to generate multiple images with a consistent main character — an essential capability for storytelling use cases such as illustrating a comic strip or developing a storyboard. The researchers’ approach introduces a technique called subject-driven shared attention, which reduces the time it takes to generate consistent imagery from 13 minutes to around 30 seconds.
NVIDIA researchers last year won the Best in Show award at SIGGRAPH’s Real-Time Live event for AI models that turn text or image prompts into custom textured materials. This year, they’re presenting a paper that applies 2D generative diffusion models to interactive texture painting on 3D meshes, enabling artists to paint in real time with complex textures based on any reference image.
Kick-Starting Developments in Physics-Based Simulation
Graphics researchers are narrowing the gap between physical objects and their virtual representations with physics-based simulation — a range of techniques to make digital objects and characters move the same way they would in the real world.
Several NVIDIA Research papers feature breakthroughs in the field, including SuperPADL, a project that tackles the challenge of simulating complex human motions based on text prompts (see video at top).
Using a combination of reinforcement learning and supervised learning, the researchers demonstrated how the SuperPADL framework can be trained to reproduce the motion of more than 5,000 skills — and can run in real time on a consumer-grade NVIDIA GPU.
Another NVIDIA paper features a neural physics method that applies AI to learn how objects — whether represented as a 3D mesh, a NeRF or a solid object generated by a text-to-3D model — would behave as they are moved in an environment.
A paper written in collaboration with Carnegie Mellon University researchers develops a new kind of renderer — one that, instead of modeling physical light, can perform thermal analysis, electrostatics and fluid mechanics. Named one of five best papers at SIGGRAPH, the method is easy to parallelize and doesn’t require cumbersome model cleanup, offering new opportunities for speeding up engineering design cycles.
In the example above, the renderer performs a thermal analysis of the Mars Curiosity rover, where keeping temperatures within a specific range is critical to mission success.
Additional simulation papers introduce a more efficient technique for modeling hair strands and a pipeline that accelerates fluid simulation by 10x.
Raising the Bar for Rendering Realism, Diffraction Simulation
Another set of NVIDIA-authored papers present new techniques to model visible light up to 25x faster and simulate diffraction effects — such as those used in radar simulation for training self-driving cars — up to 1,000x faster.
A paper by NVIDIA and University of Waterloo researchers tackles free-space diffraction, an optical phenomenon where light spreads out or bends around the edges of objects. The team’s method can integrate with path-tracing workflows to increase the efficiency of simulating diffraction in complex scenes, offering up to 1,000x acceleration. Beyond rendering visible light, the model could also be used to simulate the longer wavelengths of radar, sound or radio waves.
Path tracing samples numerous paths — multi-bounce light rays traveling through a scene — to create a photorealistic picture. Two SIGGRAPH papers improve sampling quality for ReSTIR, a path-tracing algorithm first introduced by NVIDIA and Dartmouth College researchers at SIGGRAPH 2020 that has been key to bringing path tracing to games and other real-time rendering products.
One of these papers, a collaboration with the University of Utah, shares a new way to reuse calculated paths that increases effective sample count by up to 25x, significantly boosting image quality. The other improves sample quality by randomly mutating a subset of the light’s path. This helps denoising algorithms perform better, producing fewer visual artifacts in the final render.
Teaching AI to Think in 3D
NVIDIA researchers are also showcasing multipurpose AI tools for 3D representations and design at SIGGRAPH.
One paper introduces fVDB, a GPU-optimized framework for 3D deep learning that matches the scale of the real world. The fVDB framework provides AI infrastructure for the large spatial scale and high resolution of city-scale 3D models and NeRFs, and segmentation and reconstruction of large-scale point clouds.
A Best Technical Paper award winner written in collaboration with Dartmouth College researchers introduces a theory for representing how 3D objects interact with light. The theory unifies a diverse spectrum of appearances into a single model.
And a collaboration with University of Tokyo, University of Toronto and Adobe Research introduces an algorithm that generates smooth, space-filling curves on 3D meshes in real time. While previous methods took hours, this framework runs in seconds and offers users a high degree of control over the output to enable interactive design.
NVIDIA at SIGGRAPH
Learn more about NVIDIA at SIGGRAPH, with special events including a fireside chat between NVIDIA founder and CEO Jensen Huang and Lauren Goode, senior writer at WIRED, on the impact of robotics and AI in industrial digitalization.
NVIDIA researchers will also present OpenUSD Day by NVIDIA, a full-day event showcasing how developers and industry leaders are adopting and evolving OpenUSD to build AI-enabled 3D pipelines.
NVIDIA Research has hundreds of scientists and engineers worldwide, with teams focused on topics including AI, computer graphics, computer vision, self-driving cars and robotics. See more of their latest work.
Enhancing CTC-based Speech Recognition with Diverse Modeling Units
In recent years, the evolution of end-to-end (E2E) automatic speech recognition (ASR) models has been remarkable, largely due to advances in deep learning architectures like transformer. On top of E2E systems, researchers have achieved substantial accuracy improvement by rescoring E2E model’s N-best hypotheses with a phoneme-based model. This raises an interesting question about where the improvements come from other than the system combination effect. We examine the underlying mechanisms driving these gains and propose an efficient joint training approach, where E2E models are trained jointly…Apple Machine Learning Research
On Computationally Efficient Multi-Class Calibration
Consider a multi-class labelling problem, where the labels can take values in [k], and a predictor predicts a distribution over the labels. In this work, we study the following foundational question: Are there notions of multi-class calibration that give strong guarantees of meaningful predictions and can be achieved in time and sample complexities polynomial in k? Prior notions of calibration exhibit a tradeoff between computational efficiency and expressivity: they either suffer from having sample complexity exponential in k, or needing to solve computationally intractable problems, or give…Apple Machine Learning Research
Omnipredictors for Regression and the Approximate Rank of Convex Functions
Consider the supervised learning setting where the goal is to learn to predict labels y given points x from a distribution. An omnipredictor for a class L of loss functions and a class C of hypotheses is a predictor whose predictions incur less expected loss than the best hypothesis in C for every loss in L. Since the work of [GKR+21] that introduced the notion, there has been a large body of work in the setting of binary labels where y∈{0,1}, but much less is known about the regression setting where y∈[0,1] can be continuous. Our main conceptual contribution is the notion of sufficient…Apple Machine Learning Research
Superposition Prompting: Improving and Accelerating Retrieval-Augmented Generation
Despite the successes of large language models (LLMs), they exhibit significant drawbacks, particularly when processing long contexts. Their inference cost scales quadratically with respect to sequence length, making it expensive for deployment in some real-world text processing applications, such as retrieval-augmented generation (RAG). Additionally, LLMs also exhibit the “distraction phenomenon,” where irrelevant context in the prompt degrades output quality. To address these drawbacks, we propose a novel RAG prompting methodology, superposition prompting, which can be directly applied to…Apple Machine Learning Research
How Smooth Is Attention?
Self-attention and masked self-attention are at the heart of Transformers’ outstanding success. Still, our mathematical understanding of attention, in particular of its Lipschitz properties — which are key when it comes to analyzing robustness and expressive power — is incomplete. We provide a detailed study of the Lipschitz constant of self-attention in several practical scenarios, discussing the impact of the sequence length and layer normalization on the local Lipschitz constant of both unmasked and masked self-attention. In particular, we show that for inputs of length n in any compact…Apple Machine Learning Research
Careful With That Scalpel: Improving Gradient Surgery With an EMA
Beyond minimizing a single training loss, many deep learning estimation pipelines rely on an auxiliary objective to quantify and encourage desirable properties of the model (e.g. performance on another dataset, robustness, agreement with a prior). Although the simplest approach to incorporating an auxiliary loss is to sum it with the training loss as a regularizer, recent works have shown that one can improve performance by blending the gradients beyond a simple sum; this is known as gradient surgery. We cast the problem as a constrained minimization problem where the auxiliary objective is…Apple Machine Learning Research
Using Agents for Amazon Bedrock to interactively generate infrastructure as code
In the diverse toolkit available for deploying cloud infrastructure, Agents for Amazon Bedrock offers a practical and innovative option for teams looking to enhance their infrastructure as code (IaC) processes. Agents for Amazon Bedrock automates the prompt engineering and orchestration of user-requested tasks. After being configured, an agent builds the prompt and augments it with your company-specific information to provide responses back to the user in natural language.
This solution shows how Amazon Bedrock agents can be configured to accept cloud architecture diagrams, automatically analyze them, and generate Terraform or AWS CloudFormation templates. This solution uses Retrieval Augmented Generation (RAG) to ensure the generated scripts adhere to organizational needs and industry standards. A key feature is the agent’s ability to dynamically interact with users. During the IaC generation process, Amazon Bedrock agents actively probe for additional information by analyzing the provided diagrams and querying the user to fill any gaps. This interaction allows for a more tailored and precise IaC configuration.
Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading artificial intelligence (AI) companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI.
In this blog post, we explore how Agents for Amazon Bedrock can be used to generate customized, organization standards-compliant IaC scripts directly from uploaded architecture diagrams. This will help accelerate deployments, reduce errors, and ensure adherence to security guidelines.
Solution overview
Before we explore the deployment process, let’s walk through the key steps of the architecture as illustrated in Figure 1.
- Initial Input through the Amazon Bedrock chat console: The user begins by entering the name of their Amazon Simple Storage Service (Amazon S3) bucket and the object (key) name where the architecture diagram is stored into the Amazon Bedrock chat console. For instance, if an architecture diagram is saved as s3://testbucket/architecturediagram.png, the user will enter testbucket as the S3 bucket name and architecturediagram.png as the object name.
- Diagram analysis and query generation: The Amazon Bedrock agent forwards the architecture diagram location to an action group that invokes an AWS Lambda. This function retrieves the architecture diagram from the specified S3 bucket, analyzes it using the Amazon Bedrock model, and produces a summary of the diagram. It also generates questions regarding any missing components, dependencies, or parameter values that are needed to create IaC for AWS services. This detailed response is then sent back to the agent.
- Interaction and user confirmation: The agent displays the generated questions to the user and records their responses. Next, the agent provides a comprehensive summary of the architecture diagram along with additional inputs provided by the user. Users then have the opportunity to approve this configuration or suggest any necessary adjustments. On receiving confirmation from the user, the agent passes this information to the second action group to generate IaC.
- IaC generation and deployment: The second action group invokes a Lambda function that processes the user’s input data along with organization-specific coding guidelines from Knowledge Bases for Amazon Bedrock to create the IaC. After being generated, the IaC is automatically pushed to a designated GitHub repository.
Prerequisites
You should have the following:
- Understanding of Agents for Amazon Bedrock, prompt engineering, Knowledge Bases for Amazon Bedrock, Lambda functions, and AWS Identity and Access Management (IAM).
- An AWS account with the appropriate IAM permissions to create Amazon Bedrock agents and knowledge bases, Lambda functions, and IAM roles.
- Create a service role for Agents for Amazon Bedrock.
- A GitHub account with a repository to store the generated Terraform scripts.
Deployment steps
The solution can be used to create IaC (using Terraform or CloudFormation) by inputting the architecture diagram. For the purpose of this blog post, we focus on creating Terraform IaC. There are four steps to deploy the solution.
Step 1: Configure an Amazon Bedrock knowledge base: Configuring a knowledge base (KB) enables you to access information about organization standard Terraform modules. Follow these steps to set up your KB:
- Sign in and go to the AWS Management Console for Amazon Bedrock. Go directly to the Knowledge Base section. This is your starting point for creating a new KB.
- Enter a clear and descriptive name that reflects the purpose of your KB, such as Terraform KB.
- Assign a pre-configured IAM role with the necessary permissions. It’s typically best to let Amazon Bedrock create this role for you to ensure it has the correct permissions.
- Define the data sources by uploading a JSON file to an S3 bucket with encryption enabled for security. This file should contain a structured list of AWS services and Terraform modules. For the JSON structure, use the example provided in the repository.
- Choose the default embeddings model. For most use cases, the Amazon Bedrock Titan G1 Embeddings – Text model will suffice. It’s pre-configured and ready to use, simplifying the process.
- Use the managed vector store to allow Amazon Bedrock to create and manage the vector store for you in Amazon OpenSearch Service.
- Select the KB and in the Data source section, choose Sync to begin data ingestion. When data ingestion completes, a green success banner appears if it is successful.
- Double-check all entered information for accuracy. Pay special attention to the S3 bucket URI and IAM role details.
Step 2: Configure the Bedrock agent:
- Open the Amazon Bedrock console, select Agents in the left navigation panel, then choose Create Agent.
- Enter agent details including agent name and description (optional).
- Next, grant the agent permissions to AWS services through the IAM service role. This gives your agent access to required services, such as Lambda.
- Select a foundation model from Amazon Bedrock (for example, Anthropic Claude 3 Sonnet).
- To create Terraform code using Agents for Amazon Bedrock, attach the following instruction to the agent:
“Assist users in creating IaC for provided architecture diagram. Ask user for S3 bucket name and object name where the diagram is stored. Upon receiving the information, run analysis-query action group. Give structured summary and ask user only the questions that are received from action group response. Take the answers from the user and give detailed summary to the user. Take approval from user. When approved, give all that information to final draft along with S3 bucket name, object name as input for the iac-deployment action group and run the action group.”
Step 3: Configuring agent action groups: After initial agent configuration and adding the above instruction to the agent, there are two actions that need to be added to the agent to create Terraform IaC by passing an architecture diagram.
- Create an action group linked to a Lambda function (for creating a Lambda function, see Getting started with Lambda) that is designed to analyze the architecture diagram and generates questions related to any missing components, dependencies, or parameter values necessary for IaC creation of AWS services. This group is invoked by the agent following the user’s input of S3 bucket and object details. The responses are then relayed back to the agent, which conducts an interactive session to collect any missing information from the user. See Lambda code and OpenAPI-schema in the repository.
- Establish a second action group tied to a different Lambda function responsible for creating the Terraform code and uploading it to a GitHub repository. This group is invoked only after the user has reviewed and approved the infrastructure configuration. See Lambda code and OpenAPI-schema in the repository.
Step 4: Add the action groups to the agent:
- Assign a descriptive name to each action group and detail their functions in the description fields. This helps clarify the purpose of each group within the workflow.
- For each action group, select the appropriate Lambda functions that you set up previously. These functions run the business logic required when an action is invoked. Make sure to choose the correct version of each Lambda function. For additional details, see the section on Action Group Lambda Functions.
- Provide the Amazon S3 URI that links to the API schema for each action group. This schema should include the API’s description, structure, and parameters. The API is crucial for managing the workflow, such as receiving user inputs, invoking Lambda functions to run the process, validating inputs, initiating Terraform module creation, and monitoring the provisioning status. For further guidance, see the section on Action Group OpenAPI Schemas.
The following screenshot shows an example of the user interaction with Agents for Amazon Bedrock
The following screenshot shows an example Terraform output
Clean up
The services used in this demonstration can incur costs. Complete the following steps to clean up your resources:
- Delete the Lambda functions if they’re no longer required.
- Delete action groups and Amazon Bedrock agent that were created.
- Empty and delete the S3 bucket used for storing the architecture diagram.
- Remove the generated Terraform scripts from the GitHub repo.
- Delete the Amazon Bedrock knowledge base Bedrock if it’s no longer needed.
Conclusion
Agents for Amazon Bedrock uses generative AI to transform architecture diagrams into compliant infrastructure as code (IaC) scripts for AWS deployments, such as Terraform and AWS CloudFormation. This capability is a crucial tool for engineers transitioning to the cloud, speeding up the cloud adoption process while ensuring that deployments adhere to established best practices from the start.
Through the interactive features of Agents for Amazon Bedrock, the automation of IaC generation not only streamlines the initial set up but also significantly improves ongoing operations like infrastructure management. Although this post concentrates on IaC creation, the interactive capabilities of Agents for Amazon Bedrock can be used across various AWS services, providing a dynamic and comprehensive solution for managing and optimizing cloud infrastructure.
Are you ready to streamline your cloud deployment process with the generative AI of Amazon Bedrock? Start by delving into the Amazon Bedrock User Guide to see how it can facilitate your organization’s transition to the cloud. For specialized assistance, consider engaging with AWS Professional Services to maximize the efficiency and benefits of using Amazon Bedrock. Embrace the potential for a swift, secure, and efficient cloud transformation with Amazon Bedrock. Take the first step today and discover how using generative AI can revolutionize your approach to cloud infrastructure.
About the Author
Akhil Raj Yallamelli is a Cloud Infrastructure Architect at AWS, specializing in optimizing cloud infrastructures for enhanced data security and cost efficiency. He skillfully integrates technical solutions with business strategies to create scalable, reliable, and secure cloud environments. Akhil builds technical solutions focusing on customer business outcomes, incorporating generative AI (Gen AI) technologies to drive innovation. With deep expertise in AWS and a strong background in DevOps methodologies throughout the software development life cycle (SDLC), Akhil leads critical implementation and migration projects. He holds an MS degree in Computer Science. Outside of his professional work, Akhil enjoys watching and playing sports.