AI Decoded: Demystifying AI and the Hardware, Software and Tools That Power It

AI Decoded: Demystifying AI and the Hardware, Software and Tools That Power It

With the 2018 launch of RTX technologies and the first consumer GPU built for AI — GeForce RTX — NVIDIA accelerated the shift to AI computing. Since then, AI on RTX PCs and workstations has grown into a thriving ecosystem with more than 100 million users and 500 AI applications.

Generative AI is now ushering in a new wave of capabilities from PC to cloud. And NVIDIA’s rich history and expertise in AI is helping ensure all users have the performance to handle a wide range of AI features.

Users at home and in the office are already taking advantage of AI on RTX with productivity- and entertainment-enhancing software. Gamers feel the benefits of AI on GeForce RTX GPUs with higher frame rates at stunning resolutions in their favorite titles. Creators can focus on creativity, instead of watching spinning wheels or repeating mundane tasks. And developers can streamline workflows using generative AI for prototyping and to automate debugging.

The field of AI is moving fast. As research advances, AI will tackle more complex tasks. And the demanding performance needs will be handled by RTX.

What Is AI?

In its most fundamental form, artificial intelligence is a smarter type of computing. It’s the capability of a computer program or a machine to think, learn and take actions without being explicitly coded with commands to do so, or a user having to control each command.

AI can be thought of as the ability for a device to perform tasks autonomously, by ingesting and analyzing enormous amounts of data, then recognizing patterns in that data — often referred to as being “trained.”

AI development is always oriented around developing systems that perform tasks that would otherwise require human intelligence, and often significant levels of input, to complete — only at speeds beyond any individual’s or group’s capabilities. For this reason, AI is broadly seen as both disruptive and highly transformational.

A key benefit of AI systems is the ability to learn from experiences or patterns inside data, adjusting conclusions on their own when fed new inputs or data. This self-learning allows AI systems to accomplish a stunning variety of tasks, including image recognition, speech recognition, language translation, medical diagnostics, car navigation, image and video enhancement, and hundreds of other use cases.

The next step in the evolution of AI is content generation — referred to as generative AI. It enables users to quickly create new content, and iterate on it, based on a variety of inputs, which can include text, images, sounds, animation, 3D models or other types of data. It then generates new content in the same or a new form.

Popular language applications, like the cloud-based ChatGPT, allow users to generate long-form copy based on a short text request. Image generators like Stable Diffusion turn descriptive text inputs into the desired image. New applications are turning text into video and 2D images into 3D renderings.

GeForce RTX AI PCs and NVIDIA RTX Workstations

AI PCs are computers with dedicated hardware designed to help AI run faster. It’s the difference between sitting around waiting for a 3D image to load, and seeing it update instantaneously with an AI denoiser.

On RTX GPUs, these specialized AI accelerators are called Tensor Cores. And they dramatically speed up AI performance across the most demanding applications for work and play.

One way that AI performance is measured is in teraops, or trillion operations per second (TOPS). Similar to an engine’s horsepower rating, TOPS can give users a sense of a PC’s AI performance with a single metric. The current generation of GeForce RTX GPUs offers performance options that range from roughly 200 AI TOPS all the way to over 1,300 TOPS, with many options across laptops and desktops in between. Professionals get even higher AI performance with the NVIDIA RTX 6000 Ada Generation GPU.

To put this in perspective, the current generation of AI PCs without GPUs range from 10 to 45 TOPS.

More and more types of AI applications will require the benefits of having a PC capable of performing certain AI tasks locally — meaning on the device rather than running in the cloud. Benefits of running on an AI PC include that computing is always available, even without an internet connection; systems offer low latency for high responsiveness; and increased privacy so that users don’t have to upload sensitive materials to an online database before it becomes usable by an AI.

AI for Everyone

RTX GPUs bring more than just performance. They introduce capabilities only possible with RTX technology. Many of these AI features are accessible — and impactful — to millions, regardless of the individual’s skill level.

From AI upscaling to improved video conferencing to intelligent, personalizable chatbots, there are tools to benefit all types of users.

RTX Video uses AI to upscale streaming video and display it in HDR. Bringing lower-resolution video in standard dynamic range to vivid, up to 4K high-resolution high dynamic range. RTX users can enjoy the feature with one-time, one-click enablement on nearly any video streamed in a Chrome or Edge browser.

NVIDIA Broadcast, a free app for RTX users with a straightforward user interface, has a host of AI features that improve video conferencing and livestreaming. It removes unwanted background sounds like clicky keyboards, vacuum cleaners and screaming children with Noise and Echo Removal. It can replace or blur backgrounds with better edge detection using Virtual Background. It smooths low-quality camera images with Video Noise Removal. And it can stay centered on the screen with eyes looking at the camera no matter where the user moves, using Auto Frame and Eye Contact.

Chat with RTX is a local, personalized AI chatbot demo that’s easy to use and free to download.

The tech demo, originally released in January, will get an update with Google’s Gemma soon.

Users can easily connect local files on a PC to a supported large language model simply by dropping files into a single folder and pointing the demo to the location. It enables queries for quick, contextually relevant answers.

Since Chat with RTX runs locally on Windows with GeForce RTX PCs and NVIDIA RTX workstations, results are fast — and the user’s data stays on the device. Rather than relying on cloud-based services, Chat with RTX lets users process sensitive data on a local PC without the need to share it with a third party or have an internet connection.

AI for Gamers

Over the past six years, game performance has seen the greatest leaps with AI acceleration. Gamers have been turning NVIDIA DLSS on since 2019, boosting frame rates and improving image quality. It’s a technique that uses AI to generate pixels in video games automatically. With ongoing improvements, it now increases frame rates by up to 4x.

And with the introduction of Ray Reconstruction in the latest version, DLSS 3.5, visual quality is further enhanced in some of the world’s top titles, setting a new standard for visually richer and more immersive gameplay.

There are now over 500 games and applications that have revolutionized the ways people play and create with ray tracing, DLSS and AI-powered technologies.

Beyond frames, AI is set to improve the way gamers interact with characters and remaster classic games.

NVIDIA ACE microservices — including generative AI-powered speech and animation models — are enabling developers to add intelligent, dynamic digital avatars to games. Demonstrated at CES, ACE won multiple awards for its ability to bring game characters to life as a glimpse into the future of PC gaming.

NVIDIA RTX Remix, a platform for modders to create stunning RTX remasters of classic games, delivers generative AI tools that can transform basic textures from classic games into modern, 4K-resolution, physically based rendering materials. Several projects have already been released or are in the works, including Half-Life 2 RTX and Portal with RTX.

AI for Creators

AI is unlocking creative potential by reducing or automating tedious tasks, freeing up time for pure creativity. These features run fastest or solely on PCs with NVIDIA RTX or GeForce RTX GPUs.

Adobe Premiere Pro’s AI-powered Enhance Speech tool removes unwanted noise and improves dialogue quality.

Adobe Premiere Pro’s Enhance Speech tool is accelerated by RTX, using AI to remove unwanted noise and improve the quality of dialogue clips so they sound professionally recorded. It’s up to 4.5x faster on RTX vs. Mac. Another Premiere feature, Auto Reframe, uses GPU acceleration to identify and track the most relevant elements in a video and intelligently reframes video content for different aspect ratios.

Another time-saving AI feature for video editors is DaVinci Resolve’s Magic Mask. Previously, if editors needed to adjust the color/brightness of a subject in one shot or remove an unwanted object, they’d have to use a combination of rotoscoping techniques or basic power windows and masks to isolate the subject from the background.

Magic Mask has completely changed that workflow. With it, simply draw a line over the subject and the AI will process for a moment before revealing the selection. And GeForce RTX laptops can run the feature 2.5x faster than the fastest non-RTX laptops.

This is just a sample of the ways that AI is increasing the speed of creativity. There are now more than 125 AI applications accelerated by RTX.

AI for Developers

AI is enhancing the way developers build software applications through scalable environments, hardware and software optimizations, and new APIs.

NVIDIA AI Workbench helps developers quickly create, test and customize pretrained generative AI models and LLMs using PC-class performance and memory footprint. It’s a unified, easy-to-use toolkit that can scale from running locally on RTX PCs to virtually any data center, public cloud or NVIDIA DGX Cloud.

After building AI models for PC use cases, developers can optimize them using NVIDIA TensorRT — the software that helps developers take full advantage of the Tensor Cores in RTX GPUs.

TensorRT acceleration is now available in text-based applications with TensorRT-LLM for Windows. The open-source library increases LLM performance and includes pre-optimized checkpoints for popular models, including Google’s Gemma, Meta Llama 2, Mistral and Microsoft Phi-2.

Developers also have access to a TensorRT-LLM wrapper for the OpenAI Chat API. With just one line of code change, continue.dev — an open-source autopilot for VS Code and JetBrains that taps into an LLM — can use TensorRT-LLM locally on an RTX PC for fast, local LLM inference using this popular tool.

Every week, we’ll demystify AI by making the technology more accessible, and we’ll showcase new hardware, software, tools and accelerations for RTX AI PC users.

The iPhone moment of AI is here, and it’s just the beginning. Welcome to AI Decoded.

Get weekly updates directly in your inbox by subscribing to the AI Decoded newsletter.

Read More

The Magic Behind the Screen: Celebrating the 96th Academy Awards Nominees for Best Visual Effects

The Magic Behind the Screen: Celebrating the 96th Academy Awards Nominees for Best Visual Effects

The 96th Academy Awards nominees for Best Visual Effects are a testament to the incredible technological advancements pushing the boundaries of what’s possible in film.

Whether showcasing colossal destruction scenes, heart-pumping action sequences or interstellar adventures, each nominee demonstrates unique contributions in visual effects, or VFX — and they all used cutting-edge NVIDIA technologies in their workflows to bring their magic to the screen.

This year’s nominees include:

  • The Creator (20th Century Studios) — Jay Cooper, Ian Comley, Andrew Roberts and Neil Corbould
  • Godzilla: Minus One (Toho) — Takashi Yamazaki, Kiyoko Shibuya, Masaki Takahashi and Tatsuji Nojima
  • Guardians of the Galaxy Vol. 3 (Marvel Studios) — Stephane Ceretti, Alexis Wajsbrot, Guy Williams and Theo Bialek
  • Napoleon (Apple Original Films/Sony Pictures) — Charley Henley, Luc-Ewen Martin-Fenouillet, Simone Coco and Neil Corbould
  • Mission: Impossible – Dead Reckoning Part One (Paramount Pictures) — Alex Wuttke, Simone Coco, Jeff Sutherland and Neil Corbould

Reinventing the Monster Movie

Godzilla: Minus One presented a unique challenge: making a well-known giant monster, or kaijū, feel terrifying anew.

With a budget under $15 million, small by today’s standards, the film’s VFX team relied on rapid iterations with the director to eliminate long review cycles, along with a heavily detailed computer-generated imagery (CGI) model to bring Godzilla to life.

Godzilla was ready for its closeup, the monster’s head alone containing over 200 million polygons. The animators injected nuanced, lifelike behaviors into the creature to round out its performance.

In addition, the film’s destruction scenes used a sophisticated, memory-intensive physics engine, allowing for realistic simulations of crumbling buildings and landscapes under destruction to further immerse audiences in the chaos.

A Cosmic Spectacle

Guardians of the Galaxy Vol. 3 continued the series’s tradition of blending humor with breathtaking cosmic visuals. This installment pushed the envelope with its use of real-time rendering, enabling its artists to visualize complex space environments and characters on set.

The film brought together Wētā FX, Framestore and Sony Pictures Imageworks, among others, to create a whopping 3,000+ VFX shots. The dense, immersive 3D environments allowed for a seamless integration of live-action and CGI elements and characters, resulting in a visually stunning space opera that maintained the series’ signature style while exploring new visual territories.

One of Guardians’s greatest achievements is the hallway fight scene filmed at 120 frames per second and delivered as a single continuous shot with variable speed ramps and nonstop action.

Epic Storytelling Through Detailed VFX

The historical epic Napoleon was brought to life with meticulous attention to detail and scale. The film used various set extensions and practical effects to recreate the vast battlefields and period-specific architecture of early 19th-century Europe.

Advanced crowd simulation was used to depict the massive armies of Napoleon’s time, each soldier animated with individual behaviors to enhance the battle scenes’ realism. These touches, combined with high-resolution textures and dynamic lighting, created a visually compelling narrative grounded in reality.

Exploring AI’s Boundaries

The Creator explored the themes of AI and virtual reality, requiring VFX that could realistically depict advanced technology and digital worlds.

The film made significant use of CG animation and visual effects to create environments both futuristic and plausible. Director Gareth Edwards, also known for Rogue One and Godzilla (2014), has been widely applauded for delivering a film with the look of an expensive summer blockbuster using a fraction of the typical budget.

The portrayal of AI entities involved a combination of motion-capture and procedural animation to create characters that moved and interacted with complexity and fluidity at human level. The VFX team developed custom software to simulate the intricate patterns of digital consciousness, blurring the lines between the virtual and the real.

High-Octane Action Meets Precision VFX

For Mission: Impossible – Dead Reckoning Part One, the visual effects team faced the challenge of enhancing the film’s signature action sequences without detracting from the series’s reputation for practical stunts. To achieve this, they took a hybrid approach, using CGI to seamlessly augment practical effects.

High-speed drone footage integrated with CG elements created breathtaking chase scenes, while advanced compositing techniques added layers of detail and depth to explosions and hand-to-hand combat scenes, elevating the film’s action to new heights.

NVIDIANs at the SciTech Awards

NVIDIA’s Christopher Jon Horvath, joined by Steve LaVietes and Joe Ardent, on stage to accept their award.

The Academy Awards for Scientific and Technical Achievements highlight technical contributions that have significantly affected the way movies are made, as well as the brilliant inventors behind them.

OpenUSD was honored in the science and engineering subcategory for its importance as the first open-source scene description framework that streamlines the entire production workflow. Its innovative layering system and efficient crate file format have established it as the de facto standard for 3D scene interchange, facilitating unparalleled collaboration across the industry. 

The science and engineering subcategory also celebrated other remarkable technologies, including the OpenVDB open-source library, for sparse 3D volumes, which has become an industry standard for visual-effects simulations and renderings of water, fire, smoke and clouds.

Initially created in 2009 by Ken Museth, senior director of physics research at NVIDIA, OpenVDB has been further developed by Museth, Peter Cucka and Mihai Aldén. Learn more about the latest advancements in OpenVDB including NanoVDB and NeuralVDB.

In addition, the Alembic Caching and Interchange system, developed by Lucas Miller, NVIDIA’s Christopher Jon Horvath, Steve LaVietes and Joe Ardent, received recognition for its efficient algorithms in storing and retrieving baked, time-sampled data, facilitating high-efficiency caching and scene sharing across the digital production pipeline.

OpenVDB and Alembic are both interoperable with OpenUSD, enhancing their utility and integration within the industry’s production workflows.

See How Oscar-Nominated VFX Are Created at GTC

Learn more about visual effects, AI, virtual production and animation at NVIDIA GTC, a global AI conference taking place March 18-21 at the San Jose Convention Center and online. Register to hear from industry luminaries creating stunning visuals in film and TV.

Academy Award-winner Ken Museth will present a session, Open-Source Software for Visual Effects: OpenUSD and OpenVDB, on Monday, March 18, at 9 a.m. PT.

And join us for OpenUSD Day to learn how to build generative AI-enabled 3D pipelines and tools using Universal Scene Description. Browse the full list of media and entertainment sessions at GTC.

Featured image courtesy of Toho Co., Ltd. TOHO CO., LTD.

Read More

Orca-Math: Demonstrating the potential of SLMs with model specialization

Orca-Math: Demonstrating the potential of SLMs with model specialization

abstract wave lines on a gradient background

Our work on Orca and Orca 2 demonstrated the power of improved training signals and methods to enhance the reasoning abilities of smaller language models, getting closer to the levels found in much larger language models. Orca-Math is another step in this direction, where we explore the capabilities of small language models (SLMs) when specialized in a certain area, in this case solving grade school math problems, which has long been recognized as a complex task for SLMs.

Orca-Math is a 7 billion parameters model created by fine-tuning the Mistral 7B model. Orca-Math achieves 86.81% on GSM8k pass@1, exceeding the performance of much bigger models including general models (e.g. LLAMA-2-70, Gemini Pro and GPT-3.5) and math-specific models (e.g. MetaMath-70B and WizardMa8th-70B). Note that the base model (Mistral-7B) achieves 37.83% on GSM8K.

Alt Text: Bar graph comparing GSM8K score of different models with an upward trend in quality. The models are LLAMA-2-70, GPT-3.5, Gemini Pro,  WizardMath-70B, MetaMath-70B and Orca-Math-7B. The graph shows that the Orca-Math-7B model outperforms other bigger models on GSM8K.
 Bar graph comparing GSM8K score of different models with an upward trend in quality. The models are LLAMA-2-70, GPT-3.5, Gemini Pro,  WizardMath-70B, MetaMath-70B and Orca-Math-7B. The graph shows that the Orca-Math-7B model outperforms other bigger models on GSM8K.

The state-of-the-art (SOTA) performance of the Orca-Math model can be attributed to two key insights:

  • Training on high-quality synthetic data with 200,000 math problems, created using multi-agents (AutoGen). This is smaller than other math datasets, which could have millions of problems. The smaller model and smaller dataset mean faster and cheaper training.
  • In addition to traditional supervised fine-tuning, the model was trained using an iterative learning process, where the model is allowed to practice solving problems and continues to improve based on feedback from a teacher.

Our findings show that smaller models are valuable in specialized settings, where they can match the performance of much larger models while also highlighting the potential of continual learning and using feedback to improve language models. We are making the dataset (opens in new tab) publicly available, along with a report (opens in new tab) describing the training procedure to encourage research on the improvement and specialization of smaller language models.

Teaching SLMs math

Solving mathematical word problems has long been recognized as a complex task for SLMs. Models that achieve over 80% accuracy on the GSM8K benchmark (GSM8K, which stands for Grade School Math 8K, is a dataset of 8,500 high-quality grade school mathematical word problems that require multi-step reasoning) typically exceed 30 billion parameters.

To reach higher levels of performance with smaller models, researchers often train SLMs to generate code, or use calculators to help avoid calculation errors. Additionally, they employ a technique called ensembling, in which the model is called up to 100 times, with each call reattempting to solve the problem. Ensembling provides a substantial boost in accuracy but at a significant increase in compute cost increase, due to multiple calls to the model. 

This research aims to explore how far we can push the native ability of smaller language models when they are specialized to solve math problems, without the use of external tools, verifiers or ensembling. More specifically, we focus on two directions:

AgentInstruct

Previous work on synthetic data creation often uses frontier models to generate similar problems based on a seed problem. Providing paraphrases of the seed with different numbers and attributes can be useful for creating training data for the smaller model. We propose employing multi-agent flows, using AutoGen, to create new problems and solutions, which can not only create more demonstrations of the problem but also increase the diversity and range of difficulty of the problems. 

To generate more challenging problems, we create a setup with a team of agents working collaboratively to create a dataset geared toward a predefined objective. For example, we can use two agents, namely Suggester and Editor. The Suggester examines a problem and proposes several methods for increasing its complexity, while the Editor takes the original word problem and the Suggester’s recommendations to generate an updated, more challenging problem. This iterative process can occur over multiple rounds, with each round further increasing the complexity of the previously generated problem. A third agent can then verify that the problem is solvable and create the solution.

Iterative learning

Using high-quality training data that may elicit richer learning signals (e.g. explanations) has been shown to significantly improve SLM’s abilities in acquiring skills that had only emerged before at much larger scale.

This paradigm fits under a teacher-student approach where the large model (the teacher) is 
creating demonstrations for the SLM (the student) to learn from. In this work, we extend the teacher-student paradigm to iterative learning settings as follows:

  • Teaching by demonstration: In this stage, we train the SLM by using AgentInstruct to demonstrate problems and their solutions.
  • Practice and feedback: We let the SLM practice solving problems on its own. For every problem, we allow the SLM to create multiple solutions. We then use the teacher model to provide feedback on the SLM solutions. If the SLM is unable to correctly solve the problem, even after multiple attempts, we use a solution provided by the teacher.
  • Iterative improvement: We use the teacher feedback to create preference data showing the SLM both good and bad solutions to the same problem, and then retrain the SLM.

The practice, feedback, and iterative improvement steps can be repeated multiple times.

Conclusion

Our findings show that smaller models are valuable in specialized settings where they can match the performance of much larger models but with a limited scope. By training Orca-Math on a small dataset of 200,000 math problems, we have achieved performance levels that rival or surpass those of much larger models.

The relatively small size of the dataset also shows the potential of using multi-agent flows to simulate the process of data and feedback generation. The small dataset size has implications for the cost of training and highlights that training data with richer learning signals can improve the efficiency of the learning process. Our findings also highlight the potential of continual learning and the improvement of language models, where the model iteratively improves as it receives more feedback from a person or another model.

The post Orca-Math: Demonstrating the potential of SLMs with model specialization appeared first on Microsoft Research.

Read More

VeCLIP: Improving CLIP Training via Visual-enriched Captions

Paper abstract: Large-scale web-crawled datasets are fundamental for the success of pre-training vision-language models, such as CLIP. However, the inherent noise and potential irrelevance of web-crawled AltTexts pose challenges in achieving precise image-text alignment. Existing methods utilizing large language models (LLMs) for caption rewriting have shown promise on small, curated datasets like CC3M and CC12M. This study introduces a scalable pipeline for noisy caption rewriting. Unlike recent LLM rewriting techniques, we emphasize the incorporation of visual concepts into captions, termed…Apple Machine Learning Research

Alida gains deeper understanding of customer feedback with Amazon Bedrock

Alida gains deeper understanding of customer feedback with Amazon Bedrock

This post is co-written with Sherwin Chu from Alida.

Alida helps the world’s biggest brands create highly engaged research communities to gather feedback that fuels better customer experiences and product innovation.

Alida’s customers receive tens of thousands of engaged responses for a single survey, therefore the Alida team opted to leverage machine learning (ML) to serve their customers at scale. However, when employing the use of traditional natural language processing (NLP) models, they found that these solutions struggled to fully understand the nuanced feedback found in open-ended survey responses. The models often only captured surface-level topics and sentiment, and missed crucial context that would allow for more accurate and meaningful insights.

In this post, we learn about how Anthropic’s Claude Instant model on Amazon Bedrock enabled the Alida team to quickly build a scalable service that more accurately determines the topic and sentiment within complex survey responses. The new service achieved a 4-6 times improvement in topic assertion by tightly clustering on several dozen key topics vs. hundreds of noisy NLP keywords.

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies, such as AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon via a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI.

Using Amazon Bedrock allowed Alida to bring their service to market faster than if they had used other machine learning (ML) providers or vendors.

The challenge

Surveys with a combination of multiple-choice and open-ended questions allow market researchers to get a more holistic view by capturing both quantitative and qualitative data points.

Multiple-choice questions are easy to analyze at scale, but lack nuance and depth. Set response options may also lead to biasing or priming participant responses.

Open-ended survey questions allow responders to provide context and unanticipated feedback. These qualitative data points deepen researchers’ understanding beyond what multiple-choice questions can capture alone. The challenge with the free-form text is that it can lead to complex and nuanced answers that are difficult for traditional NLP to fully understand. For example:

“I recently experienced some of life’s hardships and was really down and disappointed. When I went in, the staff were always very kind to me. It’s helped me get through some tough times!”

Traditional NLP methods will identify topics as “hardships,” “disappointed,” “kind staff,” and “get through tough times.” It can’t distinguish between the responder’s overall current negative life experiences and the specific positive store experiences.

Alida’s existing solution automatically process large volumes of open-ended responses, but they wanted their customers to gain better contextual comprehension and high-level topic inference.

Amazon Bedrock

Prior to the introduction of LLMs, the way forward for Alida to improve upon their existing single-model solution was to work closely with industry experts and develop, train, and refine new models specifically for each of the industry verticals that Alida’s customers operated in. This was both a time- and cost-intensive endeavor.

One of the breakthroughs that make LLMs so powerful is the use of attention mechanisms. LLMs use self-attention mechanisms that analyze the relationships between words in a given prompt. This allows LLMs to better handle the topic and sentiment in the earlier example and presents an exciting new technology that can be used to address the challenge.

With Amazon Bedrock, teams and individuals can immediately start using foundation models without having to worry about provisioning infrastructure or setting up and configuring ML frameworks. You can get started with the following steps:

  1. Verify that your user or role has permission to create or modify Amazon Bedrock resources. For details, see Identity-based policy examples for Amazon Bedrock
  2. Log in into the Amazon Bedrock console.
  3. On the Model access page, review the EULA and enable the FMs you’d like in your account.
  4. Start interacting with the FMs via the following methods:

Alida’s executive leadership team was eager to be an early adopter of the Amazon Bedrock because they recognized its ability to help their teams to bring new generative AI-powered solutions to market faster.

Vincy William, the Senior Director of Engineering at Alida who leads the team responsible for building the topic and sentiment analysis service, says,

“LLMs provide a big leap in qualitative analysis and do things (at a scale that is) humanly not possible to do. Amazon Bedrock is a game changer, it allows us to leverage LLMs without the complexity.”

The engineering team experienced the immediate ease of getting started with Amazon Bedrock. They could select from various foundation models and start focusing on prompt engineering instead of spending time on right-sizing, provisioning, deploying, and configuring resources to run the models.

Solution overview

Sherwin Chu, Alida’s Chief Architect, shared Alida’s microservices architecture approach. Alida built the topic and sentiment classification as a service with survey response analysis as its first application. With this approach, common LLM implementation challenges such as the complexity of managing prompts, token limits, request constraints, and retries are abstracted away, and the solution allows for consuming applications to have a simple and stable API to work with. This abstraction layer approach also enables the service owners to continually improve internal implementation details and minimize API-breaking changes. Finally, the service approach allows for a single point to implement any data governance and security policies that evolve as AI governance matures in the organization.

The following diagram illustrates the solution architecture and flow.

Alida microservice architecture

Alida evaluated LLMs from various providers, and found Anthropic’s Claude Instant to be the right balance between cost and performance. Working closely with the prompt engineering team, Chu advocated to implement a prompt chaining strategy as opposed to a single monolith prompt approach.

Prompt chaining enables you to do the following:

  • Break down your objective into smaller, logical steps
  • Build a prompt for each step
  • Provide the prompts sequentially to the LLM

This creates additional points of inspection, which has the following benefits:

  • It’s straightforward to systematically evaluate changes you make to the input prompt
  • You can implement more detailed tracking and monitoring of the accuracy and performance at each step

Key considerations with this strategy include the increase in the number of requests made to the LLM and the resulting increase in the overall time it takes to complete the objective. For Alida’s use case they chose to batching a collection of open-ended responses in a single prompt to the LLM is what they chose to offset these effects.

NLP vs. LLM

Alida’s existing NLP solution relies on clustering algorithms and statistical classification to analyze open-ended survey responses. When applied to sample feedback for a coffee shop’s mobile app, it extracted topics based on word patterns but lacked true comprehension. The following table includes some examples comparing NLP responses vs. LLM responses.

Survey Response Existing Traditional NLP Amazon Bedrock with Claude Instant
Topic Topic Sentiment
I almost exclusively order my drinks through the app bc of convenience and it’s less embarrassing to order super customized drinks lol. And I love earning rewards! [‘app bc convenience’, ‘drink’, ‘reward’] Mobile Ordering Convenience positive
The app works pretty good the only complaint I have is that I can’t add Any number of money that I want to my gift card. Why does it specifically have to be $10 to refill?! [‘complaint’, ‘app’, ‘gift card’, ‘number money’] Mobile Order Fulfillment Speed negative

The example results show how the existing solution was able to extract relevant keywords, but isn’t able to achieve a more generalized topic group assignment.

In contrast, using Amazon Bedrock and Anthropic Claude Instant, the LLM with in-context training is able to assign the responses to pre-defined topics and assign sentiment.

In additional to delivering better answers for Alida’s customers, for this particular use-case, pursuing a solution using an LLM over traditional NLP methods saved a vast amount of time and effort in training and maintaining a suitable model. The following table compares training a traditional NLP model vs. in-context training of an LLM.

. Data Requirement Training Process Model Adaptability
Training a traditional NLP model Thousands of human-labeled examples

Combination of automated and manual feature engineering.

Iterative train and evaluate cycles.

Slower turnaround due to the need to retrain model
In-context training of LLM Several examples

Trained on the fly within the prompt.

Limited by context window size.

Faster iterations by modifying the prompt.

Limited retention due to context window size.

Conclusion

Alida’s use of Anthropic’s Claude Instant model on Amazon Bedrock demonstrates the powerful capabilities of LLMs for analyzing open-ended survey responses. Alida was able to build a superior service that was 4-6 times more precise at topic analysis when compared to their NLP-powered service. Additionally, using in-context prompt engineering for LLMs significantly reduced development time, because they didn’t need to curate thousands of human-labeled data points to train a traditional NLP model. This ultimately allows Alida to give their customers richer insights sooner!

If you’re ready to start building your own foundation model innovation with Amazon Bedrock, checkout this link to Set up Amazon Bedrock. If you interested in reading about other intriguing Amazon Bedrock applications, see the Amazon Bedrock specific section of the AWS Machine Learning Blog.


About the authors

Kinman Lam is an ISV/DNB Solution Architect for AWS. He has 17 years of experience in building and growing technology companies in the smartphone, geolocation, IoT, and open source software space. At AWS, he uses his experience to help companies build robust infrastructure to meet the increasing demands of growing businesses, launch new products and services, enter new markets, and delight their customers.

Sherwin ChuSherwin Chu is the Chief Architect at Alida, helping product teams with architectural direction, technology choice, and complex problem-solving. He is an experienced software engineer, architect, and leader with over 20 years in the SaaS space for various industries. He has built and managed numerous B2B and B2C systems on AWS and GCP.

Mark Roy is a Principal Machine Learning Architect for AWS, helping customers design and build AI/ML and generative AI solutions. His focus since early 2023 has been leading solution architecture efforts for the launch of Amazon Bedrock, AWS’ flagship generative AI offering for builders. Mark’s work covers a wide range of use cases, with a primary interest in generative AI, agents, and scaling ML across the enterprise. He has helped companies in insurance, financial services, media and entertainment, healthcare, utilities, and manufacturing. Prior to joining AWS, Mark was an architect, developer, and technology leader for over 25 years, including 19 years in financial services. Mark holds six AWS certifications, including the ML Specialty Certification.

Read More

Robo Rendezvous: Robotics Innovators and AI Leaders to Converge at NVIDIA GTC

Robo Rendezvous: Robotics Innovators and AI Leaders to Converge at NVIDIA GTC

Bringing together pioneers in robotics and AI, NVIDIA GTC will be a state-of-the-art showcase of applied AI for autonomous machines.

The conference, running March 18-21 at the San Jose Convention Center and online, boasts a star-studded lineup. This includes a fireside chat with Marc Raibert, executive director of The AI Institute, and Dieter Fox, senior director of robotics research at NVIDIA, as well as panels featuring heavyweights like Disney, Google DeepMind and Amazon, alongside insights from NVIDIA stalwarts like Senior Research Scientist Jim Fan.

With over 77 ecosystem partners and more than 25 partner robots, from industrial giants to entertainment bots, GTC is where the future of robotics unfolds.

Attendees will be able to explore the convergence of AI and robotics through dynamic displays in the AI at the Edge pavilion, the Metropolis pavilion and demo areas, featuring the latest robot arms, robotic vision systems and high-accuracy 3D scanning systems.

These demonstrations provide compelling examples of how AI seamlessly enhances human capabilities across diverse industries. Groundbreaking demos using large language models for real-world applications will push the boundaries of human-machine interaction.

Here are a few of the conference’s must-see robotics events:

Plus, a special session with Deepu Talla, vice president of robotics and edge computing, about “AI Robotics: Driving Innovation for the Future of Automation” was just added to the GTC catalog.

This year’s GTC also offers 40 hands-on labs, providing attendees with an immersive experience of the practical applications of these technologies.

A Jetson and Robotics Developer Day will be held on Thursday, March 21, featuring a full day of sessions and panels that dive deep into building next-gen AI-powered robotics and edge applications on the NVIDIA Jetson, Isaac and Metropolis platforms.

Over the past decade, GTC has been where advances in computer graphics, deep learning and generative AI were launched. As industries from agriculture to manufacturing are transformed by these technologies, this year’s event will offer a glimpse into the innovations that will soon define our daily lives.

Register for GTC to secure your spot at the forefront of technology’s next leap. 

Read More

Google at APS 2024

Google at APS 2024

Today the 2024 March Meeting of the American Physical Society (APS) kicks off in Minneapolis, MN. A premier conference on topics ranging across physics and related fields, APS 2024 brings together researchers, students, and industry professionals to share their discoveries and build partnerships with the goal of realizing fundamental advances in physics-related sciences and technology.

This year, Google has a strong presence at APS with a booth hosted by the Google Quantum AI team, 50+ talks throughout the conference, and participation in conference organizing activities, special sessions and events. Attending APS 2024 in person? Come visit Google’s Quantum AI booth to learn more about the exciting work we’re doing to solve some of the field’s most interesting challenges. <!–Visit the @GoogleAI X (Twitter) account to find out about Google booth activities (e.g., demos and Q&A sessions). –>

You can learn more about the latest cutting edge work we are presenting at the conference along with our schedule of booth events below (Googlers listed in bold).

Organizing Committee

Session Chairs include: Aaron Szasz

Booth Activities

This schedule is subject to change. Please visit the Google Quantum AI booth for more information.

Crumble

Presenter: Matt McEwen

Tue, Mar 5 | 11:00 AM CST

Qualtran

Presenter: Tanuj Khattar

Tue, Mar 5 | 2:30 PM CST

Qualtran

Presenter: Tanuj Khattar

Thu, Mar 7 | 11:00 AM CST

$5M XPRIZE / Google Quantum AI competition to accelerate quantum applications Q&A

Presenter: Ryan Babbush

Thu, Mar 7 | 11:00 AM CST

Talks

Monday

Certifying highly-entangled states from few single-qubit measurements

Presenter: Hsin-Yuan Huang

Author: Hsin-Yuan Huang

Session A45: New Frontiers in Machine Learning Quantum Physics

Toward high-fidelity analog quantum simulation with superconducting qubits

Presenter: Trond Andersen

Authors: Trond I Andersen, Xiao Mi, Amir H Karamlou, Nikita Astrakhantsev, Andrey Klots, Julia Berndtsson, Andre Petukhov, Dmitry Abanin, Lev B Ioffe, Yu Chen, Vadim Smelyanskiy, Pedram Roushan

Session A51: Applications on Noisy Quantum Hardware I

Measuring circuit errors in context for surface code circuits

Presenter: Dripto M Debroy

Authors: Dripto M Debroy, Jonathan A Gross, Élie Genois, Zhang Jiang

Session B50: Characterizing Noise with QCVV Techniques

Quantum computation of stopping power for inertial fusion target design I: Physics overview and the limits of classical algorithms

Presenter: Andrew D. Baczewski

Authors: Nicholas C. Rubin, Dominic W. Berry, Alina Kononov, Fionn D. Malone, Tanuj Khattar, Alec White, Joonho Lee, Hartmut Neven, Ryan Babbush, Andrew D. Baczewski

Session B51: Heterogeneous Design for Quantum Applications

Link to Paper

Quantum computation of stopping power for inertial fusion target design II: Physics overview and the limits of classical algorithms

Presenter: Nicholas C. Rubin

Authors: Nicholas C. Rubin, Dominic W. Berry, Alina Kononov, Fionn D. Malone, Tanuj Khattar, Alec White, Joonho Lee, Hartmut Neven, Ryan Babbush, Andrew D. Baczewski

Session B51: Heterogeneous Design for Quantum Applications

Link to Paper

Calibrating Superconducting Qubits: From NISQ to Fault Tolerance

Presenter: Sabrina S Hong

Author: Sabrina S Hong

Session B56: From NISQ to Fault Tolerance

Measurement and feedforward induced entanglement negativity transition

Presenter: Ramis Movassagh

Authors: Alireza Seif, Yu-Xin Wang, Ramis Movassagh, Aashish A. Clerk

Session B31: Measurement Induced Criticality in Many-Body Systems

Link to Paper

Effective quantum volume, fidelity and computational cost of noisy quantum processing experiments

Presenter: Salvatore Mandra

Authors: Kostyantyn Kechedzhi, Sergei V Isakov, Salvatore Mandra, Benjamin Villalonga, X. Mi, Sergio Boixo, Vadim Smelyanskiy

Session B52: Quantum Algorithms and Complexity

Link to Paper

Accurate thermodynamic tables for solids using Machine Learning Interaction Potentials and Covariance of Atomic Positions

Presenter: Mgcini K Phuthi

Authors: Mgcini K Phuthi, Yang Huang, Michael Widom, Ekin D Cubuk, Venkat Viswanathan

Session D60: Machine Learning of Molecules and Materials: Chemical Space and Dynamics

Tuesday

IN-Situ Pulse Envelope Characterization Technique (INSPECT)

Presenter: Zhang Jiang

Authors: Zhang Jiang, Jonathan A Gross, Élie Genois

Session F50: Advanced Randomized Benchmarking and Gate Calibration

Characterizing two-qubit gates with dynamical decoupling

Presenter: Jonathan A Gross

Authors: Jonathan A Gross, Zhang Jiang, Élie Genois, Dripto M Debroy, Ze-Pei Cian*, Wojciech Mruczkiewicz

Session F50: Advanced Randomized Benchmarking and Gate Calibration

Statistical physics of regression with quadratic models

Presenter: Blake Bordelon

Authors: Blake Bordelon, Cengiz Pehlevan, Yasaman Bahri

Session EE01: V: Statistical and Nonlinear Physics II

Improved state preparation for first-quantized simulation of electronic structure


Presenter: William J Huggins


Authors: William J Huggins, Oskar Leimkuhler, Torin F Stetina, Birgitta Whaley

Session G51: Hamiltonian Simulation

Controlling large superconducting quantum processors

Presenter: Paul V. Klimov

Authors: Paul V. Klimov, Andreas Bengtsson, Chris Quintana, Alexandre Bourassa, Sabrina Hong, Andrew Dunsworth, Kevin J. Satzinger, William P. Livingston, Volodymyr Sivak, Murphy Y. Niu, Trond I. Andersen, Yaxing Zhang, Desmond Chik, Zijun Chen, Charles Neill, Catherine Erickson, Alejandro Grajales Dau, Anthony Megrant, Pedram Roushan, Alexander N. Korotkov, Julian Kelly, Vadim Smelyanskiy, Yu Chen, Hartmut Neven

Session G30: Commercial Applications of Quantum Computing)

Link to Paper

Gaussian boson sampling: Determining quantum advantage

Presenter: Peter D Drummond

Authors: Peter D Drummond, Alex Dellios, Ned Goodman, Margaret D Reid, Ben Villalonga

Session G50: Quantum Characterization, Verification, and Validation II

Attention to complexity III: learning the complexity of random quantum circuit states

Presenter: Hyejin Kim

Authors: Hyejin Kim, Yiqing Zhou, Yichen Xu, Chao Wan, Jin Zhou, Yuri D Lensky, Jesse Hoke, Pedram Roushan, Kilian Q Weinberger, Eun-Ah Kim

Session G50: Quantum Characterization, Verification, and Validation II

Balanced coupling in superconducting circuits

Presenter: Daniel T Sank

Authors: Daniel T Sank, Sergei V Isakov, Mostafa Khezri, Juan Atalaya

Session K48: Strongly Driven Superconducting Systems

Resource estimation of Fault Tolerant algorithms using Qᴜᴀʟᴛʀᴀɴ

Presenter: Tanuj Khattar

Author: Tanuj Khattar

Session K49: Algorithms and Implementations on Near-Term Quantum Computers

Wednesday

Discovering novel quantum dynamics with superconducting qubits

Presenter: Pedram Roushan

Author: Pedram Roushan

Session M24: Analog Quantum Simulations Across Platforms

Deciphering Tumor Heterogeneity in Triple-Negative Breast Cancer: The Crucial Role of Dynamic Cell-Cell and Cell-Matrix Interactions

Presenter: Susan Leggett

Authors: Susan Leggett, Ian Wong, Celeste Nelson, Molly Brennan, Mohak Patel, Christian Franck, Sophia Martinez, Joe Tien, Lena Gamboa, Thomas Valentin, Amanda Khoo, Evelyn K Williams

Session M27: Mechanics of Cells and Tissues II

Toward implementation of protected charge-parity qubits

Presenter: Abigail Shearrow

Authors: Abigail Shearrow, Matthew Snyder, Bradley G Cole, Kenneth R Dodge, Yebin Liu, Andrey Klots, Lev B Ioffe, Britton L Plourde, Robert McDermott

Session N48: Unconventional Superconducting Qubits

Electronic capacitance in tunnel junctions for protected charge-parity qubits

Presenter: Bradley G Cole

Authors: Bradley G Cole, Kenneth R Dodge, Yebin Liu, Abigail Shearrow, Matthew Snyder, Andrey Klots, Lev B Ioffe, Robert McDermott, B.L.T. Plourde

Session N48: Unconventional Superconducting Qubits

Overcoming leakage in quantum error correction

Presenter: Kevin C. Miao

Authors: Kevin C. Miao, Matt McEwen, Juan Atalaya, Dvir Kafri, Leonid P. Pryadko, Andreas Bengtsson, Alex Opremcak, Kevin J. Satzinger, Zijun Chen, Paul V. Klimov, Chris Quintana, Rajeev Acharya, Kyle Anderson, Markus Ansmann, Frank Arute, Kunal Arya, Abraham Asfaw, Joseph C. Bardin, Alexandre Bourassa, Jenna Bovaird, Leon Brill, Bob B. Buckley, David A. Buell, Tim Burger, Brian Burkett, Nicholas Bushnell, Juan Campero, Ben Chiaro, Roberto Collins, Paul Conner, Alexander L. Crook, Ben Curtin, Dripto M. Debroy, Sean Demura, Andrew Dunsworth, Catherine Erickson, Reza Fatemi, Vinicius S. Ferreira, Leslie Flores Burgos, Ebrahim Forati, Austin G. Fowler, Brooks Foxen, Gonzalo Garcia, William Giang, Craig Gidney, Marissa Giustina, Raja Gosula, Alejandro Grajales Dau, Jonathan A. Gross, Michael C. Hamilton, Sean D. Harrington, Paula Heu, Jeremy Hilton, Markus R. Hoffmann, Sabrina Hong, Trent Huang, Ashley Huff, Justin Iveland, Evan Jeffrey, Zhang Jiang, Cody Jones, Julian Kelly, Seon Kim, Fedor Kostritsa, John Mark Kreikebaum, David Landhuis, Pavel Laptev, Lily Laws, Kenny Lee, Brian J. Lester, Alexander T. Lill, Wayne Liu, Aditya Locharla, Erik Lucero, Steven Martin, Anthony Megrant, Xiao Mi, Shirin Montazeri, Alexis Morvan, Ofer Naaman, Matthew Neeley, Charles Neill, Ani Nersisyan, Michael Newman, Jiun How Ng, Anthony Nguyen, Murray Nguyen, Rebecca Potter, Charles Rocque, Pedram Roushan, Kannan Sankaragomathi, Christopher Schuster, Michael J. Shearn, Aaron Shorter, Noah Shutty, Vladimir Shvarts, Jindra Skruzny, W. Clarke Smith, George Sterling, Marco Szalay, Douglas Thor, Alfredo Torres, Theodore White, Bryan W. K. Woo, Z. Jamie Yao, Ping Yeh, Juhwan Yoo, Grayson Young, Adam Zalcman, Ningfeng Zhu, Nicholas Zobrist, Hartmut Neven, Vadim Smelyanskiy, Andre Petukhov, Alexander N. Korotkov, Daniel Sank, Yu Chen

Session N51: Quantum Error Correction Code Performance and Implementation I

Link to Paper

Modeling the performance of the surface code with non-uniform error distribution: Part 1

Presenter: Yuri D Lensky

Authors: Yuri D Lensky, Volodymyr Sivak, Kostyantyn Kechedzhi, Igor Aleiner

Session N51: Quantum Error Correction Code Performance and Implementation I

Modeling the performance of the surface code with non-uniform error distribution: Part 2

Presenter: Volodymyr Sivak

Authors: Volodymyr Sivak, Michael Newman, Cody Jones, Henry Schurkus, Dvir Kafri, Yuri D Lensky, Paul Klimov, Kostyantyn Kechedzhi, Vadim Smelyanskiy

Session N51: Quantum Error Correction Code Performance and Implementation I

Highly optimized tensor network contractions for the simulation of classically challenging quantum computations

Presenter: Benjamin Villalonga

Author: Benjamin Villalonga

Session Q51: Co-evolution of Quantum Classical Algorithms

Teaching modern quantum computing concepts using hands-on open-source software at all levels

Presenter: Abraham Asfaw

Author: Abraham Asfaw

Session Q61: Teaching Quantum Information at All Levels II

Thursday

New circuits and an open source decoder for the color code

Presenter: Craig Gidney

Authors: Craig Gidney, Cody Jones

Session S51: Quantum Error Correction Code Performance and Implementation II

Link to Paper

Performing Hartree-Fock many-body physics calculations with large language models

Presenter: Eun-Ah Kim

Authors: Eun-Ah Kim, Haining Pan, Nayantara Mudur, William Taranto, Subhashini Venugopalan, Yasaman Bahri, Michael P Brenner

Session S18: Data Science, AI and Machine Learning in Physics I

New methods for reducing resource overhead in the surface code

Presenter: Michael Newman

Authors: Craig M Gidney, Michael Newman, Peter Brooks, Cody Jones

Session S51: Quantum Error Correction Code Performance and Implementation II

Link to Paper

Challenges and opportunities for applying quantum computers to drug design

Presenter: Raffaele Santagati

Authors: Raffaele Santagati, Alan Aspuru-Guzik, Ryan Babbush, Matthias Degroote, Leticia Gonzalez, Elica Kyoseva, Nikolaj Moll, Markus Oppel, Robert M. Parrish, Nicholas C. Rubin, Michael Streif, Christofer S. Tautermann, Horst Weiss, Nathan Wiebe, Clemens Utschig-Utschig

Session S49: Advances in Quantum Algorithms for Near-Term Applications

Link to Paper

Dispatches from Google’s hunt for super-quadratic quantum advantage in new applications

Presenter: Ryan Babbush

Author: Ryan Babbush

Session T45: Recent Advances in Quantum Algorithms

Qubit as a reflectometer

Presenter: Yaxing Zhang

Authors: Yaxing Zhang, Benjamin Chiaro

Session T48: Superconducting Fabrication, Packaging, & Validation

Random-matrix theory of measurement-induced phase transitions in nonlocal Floquet quantum circuits

Presenter: Aleksei Khindanov

Authors: Aleksei Khindanov, Lara Faoro, Lev Ioffe, Igor Aleiner

Session W14: Measurement-Induced Phase Transitions

Continuum limit of finite density many-body ground states with MERA

Presenter: Subhayan Sahu

Authors: Subhayan Sahu, Guifré Vidal

Session W58: Extreme-Scale Computational Science Discovery in Fluid Dynamics and Related Disciplines II

Dynamics of magnetization at infinite temperature in a Heisenberg spin chain

Presenter: Eliott Rosenberg

Authors: Eliott Rosenberg, Trond Andersen, Rhine Samajdar, Andre Petukhov, Jesse Hoke*, Dmitry Abanin, Andreas Bengtsson, Ilya Drozdov, Catherine Erickson, Paul Klimov, Xiao Mi, Alexis Morvan, Matthew Neeley, Charles Neill, Rajeev Acharya, Richard Allen, Kyle Anderson, Markus Ansmann, Frank Arute, Kunal Arya, Abraham Asfaw, Juan Atalaya, Joseph Bardin, A. Bilmes, Gina Bortoli, Alexandre Bourassa, Jenna Bovaird, Leon Brill, Michael Broughton, Bob B. Buckley, David Buell, Tim Burger, Brian Burkett, Nicholas Bushnell, Juan Campero, Hung-Shen Chang, Zijun Chen, Benjamin Chiaro, Desmond Chik, Josh Cogan, Roberto Collins, Paul Conner, William Courtney, Alexander Crook, Ben Curtin, Dripto Debroy, Alexander Del Toro Barba, Sean Demura, Agustin Di Paolo, Andrew Dunsworth, Clint Earle, E. Farhi, Reza Fatemi, Vinicius Ferreira, Leslie Flores, Ebrahim Forati, Austin Fowler, Brooks Foxen, Gonzalo Garcia, Élie Genois, William Giang, Craig Gidney, Dar Gilboa, Marissa Giustina, Raja Gosula, Alejandro Grajales Dau, Jonathan Gross, Steve Habegger, Michael Hamilton, Monica Hansen, Matthew Harrigan, Sean Harrington, Paula Heu, Gordon Hill, Markus Hoffmann, Sabrina Hong, Trent Huang, Ashley Huff, William Huggins, Lev Ioffe, Sergei Isakov, Justin Iveland, Evan Jeffrey, Zhang Jiang, Cody Jones, Pavol Juhas, D. Kafri, Tanuj Khattar, Mostafa Khezri, Mária Kieferová, Seon Kim, Alexei Kitaev, Andrey Klots, Alexander Korotkov, Fedor Kostritsa, John Mark Kreikebaum, David Landhuis, Pavel Laptev, Kim Ming Lau, Lily Laws, Joonho Lee, Kenneth Lee, Yuri Lensky, Brian Lester, Alexander Lill, Wayne Liu, William P. Livingston, A. Locharla, Salvatore Mandrà, Orion Martin, Steven Martin, Jarrod McClean, Matthew McEwen, Seneca Meeks, Kevin Miao, Amanda Mieszala, Shirin Montazeri, Ramis Movassagh, Wojciech Mruczkiewicz, Ani Nersisyan, Michael Newman, Jiun How Ng, Anthony Nguyen, Murray Nguyen, M. Niu, Thomas O’Brien, Seun Omonije, Alex Opremcak, Rebecca Potter, Leonid Pryadko, Chris Quintana, David Rhodes, Charles Rocque, N. Rubin, Negar Saei, Daniel Sank, Kannan Sankaragomathi, Kevin Satzinger, Henry Schurkus, Christopher Schuster, Michael Shearn, Aaron Shorter, Noah Shutty, Vladimir Shvarts, Volodymyr Sivak, Jindra Skruzny, Clarke Smith, Rolando Somma, George Sterling, Doug Strain, Marco Szalay, Douglas Thor, Alfredo Torres, Guifre Vidal, Benjamin Villalonga, Catherine Vollgraff Heidweiller, Theodore White, Bryan Woo, Cheng Xing, Jamie Yao, Ping Yeh, Juhwan Yoo, Grayson Young, Adam Zalcman, Yaxing Zhang, Ningfeng Zhu, Nicholas Zobrist, Hartmut Neven, Ryan Babbush, Dave Bacon, Sergio Boixo, Jeremy Hilton, Erik Lucero, Anthony Megrant, Julian Kelly, Yu Chen, Vadim Smelyanskiy, Vedika Khemani, Sarang Gopalakrishnan, Tomaž Prosen, Pedram Roushan

Session W50: Quantum Simulation of Many-Body Physics

Link to Paper

The fast multipole method on a quantum computer

Presenter: Kianna Wan

Authors: Kianna Wan, Dominic W Berry, Ryan Babbush

Session W50: Quantum Simulation of Many-Body Physics

Friday

The quantum computing industry and protecting national security: what tools will work?

Presenter: Kate Weber

Author: Kate Weber

Session Y43: Industry, Innovation, and National Security: Finding the Right Balance

Novel charging effects in the fluxonium qubit

Presenter: Agustin Di Paolo

Authors: Agustin Di Paolo, Kyle Serniak, Andrew J Kerman, William D Oliver

Session Y46: Fluxonium-Based Superconducting Quibits

Microwave Engineering of Parametric Interactions in Superconducting Circuits

Presenter: Ofer Naaman

Author: Ofer Naaman

Session Z46: Broadband Parametric Amplifiers and Circulators

Linear spin wave theory of large magnetic unit cells using the Kernel Polynomial Method

Presenter: Harry Lane

Authors: Harry Lane, Hao Zhang, David A Dahlbom, Sam Quinn, Rolando D Somma, Martin P Mourigal, Cristian D Batista, Kipton Barros

Session Z62: Cooperative Phenomena, Theory


*Work done while at Google

Read More

Unlocking Innovation: AWS and Anthropic push the boundaries of generative AI together

Unlocking Innovation: AWS and Anthropic push the boundaries of generative AI together

Amazon Bedrock is the best place to build and scale generative AI applications with large language models (LLM) and other foundation models (FMs). It enables customers to leverage a variety of high-performing FMs, such as the Claude family of models by Anthropic, to build custom generative AI applications. Looking back to 2021, when Anthropic first started building on AWS, no one could have envisioned how transformative the Claude family of models would be. We have been making state-of-the-art generative AI models accessible and usable for businesses of all sizes through Amazon Bedrock. In just a few short months since Amazon Bedrock became generally available on September 28, 2023, more than 10K customers have been using it to deliver, and many of them are using Claude. Customers such as ADP, Broadridge, Cloudera, Dana-Farber Cancer Institute, Genesys, Genomics England, GoDaddy, Intuit, M1 Finance, Perplexity AI, Proto Hologram, Rocket Companies and more are using Anthropic’s Claude models on Amazon Bedrock to drive innovation in generative AI and to build transformative customer experiences. And today, we are announcing an exciting milestone with the next generation of Claude coming to Amazon Bedrock: Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku.

Introducing Anthropic’s Claude 3 models

Anthropic is unveiling its next generation of Claude with three advanced models optimized for different use cases. Haiku is the fastest and most cost-effective model on the market. It is a fast compact model for near-instant responsiveness. For the vast majority of workloads, Sonnet is 2x faster than Claude 2 and Claude 2.1 with higher levels of intelligence. It excels at intelligent tasks demanding rapid responses, like knowledge retrieval or sales automation. And it strikes the ideal balance between intelligence and speed – qualities especially critical for enterprise use cases. Opus is the most advanced, capable, state-of-the-art FM with deep reasoning, advanced math, and coding abilities, with top-level performance on highly complex tasks. It can navigate open-ended prompts, and novel scenarios with remarkable fluency, including task automation, hypothesis generation, and analysis of charts, graphs, and forecasts. And Sonnet is first available on Amazon Bedrock today. Current evaluations from Anthropic suggest that the Claude 3 model family outperforms comparable models in math word problem solving (MATH) and multilingual math (MGSM) benchmarks, critical benchmarks used today for LLMs.

  1. Vision capabilities – Claude 3 models have been trained to understand structured and unstructured data across different formats, not just language, but also images, charts, diagrams, and more. This lets businesses build generative AI applications integrating diverse multimedia sources and solving truly cross-domain problems. For instance, pharmaceutical companies can query drug research papers alongside protein structure diagrams to accelerate discovery. Media organizations can generate image captions or video scripts automatically.
  2. Best-in-class benchmarks – Claude 3 exceeds existing models on standardized evaluations such as math problems, programming exercises, and scientific reasoning. Customers can optimize domain specific experimental procedures in manufacturing, or audit financial reports based on contextual data, in an automated way and with high accuracy using AI-driven responses.

    Specifically, Opus outperforms its peers on most of the common evaluation benchmarks for AI systems, including undergraduate level expert knowledge (MMLU), graduate level expert reasoning (GPQA), basic mathematics (GSM8K), and more. It exhibits high levels of comprehension and fluency on complex tasks, leading the frontier of general intelligence.

  3. Reduced hallucination – Businesses require predictive, controllable outputs from AI systems directing automated processes or customer interactions. Claude 3 models mitigate hallucination through constitutional AI techniques that provide transparency into the model’s reasoning, as well as improve accuracy. Claude 3 Opus shows an estimated 2x gain in accuracy over Claude 2.1 on difficult open-ended questions, reducing the likelihood of faulty responses. As enterprise customers rely on Claude across industries like healthcare, finance, and legal research, reducing hallucinations is essential for safety and performance. The Claude 3 family sets a new standard for reliable generative AI output.

Benefits of Anthropic Claude 3 FMs on Amazon Bedrock

Through Amazon Bedrock, customers will get easy access to build with Anthropic’s newest models. This includes not only natural language models but also their expanded range of multimodal AI models capable of advanced reasoning across text, images, charts, and more. Our collaboration has already helped customers accelerate generative AI adoption and delivered business value to them. Here are a few ways customers have been using Anthropic’s Claude models on Amazon Bedrock:

“We are developing a generative AI solution on AWS to help customers plan epic trips and create life-changing experiences with personalized travel itineraries. By building with Claude on Amazon Bedrock, we reduced itinerary generation costs by nearly 80% percent when we quickly created a scalable, secure AI platform that can organize our book content in minutes to deliver cohesive, highly accurate travel recommendations. Now we can repackage and personalize our content in various ways on our digital platforms, based on customer preference, all while highlighting trusted local voices–just like Lonely Planet has done for 50 years.”

— Chris Whyde, Senior VP of Engineering and Data Science, Lonely Planet

“We are working with AWS and Anthropic to host our custom, fine-tuned Anthropic Claude model on Amazon Bedrock to support our strategy of rapidly delivering generative AI solutions at scale and with cutting-edge encryption, data privacy, and safe AI technology embedded in everything we do. Our new Lexis+ AI platform technology features conversational search, insightful summarization, and intelligent legal drafting capabilities, which enable lawyers to increase their efficiency, effectiveness, and productivity.”

— Jeff Reihl, Executive VP and CTO, LexisNexis Legal & Professional

“At Broadridge, we have been working to automate the understanding of regulatory reporting requirements to create greater transparency and increase efficiency for our customers operating in domestic and global financial markets. With use of Claude on Amazon Bedrock, we’re thrilled to get even higher accuracy in our experiments with processing and summarizing capabilities. With Amazon Bedrock, we have choice in our use of LLMs, and we value the performance and integration capabilities it offers.”

— Saumin Patel, VP Engineering generative AI, Broadridge

The Claude 3 model family caters to various needs, allowing customers to choose the model best suited for their specific use case, which is key to developing a successful prototype and later production systems that can deliver real impact—whether for a new product, feature or process that boosts the bottom line. Keeping customer needs top of mind, Anthropic and AWS are delivering where it matters most to organizations of all sizes:

  1. Improved performance – Claude 3 models are significantly faster for real-time interactions thanks to optimizations across hardware and software.
  2. Increased accuracy and reliability – Through massive scaling as well as new self-supervision techniques, expected gains of 2x in accuracy for complex questions over long contexts mean AI that’s even more helpful, safe, and honest.
  3. Simpler and secure customization – Customization capabilities, like retrieval-augmented generation (RAG), simplify training models on proprietary data and building applications backed by diverse data sources, so customers get AI tuned for their unique needs. In addition, proprietary data is never exposed to the public internet, never leaves the AWS network, is securely transferred through VPC, and is encrypted in transit and at rest.

And AWS and Anthropic are continuously reaffirming our commitment to advancing generative AI in a responsible manner. By constantly improving model capabilities committing to frameworks like Constitutional AI or the White House voluntary commitments on AI, we can accelerate the safe, ethical development and deployment of this transformative technology.

The future of generative AI

Looking ahead, customers will build entirely new categories of generative AI-powered applications and experiences with the latest generation of models. We’ve only begun to tap generative AI’s potential to automate complex processes, augment human expertise, and reshape digital experiences. We expect to see unprecedented levels of innovation as customers choose Anthropic’s models augmented with multimodal skills leveraging all the tools they need to build and scale generative AI applications on Amazon Bedrock. Imagine sophisticated conversational assistants providing fast and highly-contextual responses, picture personalized recommendation engines that seamlessly blend in relevant images, diagrams and associated knowledge to intuitively guide decisions. Envision scientific research turbocharged by generative AI able to read experiments, synthesize hypotheses, and even propose novel areas for exploration. There are so many possibilities that will be realized by taking full advantage of all generative AI has to offer through Amazon Bedrock. Our collaboration ensures enterprises and innovators worldwide will have the tools to reach the next frontier of generative AI-powered innovation responsibly, and for the benefit of all.

Conclusion

It’s still early days for generative AI, but strong collaboration and a focus on innovation are ushering in a new era of generative AI on AWS. We can’t wait to see what customers build next.

Resources

Check out the following resources to learn more about this announcement:


About the author

Swami Sivasubramanian is Vice President of Data and Machine Learning at AWS. In this role, Swami oversees all AWS Database, Analytics, and AI & Machine Learning services. His team’s mission is to help organizations put their data to work with a complete, end-to-end data solution to store, access, analyze, and visualize, and predict.

Read More