NVIDIA Unveils Its Most Affordable Generative AI Supercomputer

NVIDIA Unveils Its Most Affordable Generative AI Supercomputer

NVIDIA is taking the wraps off a new compact generative AI supercomputer, offering increased performance at a lower price with a software upgrade.

The new NVIDIA Jetson Orin Nano Super Developer Kit, which fits in the palm of a hand, provides everyone from commercial AI developers to hobbyists and students, gains in generative AI capabilities and performance. And the price is now $249, down from $499.

Available today, it delivers as much as a 1.7x leap in generative AI inference performance, a 70% increase in performance to 67 INT8 TOPS, and a 50% increase in memory bandwidth to 102GB/s compared with its predecessor.

Whether creating LLM chatbots based on retrieval-augmented generation, building a visual AI agent, or deploying AI-based robots, the Jetson Orin Nano Super is an ideal solution to fetch.

The Gift That Keeps on Giving

The software updates available to the new Jetson Orin Nano Super will also boost generative AI performance for those who already own the Jetson Orin Nano Developer Kit.

Jetson Orin Nano Super is suited for those interested in developing skills in generative AI, robotics or computer vision. As the AI world is moving from task-specific models into foundation models, it also provides an accessible platform to transform ideas into reality.

Powerful Performance With Super for Generative AI

The enhanced performance of the Jetson Orin Nano Super delivers gains for all popular generative AI models and transformer-based computer vision.

The developer kit consists of a Jetson Orin Nano 8GB system-on-module (SoM) and a reference carrier board, providing an ideal platform for prototyping edge AI applications.

The SoM features an NVIDIA Ampere architecture GPU with tensor cores and a 6-core Arm CPU, facilitating multiple concurrent AI application pipelines and high-performance inference. It can support up to four cameras, offering higher resolution and frame rates than previous versions.

Extensive Generative AI Software Ecosystem and Community

Generative AI is evolving quickly. The NVIDIA Jetson AI lab offers immediate support for those cutting-edge models from the open-source community and provides easy-to-use tutorials. Developers can also get extensive support from the broader Jetson community and inspiration from projects created by developers.

Jetson runs NVIDIA AI software including NVIDIA Isaac for robotics, NVIDIA Metropolis for vision AI and NVIDIA Holoscan for sensor processing. Development time can be reduced with NVIDIA Omniverse Replicator for synthetic data generation and NVIDIA TAO Toolkit for fine-tuning pretrained AI models from the NGC catalog.

Jetson ecosystem partners offer additional AI and system software, developer tools and custom software development. They can also help with cameras and other sensors, as well as carrier boards and design services for product solutions.

Boosting Jetson Orin Performance for All With Super Mode

The software updates to boost 1.7X generative AI performance will also be available to the Jetson Orin NX and Orin Nano series of systems on modules.

Existing Jetson Orin Nano Developer Kit owners can upgrade the JetPack SDK to unlock boosted performance today.

Learn more about Jetson Orin Nano Super Developer Kit.

See notice regarding software product information.

Read More

Llama 3.3 70B now available in Amazon SageMaker JumpStart

Llama 3.3 70B now available in Amazon SageMaker JumpStart

Today, we are excited to announce that the Llama 3.3 70B from Meta is available in Amazon SageMaker JumpStart. Llama 3.3 70B marks an exciting advancement in large language model (LLM) development, offering comparable performance to larger Llama versions with fewer computational resources.

In this post, we explore how to deploy this model efficiently on Amazon SageMaker AI, using advanced SageMaker AI features for optimal performance and cost management.

Overview of the Llama 3.3 70B model

Llama 3.3 70B represents a significant breakthrough in model efficiency and performance optimization. This new model delivers output quality comparable to Llama 3.1 405B while requiring only a fraction of the computational resources. According to Meta, this efficiency gain translates to nearly five times more cost-effective inference operations, making it an attractive option for production deployments.

The model’s sophisticated architecture builds upon Meta’s optimized version of the transformer design, featuring an enhanced attention mechanism that can help substantially reduce inference costs. During its development, Meta’s engineering team trained the model on an extensive dataset comprising approximately 15 trillion tokens, incorporating both web-sourced content and over 25 million synthetic examples specifically created for LLM development. This comprehensive training approach results in the model’s robust understanding and generation capabilities across diverse tasks.

What sets Llama 3.3 70B apart is its refined training methodology. The model underwent an extensive supervised fine-tuning process, complemented by Reinforcement Learning from Human Feedback (RLHF). This dual-approach training strategy helps align the model’s outputs more closely with human preferences while maintaining high performance standards. In benchmark evaluations against its larger counterpart, Llama 3.3 70B demonstrated remarkable consistency, trailing Llama 3.1 405B by less than 2% in 6 out of 10 standard AI benchmarks and actually outperforming it in three categories. This performance profile makes it an ideal candidate for organizations seeking to balance model capabilities with operational efficiency.

The following figure summarizes the benchmark results (source).

Getting started with SageMaker JumpStart

SageMaker JumpStart is a machine learning (ML) hub that can help accelerate your ML journey. With SageMaker JumpStart, you can evaluate, compare, and select pre-trained foundation models (FMs), including Llama 3 models. These models are fully customizable for your use case with your data, and you can deploy them into production using either the UI or SDK.

Deploying Llama 3.3 70B through SageMaker JumpStart offers two convenient approaches: using the intuitive SageMaker JumpStart UI or implementing programmatically through the SageMaker Python SDK. Let’s explore both methods to help you choose the approach that best suits your needs.

Deploy Llama 3.3 70B through the SageMaker JumpStart UI

You can access the SageMaker JumpStart UI through either Amazon SageMaker Unified Studio or Amazon SageMaker Studio. To deploy Llama 3.3 70B using the SageMaker JumpStart UI, complete the following steps:

  1. In SageMaker Unified Studio, on the Build menu, choose JumpStart models.

Alternatively, on the SageMaker Studio console, choose JumpStart in the navigation pane.

  1. Search for Meta Llama 3.3 70B.
  2. Choose the Meta Llama 3.3 70B model.
  3. Choose Deploy.
  4. Accept the end-user license agreement (EULA).
  5. For Instance type¸ choose an instance (ml.g5.48xlarge or ml.p4d.24xlarge).
  6. Choose Deploy.

Wait until the endpoint status shows as InService. You can now run inference using the model.

Deploy Llama 3.3 70B using the SageMaker Python SDK

For teams looking to automate deployment or integrate with existing MLOps pipelines, you can use the following code to deploy the model using the SageMaker Python SDK:

from sagemaker.serve.builder.model_builder import ModelBuilder
from sagemaker.serve.builder.schema_builder import SchemaBuilder
from sagemaker.jumpstart.model import ModelAccessConfig
from sagemaker.session import Session
import logging

sagemaker_session = Session()

artifacts_bucket_name = sagemaker_session.default_bucket()
execution_role_arn = sagemaker_session.get_caller_identity_arn()

js_model_id = "meta-textgeneration-llama-3-3-70b-instruct"

gpu_instance_type = "ml.p4d.24xlarge"

response = "Hello, I'm a language model, and I'm here to help you with your English."

sample_input = {
    "inputs": "Hello, I'm a language model,",
    "parameters": {"max_new_tokens": 128, "top_p": 0.9, "temperature": 0.6},
}

sample_output = [{"generated_text": response}]

schema_builder = SchemaBuilder(sample_input, sample_output)

model_builder = ModelBuilder(
    model=js_model_id,
    schema_builder=schema_builder,
    sagemaker_session=sagemaker_session,
    role_arn=execution_role_arn,
    log_level=logging.ERROR
)

model= model_builder.build()

predictor = model.deploy(model_access_configs={js_model_id:ModelAccessConfig(accept_eula=True)}, accept_eula=True)
predictor.predict(sample_input)

Set up auto scaling and scale down to zero

You can optionally set up auto scaling to scale down to zero after deployment. For more information, refer to Unlock cost savings with the new scale down to zero feature in SageMaker Inference.

Optimize deployment with SageMaker AI

SageMaker AI simplifies the deployment of sophisticated models like Llama 3.3 70B, offering a range of features designed to optimize both performance and cost efficiency. With the advanced capabilities of SageMaker AI, organizations can deploy and manage LLMs in production environments, taking full advantage of Llama 3.3 70B’s efficiency while benefiting from the streamlined deployment process and optimization tools of SageMaker AI. Default deployment through SageMaker JumpStart uses accelerated deployment, which uses speculative decoding to improve throughput. For more information on how speculative decoding works with SageMaker AI, see Amazon SageMaker launches the updated inference optimization toolkit for generative AI.

Firstly, the Fast Model Loader revolutionizes the model initialization process by implementing an innovative weight streaming mechanism. This feature fundamentally changes how model weights are loaded onto accelerators, dramatically reducing the time required to get the model ready for inference. Instead of the traditional approach of loading the entire model into memory before beginning operations, Fast Model Loader streams weights directly from Amazon Simple Storage Service (Amazon S3) to the accelerator, enabling faster startup and scaling times.

One SageMaker inference capability is Container Caching, which transforms how model containers are managed during scaling operations. This feature eliminates one of the major bottlenecks in deployment scaling by pre-caching container images, removing the need for time-consuming downloads when adding new instances. For large models like Llama 3.3 70B, where container images can be substantial in size, this optimization significantly reduces scaling latency and improves overall system responsiveness.

Another key capability is Scale to Zero. It introduces intelligent resource management that automatically adjusts compute capacity based on actual usage patterns. This feature represents a paradigm shift in cost optimization for model deployments, allowing endpoints to scale down completely during periods of inactivity while maintaining the ability to scale up quickly when demand returns. This capability is particularly valuable for organizations running multiple models or dealing with variable workload patterns.

Together, these features create a powerful deployment environment that maximizes the benefits of Llama 3.3 70B’s efficient architecture while providing robust tools for managing operational costs and performance.

Conclusion

The combination of Llama 3.3 70B with the advanced inference features of SageMaker AI provides an optimal solution for production deployments. By using Fast Model Loader, Container Caching, and Scale to Zero capabilities, organizations can achieve both high performance and cost-efficiency in their LLM deployments.

We encourage you to try this implementation and share your experiences.


About the authors

Marc Karp is an ML Architect with the Amazon SageMaker Service team. He focuses on helping customers design, deploy, and manage ML workloads at scale. In his spare time, he enjoys traveling and exploring new places.

Saurabh Trikande is a Senior Product Manager for Amazon Bedrock and SageMaker Inference. He is passionate about working with customers and partners, motivated by the goal of democratizing AI. He focuses on core challenges related to deploying complex AI applications, inference with multi-tenant models, cost optimizations, and making the deployment of Generative AI models more accessible. In his spare time, Saurabh enjoys hiking, learning about innovative technologies, following TechCrunch, and spending time with his family.

Melanie Li, PhD, is a Senior Generative AI Specialist Solutions Architect at AWS based in Sydney, Australia, where her focus is on working with customers to build solutions leveraging state-of-the-art AI and machine learning tools. She has been actively involved in multiple Generative AI initiatives across APJ, harnessing the power of Large Language Models (LLMs). Prior to joining AWS, Dr. Li held data science roles in the financial and retail industries.

Adriana Simmons is a Senior Product Marketing Manager at AWS.

Lokeshwaran Ravi is a Senior Deep Learning Compiler Engineer at AWS, specializing in ML optimization, model acceleration, and AI security. He focuses on enhancing efficiency, reducing costs, and building secure ecosystems to democratize AI technologies, making cutting-edge ML accessible and impactful across industries.

Yotam Moss is a Software development Manager for Inference at AWS AI.

Read More

ARMADA: Augmented Reality for Robot Manipulation and Robot-Free Data Acquisition

Teleoperation for robot imitation learning is bottlenecked by hardware availability. Can high-quality robot data be collected without a physical robot? We present a system for augmenting Apple Vision Pro with real-time virtual robot feedback. By providing users with an intuitive understanding of how their actions translate to robot motions, we enable the collection of natural barehanded human data that is compatible with the limitations of physical robot hardware. We conducted a user study with 15 participants demonstrating 3 different tasks each under 3 different feedback conditions and…Apple Machine Learning Research

Tech Leader, AI Visionary, Endlessly Curious Jensen Huang to Keynote CES 2025

Tech Leader, AI Visionary, Endlessly Curious Jensen Huang to Keynote CES 2025

On Jan. 6 at 6:30 p.m. PT, NVIDIA founder and CEO Jensen Huang — with his trademark leather jacket and an unwavering vision — will step onto the CES 2025 stage.

From humble beginnings as a busboy at a Denny’s to founding NVIDIA, Huang’s story embodies innovation and perseverance.

Huang has been named the world’s best CEO by Fortune and The Economist, as well as one of TIME magazine’s 100 most influential people in the world.

Today, NVIDIA is a driving force behind breakthroughs in AI and accelerated computing, technologies transforming industries ranging from healthcare, to automotive and entertainment.

Across the globe, NVIDIA’s innovations enable advanced chatbots, robots, software-defined vehicles, sprawling virtual worlds, hypersynchronized factory floors and much more.

NVIDIA’s accelerated computing and AI platforms power hundreds of millions of computers, available from major cloud providers and server manufacturers.

They fuel 76% of the world’s fastest supercomputers on the TOP500 list and are supported by a thriving community of more than 5 million developers.

For decades, Huang has led NVIDIA through revolutions that ripple across industries.

GPUs redefined gaming as an art form, and NVIDIA’s AI tools empower labs, factory floors and Hollywood sets. From self-driving cars to automated industrial processes, these tools are foundational to the next generation of technological breakthroughs.

CES has long been the stage for the unveiling of technological advancements, and Huang’s keynote is no exception.

Since its inception in 1967, CES has unveiled iconic innovations, including transistor radios, VCRs and HDTVs.

Over the decades, CES has launched numerous NVIDIA flagship innovations, from a first look at NVIDIA SHIELD to NVIDIA DRIVE for autonomous vehicles.

NVIDIA at CES 2025

The keynote is just the beginning.

From Jan. 7-10, NVIDIA will host press, analysts, customers and partners at the Fontainebleau Resort Las Vegas.

The space will feature hands-on demos showcasing innovations in AI, robotics and accelerated computing across NVIDIA’s automotive, consumer, enterprise, Omniverse and robotics portfolios.

Meanwhile, NVIDIA’s technologies will take center stage on the CES show floor at the Las Vegas Convention Center, where partners will highlight AI-powered technologies, immersive gaming experiences and groundbreaking automotive advancements.

Attendees can also participate in NVIDIA’s “Explore to Win” program, an interactive scavenger hunt featuring missions, points and prizes.

Curious about the future? Tune in live on NVIDIA’s website or the company’s YouTube channels to witness how NVIDIA is shaping the future of technology.

Read More

AWS re:Invent 2024 Highlights: Top takeaways from Swami Sivasubramanian to help customers manage generative AI at scale

AWS re:Invent 2024 Highlights: Top takeaways from Swami Sivasubramanian to help customers manage generative AI at scale

We spoke with Dr. Swami Sivasubramanian, Vice President of Data and AI, shortly after AWS re:Invent 2024 to hear his impressions—and to get insights on how the latest AWS innovations help meet the real-world needs of customers as they build and scale transformative generative AI applications.

Q: What made this re:Invent different?

Swami Sivasubramanian: The theme I spoke about in my re:Invent keynote was simple but powerful—convergence. I believe that we’re at an inflection point unlike any other in the evolution of AI. We’re seeing a remarkable convergence of data, analytics, and generative AI. It’s a combination that enables next-level generative AI applications that are far more capable. And it lets our customers move faster in a really significant way, getting more value, more quickly. Companies like Rocket Mortgage are building on an AI-driven platform powered by Amazon Bedrock to create AI agents and automate tasks—working to give their employees access to generative AI with no-code tools. Canva uses AWS to power 1.2 million requests a day and sees 450 new designs created every second. There’s also a human side to convergence, as people across organizations are working together in new ways, requiring a deeper level of collaboration between groups, like science and engineering teams. And this isn’t just a one-time collaboration. It’s an ongoing process.

People’s expectations for applications and customer experiences are changing again with generative AI. Increasingly, I think generative AI inference is going to be a core building block for every application. To realize this future, organizations need more than just a chatbot or a single powerful large language model (LLM). At re:Invent, we made some exciting announcements about the future of generative AI, of course. But we also launched a remarkable portfolio of new products, capabilities, and features that will help our customers manage generative AI at scale—making it easier to control costs, build trust, increase productivity, and deliver ROI.

Q: Are there key innovations that build on the experience and lessons learned at Amazon in adopting generative AI? How are you bringing those capabilities to your customers

Swami Sivasubramanian: Yes, our announcement of Amazon Nova, a new generation of foundation models (FMs), has state-of-the-art intelligence across a wide range of tasks and industry-leading price performance. Amazon Nova models expand the growing selection of the broadest and most capable FMs in Amazon Bedrock for enterprise customers. The specific capabilities of Amazon Nova Micro, Lite, and Pro demonstrate exceptional intelligence, capabilities, and speed—and perform quite competitively against the best models in their respective categories. Amazon Nova Canvas, our state-of-the-art image generation model, creates professional grade images from text and image inputs, democratizing access to production-grade visual content for advertising, training, social media, and more. Finally, Amazon Nova Reel offers state-of-the-art video generation that allows customers to create high-quality video from text or images. With about 1,000 generative AI applications in motion inside Amazon, groups like Amazon Ads are using Amazon Nova to remove barriers for sellers and advertisers, enabling new levels of creativity and innovation. New capabilities like image and video generation are helping Amazon Ads customers promote more products in their catalogs, and experiment with new strategies like keyword-level creative to increase engagement and drive sales.

But there’s more ahead, and here’s where an important shift is happening. We’re working on an even more capable any-to-any model where you can provide text, images, audio, and video as input and the model can generate outputs in any of these modalities. And we think this multi-modal approach is how models are going to evolve, moving ahead where one model can accept any kind of input and generate any kind of output. Over time, I think this is what state-of-the-art models will look like.

Q: Speaking of announcements like Amazon Nova, you’ve been a key innovator in AI for many years. What continues to inspire you?

Swami Sivasubramanian: It’s fascinating to think about what LLMs are capable of. What inspires me most though is how can we help our customers unblock the challenges they are facing and realize that potential. Consider hallucinations. As highly capable as today’s models are, they still have a tendency to get things wrong occasionally. It’s a challenge that many of our customers struggle with when integrating generative AI into their businesses and moving to production. We explored the problem and asked ourselves if we could do more to help. We looked inward, and leveraged Automated Reasoning, an innovation that Amazon has been using as a behind-the-scenes technology in many of our services like identity and access management.

I like to think of this situation as yin and yang. Automated Reasoning is all about certainty and being able to mathematically prove that something is correct. Generative AI is all about creativity and open-ended responses. Though they might seem like opposites, they’re actually complementary—with Automated Reasoning completing and strengthening generative AI. We’ve found that Automated Reasoning works really well when you have a huge surface area of a problem, a corpus of knowledge about that problem area, and when it’s critical that you get the correct answer—which makes Automated Reasoning a good fit for addressing hallucinations.

At re:Invent, we announced Amazon Bedrock Guardrails Automated Reasoning checks—the first and only generative AI safeguard that helps prevent factual errors due to hallucinations. All by using logically accurate and verifiable reasoning that explains why generative AI responses are correct. I think that it’s an innovation that will have significant impact across organizations and industries, helping build trust and accelerate generative AI adoption.

Q: Controlling costs is important to all organizations, large and small, particularly as they take generative AI applications into production. How do the announcements at re:Invent answer this need?

Swami Sivasubramanian: Like our customers, here at Amazon we’re increasing our investment in generative AI development, with multiple projects in process—all requiring timely access to accelerated compute resources. But allocating optimal compute capacity to each project can create a supply/demand challenge. To address this challenge, we created an internal service that helped Amazon drive utilization of compute resources to more than 90% across all our projects. This service enabled us to smooth out demand across projects and achieve higher capacity utilization, speeding development.

As with Automated Reasoning, we realized that our customers would also benefit from these capabilities. So, at re:Invent, I announced the new task governance capability in Amazon SageMaker HyperPod, which helps our customers optimize compute resource utilization and reduce time to market by up to 40%. With this capability, users can dynamically run tasks across the end-to-end FM workflow— accelerating time to market for AI innovations while avoiding cost overruns due to underutilized compute resources.

Our customers also tell me that the trade-off between cost and accuracy for models is real. We’re answering this need by making it super-easy to evaluate models on Amazon Bedrock, so they don’t have to spend months researching and making comparisons. We’re also lowering costs with game-changing capabilities such Amazon Bedrock Model Distillation, which pairs models for lower costs; Amazon Bedrock Intelligent Prompt Routing, which manages prompts more efficiently, at scale; and prompt caching, which reduces repeated processing without compromising on accuracy.

Q: Higher productivity is one of the core promises of generative AI. How is AWS helping employees at all levels be more productive?

Swami Sivasubramanian: I like to point out that using generative AI becomes irresistible when it makes employees 10 times more productive. In short, not an incremental increase, but a major leap in productivity. And we’re helping employees get there. For example, Amazon Q Developer is transforming code development by taking care of the time-consuming chores that developers don’t want to deal with, like software upgrades. And it also helps them move much faster by automating code reviews and dealing with mainframe modernization. Consider Novacomp, a leading IT company in Latin America, which leveraged Amazon Q Developer to upgrade a project with over 10,000 lines of Java code in just 50 minutes, a task that would have typically taken an estimated 3 weeks. The company also simplified everyday tasks for developers, reducing its technical debt by 60% on average.

On the business side, Amazon Q Business is bridging the gap between unstructured and structured data, recognizing that most businesses need to draw from a mix of data. With Amazon Q in QuickSight, non-technical users can leverage natural language to build, discover, and share meaningful insights in seconds. Now they can access databases and data warehouses, as well as unstructured business data, like emails, reports, charts, graphs, and images.

And looking ahead, we announced advanced agentic capabilities for Amazon Q Business, coming in 2025, which will use agents to automate complex tasks that stretch across multiple teams and applications. Agents give generative AI applications next-level capabilities, and we’re bringing them to our customers via Amazon Q Business, as well as Amazon Bedrock multi-agent collaboration, which improves successful task completion by 40% over popular solutions. This major improvement translates to more accurate and human-like outcomes in use cases like automating customer support, analyzing financial data for risk management, or optimizing supply-chain logistics.

It’s all part of how we’re enabling greater productivity today, with even more on the horizon.

Q: To get employees and customers adopting generative AI and benefiting from that increased productivity, it has to be trusted. What steps is AWS taking to help build that trust?

Swami Sivasubramanian: I think that lack of trust is a big obstacle to moving from proof of concept to production. Business leaders are about to hit go and they hesitate because they don’t want to lose the trust of their customers. As generative AI continues to drive innovation across industries and our daily life, the need for responsible AI has become increasingly acute. And we’re helping meet that need with innovations like Amazon Bedrock Automated Reasoning, which I mentioned earlier, that works to prevent hallucinations—and increases trust. We also announced new LLM-as-a-judge capabilities with Amazon Bedrock Model Evaluation so you can now perform tests and evaluate other models with humanlike quality at a fraction of the cost and time of running human evaluations. These evaluations assess multiple quality dimensions, including correctness, helpfulness, and responsible AI criteria such as answer refusal and harmfulness.

I should also mention that AWS recently became the first major cloud provider to announce ISO/IEC 42001 accredited certification for AI services, covering Amazon Bedrock, Amazon Q Business, Amazon Textract, and Amazon Transcribe. This international management system standard outlines requirements and controls for organizations to promote the responsible development and use of AI systems. Technical standards like ISO/IEC 42001 are significant because they provide a much-needed common framework for responsible AI development and deployment.

Q: Data remains central to building more personalized experiences applicable to your business. How do the re:Invent launches help AWS customers get their data ready for generative AI?

Swami Sivasubramanian: Generative AI isn’t going to be useful for organizations unless it can seamlessly access and deeply understand the organization’s data. With these insights, our customers can create customized experiences, such as highly personalized customer service agents that can help service representatives resolve issues faster. For AWS customers, getting data ready for generative AI isn’t just a technical challenge—it’s a strategic imperative. Proprietary, high-quality data is the key differentiator in transforming generic AI into powerful, business-specific applications. To prepare for this AI-driven future, we’re helping our customers build a robust, cloud-based data foundation, with built-in security and privacy. That’s the backbone of AI readiness.

With the next generation of Amazon SageMaker announced at re:Invent, we’re introducing an integrated experience to access, govern, and act on all your data by bringing together widely adopted AWS data, analytics, and AI capabilities. Collaborate and build faster from a unified studio using familiar AWS tools for model development, generative AI, data processing, and SQL analytics—with Amazon Q Developer assisting you along the way. Access all your data whether it’s stored in data lakes, data warehouses, third-party or federated data sources. And move with confidence and trust, thanks to built-in governance to address enterprise security needs.

At re:Invent, we also launched key Amazon Bedrock capabilities that help our customers maximize the value of their data. Amazon Bedrock Knowledge Bases now offers the only managed, out-of-the-box Retrieval Augmented Generation (RAG) solution, which enables our customers to natively query their structured data where it resides, accelerating development. Support for GraphRAG generates more relevant responses by modeling and storing relationships between data. And Amazon Bedrock Data Automation transforms unstructured, multimodal data into structured data for generative AI—automatically extracting, transforming, and generating usable data from multimodal content, at scale. These capabilities and more help our customers leverage their data to create powerful, insightful generative AI applications.

Q: What did you take away from your customer conversations at re:Invent?

Swami Sivasubramanian: I continue to be amazed and inspired by our customers and the important work they’re doing. We continue to offer our customers the choice and specialization they need to power their unique use cases. With Amazon Bedrock Marketplace, customers now have access to more than 100 popular, emerging, and specialized models.

At re:Invent, I heard a lot about the new efficiency and transformative experiences customers are creating. I also heard about innovations that are changing people’s lives. Like Exact Sciences, a molecular diagnostic company, which developed an AI-powered solution using Amazon Bedrock to accelerate genetic testing and analysis by 50%. Behind that metric there’s a real human value—enabling earlier cancer detection and personalized treatment planning. And that’s just one story among thousands, as our customers reach higher and build faster, achieving impressive results that change industries and improve lives.

I get excited when I think about how we can help educate the next wave of innovators building these experiences. With the launch of the new Education Equity Initiative, Amazon is committing up to $100 million in cloud technology and technical resources to help existing, dedicated learning organizations reach more learners by creating new and innovative digital learning solutions. That’s truly inspiring to me.

In fact, the pace of change, the remarkable innovations we introduced at re:Invent, and the enthusiasm of our customers all reminded me of the early days of AWS, when anything seemed possible. And now, it still is.


About the author

Swami Sivasubramanian is VP, AWS AI & Data. In this role, Swami oversees all AWS Database, Analytics, and AI & Machine Learning services. His team’s mission is to help organizations put their data to work with a complete, end-to-end data solution to store, access, analyze, and visualize, and predict.

Read More