Unlock cost-effective AI inference using Amazon Bedrock serverless capabilities with an Amazon SageMaker trained model

In this post, I’ll show you how to use Amazon Bedrock—with its fully managed, on-demand API—with your Amazon SageMaker trained or fine-tuned model.

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies such as AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities to build generative AI applications with security, privacy, and responsible AI.

Previously, if you wanted to use your own custom fine-tuned models in Amazon Bedrock, you either had to self-manage your inference infrastructure in SageMaker or train the models directly within Amazon Bedrock, which requires costly provisioned throughput.

With Amazon Bedrock Custom Model Import, you can use new or existing models that have been trained or fine-tuned within SageMaker using Amazon SageMaker JumpStart. You can import the supported architectures into Amazon Bedrock, allowing you to access them on demand through the Amazon Bedrock fully managed invoke model API.

Solution overview

At the time of writing, Amazon Bedrock supports importing custom models from the following architectures:

Mistral
Flan
Meta Llama 2 and Llama 3

For this post, we use a Hugging Face Flan-T5 Base model.

In the following sections, I walk you through the steps to train a model in SageMaker JumpStart and import it into Amazon Bedrock. Then you can interact with your custom model through the Amazon Bedrock playgrounds.

Prerequisites

Before you begin, verify that you have an AWS account with Amazon SageMaker Studio and Amazon Bedrock access.

If you don’t already have an instance of SageMaker Studio, see Launch Amazon SageMaker Studio for instructions to create one.

Train a model in SageMaker JumpStart

Complete the following steps to train a Flan model in SageMaker JumpStart:

Open the AWS Management Console and go to SageMaker Studio.

In SageMaker Studio, choose JumpStart in the navigation pane.

With SageMaker JumpStart, machine learning (ML) practitioners can choose from a broad selection of publicly available FMs using pre-built machine learning solutions that can be deployed in a few clicks.

Search for and choose the Hugging Face Flan-T5 Base

On the model details page, you can review a short description of the model, how to deploy it, how to fine-tune it, and what format your training data needs to be in to customize the model.

Choose Train to begin fine-tuning the model on your training data.

Create the training job using the default settings. The defaults populate the training job with recommended settings.

The example in this post uses a prepopulated example dataset. When using your own data, enter its location in the Data section, making sure it meets the format requirements.

Configure the security settings such as AWS Identity and Access Management (IAM) role, virtual private cloud (VPC), and encryption.
Note the value for Output artifact location (S3 URI) to use later.
Submit the job to start training.

You can monitor your job by selecting Training on the Jobs dropdown menu. When the training job status shows as Completed, the job has finished. With default settings, training takes about 10 minutes.

Import the model into Amazon Bedrock

After the model has completed training, you can import it into Amazon Bedrock. Complete the following steps:

On the Amazon Bedrock console, choose Imported models under Foundation models in the navigation pane.
Choose Import model.

For Model name, enter a recognizable name for your model.
Under Model import settings, select Amazon SageMaker model and select the radio button next to your model.

Under Service access, select Create and use a new service role and enter a name for the role.
Choose Import model.

The model import will complete in about 15 minutes.

Under Playgrounds in the navigation pane, choose Text.
Choose Select model.

For Category, choose Imported models.
For Model, choose flan-t5-fine-tuned.
For Throughput, choose On-demand.
Choose Apply.

You can now interact with your custom model. In the following screenshot, we use our example custom model to summarize a description about Amazon Bedrock.

Clean up

Complete the following steps to clean up your resources:

If you’re not going to continue using SageMaker, delete your SageMaker domain.
If you no longer want to maintain your model artifacts, delete the Amazon Simple Storage Service (Amazon S3) bucket where your model artifacts are stored.
To delete your imported model from Amazon Bedrock, on the Imported models page on the Amazon Bedrock console, select your model, and then choose the options menu (three dots) and select Delete.

Conclusion

In this post, we explored how the Custom Model Import feature in Amazon Bedrock enables you to use your own custom trained or fine-tuned models for on-demand, cost-efficient inference. By integrating SageMaker model training capabilities with the fully managed, scalable infrastructure of Amazon Bedrock, you now have a seamless way to deploy your specialized models and make them accessible through a simple API.

Whether you prefer the user-friendly SageMaker Studio console or the flexibility of SageMaker notebooks, you can train and import your models into Amazon Bedrock. This allows you to focus on developing innovative applications and solutions, without the burden of managing complex ML infrastructure.

As the capabilities of large language models continue to evolve, the ability to integrate custom models into your applications becomes increasingly valuable. With the Amazon Bedrock Custom Model Import feature, you can now unlock the full potential of your specialized models and deliver tailored experiences to your customers, all while benefiting from the scalability and cost-efficiency of a fully managed service.

To dive deeper into fine-tuning on SageMaker, see Instruction fine-tuning for FLAN T5 XL with Amazon SageMaker Jumpstart. To get more hands-on experience with Amazon Bedrock, check out our Building with Amazon Bedrock workshop.

About the Author

Joseph Sadler is a Senior Solutions Architect on the Worldwide Public Sector team at AWS, specializing in cybersecurity and machine learning. With public and private sector experience, he has expertise in cloud security, artificial intelligence, threat detection, and incident response. His diverse background helps him architect robust, secure solutions that use cutting-edge technologies to safeguard mission-critical systems

Vedere AI

Unlock cost-effective AI inference using Amazon Bedrock serverless capabilities with an Amazon SageMaker trained model

Solution overview

Prerequisites

Train a model in SageMaker JumpStart

Import the model into Amazon Bedrock

Clean up

Conclusion

About the Author

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.