Streamline custom model creation and deployment for Amazon Bedrock with Provisioned Throughput using Terraform

As customers seek to incorporate their corpus of knowledge into their generative artificial intelligence (AI) applications, or to build domain-specific models, their data science teams often want to conduct A/B testing and have repeatable experiments. In this post, we discuss a solution that uses infrastructure as code (IaC) to define the process of retrieving and formatting data for model customization and initiating the model customization. This enables you to version and iterate as needed.

With Amazon Bedrock, you can privately and securely customize foundation models (FMs) with your own data to build applications that are specific to your domain, organization, and use case. With custom models, you can create unique user experiences that reflect your company’s style, voice, and services.

Amazon Bedrock supports two methods of model customization:

Fine-tuning allows you to increase model accuracy by providing your own task-specific labeled training dataset and further specialize your FMs.
Continued pre-training allows you to train models using your own unlabeled data in a secure and managed environment and supports customer-managed keys. Continued pre-training helps models become more domain-specific by accumulating more robust knowledge and adaptability—beyond their original training.

In this post, we provide guidance on how to create an Amazon Bedrock custom model using HashiCorp Terraform that allows you to automate the process, including preparing datasets used for customization.

Terraform is an IaC tool that allows you to manage AWS resources, software as a service (SaaS) resources, datasets, and more, using declarative configuration. Terraform provides the benefits of automation, versioning, and repeatability.

Solution overview

We use Terraform to download a public dataset from the Hugging Face Hub, convert it to JSONL format, and upload it to an Amazon Simple Storage Service (Amazon S3) bucket with a versioned prefix. We then create an Amazon Bedrock custom model using fine-tuning, and create a second model using continued pre-training. Lastly, we configure Provisioned Throughput for our new models so we can test and deploy the custom models for wider usage.

The following diagram illustrates the solution architecture.

The workflow includes the following steps:

The user runs the terraform apply The Terraform local-exec provisioner is used to run a Python script that downloads the public dataset DialogSum from the Hugging Face Hub. This is then used to create a fine-tuning training JSONL file.
An S3 bucket stores training, validation, and output data. The generated JSONL file is uploaded to the S3 bucket.
The FM defined in the Terraform configuration is used as the source for the custom model training job.
The custom model training job uses the fine-tuning training data stored in the S3 bucket to enrich the FM. Amazon Bedrock is able to access the data in the S3 bucket (including output data) due to the AWS Identity and Access Management (IAM) role defined in the Terraform configuration, which grants access to the S3 bucket.
When the custom model training job is complete, the new custom model is available for use.

The high-level steps to implement this solution are as follows:

Create and initialize a Terraform project.
Create data sources for context lookup.
Create an S3 bucket to store training, validation, and output data.
Create an IAM service role that allows Amazon Bedrock to run a model customization job, access your training and validation data, and write your output data to your S3 bucket.
Configure your local Python virtual environment.
Download the DialogSum public dataset and convert it to JSONL.
Upload the converted dataset to Amazon S3.
Create an Amazon Bedrock custom model using fine-tuning.
Configure custom model Provisioned Throughput for your models.

Prerequisites

This solution requires the following prerequisites:

An AWS account. If you don’t have an account, you can sign up for one.
Access to an IAM user or role that has the required permissions for creating a custom service role, creating an S3 bucket, uploading objects to that S3 bucket, and creating Amazon Bedrock model customization jobs.
Configured model access for Cohere Command-Light (if you plan to use that model).
Terraform v1.0.0 or newer installed.
python v3.8 or newer installed.

Create and initialize a Terraform project

Complete the following steps to create a new Terraform project and initialize it. You can work in a local folder of your choosing.

In your preferred terminal, create a new folder named bedrockcm and change to that folder:
1. If on Windows, use the following code:
```
md bedrockcm
cd bedrockcm
```
2. If on Mac or Linux, use the following code:
```
mkdir bedrockcm
cd bedrockcm
```

Now you can work in a text editor and enter in code.

In your preferred text editor, add a new file with the following Terraform code:

terraform {
  required_version = ">= 1.0.0"
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">= 5.35.0"
    }
  }
}

Save the file in the root of the bedrockcm folder and name it main.tf.
In your terminal, run the following command to initialize the Terraform working directory:

terraform init

The output will contain a successful message like the following:

“Terraform has been successfully initialized”

In your terminal, validate the syntax for your Terraform files:

terraform validate

Create data sources for context lookup

The next step is to add configurations that define data sources that look up information about the context Terraform is currently operating in. These data sources are used when defining the IAM role and policies and when creating the S3 bucket. More information can be found in the Terraform documentation for aws_caller_identity, aws_partition, and aws_region.

In your text editor, add the following Terraform code to your main.tf file:

# Data sources to query the current context Terraform is operating in
data "aws_caller_identity" "current" {}
data "aws_partition" "current" {}
data "aws_region" "current" {}

Save the file.

Create an S3 bucket

In this step, you use Terraform to create an S3 bucket to use during model customization and associated outputs. S3 bucket names are globally unique, so you use the Terraform data source aws_caller_identity, which allows you to look up the current AWS account ID, and use string interpolation to include the account ID in the bucket name. Complete the following steps:

Add the following Terraform code to your main.tf file:

# Create a S3 bucket
resource "aws_s3_bucket" "model_training" {
  bucket = "model-training-${data.aws_caller_identity.current.account_id}"
}

Save the file.

Create an IAM service role for Amazon Bedrock

Now you create the service role that Amazon Bedrock will assume to operate the model customization jobs.

You first create a policy document, assume_role_policy, which defines the trust relationship for the IAM role. The policy allows the bedrock.amazonaws.com service to assume this role. You use global condition context keys for cross-service confused deputy prevention. There are also two conditions you specify: the source account must match the current account, and the source ARN must be an Amazon Bedrock model customization job operating from the current partition, AWS Region, and current account.

Complete the following steps:

Add the following Terraform code to your main.tf file:

# Create a policy document to allow Bedrock to assume the role
data "aws_iam_policy_document" "assume_role_policy" {
  statement {
    actions = ["sts:AssumeRole"]
    effect  = "Allow"
    principals {
      type        = "Service"
      identifiers = ["bedrock.amazonaws.com"]
    }
    condition {
      test     = "StringEquals"
      variable = "aws:SourceAccount"
      values   = [data.aws_caller_identity.current.account_id]
    }
    condition {
      test     = "ArnEquals"
      variable = "aws:SourceArn"
      values   = ["arn:${data.aws_partition.current.partition}:bedrock:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:model-customization-job/*"]
    }
  }
}

The second policy document, bedrock_custom_policy, defines permissions for accessing the S3 bucket you created for model training, validation, and output. The policy allows the actions GetObject, PutObject, and ListBucket on the resources specified, which are the ARN of the model_training S3 bucket and all of the buckets contents. You will then create an aws_iam_policy resource, which creates the policy in AWS.

Add the following Terraform code to your main.tf file:

# Create a policy document to allow Bedrock to access the S3 bucket
data "aws_iam_policy_document" "bedrock_custom_policy" {
  statement {
    sid       = "AllowS3Access"
    actions   = ["s3:GetObject", "s3:PutObject", "s3:ListBucket"]
    resources = [aws_s3_bucket.model_training.arn, "${aws_s3_bucket.model_training.arn}/*"]
  }
}

resource "aws_iam_policy" "bedrock_custom_policy" {
  name_prefix = "BedrockCM-"
  description = "Policy for Bedrock Custom Models customization jobs"
  policy      = data.aws_iam_policy_document.bedrock_custom_policy.json
}

Finally, the aws_iam_role resource, bedrock_custom_role, creates an IAM role with a name prefix of BedrockCM- and a description. The role uses assume_role_policy as its trust policy and bedrock_custom_policy as a managed policy to allow the actions specified.

Add the following Terraform code to your main.tf file:

# Create a role for Bedrock to assume
resource "aws_iam_role" "bedrock_custom_role" {
  name_prefix = "BedrockCM-"
  description = "Role for Bedrock Custom Models customization jobs"

  assume_role_policy  = data.aws_iam_policy_document.assume_role_policy.json
  managed_policy_arns = [aws_iam_policy.bedrock_custom_policy.arn]
}

Save the file.

Configure your local Python virtual environment

Python supports creating lightweight virtual environments, each with their own independent set of Python packages installed. You create and activate a virtual environment, and then install the datasets package.

In your terminal, in the root of the bedrockcm folder, run the following command to create a virtual environment:

python3 -m venv venv

Activate the virtual environment:
1. If on Windows, use the following command:
```
venvScriptsactivate
```
2. If on Mac or Linux, use the following command:
```
source venv/bin/activate
```

Now you install the datasets package via pip.

In your terminal, run the following command to install the datasets package:

pip3 install datasets

Download the public dataset

You now use Terraform’s local-exec provisioner to invoke a local Python script that will download the public dataset DialogSum from the Hugging Face Hub. The dataset is already divided into training, validation, and testing splits. This example uses just the training split.

You prepare the data for training by removing the id and topic columns, renaming the dialogue and summary columns, and truncating the dataset to 10,000 records. You then save the dataset in JSONL format. You could also use your own internal private datasets; we use a public dataset for example purposes.

You first create the local Python script named dialogsum-dataset-finetune.py, which is used to download the dataset and save it to disk.

In your text editor, add a new file with the following Python code:

import pandas as pd
from datasets import load_dataset

# Load the dataset from the huggingface hub
dataset = load_dataset("knkarthick/dialogsum")

# Convert the dataset to a pandas DataFrame
dft = dataset['train'].to_pandas()

# Drop the columns that are not required for fine-tuning
dft = dft.drop(columns=['id', 'topic'])

# Rename the columns to prompt and completion as required for fine-tuning.
# Ref: https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-prereq.html#model-customization-prepare
dft = dft.rename(columns={"dialogue": "prompt", "summary": "completion"})

# Limit the number of rows to 10,000 for fine-tuning
dft = dft.sample(10000,
    random_state=42)

# Save DataFrame as a JSONL file, with each line as a JSON object
dft.to_json('dialogsum-train-finetune.jsonl', orient='records', lines=True)

Save the file in the root of the bedrockcm folder and name it dialogsum-dataset-finetune.py.

Next, you edit the main.tf file you have been working in and add the terraform_data resource type, uses a local provisioner to invoke your Python script.

In your text editor, edit the main.tf file and add the following Terraform code:

resource "terraform_data" "training_data_fine_tune_v1" {
  input = "dialogsum-train-finetune.jsonl"

  provisioner "local-exec" {
    command = "python dialogsum-dataset-finetune.py"
  }
}

Upload the converted dataset to Amazon S3

Terraform provides the aws_s3_object resource type, which allows you to create and manage objects in S3 buckets. In this step, you reference the S3 bucket you created earlier and the terraform_data resource’s output attribute. This output attribute is how you instruct the Terraform resource graph that these resources need to be created with a dependency order.

In your text editor, edit the main.tf file and add the following Terraform code:

resource "aws_s3_object" "v1_training_fine_tune" {
  bucket = aws_s3_bucket.model_training.id
  key    = "training_data_v1/${terraform_data.training_data_fine_tune_v1.output}"
  source = terraform_data.training_data_fine_tune_v1.output
}

Create an Amazon Bedrock custom model using fine-tuning

Amazon Bedrock has multiple FMs that support customization with fine-tuning. To see a list of the models available, use the following AWS Command Line Interface (AWS CLI) command:

In your terminal, run the following command to list the FMs that support customization by fine-tuning:

aws bedrock list-foundation-models --by-customization-type FINE_TUNING

You use the Cohere Command-Light FM for this model customization. You add a Terraform data source to query the foundation model ARN using the model name. You then create the Terraform resource definition for aws_bedrock_custom_model, which creates a model customization job, and immediately returns.

The time it takes for model customization is non-deterministic, and is based on the input parameters, model used, and other factors.

In your text editor, edit the main.tf file and add the following Terraform code:

data "aws_bedrock_foundation_model" "cohere_command_light_text_v14" {
  model_id = "cohere.command-light-text-v14:7:4k"
}

resource "aws_bedrock_custom_model" "cm_cohere_v1" {
  custom_model_name     = "cm_cohere_v001"
  job_name              = "cm.command-light-text-v14.v001"
  base_model_identifier = data.aws_bedrock_foundation_model.cohere_command_light_text_v14.model_arn
  role_arn              = aws_iam_role.bedrock_custom_role.arn
  customization_type    = "FINE_TUNING"

  hyperparameters = {
    "epochCount"             = "1"
    "batchSize"              = "8"
    "learningRate"           = "0.00001"
    "earlyStoppingPatience"  = "6"
    "earlyStoppingThreshold" = "0.01"
    "evalPercentage"         = "20.0"
  }

  output_data_config {
    s3_uri = "s3://${aws_s3_bucket.model_training.id}/output_data_v1/"
  }

  training_data_config {
    s3_uri = "s3://${aws_s3_bucket.model_training.id}/training_data_v1/${terraform_data.training_data_fine_tune_v1.output}"
  }
}

Save the file.

Now you use Terraform to create the data sources and resources defined in your main.tf file, which will start a model customization job.

In your terminal, run the following command to validate the syntax for your Terraform files:

terraform validate

Run the following command to apply the configuration you created. Before creating the resources, Terraform will describe all the resources that will be created so you can verify your configuration:

terraform apply

Terraform will generate a plan and ask you to approve the actions, which will look similar to the following code:

...

Plan: 6 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value:

Enter yes to approve the changes.

Terraform will now apply your configuration. This process runs for a few minutes. At this time, your custom model is not yet ready for use; it will be in a Training state. Wait for training to finish before continuing. You can review the status on the Amazon Bedrock console on the Custom models page.

When the process is complete, you receive a message like the following:

Apply complete! Resources: 6 added, 0 changed, 0 destroyed.

You can also view the status on the Amazon Bedrock console.

You have now created an Amazon Bedrock custom model using fine-tuning.

Configure custom model Provisioned Throughput

Amazon Bedrock allows you to run inference on custom models by purchasing Provisioned Throughput. This guarantees a consistent level of throughput in exchange for a term commitment. You specify the number of model units needed to meet your application’s performance needs. For evaluating custom models initially, you can purchase Provisioned Throughput hourly (on-demand) with no long-term commitment. With no commitment, a quota of one model unit is available per Provisioned Throughput.

You create a new resource for Provisioned Throughput, associate one of your custom models, and provide a name. You omit the commitment_duration attribute to use on-demand.

In your text editor, edit the main.tf file and add the following Terraform code:

resource "aws_bedrock_provisioned_model_throughput" "cm_cohere_provisioned_v1" {
  provisioned_model_name = "${aws_bedrock_custom_model.cm_cohere_v1.custom_model_name}-provisioned"
  model_arn              = aws_bedrock_custom_model.cm_cohere_v1.custom_model_arn
  model_units            = 1 
}

Save the file.

Now you use Terraform to create the resources defined in your main.tf file.

In your terminal, run the following command to re-initialize the Terraform working directory:

terraform init

The output will contain a successful message like the following:

“Terraform has been successfully initialized”

Validate the syntax for your Terraform files:

terraform validate

Run the following command to apply the configuration you created:

terraform apply

Best practices and considerations

Note the following best practices when using this solution:

Data and model versioning – You can version your datasets and models by using version identifiers in your S3 bucket prefixes. This allows you to compare model efficacy and outputs. You could even operate a new model in a shadow deployment so that your team can evaluate the output relative to your models being used in production.
Data privacy and network security – With Amazon Bedrock, you are in control of your data, and all your inputs and customizations remain private to your AWS account. Your data, such as prompts, completions, custom models, and data used for fine-tuning or continued pre-training, is not used for service improvement and is never shared with third-party model providers. Your data remains in the Region where the API call is processed. All data is encrypted in transit and at rest. You can use AWS PrivateLink to create a private connection between your VPC and Amazon Bedrock.
Billing – Amazon Bedrock charges for model customization, storage, and inference. Model customization is charged per tokens processed. This is the number of tokens in the training dataset multiplied by the number of training epochs. An epoch is one full pass through the training data during customization. Model storage is charged per month, per model. Inference is charged hourly per model unit using Provisioned Throughput. For detailed pricing information, see Amazon Bedrock Pricing.
Custom models and Provisioned Throughput – Amazon Bedrock allows you to run inference on custom models by purchasing Provisioned Throughput. This guarantees a consistent level of throughput in exchange for a term commitment. You specify the number of model units needed to meet your application’s performance needs. For evaluating custom models initially, you can purchase Provisioned Throughput hourly with no long-term commitment. With no commitment, a quota of one model unit is available per Provisioned Throughput. You can create up to two Provisioned Throughputs per account.
Availability – Fine-tuning support on Meta Llama 2, Cohere Command Light, and Amazon Titan Text FMs is available today in Regions US East (N. Virginia) and US West (Oregon). Continued pre-training is available today in public preview in Regions US East (N. Virginia) and US West (Oregon). To learn more, visit the Amazon Bedrock Developer Experience and check out Custom models.

Clean up

When you no longer need the resources created as part of this post, clean up those resources to save associated costs. You can clean up the AWS resources created in this post using Terraform with the terraform destroy command.

First, you need to modify the configuration of the S3 bucket in the main.tf file to enable force destroy so the contents of the bucket will be deleted, so the bucket itself can be deleted. This will remove all of the sample data contained in the S3 bucket as well as the bucket itself. Make sure there is no data you want to retain in the bucket before proceeding.

Modify the declaration of your S3 bucket to set the force_destroy attribute of the S3 bucket:

# Create a S3 bucket
resource "aws_s3_bucket" "model_training" {
  bucket = "model-training-${data.aws_caller_identity.current.account_id}"
  force_destroy = true
}

Run the terraform apply command to update the S3 bucket with this new configuration:

terraform apply

Run the terraform destroy command to delete all resources created as part of this post:

terraform destroy

Conclusion

In this post, we demonstrated how to create Amazon Bedrock custom models using Terraform. We introduced GitOps to manage model configuration and data associated with your custom models.

We recommend testing the code and examples in your development environment, and making appropriate changes as required to use them in production. Consider your model consumption requirements when defining your Provisioned Throughput.

We welcome your feedback! If you have questions or suggestions, leave them in the comments section.

About the Authors

Josh Famestad is a Solutions Architect at AWS helping public sector customers accelerate growth, add agility, and reduce risk with cloud-based solutions.

Kevon Mayers is a Solutions Architect at AWS. Kevon is a Core Contributor for Terraform and has led multiple Terraform initiatives within AWS. Prior to joining AWS, he was working as a DevOps engineer and developer, and before that was working with the GRAMMYs/The Recording Academy as a studio manager, music producer, and audio engineer.

Tyler Lynch is a Principal Solution Architect at AWS. Tyler leads Terraform provider engineering at AWS and is a Core Contributor for Terraform.

Vedere AI