Code generation using Code Llama 70B and Mixtral 8x7B on Amazon SageMaker

In the ever-evolving landscape of machine learning and artificial intelligence (AI), large language models (LLMs) have emerged as powerful tools for a wide range of natural language processing (NLP) tasks, including code generation. Among these cutting-edge models, Code Llama 70B stands out as a true heavyweight, boasting an impressive 70 billion parameters. Developed by Meta and now available on Amazon SageMaker, this state-of-the-art LLM promises to revolutionize the way developers and data scientists approach coding tasks.

What is Code Llama 70B and Mixtral 8x7B?

Code Llama 70B is a variant of the Code Llama foundation model (FM), a fine-tuned version of Meta’s renowned Llama 2 model. This massive language model is specifically designed for code generation and understanding, capable of generating code from natural language prompts or existing code snippets. With its 70 billion parameters, Code Llama 70B offers unparalleled performance and versatility, making it a game-changer in the world of AI-assisted coding.

Mixtral 8x7B is a state-of-the-art sparse mixture of experts (MoE) foundation model released by Mistral AI. It supports multiple use cases such as text summarization, classification, text generation, and code generation. It is an 8x model, which means it contains eight distinct groups of parameters. The model has about 45 billion total parameters and supports a context length of 32,000 tokens. MoE is a type of neural network architecture that consists of multiple experts” where each expert is a neural network. In the context of transformer models, MoE replaces some feed-forward layers with sparse MoE layers. These layers have a certain number of experts, and a router network selects which experts process each token at each layer. MoE models enable more compute-efficient and faster inference compared to dense models.

Key features and capabilities of Code Llama 70B and Mixtral 8x7B include:

Code generation: These LLMs excel at generating high-quality code across a wide range of programming languages, including Python, Java, C++, and more. They can translate natural language instructions into functional code, streamlining the development process and accelerating project timelines.
Code infilling: In addition to generating new code, they can seamlessly infill missing sections of existing code by providing the prefix and suffix. This feature is particularly valuable for enhancing productivity and reducing the time spent on repetitive coding tasks.
Natural language interaction: The instruct variants of Code Llama 70B and Mixtral 8x7B support natural language interaction, allowing developers to engage in conversational exchanges to develop code-based solutions. This intuitive interface fosters collaboration and enhances the overall coding experience.
Long context support: With the ability to handle context lengths of up to 48 thousand tokens, Code Llama 70B can maintain coherence and consistency over extended code segments or conversations, ensuring relevant and accurate responses. Mixtral 8x7B has a context window of 32 thousand tokens.
Multi-language support: While both of these models excel at generating code, their capabilities extend beyond programming languages. They can also assist with natural language tasks, such as text generation, summarization, and question answering, making them versatile tools for various applications.

Harnessing the power of Code Llama 70B and Mistral models on SageMaker

Amazon SageMaker, a fully managed machine learning service, provides a seamless integration with Code Llama 70B, enabling developers and data scientists to use its capabilities with just a few clicks. Here’s how you can get started:

One-click deployment: Code Llama 70B and Mixtral 8x7B are available in Amazon SageMaker JumpStart, a hub that provides access to pre-trained models and solutions. With a few clicks, you can deploy them and create a private inference endpoint for your coding tasks.
Scalable infrastructure: The SageMaker scalable infrastructure ensures that foundation models can handle even the most demanding workloads, allowing you to generate code efficiently and without delays.
Integrated development environment: SageMaker provides a seamless integrated development environment (IDE) that you can use to interact with these models directly from your coding environment. This integration streamlines the workflow and enhances productivity.
Customization and fine-tuning: While Code Llama 70B and Mixtral 8x7B are powerful out-of-the-box models, you can use SageMaker to fine-tune and customize a model to suit your specific needs, further enhancing its performance and accuracy.
Security and compliance: SageMaker JumpStart employs multiple layers of security, including data encryption, network isolation, VPC deployment, and customizable inference, to ensure the privacy and confidentiality of your data when working with LLMs

Solution overview

The following figure showcases how code generation can be done using the Llama and Mistral AI Models on SageMaker presented in this blog post.

You first deploy a SageMaker endpoint using an LLM from SageMaker JumpStart. For the examples presented in this article, you either deploy a Code Llama 70 B or a Mixtral 8x7B endpoint. After the endpoint has been deployed, you can use it to generate code with the prompts provided in this article and the associated notebook, or with your own prompts. After the code has been generated with the endpoint, you can use a notebook to test the code and its functionality.

Prerequisites

In this section, you sign up for an AWS account and create an AWS Identity and Access Management (IAM) admin user.

If you’re new to SageMaker, we recommend that you read What is Amazon SageMaker?.

Use the following hyperlinks to finish setting up the prerequisites for an AWS account and Sagemaker:

Create an AWS Account: This walks you through setting up an AWS account
When you create an AWS account, you get a single sign-in identity that has complete access to all of the AWS services and resources in the account. This identity is called the AWS account root user.
Signing in to the AWS Management Console using the email address and password that you used to create the account gives you complete access to all of the AWS resources in your account. We strongly recommend that you not use the root user for everyday tasks, even the administrative ones.
Adhere to the security best practices in IAM, and Create an Administrative User and Group. Then securely lock away the root user credentials and use them to perform only a few account and service management tasks.
In the console, go to the SageMaker console andopen the left navigation pane.
1. Under Admin configurations, choose Domains.
2. Choose Create domain.
3. Choose Set up for single user (Quick setup). Your domain and user profile are created automatically.
Follow the steps in Custom setup to Amazon SageMaker to set up SageMaker for your organization.

With the prerequisites complete, you’re ready to continue.

Code generation scenarios

The Mixtral 8x7B and Code Llama 70B models requires an ml.g5.48xlarge instance. SageMaker JumpStart provides a simplified way to access and deploy over 100 different open source and third-party foundation models. In order to deploy an endpoint using SageMaker JumpStart, you might need to request a service quota increase to access an ml.g5.48xlarge instance for endpoint use. You can request service quota increases through the AWS console, AWS Command Line Interface (AWS CLI), or API to allow access to those additional resources.

Code Llama use cases with SageMaker

While Code Llama excels at generating simple functions and scripts, its capabilities extend far beyond that. The models can generate complex code for advanced applications, such as building neural networks for machine learning tasks. Let’s explore an example of using Code Llama to create a neural network on SageMaker. Let us start with deploying the Code Llama Model through SageMaker JumpStart.

Launch SageMaker JumpStart
Sign in to the console, navigate to SageMaker, and launch the SageMaker domain to open SageMaker Studio. Within SageMaker Studio, select JumpStart in the left-hand navigation menu.
Search for Code Llama 70B
In the JumpStart model hub, search for Code Llama 70B in the search bar. You should see the Code Llama 70B model listed under the Models category.
Deploy the Model
Select the Code Llama 70B model, and then choose Deploy. Enter an endpoint name (or keep the default value) and select the target instance type (for example, ml.g5.48xlarge). Choose Deploy to start the deployment process. You can leave the rest of the options as default.

Additional details on deployment can be found in Code Llama 70B is now available in Amazon SageMaker JumpStart

Create an inference endpoint
After the deployment is complete, SageMaker will provide you with an inference endpoint URL. Copy this URL to use later.
Set set up your development environment
You can interact with the deployed Code Llama 70B model using Python and the AWS SDK for Python (Boto3). First, make sure you have the required dependencies installed: pip install boto3

Note: This blog post section contains code that was generated with the assistance of Code Llama70B powered by Amazon Sagemaker.

Generating a transformer model for natural language processing

Let us walk through a code generation example with Code Llama 70B where you will generate a transformer model in python using Amazon SageMaker SDK.

Prompt:

<s>[INST]
<<SYS>>You are an expert code assistant that can teach a junior developer how to code. Your language of choice is Python. Don't explain the code, just generate the code block itself. Always use Amazon SageMaker SDK for python code generation. Add test case to test the code<</SYS>>

Generate a Python code that defines and trains a Transformer model for text classification on movie dataset. The python code should use Amazon SageMaker's TensorFlow estimator and be ready for deployment on SageMaker.
[/INST]

Response:

Code Llama generates a Python script for training a Transformer model on the sample dataset using TensorFlow and Amazon SageMaker.

Code example:
Create a new Python script (for example, code_llama_inference.py) and add the following code. Replace <YOUR_ENDPOINT_NAME> with the actual inference endpoint name provided by SageMaker JumpStart:

import boto3
import json

# Set up the SageMaker client
session = boto3.Session()
sagemaker_client = session.client("sagemaker-runtime")

# Set the inference endpoint URL
endpoint_name = "<YOUR_ENDPOINT_NAME>"

def query_endpoint(payload):
    client = boto3.client('runtime.sagemaker')
    response = client.invoke_endpoint(
        EndpointName=endpoint_name,
        ContentType='application/json',
        Body=json.dumps(payload).encode('utf-8'),
    )
    response = response["Body"].read().decode("utf8")
    response = json.loads(response)
    return response

def print_completion(prompt: str, response: str) -> None:
    bold, unbold = '33[1m', '33[0m'
    print(f"{bold}> Input{unbold}n{prompt}{bold}n> Output{unbold}n{response[0]['generated_text']}n")

# Define the input prompt 
prompt = """ 
<s>[INST]
<<SYS>>You are an expert code assistant that can teach a junior developer how to code. Your language of choice is Python. Don't explain the code, just generate the code block itself. Always use Amazon SageMaker SDK for python code generation. Add test case to test the code<</SYS>>
 
Generate a Python code that defines and trains a Transformer model for text classification on movie dataset. The python code should use Amazon SageMaker's TensorFlow estimator and be ready for deployment on SageMaker.
[/INST] """  

# Send the request to the endpoint and decode the response 
payload = {"inputs": prompt, "parameters": {"max_new_tokens": 2000, "temperature": 0.2, "top_p": 0.9}, }  
response = query_endpoint(payload) 
print_completion(prompt, response)

Save the script and run it:

python code_llama_inference.py

The script will send the provided prompt to the Code Llama 70B model deployed on SageMaker, and the model’s response will be printed to the output.

Example output:

Input

<s>[INST]
<<SYS>>
You are an expert AI assistant specializing in generating Python code for natural language processing (NLP) tasks on Amazon SageMaker. Your code should follow best practices, be well-documented, and include examples of usage.
<</SYS>>

Generate a Python script that defines and trains a Transformer model for text classification on movie dataset. The script should use Amazon SageMaker's TensorFlow estimator and be ready for deployment on SageMaker.
[/INST]

> Output

[RESP]
import os
import json
import sagemaker
import boto3
import numpy as np
import pandas as pd
from sagemaker.tensorflow import TensorFlow
from sklearn.model_selection import train_test_split
from sklearn.datasets import fetch_20newsgroups

# Set up SageMaker session
sagemaker_session = sagemaker.Session()
region = boto3.Session().region_name
bucket = sagemaker_session.default_bucket()
prefix = "sagemaker/DEMO-xSum"

# Download dataset
train = fetch_20newsgroups(subset="train")
test = fetch_20newsgroups(subset="test")

# Prepare dataset
train_data = pd.DataFrame(
{"text": train.data, "label": train.target.astype(np.int32)}
)
test_data = pd.DataFrame({"text": test.data, "label": test.target.astype(np.int32)})

# Split dataset
train_data, validation_data = train_test_split(train_data, test_size=0.2)

# Upload dataset to S3
train_input = sagemaker_session.upload_data(
path=train_data.to_csv(header=False, index=False),
bucket=bucket,
key_prefix=os.path.join(prefix, "train"),
)
validation_input = sagemaker_session.upload_data(
path=validation_data.to_csv(header=False, index=False),
bucket=bucket,
key_prefix=os.path.join(prefix, "validation"),
)
test_input = sagemaker_session.upload_data(
path=test_data.to_csv(header=False, index=False),
bucket=bucket,
key_prefix=os.path.join(prefix, "test"),
)

# Define hyperparameters
hyperparameters = {
"epochs": 1,
"batch_size": 64,
"max_seq_length": 256,
"model_dir": "/opt/ml/model",
"train_dir": "/opt/ml/input/data/train",
"validation_dir": "/opt/ml/input/data/validation",
"test_dir": "/opt/ml/input/data/test",
"train_filename": "train.csv",
"validation_filename": "validation.csv",
"test_filename": "test.csv",
"freeze_bert_layer": True,
"learning_rate": 0.0001,
"epsilon": 1e-08,
"bert_trainable_layers": None,
"bert_layer": -2,
"lambda_regularizer": 0.01,
"gradient_accumulation_steps": 2,
"steps_per_epoch": 100,
"validation_steps": 10,
"test_steps": 10,
"output_dir": "/opt/ml/output",
}

# Define estimator
estimator = TensorFlow(
entry_point="train.py",
source_dir="code",
role=sagemaker.get_execution_role(),
instance_count=1,
instance_type="ml.p3.2xlarge",
framework_version="2.4.1",
py_version="py37",
hyperparameters=hyperparameters,
)

# Train model
estimator.fit(
{
"train": train_input,
"validation": validation_input,
"test": test_input,
},
wait=True,
logs=True,
)

# Deploy model
predictor = estimator.deploy(
initial_instance_count=1, instance_type="ml.p3.2xlarge", wait=True
)

# Predict
predictor.predict(json.dumps({"instances": [test_data.iloc[0]["text"]]}))
<</RESP>>

You can modify the prompt variable to request different code generation tasks or engage in natural language interactions with the model.

This example demonstrates how to deploy and interact with the Code Llama 70B model on SageMaker JumpStart using Python and the AWS SDK. Because the model might be prone to minor errors in generating the response output, make sure you run the code. Further, you can instruct the model to fact-check the output and refine the model response in order to fix any other unnecessary errors in the code. With this setup, you can leverage the powerful code generation capabilities of Code Llama 70B within your development workflows, streamlining the coding process and unlocking new levels of productivity. Lets take a look at some additional examples.

Additional examples and use cases

Let’s walk through some other complex code generation scenarios. In the following sample, we’re running the script to generate a Deep Q reinforcement learning (RL) agent for playing the CartPole-v0 environment.

Generating a reinforcement learning agent

The following prompt was tested on Code Llama 70B to generate a Deep Q RL agent adept in playing CartPole-v0 environment.

Prompt:

<s>[INST]
<<SYS>>
You are a skilled AI assistant capable of generating Python code for reinforcement learning tasks on Amazon SageMaker. Your code should be efficient, well-documented, and include examples of usage.
<</SYS>>

Could you please generate a Python script that implements a Deep Q-Network (DQN) agent for playing the CartPole-v1 environment? The script should use Amazon SageMaker's TensorFlow estimator and be ready for deployment on SageMaker.
[/INST]

Response: Code Llama generates a Python script for training a DQN agent on the CartPole-v1 environment using TensorFlow and Amazon SageMaker as showcased in our GitHub repository.

Generating a distributed training script

In this scenario, you will generate a sample python code for distributed machine learning training on Amazon SageMaker using Code Llama 70B.

Prompt:

<s>[INST]
<<SYS>>
You are an expert AI assistant skilled in generating Python code for distributed machine learning training on Amazon SageMaker. Your code should be optimized for performance, follow best practices, and include examples of usage.
<</SYS>>

Could you please generate a Python script that performs distributed training of a deep neural network for image classification on the ImageNet dataset? The script should use Amazon SageMaker's PyTorch estimator with distributed data parallelism and be ready for deployment on SageMaker.
[/INST]

Response: Code Llama generates a Python script for distributed training of a deep neural network on the ImageNet dataset using PyTorch and Amazon SageMaker. Additional details are available in our GitHub repository.

Mixtral 8x7B use cases with SageMaker

Compared to traditional LLMs, Mixtral 8x7B offers the advantage of faster decoding at the speed of a smaller, parameter-dense model despite containing more parameters. It also outperforms other open-access models on certain benchmarks and supports a longer context length.

Launch SageMaker JumpStart
Sign in to the console, navigate to SageMaker, and launch the SageMaker domain to open SageMaker Studio. Within SageMaker Studio, select JumpStart in the left-hand navigation menu.
Search for Mixtral 8x7B Instruct
In the JumpStart model hub, search for Mixtral 8x7B Instruct in the search bar. You should see the Mixtral 8x7B Instruct model listed under the Models category.
Deploy the Model
Select the Code Llama 70B model, and then choose Deploy. Enter an endpoint name (or keep the default value) and choose the target instance type (for example, ml.g5.48xlarge). Choose Deploy to start the deployment process. You can leave the rest of the options as default.

Additional details on deployment can be found in Mixtral-8x7B is now available in Amazon SageMaker JumpStart.

Create an inference endpoint
After the deployment is complete, SageMaker will provide you with an inference endpoint URL. Copy this URL to use later.

Generating a hyperparameter tuning script for SageMaker

Hyperparameters are external configuration variables that data scientists use to manage machine learning model training. Sometimes called model hyperparameters, the hyperparameters are manually set before training a model. They’re different from parameters, which are internal parameters automatically derived during the learning process and not set by data scientists. Hyperparameters directly control model structure, function, and performance.

When you build complex machine learning systems like deep learning neural networks, exploring all the possible combinations is impractical. Hyperparameter tuning can accelerate your productivity by trying many variations of a model. It looks for the best model automatically by focusing on the most promising combinations of hyperparameter values within the ranges that you specify. To get good results, you must choose the right ranges to explore.

SageMaker automatic model tuning (AMT) finds the best version of a model by running many training jobs on your dataset. To do this, AMT uses the algorithm and ranges of hyperparameters that you specify. It then chooses the hyperparameter values that creates a model that performs the best, as measured by a metric that you choose.

Note: This blog post section contains code that was generated with the assistance of Mixtral 8X7B model, powered by Amazon Sagemaker.

Prompt:

<s>[INST]
<<SYS>>
You are an AI assistant specializing in generating Python code for hyperparameter tuning of machine learning models on Amazon SageMaker. Your code should be efficient, well-documented, and include examples of usage.
<</SYS>>

Generate a Python script that performs hyperparameter tuning of an XGBoost model for regression on the Boston Housing dataset. The script should use Amazon SageMaker's XGBoost estimator and be ready for deployment on SageMaker. It should optimize for validation:accuracy, and optimize the hyperparameters alpha and eta.

[/INST]

Response:

Here's a Python script that performs hyperparameter tuning of an XGBoost model for regression on the Boston Housing dataset using Amazon SageMaker. This script uses the XGBoost estimator and optimizes for validation:accuracy while tuning the hyperparameters `alpha` and `eta`.

```python
import boto3
import numpy as np
import pandas as pd
from sagemaker.xgboost.estimator import XGBoost
from sagemaker.tuner import IntegerParameter, ContinuousParameter, HyperparameterTuner

# Set up SageMaker session and specify the IAM role
sagemaker_session = boto3.Session(region_name='us-west-2')
role = sagemaker_session.get_execution_role()

# Specify the SageMaker XGBoost image version
xgboost_image = 'xgboost:1.3-1'

# Define the hyperparameter tuning configuration
hyperparameter_ranges = {
    'alpha': ContinuousParameter(0.1, 1.0),
    'eta': ContinuousParameter(0.01, 0.3)
}

objective_metric_name = 'validation:accuracy'
metric_definitions = [
    {
        'Name': 'validation:accuracy',
        'Regex': 'validation-accuracy: ([0-9\.]+)'
    }
]

tuner = HyperparameterTuner(
    XGBoost(
        role=role,
        train_instance_count=1,
        train_instance_type='ml.m5.large',
        sagemaker_session=sagemaker_session,
        hyperparameters={'objective': 'reg:squarederror', 'num_boost_round': 100},
        base_job_name='xgboost-tuner-boston'
    ),
    objective_metric_name,
    hyperparameter_ranges,
    metric_definitions,
    max_jobs=10,
    max_parallel_jobs=3
)

# Load the Boston Housing dataset
boston_data = pd.
read_csv('boston_housing.csv')
train_data = boston_data.sample(frac=0.8, random_state=42)
test_data = boston_data.drop(train_data.index)

train_input_path = sagemaker_session.upload_data(path=train_data, key_prefix='xgboost-tuner/input/train')
test_input_path = sagemaker_session.upload_data(path=test_data, key_prefix='xgboost-tuner/input/test')

# Set up the training and validation datasets
train_channel = sagemaker_session.input_channel(name='train', path=train_input_path)
test_channel = sagemaker_session.input_channel(name='test', path=test_input_path)

# Define the SageMaker XGBoost estimator for training
xgboost_estimator = XGBoost(
    role=role,
    train_instance_count=1,
    train_instance_type='ml.m5.large',
    sagemaker_session=sagemaker_session,
    hyperparameters={'objective': 'reg:squarederror', 'num_boost_round': 100},
    base_job_name='xgboost-tuner-boston'
)

# Perform hyperparameter tuning
tuner.fit({'train': train_channel, 'test': test_channel}, include_cls_metadata=False)

# Get the best hyperparameters and create a new XGBoost estimator
best_training_job = tuner.best_training_job()
best_hyperparameters = tuner.best_hyperparameters
print(f'Best hyperparameters: {best_hyperparameters}')

new_xgboost_estimator = XGBoost(
    role=role,
    train_instance_count=1,
    train_instance_type='ml.m5.large',
    sagemaker_session=sagemaker_session,
    hyperparameters=best_hyperparameters,
    base_job_name='xgboost-tuner-boston'
)

# Train the model with the best hyperparameters
new_xgboost_estimator.fit({'train': train_channel, 'test': test_channel}, include_cls_metadata=False)

# Save the trained model for deployment
model_data = new_xgboost_estimator.create_model()
model_name = 'xgboost-tuner-boston-model'
sagemaker_session.create_model(model_name=model_name, model_data=model_data)

# Deploy the model
predictor = sagemaker_session.create_predictor(model_name,
                                                base_job_name='xgboost-tuner-boston',
                                                deploy_prebuilt_image=True,
                                                entry_point='inference.py',
                                                source_dir='xgboost-tuner',
                                                instance_type='ml.m5.large',
                                                initial_instance_count=1)
```

Replace `'boston_housing.csv'` with the path to your Boston Housing dataset. This script assumes that the dataset has columns named `'CRIM'`, `'ZN'`, `'INDUS'`, `'CHAS'`, `'NOX'`, `'RM'`, `'AGE'`, `'DIS'`, `'RAD'`, `'TAX'`, `'PTRATIO'`, `'B'`, `'LSTAT'`, and `'MEDV'`.

To run this script, you need to have the AWS CLI and Boto3 installed. Make sure to configure your AWS credentials using `aws configure`.

The script also requires a file named `inference.py` in the `xgboost-tuner` directory for deployment. Here's a simple example of `inference.py`:

```python
import json
import xgboost as xgb
import numpy as np

def model_fn(model_dir):
    model = xgb.Booster(model_file=os.path.join(model_dir, 'model.joblib'))
    return model

def input_fn(request_body, content_type):
    if content_type == 'text/csv':
        features = np.loadtxt(pd.compat.StringIO(request_body), delimiter=',')
        return {'features': features.reshape(1, -1)}
    raise ValueError("Request body must be csv-formatted.")

def predict_fn(input_data, model):
    features = input_data['features']
    prediction = model.predict(features)
    return {'prediction': float(prediction)}
```

Make sure to install the required libraries (`boto3`, `numpy`, `pandas`, `xgboost`, and `sagemaker`) before running the script.

Code Transformation: Java to Python

There are instances where users need to convert code written in one programing language to another. This is known as a cross-language transformation task, and foundation models can help automate the process.

Prompt:

<s>[INST]
<<SYS>>
You are a programmer specialized in Java and Python languages. Your code should be efficient, well-documented, and include examples of usage.
<</SYS>>
Convert the following Java code to its equivalent in Python:

<java>
import java.util.ArrayList;

public class Main {
public static void main(String[] args) {
ArrayList<String> cars = new ArrayList<String>();
cars.add("Volvo");
cars.add("BMW");
cars.add("Ford");
cars.add("Mazda");
for (int i = 0; i < cars.size(); i++) {
System.out.println(cars.get(i));
}
}
}
</java>
[/INST]

Response:

<python>
import ArrayList from java.util import ArrayList

def main():
# Create an ArrayList of strings
cars = ArrayList[str]()

# Add elements to the ArrayList
cars.add("Volvo")
cars.add("BMW")
cars.add("Ford")
cars.add("Mazda")

# Iterate over the ArrayList and print each element
for i in range(cars.size()):
print(cars.get(i))

if __name__ == "__main__":
main()
</python>

Note: This Python code uses the Java ArrayList class, which is part of the JPype library that allows Python programs to call Java code. 
The code above will only work in an environment that has JPype installed and configured.

For pure Python solution, here is the equivalent code:

<python>
def main():
# Create a list of strings
cars = ["Volvo", "BMW", "Ford", "Mazda"]

# Iterate over the list and print each element
for car in cars:
print(car)

if __name__ == "__main__":
main()
</python>

This Python code uses a built-in list data structure instead of the Java ArrayList class. The code above is more idiomatic and efficient in Python.

AWS CDK code for a three-tier web application

The AWS Cloud Development Kit (AWS CDK) is an open-source software development framework for defining cloud infrastructure as code with modern programming languages and deploying it through AWS CloudFormation.

The three-tier architecture pattern provides a general framework to ensure decoupled and independently scalable application components can be separately developed, managed, and maintained (often by distinct teams). A three-tier architecture is the most popular implementation of a multi-tier architecture and consists of a single presentation tier, logic tier, and data tier:

Presentation tier: Component that the user directly interacts with (for example, webpages and mobile app UIs).
Logic tier: Code required to translate user actions to application functionality (for example, CRUD database operations and data processing).
Data tier: Storage media (for example, databases, object stores, caches, and file systems) that hold the data relevant to the application.

Prompt:

<s><<SYS>>
You are a programmer specialized in AWS, and programming in Python and the AWS Cloud Development Kit. Your code should be efficient, well-documented, and include examples of usage.
<</SYS>>[INST]
Create a CDK script in Python for configuring and deploying a three tier web application in AWS. The architecture should consist of these three tiers:

- Presentation tier: The front-end application is all static content (HTML, CSS, JavaScript and images) which are generated by React utilities like create-react-app. Amazon CloudFront hosts all these objects. The web application, when used, downloads all the resources to the browser and starts to run from there. The web application connects to the backend calling the APIs.

- Logic tier: Logic layer is built using Lambda functions fronted by API Gateway REST APIs. This architecture shows multiple exposed services. There are multiple different Lambda functions each handling a different aspect of the application. The Lambda functions are behind API Gateway and accessible using API URL paths.

- Data tier: DynamoDB is used for the data storage.
[/INST]

Response:

Sure, here's an example CDK script in Python that configures and deploys a three-tier web application in AWS:
```python
from aws_cdk import (
aws_apigateway as apigw,
aws_lambda as _lambda,
aws_lambda_event_sources as lambda_event_sources,
aws_logs as logs,
aws_dynamodb as ddb,
core
)

class ThreeTierWebAppStack(core.Stack):

def __init__(self, scope: core.Construct, id: str, **kwargs) -> None:
super().__init__(scope, id, **kwargs)

# Data tier
self.table = ddb.Table(
self, "DataTable",
partition_key=ddb.Attribute(name="id", type=ddb.AttributeType.STRING),
billing_mode=ddb.BillingMode.PAY_PER_REQUEST
)

# Logic tier
self.lambda_function = _lambda.Function(
self, "LambdaFunction",
runtime=_lambda.Runtime.PYTHON_3_8,
code=_lambda.Code.from_asset("lambda"),
handler="lambda_function.lambda_handler",
log_retention=logs.RetentionDays.ONE_DAY,
environment={
"TABLE_NAME": self.table.table_name
}
)

self.lambda_function.add_to_role_policy(
core.PolicyStatement(
actions=[
"dynamodb:GetItem",
"dynamodb:PutItem",
"dynamodb:UpdateItem",
"dynamodb:DeleteItem"
],
resources=[self.table.table_arn]
)
)

self.api = apigw.LambdaRestApi(
self, "Api",
handler=self.lambda_function,
proxy=False
)

# Presentation tier
self.distribution = aws_cloudfront.Distribution(
self, "Distribution",
default_root_object="index.html",
default_behavior_options=aws_cloudfront.BehaviorOptions(
origin=aws_cloudfront.Origin(
domain_name=self.api.url,
origin_path="/integration/v1"
),
viewer_protocol_policy=aws_cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS
)
)
```
Here's a brief explanation of the code:

* The `ThreeTierWebAppStack` class extends the `core.Stack` class and is the main entry point for the CDK script.
* The `table` attribute is a DynamoDB table used for data storage.
* The `lambda_function` attribute is a Lambda function that handles the business logic of the application. It has an IAM role that allows it to read and write to the DynamoDB table.
* The `api` attribute is an API Gateway REST API that exposes the Lambda function as a web service.
* The `distribution` attribute is a CloudFront distribution that serves the static content of the application. It uses the API Gateway as the origin and redirects all HTTP traffic to HTTPS.

To use this script, you need to create a `lambda` directory in the same directory as the script and put your Lambda function code in a file named `lambda_function.py` in that directory. You also need to install the AWS CDK and its dependencies using `pip install aws-cdk.core aws-cdk.aws-lambda aws-cdk.aws-lambda_event_sources aws-cdk.aws-logs aws-cdk.aws-dynamodb aws-cdk.aws-apigateway aws-cdk.aws-cloudfront`.

Once you have done that, you can deploy the stack using `cdk deploy`. This will create the DynamoDB table, the Lambda function, the API Gateway, and the CloudFront distribution. The URL of the CloudFront distribution will be printed to the console after the deployment is complete. You can use this URL to access the web application.

Additional considerations

The following are some additional considerations when implementing these models:

Different models will produce different results, so you should conduct experiments with different foundation models and different prompts for your use case to achieve the desired results.
The analyses provided are not meant to replace human judgement. You should be mindful of potential hallucinations when working with generative AI, and use the analysis only as a tool to assist and speed up code generation.

Clean up

Delete the model endpoints deployed using Amazon SageMaker for Code Llama and Mistral to avoid incurring any additional costs in your account.

Shut down any SageMaker Notebook instances that were created for deploying or running the examples showcased in this blog post to avoid any notebook instance costs associated with the account.

Conclusion

The combination of exceptional capabilities from foundation models like Code Llama 70B and Mixtral 8x7B and the powerful machine learning platform of Sagemaker, presents a unique opportunity for developers and data scientists to revolutionize their coding workflows. The cutting-edge capabilities of FMs empower customers to generate high-quality code, infill missing sections, and engage in natural language interactions, all while using the scalability, security, and compliance of AWS.

The examples highlighted in this blog post demonstrate these models’ advanced capabilities in generating complex code for various machine learning tasks, such as natural language processing, reinforcement learning, distributed training, and hyperparameter tuning, all tailored for deployment on SageMaker. Developers and data scientists can now streamline their workflows, accelerate development cycles, and unlock new levels of productivity in the AWS Cloud.

Embrace the future of AI-assisted coding and unlock new levels of productivity with Code Llama 70B and Mixtral 8x7B on Amazon SageMaker. Start your journey today and experience the transformative power of this groundbreaking language model.

References

About the Authors

Shikhar Kwatra is an AI/ML Solutions Architect at Amazon Web Services based in California. He has earned the title of one of the Youngest Indian Master Inventors with over 500 patents in the AI/ML and IoT domains. Shikhar aids in architecting, building, and maintaining cost-efficient, scalable cloud environments for the organization, and supports the GSI partners in building strategic industry solutions on AWS. Shikhar enjoys playing guitar, composing music, and practicing mindfulness in his spare time.

Jose Navarro is an AI/ML Solutions Architect at AWS based in Spain. Jose helps AWS customers—from small startups to large enterprises—architect and take their end-to-end machine learning use cases to production. In his spare time, he loves to exercise, spend quality time with friends and family, and catch up on AI news and papers.

Farooq Sabir is a Senior Artificial Intelligence and Machine Learning Specialist Solutions Architect at AWS. He holds PhD and MS degrees in Electrical Engineering from the University of Texas at Austin and an MS in Computer Science from Georgia Institute of Technology. He has over 15 years of work experience and also likes to teach and mentor college students. At AWS, he helps customers formulate and solve their business problems in data science, machine learning, computer vision, artificial intelligence, numerical optimization, and related domains. Based in Dallas, Texas, he and his family love to travel and go on long road trips.

Vedere AI

Code generation using Code Llama 70B and Mixtral 8x7B on Amazon SageMaker

What is Code Llama 70B and Mixtral 8x7B?

Harnessing the power of Code Llama 70B and Mistral models on SageMaker

Solution overview

Prerequisites

Code generation scenarios

Code Llama use cases with SageMaker

Generating a transformer model for natural language processing

Additional examples and use cases

Generating a reinforcement learning agent

Generating a distributed training script

Mixtral 8x7B use cases with SageMaker

Generating a hyperparameter tuning script for SageMaker

Code Transformation: Java to Python

AWS CDK code for a three-tier web application

Additional considerations

Clean up

Conclusion

References

About the Authors

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.