Set up Amazon SageMaker Studio with Jupyter Lab 3 using the AWS CDK

Amazon SageMaker Studio is a fully integrated development environment (IDE) for machine learning (ML) partly based on JupyterLab 3. Studio provides a web-based interface to interactively perform ML development tasks required to prepare data and build, train, and deploy ML models. In Studio, you can load data, adjust ML models, move in between steps to adjust experiments, compare results, and deploy ML models for inference.

The AWS Cloud Development Kit (AWS CDK) is an open-source software development framework to create AWS CloudFormation stacks through automatic CloudFormation template generation. A stack is a collection of AWS resources, that can be programmatically updated, moved, or deleted. AWS CDK constructs are the building blocks of AWS CDK applications, representing the blueprint to define cloud architectures.

Setting up Studio with AWS CDK has become a streamlined process. The AWS CDK allows you to use native constructs to define and deploy Studio using infrastructure as code (IaC), including AWS Identity and Access Management (AWS IAM) permissions and desired cloud resource configurations, all in one place. This development approach can be used in combination with other common software engineering best practices such as automated code deployments, tests, and CI/CD pipelines. The AWS CDK reduces the time required to perform typical infrastructure deployment tasks while shrinking the surface area for human error through automation.

This post guides you through the steps to get started with setting up and deploying Studio to standardize ML model development and collaboration with fellow ML engineers and ML scientists. All examples in the post are written in the Python programming language. However, the AWS CDK offers built-in support for multiple other programming languages like JavaScript, Java and C#.

Prerequisites

To get started, the following prerequisites apply:

Clone the GitHub repository

First, let’s clone the GitHub repository.

When the repository is successfully pulled, you may inspect the cdk directory containing the following resources:

  • cdk – Contains the main cdk resources
  • app.py – Where the AWS CDK stack is defined
  • cdk.json – Contains metadata, and feature flags

AWS CDK scripts

The two main files we want to look at in the cdk subdirectory are sagemaker_studio_construct.py and sagemaker_studio_stack.py. Let’s look at each file in more detail.

Studio construct file

The Studio construct is defined in the sagemaker_studio_construct.py file.

The Studio construct takes in the virtual private cloud (VPC), listed users, AWS Region, and underlying default instance type as parameters. This AWS CDK construct serves the following functions:

  • Creates the Studio domain (SageMakerStudioDomain)
  • Sets the IAM role sagemaker_studio_execution_role with AmazonSageMakerFullAccess permissions required to create resources. Permissions need to be scoped down further to follow the least privilege principle for improved security.
  • Sets Jupyter server app settings – takes in JUPYTER_SERVER_APP_IMAGE_NAME, defining the jupyter-server-3 container image to be used.
  • Sets kernel gateway app settings  – takes in  KERNEL_GATEWAY_APP_IMAGE_NAME, defining the datascience-2.0 container image to be used.
  • Creates a user profile for each listed user

The following code snippet shows the relevant Studio domain AWS CloudFormation resources defined in AWS CDK:

sagemaker_studio_domain = sagemaker.CfnDomain(
self,
"SageMakerStudioDomain",
auth_mode="IAM",
default_user_settings=sagemaker.CfnDomain.UserSettingsProperty(
execution_role=self.sagemaker_studio_execution_role.role_arn,
jupyter_server_app_settings=sagemaker.CfnDomain.JupyterServerAppSettingsProperty(
default_resource_spec=sagemaker.CfnDomain.ResourceSpecProperty(
instance_type="system",
sage_maker_image_arn=get_sagemaker_image_arn(
JUPYTER_SERVER_APP_IMAGE_NAME, aws_region
),
)
),
kernel_gateway_app_settings=sagemaker.CfnDomain.KernelGatewayAppSettingsProperty(
default_resource_spec=sagemaker.CfnDomain.ResourceSpecProperty(
instance_type=default_instance_type,
sage_maker_image_arn=get_sagemaker_image_arn(
KERNEL_GATEWAY_APP_IMAGE_NAME, aws_region
),
),
),
security_groups=[vpc.vpc_default_security_group],
sharing_settings=sagemaker.CfnDomain.SharingSettingsProperty(
notebook_output_option="Disabled"
),
),
domain_name="SageMakerStudioDomain",
subnet_ids=private_subnets,
vpc_id=vpc.vpc_id,
app_network_access_type="VpcOnly",
)

The following code snippet shows the user profiles created from AWS CloudFormation resources:

for user_name in user_names: sagemaker.CfnUserProfile( self, "SageMakerStudioUserProfile_" + user_name,
 domain_id=sagemaker_studio_domain.attr_domain_id, user_profile_name=user_name, )

Studio stack file

class SagemakerStudioStack(Stack):
    def __init__(
        self,
        scope: Construct,
        construct_id: str,
        **kwargs,
    ) -> None:
        super().__init__(scope, construct_id, **kwargs)
        vpc = ec2.Vpc(self, "SageMakerStudioVpc")
        SageMakerStudio(self, "SageMakerStudio", vpc=vpc, aws_region=self.region)

After the construct has been defined, you can add it by creating an instance of the class and passing the required arguments inside of the stack. The stack creates the AWS CloudFormation resources as part of one coherent deployment. This means that if at least one cloud resource fails to be created, the CloudFormation stack rolls back any changes performed. The following code snippet of the Studio construct instantiates inside of the Studio stack:

Deploy the AWS CDK stack

To deploy your AWS CDK stack, run the following commands from the project’s root directory within your terminal window:

aws configure
pip3 install -r requirements.txt
cdk bootstrap --app "python3 -m cdk.app"
cdk deploy --app "python3 -m cdk.app"

Review the resources the AWS CDK creates in your AWS account and select yes when prompted to deploy the stack.  Wait for your stack deployment to finish.  This typically takes less than 5 minutes; however, adding more resources will prolong deployment time. You can also check the deployment status on the AWS CloudFormation console.

Stack creation in CloudFormation

When the stack has been successfully deployed, check its information by going to the Studio Control Panel.  You should see the SageMaker Studio user profile you created.

Default user profile listed

If you redeploy the stack it will check for changes, performing only the cloud resource updates necessary. For example, this can be used to add users, or change permissions of those users without having to recreate all of the defined cloud resources.

Cleanup

To delete a stack, complete the following steps:

  1. On the AWS CloudFormation console, choose Stacks in the navigation pane.
  2. Open the stack you want to delete.
  3. In the stack details pane, choose Delete.
  4. Choose Delete stack when prompted.

AWS CloudFormation will delete the resources created when the stack was deployed.  This may take some time depending on the amount of resources created.

If you encounter any issues going through these cleanup steps, you may need to manually delete the Studio domain first before repeating the steps in this section.

Conclusion

In this post, we showed how to use AWS cloud-native IaC resources to build an easily reusable template for Studio deployments. SageMaker Studio is a fully integrated web-based IDE that provides a visual interface for ML development tasks based on JupyterLab3.  With AWS CDK stacks, we were able to define constructs for building out cloud components that can be easily modified, edited, or deleted by making changes to the underlying CloudFormation stack.

For more information about Amazon Studio, see Amazon SageMaker Studio.


About the Authors

Cory Hairston is a Software Engineer at the Amazon ML Solutions Lab. He is ardent about learning new technologies and leveraging that information to build reusable software solutions. He is an avid power-lifter and spends his free time making digital art.

Marcelo Aberle is an ML Engineer in the AWS AI organization. He is leading MLOps efforts at the Amazon ML Solutions Lab, helping customers design and implement scalable ML systems. His mission is to guide customers on their enterprise ML journey and accelerate their ML path to production.

Yash Shah is a Science Manager in the Amazon ML Solutions Lab. He and his team of applied scientists and machine learning engineers work on a range of machine learning use cases from healthcare, sports, automotive and manufacturing.

Read More