Provision and manage ML environments with Amazon SageMaker Canvas using AWS CDK and AWS Service Catalog

The proliferation of machine learning (ML) across a wide range of use cases is becoming prevalent in every industry. However, this outpaces the increase in the number of ML practitioners who have traditionally been responsible for implementing these technical solutions to realize business outcomes.

In today’s enterprise, there is a need for machine learning to be used by non-ML practitioners who are proficient with data, which is the foundation of ML. To make this a reality, the value of ML is being realized across the enterprise through no-code ML platforms. These platforms enable different personas, for example business analysts, to use ML without writing a single line of code and deliver solutions to business problems in a quick, simple, and intuitive manner. Amazon SageMaker Canvas is a visual point-and-click service that enables business analysts to use ML to solve business problems by generating accurate predictions on their own—without requiring any ML experience or having to write a single line of code. Canvas has expanded the use of ML in the enterprise with a simple-to-use intuitive interface that helps businesses implement solutions quickly.

Although Canvas has enabled democratization of ML, the challenge of provisioning and deploying ML environments in a secure manner still remains. Typically, this is the responsibility of central IT teams in most large enterprises. In this post, we discuss how IT teams can administer, provision, and manage secure ML environments using Amazon SageMaker Canvas, AWS Cloud Development Kit (AWS CDK) and AWS Service Catalog. The post presents a step-by-step guide for IT administrators to achieve this quickly and at scale.

Overview of the AWS CDK and AWS Service Catalog

The AWS CDK is an open-source software development framework to define your cloud application resources. It uses the familiarity and expressive power of programming languages for modeling your applications, while provisioning resources in a safe and repeatable manner.

AWS Service Catalog lets you centrally manage deployed IT services, applications, resources, and metadata. With AWS Service Catalog, you can create, share, organize and govern cloud resources with infrastructure as code (IaC) templates and enable fast and straightforward provisioning.

Solution overview

We enable provisioning of ML environments using Canvas in three steps:

  1. First, we share how you can manage a portfolio of resources necessary for the approved usage of Canvas using AWS Service Catalog.
  2. Then, we deploy an example AWS Service Catalog portfolio for Canvas using the AWS CDK.
  3. Finally, we demonstrate how you can provision Canvas environments on demand within minutes.

Prerequisites

To provision ML environments with Canvas, the AWS CDK, and AWS Service Catalog, you need to do the following:

  1. Have access to the AWS account where the Service Catalog portfolio will be deployed. Make sure you have the credentials and permissions to deploy the AWS CDK stack into your account. The AWS CDK Workshop is a helpful resource you can refer to if you need support.
  2. We recommend following certain best practices that are highlighted through the concepts detailed in the following resources:
  3. Clone this GitHub repository into your environment.

Provision approved ML environments with Amazon SageMaker Canvas using AWS Service Catalog

In regulated industries and most large enterprises, you need to adhere to the requirements mandated by IT teams to provision and manage ML environments. These may include a secure, private network, data encryption, controls to allow only authorized and authenticated users such as AWS Identity and Access Management (IAM) for accessing solutions such as Canvas, and strict logging and monitoring for audit purposes.

As an IT administrator, you can use AWS Service Catalog to create and organize secure, reproducible ML environments with SageMaker Canvas into a product portfolio. This is managed using IaC controls that are embedded to meet the requirements mentioned before, and can be provisioned on demand within minutes. You can also maintain control of who can access this portfolio to launch products.

The following diagram illustrates this architecture.

Example flow

In this section, we demonstrate an example of an AWS Service Catalog portfolio with SageMaker Canvas. The portfolio consists of different aspects of the Canvas environment that are part of the Service Catalog portfolio:

  • Studio domain – Canvas is an application that runs within Studio domains. The domain consists of an Amazon Elastic File System (Amazon EFS) volume, a list of authorized users, and a range of security, application, policy, and Amazon Virtual Private Cloud (VPC) configurations. An AWS account is linked to one domain per Region.
  • Amazon S3 bucket – After the Studio domain is created, an Amazon Simple Storage Service (Amazon S3) bucket is provisioned for Canvas to allow importing datasets from local files, also known as local file upload. This bucket is in the customer’s account and is provisioned once.
  • Canvas user – SageMaker Canvas is an application where you can add user profiles within the Studio domain for each Canvas user, who can proceed to import datasets, build and train ML models without writing code, and run predictions on the model.
  • Scheduled shutdown of Canvas sessions – Canvas users can log out from the Canvas interface when they’re done with their tasks. Alternatively, administrators can shut down Canvas sessions from the AWS Management Console as part of managing the Canvas sessions. In this part of the AWS Service Catalog portfolio, an AWS Lambda function is created and provisioned to automatically shut down Canvas sessions at defined scheduled intervals. This helps manage open sessions and shut them down when not in use.

This example flow can be found in the GitHub repository for quick reference.

Deploy the flow with the AWS CDK

In this section, we deploy the flow described earlier using the AWS CDK. After it’s deployed, you can also do version tracking and manage the portfolio.

The portfolio stack can be found in app.py and the product stacks under the products/ folder. You can iterate on the IAM roles, AWS Key Management Service (AWS KMS) keys, and VPC setup in the studio_constructs/ folder. Before deploying the stack into your account, you can edit the following lines in app.py and grant portfolio access to an IAM role of your choice.

You can manage access to the portfolio for the relevant IAM users, groups, and roles. See Granting Access to Users for more details.

Deploy the portfolio into your account

You can now run the following commands to install the AWS CDK and make sure you have the right dependencies to deploy the portfolio:

npm install -g aws-cdk@2.27.0
python3 -m venv .venv
source .venv/bin/activate
pip3 install -r requirements.txt

Run the following commands to deploy the portfolio into your account:

ACCOUNT_ID=$(aws sts get-caller-identity --query Account | tr -d '"')
AWS_REGION=$(aws configure get region)
cdk bootstrap aws://${ACCOUNT_ID}/${AWS_REGION}
cdk deploy --require-approval never

The first two commands get your account ID and current Region using the AWS Command Line Interface (AWS CLI) on your computer. Following this, cdk bootstrap and cdk deploy build assets locally, and deploy the stack in a few minutes.

The portfolio can now be found in AWS Service Catalog, as shown in the following screenshot.

On-demand provisioning

The products within the portfolio can be launched quickly and easily on demand from the Provisioning menu on the AWS Service Catalog console. A typical flow is to launch the Studio domain and the Canvas auto shutdown first because this is usually a one-time action. You can then add Canvas users to the domain. The domain ID and user IAM role ARN are saved in AWS Systems Manager and are automatically populated with the user parameters as shown in the following screenshot.

You can also use cost allocation tags that are attached to each user. For example, UserCostCenter is a sample tag where you can add the name of each user.

Key considerations for governing ML environments using Canvas

Now that we have provisioned and deployed an AWS Service Catalog portfolio focused on Canvas, we’d like to highlight a few considerations to govern the Canvas-based ML environments focused on the domain and the user profile.

The following are considerations regarding the Studio domain:

  • Networking for Canvas is managed at the Studio domain level, where the domain is deployed on a private VPC subnet for secure connectivity. See Securing Amazon SageMaker Studio connectivity using a private VPC to learn more.
  • A default IAM execution role is defined at the domain level. This default role is assigned to all Canvas users in the domain.
  • Encryption is done using AWS KMS by encrypting the EFS volume in the domain. For additional controls, you can specify your own managed key, also known as a customer managed key (CMK). See Protect Data at Rest Using Encryption to learn more.
  • The ability to upload files from your local disk is done by attaching a cross-origin resource sharing (CORS) policy to the S3 bucket used by Canvas. See Give Your Users Permissions to Upload Local Files to learn more.

The following are considerations regarding the user profile:

  • Authentication in Studio can be done both through single sign-on (SSO) and IAM. If you have an existing identity provider to federate users to access the console, you can assign a Studio user profile to each federated identity using IAM. See the section Assigning the policy to Studio users in Configuring Amazon SageMaker Studio for teams and groups with complete resource isolation to learn more.
  • You can assign IAM execution roles to each user profile. While using Studio, a user assumes the role mapped to their user profile that overrides the default execution role. You can use this for fine-grained access controls within a team.
  • You can achieve isolation using attribute-based access controls (ABAC) to ensure users can only access the resources for their team. See Configuring Amazon SageMaker Studio for teams and groups with complete resource isolation to learn more.
  • You can perform fine-grained cost tracking by applying cost allocation tags to user profiles.

Clean up

In order to clean up the resources created by the AWS CDK stack above, navigate over to the AWS CloudFormation stacks page and delete the Canvas stacks. You can also run cdk destroy from within the repository folder, to do the same.

Conclusion

In this post, we shared how you can quickly and easily provision ML environments with Canvas using AWS Service Catalog and the AWS CDK. We discussed how you can create a portfolio on AWS Service Catalog, provision the portfolio, and deploy it in your account. IT administrators can use this method to deploy and manage users, sessions, and associated costs while provisioning Canvas.

Learn more about Canvas on the product page and the Developer Guide. For further reading, you can learn how to enable business analysts to access SageMaker Canvas using AWS SSO without the console. You can also learn how business analysts and data scientists can collaborate faster using Canvas and Studio.


About the Authors

Davide Gallitelli is a Specialist Solutions Architect for AI/ML in the EMEA region. He is based in Brussels and works closely with customers throughout Benelux. He has been a developer since he was very young, starting to code at the age of 7. He started learning AI/ML at university, and has fallen in love with it since then.

Sofian Hamiti is an AI/ML specialist Solutions Architect at AWS. He helps customers across industries accelerate their AI/ML journey by helping them build and operationalize end-to-end machine learning solutions.

Shyam Srinivasan is a Principal Product Manager on the AWS AI/ML team, leading product management for Amazon SageMaker Canvas. Shyam cares about making the world a better place through technology and is passionate about how AI and ML can be a catalyst in this journey.

Avi Patel works as a software engineer on the Amazon SageMaker Canvas team. His background consists of working full stack with a frontend focus. In his spare time, he likes to contribute to open source projects in the crypto space and learn about new DeFi protocols.

Jared Heywood is a Senior Business Development Manager at AWS. He is a global AI/ML specialist helping customers with no-code machine learning. He has worked in the AutoML space for the past 5 years and launched products at Amazon like Amazon SageMaker JumpStart and Amazon SageMaker Canvas.

Read More