Amazon SageMaker Studio and SageMaker Notebook Instance now come with JupyterLab 3 notebooks to boost developer productivity

Amazon SageMaker comes with two options to spin up fully managed notebooks for exploring data and building machine learning (ML) models. The first option is fast start, collaborative notebooks accessible within Amazon SageMaker Studio – a fully integrated development environment (IDE) for machine learning. You can quickly launch notebooks in Studio, easily dial up or down the underlying compute resources without interrupting your work, and even share your notebook as a link in few simple clicks. In addition to creating notebooks, you can perform all the ML development steps to build, train, debug, track, deploy, and monitor your models in a single pane of glass in Studio. The second option is Amazon SageMaker Notebook Instance – a single, fully managed ML compute instance running notebooks in cloud, offering customers more control on their notebook configurations.

Today, we’re excited to announce that SageMaker Studio and SageMaker Notebook Instance now come with JupyterLab 3 notebooks. The new notebooks provide data scientists and developers a modern IDE complete with developer productivity tools for code authoring, refactoring and debugging, and support for the latest open-source Jupyter extensions. AWS is a major contributor to the Jupyter open-source community and we’re happy to bring the latest Jupyter capabilities to our customers.

In this post, we showcase some of the exciting new features built into SageMaker notebooks and call attention to some of our favorite open-source extensions that improve the developer experience when using SageMaker to build, train, and deploy your ML models.

What’s new with notebooks on SageMaker

The new notebooks come with several features out of the box that improve the SageMaker developer experience, including the following:

  • An integrated debugger with support for breakpoints and variable inspection
  • A table of contents panel to more easily navigate notebooks
  • A filter bar for the file browser
  • Support for multiple display languages
  • The ability to install extensions through pip, Conda, and Mamba

With the integrated debugger, you can inspect variables and step through breakpoints while you interactively build your data science and ML code. You can access the debugger by simply choosing the debugger icon on the notebook toolbar.

As of this writing, the debugger is available for our newly launched Base Python 2.0 and Data Science 2.0 images in SageMaker Studio and amazonei_pytorch_latest_p37, pytorch_p38, and tensorflow2_p38 kernels in SageMaker Notebook Instance, with plans to support more in the near future.

The table of contents panel provides an excellent utility to navigate notebooks and more easily share your findings with colleagues.

JupyterLab extensions

With the upgraded notebooks in SageMaker, you can take advantage of the ever-growing community of open-source JupyterLab extensions. In this section, we highlight a few that fit naturally into the SageMaker developer workflow, but we encourage you to browse the available extensions or even create your own.

The first extension we highlight is the Language Server Protocol extension. This open-source extension enables modern IDE functionality such as tab completion, syntax highlighting, jump to reference, variable renaming across notebooks and modules, diagnostics, and much more. This extension is very useful for those developers who want to author Python modules as well as notebooks.

Another useful extension for the SageMaker developer workflow is the jupyterlab-s3-browser. This extension picks up your SageMaker execution role’s credentials and allows you to browse, load, and write files directly to Amazon Simple Storage Service (Amazon S3).

Install extensions

JupyterLab 3 now makes the process of packaging and installing extensions significantly easier. You can install the aforementioned extensions through bash scripts. For example, in SageMaker Studio, open the system terminal from the Studio launcher and run the following commands. Note that the upgraded Studio has a separate, isolated Conda environment for managing the Jupyter Server runtime, so you need to install extensions into the studio Conda environment. To install extensions in SageMaker Notebook Instance, there is no need to switch Conda environments.

In addition, you can automate the installation of these extensions using lifecycle configurations so they’re persisted between Studio restarts. You can configure this for all the users in the domain or at an individual user level.

For Python Language Server, use the following code to install the extensions:

conda init
conda activate studio
pip install jupyterlab-lsp
pip install 'python-lsp-server[all]'
conda deactivate
nohup supervisorctl -c /etc/supervisor/conf.d/supervisord.conf restart jupyterlabserver

For Amazon S3 filebrowser, use the following:

conda init
conda activate studio
pip install jupyterlab_s3_browser
jupyter serverextension enable --py jupyterlab_s3_browser
conda deactivate
nohup supervisorctl -c /etc/supervisor/conf.d/supervisord.conf restart jupyterlabserver

Be sure to refresh your browser after installation.

For more information about writing similar lifecycle scripts for SageMaker Notebook Instance, refer to Customize a Notebook Instance Using a Lifecycle Configuration Script and Customize your Amazon SageMaker notebook instances with lifecycle configurations and the option to disable internet access. Additionally, for more information on extension management, including how to write lifecycle configurations that work for both versions 1 and 3 of JupyterLab notebooks for backward compatibility, see Installing JupyterLab and Jupyter Server extensions.

Get started with JupyterLab 3 notebooks in Studio

If you’re creating a new Studio domain, you can specify the default notebook version directly from the AWS Management Console or using the API.

On the SageMaker Control Panel, change your notebook version when editing your domain settings, in the Jupyter Lab version section.

To use the API, configure the JupyterServerAppSettings parameter as follows:

aws --region <REGION> 
sagemaker create-domain 
--domain-name <NEW_DOMAIN_NAME> 
--auth-mode <AUTHENTICATION_MODE> 
--subnet-ids <SUBNET-IDS> 
--vpc-id <VPC-ID> 
--default-user-settings ‘{
  “JupyterServerAppSettings”: {
    “DefaultResourceSpec”: {
      “SageMakerImageArn”: “arn:aws:sagemaker:<REGION>:<ACCOUNT_ID>:image/jupyter-server-3",
      “InstanceType”: “system”
    }
  }
}

If you’re an existing Studio user, you can modify your notebook version by choosing your user profile on the SageMaker Control Panel and choosing Edit.

Then choose your preferred version in the Jupyter Lab version section.

For more information, see JupyterLab Versioning.

Get started with JupyterLab 3 on SageMaker Notebook Instance

SageMaker Notebook Instance users can also specify the default notebook version both from the console and using our API. If using the console, note that the option to choose the Jupyter Lab 3 notebooks is only available for latest generation of SageMaker Notebook Instance that comes with Amazon Linux 2.

On the SageMaker console, choose your version while creating your notebook instance, under Platform identifier.

If using the API, use the following code:

create-notebook-instance --notebook-instance-name <NEW_NOTEBOOK_NAME> 
--instance-type <INSTANCE_TYPE> 
--role-arn <YOUR_ROLE_ARN> 
--platform-identifier <notebook-al2-v2>

For more information, see Creating a notebook with your JupyterLab version.

Conclusion

SageMaker Studio and SageMaker Notebook Instance now offer an upgraded notebook experience to users. We encourage you to try out the new capabilities and further boost developer productivity with these enhancements!


About the Authors

Sean MorganSean Morgan is an AI/ML Solutions Architect at AWS. He has experience in the semiconductor and academic research fields, and uses his experience to help customers reach their goals on AWS. In his free time, Sean is an active open-source contributor/maintainer and is the special interest group lead for TensorFlow Add-ons.

Arkaprava De is a Senior Software Engineer at AWS. He has been at Amazon for over 7 years and is currently working on improving the Amazon SageMaker Studio IDE experience.

Kunal Jha is a Senior Product Manager at AWS. He is focused on building Amazon SageMaker Studio as the IDE of choice for all ML development steps. In his spare time, Kunal enjoys skiing and exploring the Pacific Northwest. You can find him on LinkedIn.

Read More