Introducing AWS AI Service Cards: A new resource to enhance transparency and advance responsible AI

Introducing AWS AI Service Cards: A new resource to enhance transparency and advance responsible AI

Artificial intelligence (AI) and machine learning (ML) are some of the most transformative technologies we will encounter in our generation—to tackle business and societal problems, improve customer experiences, and spur innovation. Along with the widespread use and growing scale of AI comes the recognition that we must all build responsibly. At AWS, we think responsible AI encompasses a number of core dimensions including:

  • Fairness and bias– How a system impacts different subpopulations of users (e.g., by gender, ethnicity)
  • Explainability– Mechanisms to understand and evaluate the outputs of an AI system
  • Privacy and Security– Data protected from theft and exposure
  • Robustness– Mechanisms to ensure an AI system operates reliably
  • Governance– Processes to define, implement and enforce responsible AI practices within an organization
  • Transparency– Communicating information about an AI system so stakeholders can make informed choices about their use of the system

Our commitment to developing AI and ML in a responsible way is integral to how we build our services, engage with customers, and drive innovation. We are also committed to providing customers with tools and resources to develop and use AI/ML responsibly, from enabling ML builders with a fully managed development environment to helping customers embed AI services into common business use cases.

Providing customers with more transparency

Our customers want to know that the technology they are using was developed in a responsible way. They want resources and guidance to implement that technology responsibly at their own organization. And most importantly, they want to ensure that the technology they roll out is for everyone’s benefit, especially their end-users’. At AWS, we want to help them bring this vision to life.

To deliver the transparency that customers are asking for, we are excited to launch AWS AI Service Cards, a new resource to help customers better understand our AWS AI services. AI Service Cards are a form of responsible AI documentation that provide customers with a single place to find information on the intended use cases and limitations, responsible AI design choices, and deployment and performance optimization best practices for our AI services. They are part of a comprehensive development process we undertake to build our services in a responsible way that addresses fairness and bias, explainability, robustness, governance, transparency, privacy, and security. At AWS re:Invent 2022 we’re making the first three AI Service Cards available: Amazon Rekognition – Face Matching, Amazon Textract – AnalyzeID, and Amazon Transcribe – Batch (English-US).

Components of the AI Service Cards

Each AI Service Card contains four sections covering:

  • Basic concepts to help customers better understand the service or service features
  • Intended use cases and limitations
  • Responsible AI design considerations
  • Guidance on deployment and performance optimization

The content of the AI Service Cards addresses a broad audience of customers, technologists, researchers, and other stakeholders who seek to better understand key considerations in the responsible design and use of an AI service.

Our customers use AI in an increasingly diverse set of applications. The intended use cases and limitations section provides information about common uses for a service, and helps customers assess whether a service is a good fit for their application. For example, in the Amazon Transcribe – Batch (English-US) Card we describe the service use case of transcribing general-purpose vocabulary spoken in US English from an audio file. If a company wants a solution that automatically transcribes a domain-specific event, such as an international neuroscience conference, they can add custom vocabularies and language models to include scientific vocabulary in order to increase the accuracy of the transcription.

In the design section of each AI Service Card, we explain key responsible AI design considerations across important areas, such as our test-driven methodology, fairness and bias, explainability, and performance expectations. We provide example performance results on an evaluation dataset that is representative of a common use case. This example is just a starting point though, as we encourage customers to test on their own datasets to better understand how the service will perform on their own content and use cases in order to deliver the best experience for their end customers. And this is not a one-time evaluation. To build in a responsible way, we recommend an iterative approach where customers periodically test and evaluate their applications for accuracy or potential bias.

In the best practices for deployment and performance optimization section, we lay out key levers that customers should consider to optimize the performance of their application for real-world deployment. It’s important to explain how customers can optimize the performance of an AI system that acts as a component of their overall application or workflow to get the maximum benefit. For example, in the Amazon Rekognition Face Matching Card that covers adding face recognition capabilities to identity verification applications, we share steps customers can take to increase the quality of the face matching predictions incorporated into their workflow.

Delivering responsible AI resources and capabilities

Offering our customers the resources and tools they need to transform responsible AI from theory to practice is an ongoing priority for AWS. Earlier this year we launched our Responsible Use of Machine Learning guide that provides considerations and recommendations for responsibly using ML across all phases of the ML lifecycle. AI Service Cards complement our existing developer guides and blog posts, which provide builders with descriptions of service features and detailed instructions for using our service APIs. And with Amazon SageMaker Clarify and Amazon SageMaker Model Monitor, we offer capabilities to help detect bias in datasets and models and better monitor and review model predictions through automation and human oversight.

At the same time, we continue to advance responsible AI across other key dimensions, such as governance. At re:Invent today we launched a new set of purpose-built tools to help customers improve governance of their ML projects with Amazon SageMaker Role Manager, Amazon SageMaker Model Cards, and Amazon SageMaker Model Dashboard. Learn more on the AWS News blog and website about how these tools help to streamline ML governance processes.

Education is another key resource that helps advance responsible AI. At AWS we are committed to building the next generation of developers and data scientists in AI with the AI and ML Scholarship Program and AWS Machine Learning University (MLU). This week at re:Invent we launched a new, public MLU course on fairness considerations and bias mitigation across the ML lifecycle. Taught by the same Amazon data scientists who train AWS employees on ML, this free course features 9 hours of lectures and hands-on exercises and it is easy to get started.

AI Service Cards: A new resource—and an ongoing commitment

We are excited to bring a new transparency resource to our customers and the broader community and provide additional information on the intended uses, limitations, design, and optimization of our AI services, informed by our rigorous approach to building AWS AI services in a responsible way. Our hope is that AI Service Cards will act as a useful transparency resource and an important step in the evolving landscape of responsible AI. AI Service Cards will continue to evolve and expand as we engage with our customers and the broader community to gather feedback and continually iterate on our approach.

Contact our group of responsible AI experts to start a conversation.


About the authors

Vasi Philomin is currently a Vice President in the AWS AI team for services in the language and speech technologies areas such as Amazon Lex, Amazon Polly, Amazon Translate, Amazon Transcribe/Transcribe Medical, Amazon Comprehend, Amazon Kendra, Amazon Code Whisperer, Amazon Monitron, Amazon Lookout for Equipment and Contact Lens/Voice ID for Amazon Connect as well as Machine Learning Solutions Lab and Responsible AI.

Peter Hallinan leads initiatives in the science and practice of Responsible AI at AWS AI, alongside a team of responsible AI experts. He has deep expertise in AI (PhD, Harvard) and entrepreneurship (Blindsight, sold to Amazon). His volunteer activities have included serving as a consulting professor at the Stanford University School of Medicine, and as the president of the American Chamber of Commerce in Madagascar. When possible, he’s off in the mountains with his children: skiing, climbing, hiking and rafting

Read More

AWS Unveils New AI Service Features and Enhancements at re:Invent 2022

AWS Unveils New AI Service Features and Enhancements at re:Invent 2022

Over the last 5 years, artificial intelligence (AI) and machine learning (ML) have evolved from a niche activity to a rapidly growing mainstream endeavor. Today, more than 100,000 customers across numerous industries rely on AWS for ML and AI initiatives that infuse AI into a broad range of business use cases to automate repetitive and mundane tasks—from intelligent demand planning to document processing and content moderation. AWS AI services help customers create smoother, faster, and more efficient engagements with customers, driving greater efficiencies and lowering operational costs.

At AWS re:Invent, Amazon Web Services, Inc. has announced a series of features and enhancements across its portfolio of AI services, including purpose-built solutions to solve industry-specific challenges, representing a deeper integration of AI into everyday experiences. The new capabilities include Amazon Textract Analyze Lending to improve loan-document processing efficiency, Amazon Transcribe Call Analytics to analyze in-progress contact center calls, Amazon Kendra support for tabular search in HTML and seven new languages, Amazon HealthLake Imaging for medical image storing; Amazon HealthLake Analytics with multi-modal data querying capabilities, and broader programming languages support and easier administration in Amazon CodeWhisperer. These AI service innovations provide vertical markets and horizontal functions with deeper, real-time insights and cost-saving efficiencies to drive transformation across industries.

These new capabilities enhance AWS’s AI offerings at the top of its three-layer ML stack. The bottom layer includes foundational components (ML hardware and ML software libraries) to help customers build their own ML infrastructure, and the middle layer—Amazon SageMaker—is a fully managed ML development environment. The top layer of AI services brings ML to business use cases such as transcribing contact center calls, processing documents, and improving healthcare outcomes. Customers can use AWS AI services with no ML expertise required.

Customers from different industries rely on AWS AI services to improve efficiency and reduce operational costs. For example, WaFd Bank, a full-service US bank, improved its customer experience with Talkdesk (a global cloud contact center company) and AWS Contact Center Intelligence (CCI) solutions, reducing call times by up to 90%. And State Auto, a property and casualty insurance holding company, automated the property inspection process using Amazon Rekognition (a computer vision service), increasing the number of claims it reviews for potential fraud by 83%.

Amazon Textract Analyze Lending makes it easy to classify and extract mortgage loan data

Today, mortgage companies process large volumes of documents to extract business-critical data and make decisions on loan applications. For example, a typical US mortgage application can encompass 500 or more pages of diverse document types, including W2 forms, paystubs, bank statements, Form 1040, 1003, and many more. The lender’s loan processing application has to first understand and classify each document type to ensure that it is processed the right way. After that, the loan processing application has to extract all the data on each page of the document. The data in these documents exists in different formats and structures, and the same data element can have different names on different documents—for example, “SSN,” or “Social Security Number,” which can lead to inaccurate data extraction. So far, the classification and extraction of data from mortgage application packages have been primarily manual tasks. Furthermore, mortgage companies have to manage demand for mortgages that can fluctuate substantially during a year, so lenders are unable to plan effectively and must often allocate resources to process documents on an ad hoc basis. Overall, mortgage loan processing is still manual, slow, error-prone, and expensive.

Amazon Textract (AWS’s AI service to automatically extract text, handwriting, and data from scanned documents) now offers Amazon Textract Analyze Lending to make loan document processing more automated, faster, and cost-effective at scale. Amazon Textract Analyze Lending pulls together multiple ML models to classify various documents that commonly occur in mortgage packages, and then extracts critical information from these documents with high accuracy to improve loan document processing workflows. For example, it can now perform signature detection to identify whether documents have required signatures. It also provides a summary of the documents in a mortgage application package and identifies any missing documents. For instance, PennyMac, a financial services firm specializing in the production and servicing of US mortgage loans, uses Amazon Textract Analyze Lending to process a 3,000-page mortgage application in less than 5 minutes. Previously, PennyMac’s mortgage document processing required several hours of reviewing and preparing a loan package for approval.

Amazon Transcribe Call Analytics for improved end-user experiences

In most customer-facing industries such as telecom, finance, healthcare, and retail, customer experiences with call centers can profoundly impact perceptions of the company. Lengthy call-resolution times or the inability to deal with issues during live interactions can lead to poor customer experiences or customer churn. Contact centers need real-time insights into customer-experience issues (e.g., a product defect) while calls are in progress. Typically, developers use multiple AI services to generate live call transcriptions, extract relevant real-time insights, and manage sensitive customer information (e.g. identify and redact sensitive customer details) during live calls. However, this process adds unnecessary complexity, time, and cost.

Amazon Transcribe, an automatic speech recognition (ASR) service that makes it easy for developers to add speech-to-text capabilities to their applications, now supports call analytics to provide real-time conversation insights. Amazon Transcribe Call Analytics now provides real-time conversation insights that help analyze thousands of in-progress calls, identify call sentiment (e.g. calls that ended with a negative customer sentiment score), detect the potential reason for the call, and spot issues such as repeated requests to speak to a manager. Amazon Transcribe Call Analytics combines powerful automatic speech NLP models that are trained specifically to improve overall customer experience. With Amazon Transcribe Call Analytics, developers can build a real-time system that provides contact center agents with relevant information to solve customer issues or alert supervisors about potential issues. Amazon Transcribe Call Analytics also generates call summaries automatically, eliminating the need for agents to take notes and allowing them to focus on customer needs. Furthermore, Amazon Transcribe Call Analytics protects sensitive customer data by identifying and redacting personal information during live calls.

Amazon Kendra adds new search capabilities

Today, in the face of rapid growth in the volume and variety of data, enterprise search tools struggle to examine and uncover key insights stored across enterprise systems in heterogenous data formats and in different languages. Conventional enterprise search solutions are unable to find knowledge stored in unstructured datasets like HTML tables because it requires extracting information from two-dimensional formats (rows and columns). Sometimes, the information a customer may be seeking could exist in different languages, making the search even more challenging. As a result, enterprise employees waste time searching for information or are unable to perform their duties.

Amazon Kendra (AWS’s intelligent search service powered by ML) offers a new capability that supports tabular search in HTML. Customers can find more precise answers faster in HTML documents, whether they’re in the narrative body or tabular form, by using natural language questions. Amazon Kendra can find and extract precise answers from HTML tables by performing deeper analyses of HTML pages and using new specialized deep learning models that intelligently interpret columns and rows to pinpoint relevant data. Amazon Kendra is also adding semantic support for seven new languages (in addition to English): French, Spanish, German, Portuguese, Japanese, Korean, and Chinese. Customers can now ask natural language questions and get exact answers in any of the supported languages. One of AWS’s biopharmaceutical customers, Gilead Sciences Inc., increased staff productivity by cutting internal search times by roughly 50% using Amazon Kendra.

Amazon HealthLake offers next-generation imaging solutions and precision health analytics

Healthcare providers face a myriad of challenges as the scale and complexity of medical imaging data continues to increase. Medical imaging is a critical tool to diagnose patients, and there are billions of medical images scanned globally each year. Imaging data accounts for about 90% 1 of all healthcare data, and analyzing these complex images has largely been a manual task performed by experts and specialists. It often takes data scientists and researchers weeks or months to derive important insights from medical images, slowing down decision-making processes for healthcare providers and impacting patient-care delivery. To address these challenges, Amazon HealthLake (a HIPAA-eligible service to store, transform, query, and analyze large-scale health data) is adding two new capabilities for medical imaging and analytics:

  • Amazon HealthLake Imaging is a new HIPAA-eligible capability that enables healthcare providers and their software partners to easily store, access, and analyze medical images at petabyte scale. The new capability is designed for fast, subsecond image retrieval in clinical workflows that healthcare providers can access securely from anywhere (e.g., web, desktop, or phone) and with high availability. Typically, health systems store multiple copies of the same imaging data in clinical and research systems, leading to increased storage costs and complexity. Amazon HealthLake Imaging extracts and stores just one copy of the same image to the cloud. Customers can now access existing medical records and run analysis applications from a single encrypted copy of the same data in the cloud with normalized metadata and advanced compression. As a result, Amazon HealthLake Imaging can help providers reduce the total cost of medical imaging storage by up to 40%.
  • Amazon HealthLake Analytics is a new HIPAA-eligible capability that makes it easy to query and derive insights from multi-modal health data (e.g., imaging, text, or genetics), at the individual or population levels, with the ability to share data securely across the enterprise. It removes the need for healthcare providers to execute complex data exports and data transformations. Amazon HealthLake Analytics automatically normalizes raw health data from disparate sources (e.g., medical records, health insurance claims, EHRs, or medical devices) into an analytics and interoperable format in minutes. The new capability reduces what would otherwise take months of engineering effort to allow providers to focus on what they do best—delivering patient care.

Amazon CodeWhisperer offers broader support and easier administration

While the cloud has democratized application development through on-demand access to compute, storage, database, analytics, and ML, the traditional process of building software applications in any industry remains time-intensive. Developers must still spend significant time writing repetitive code not directly related to the core problems they want to solve. Even highly experienced developers find it difficult to keep up with multiple programming languages, frameworks, and software libraries, while ensuring they follow correct programming syntax and coding best practices.

Amazon CodeWhisperer (an ML-powered service that generates code recommendations) now supports AWS Builder ID so any developer can sign up securely with just an email address and enable Amazon CodeWhisperer for their IDE within the AWS Toolkit. In addition to Python, Java, and JavaScript, Amazon CodeWhisperer adds support for TypeScript and C# languages to accelerate code development. Also, Amazon CodeWhisperer now makes code recommendations for AWS application programming interfaces (APIs) across its most popular services, including Amazon Elastic Compute Cloud (Amazon EC2), AWS Lambda, and Amazon Simple Storage Service (Amazon S3). Finally, Amazon CodeWhisperer is now available on the AWS Management Console, so any authorized AWS administrator can enable Amazon CodeWhisperer for their organization.

Conclusion

With these new features and capabilities, AWS continues to expand its portfolio of the broadest and deepest set of AI services. AWS also recognizes that as AI-powered use cases become pervasive, it is important that these capabilities are built in a responsible way. AWS is committed to building its services in a responsible manner and supporting customers to help them deploy AI responsibly. By enabling customers to more easily and responsibly add new and expanded AI capabilities to their applications and workflows, AWS is unleashing even greater innovation and helping businesses reimagine how they approach and solve some of their most pressing challenges. To learn more about AWS’s comprehensive approach to responsible AI, visit Responsible use of artificial intelligence and machine learning.

References

1S. K. Zhou et al., “A Review of Deep Learning in Medical Imaging: Imaging Traits, Technology Trends, Case Studies With Progress Highlights, and Future Promises,” in Proceedings of the IEEE, vol. 109, no. 5, pp. 820-838, May 2021, doi: 10.1109/JPROC.2021.3054390.


About the Author

Bratin Saha is the Vice President of Artificial Intelligence and Machine Learning at AWS.

Read More

Deploy an MLOps solution that hosts your model endpoints in AWS Lambda

Deploy an MLOps solution that hosts your model endpoints in AWS Lambda

In 2019, Amazon co-founded the climate pledge. The pledge’s goal is to achieve net zero carbon by 2040. This is 10 years earlier than the Paris agreement outlines. Companies who sign up are committed to regular reporting, carbon elimination, and credible offsets. At the time of this writing, 377 companies have signed the climate pledge, and the number is still growing.

Because AWS is committed to helping you achieve your net zero goal through cloud solutions and machine learning (ML), many projects have already been developed and deployed that reduce carbon emissions. Manufacturing is one of the industries that can benefit greatly from such projects. Through optimized energy management of machines in manufacturing factories, such as compressors or chillers, companies can reduce their carbon footprint with ML.

Effectively transitioning from an ML experimentation phase to production is challenging. Automating model training and retraining, having a model registry, and tracking experiments and deployment are some of the key challenges. For manufacturing companies, there is another layer of complexity, namely how these deployed models can run at the edge.

In this post, we address these challenges by providing a machine learning operations (MLOps) template that hosts a sustainable energy management solution. The solution is agnostic to use cases, which means you can adapt it for your use cases by changing the model and data. We show you how to integrate models in Amazon SageMaker Pipelines, a native workflow orchestration tool for building ML pipelines, which runs a training job and optionally a processing job with a Monte Carlo Simulation. Experiments are tracked in Amazon SageMaker Experiments. Models are tracked and registered in the Amazon SageMaker model registry. Finally, we provide code for deployment of your final model in an AWS Lambda function.

Lambda is a compute service that lets you run code without managing or provisioning servers. Lambda’s automatic scaling, pay-per-request billing, and ease of use make it a common deployment choice for data science teams. With this post, data scientists can turn their model into a cost-effective and scalable Lambda function. Furthermore, Lambda allows for integration with AWS IoT Greengrass, which helps you build software that enables your devices to act at the edge on the data that they generate, as would be the case for a sustainable energy management solution.

Solution overview

The architecture we deploy (see the following figure) is a fully CI/CD-driven approach to machine learning. Elements are decoupled to avoid having one monolithic solution.

Architecture Diagram

Let’s start with the top left of the diagram. The Processing – Image build component is a CI/CD-driven AWS CodeCommit repository that helps build and push a Docker container to Amazon Elastic Container Registry (Amazon ECR). This processing container serves as the first step in our ML pipeline, but it’s also reused for postprocessing steps. In our case, we apply a Monte Carlo Simulation as postprocessing. The Training – Image build repository outlined on the bottom left has the same mechanism as the Processing block above it. The main difference is that it builds the container for model training.

The main pipeline, Model building (Pipeline), is another CodeCommit repository that automates running your SageMaker pipelines. This pipeline automates and connects the data preprocessing, model training, model metrics tracking in SageMaker Experiments, data postprocessing, and, model cataloging in SageMaker model registry.

The final component is on the bottom right: Model deployment. If you follow the examples in Amazon SageMaker Projects, you get a template that hosts your model using a SageMaker endpoint. Our deployment repository instead hosts the model in a Lambda function. We show an approach for deploying the Lambda function that can run real-time predictions.

Prerequisites

To deploy our solution successfully, you need the following:

Download the GitHub repository

As a first step, clone the GitHub repository to your local machine. It contains the following folder structure:

  • deployment – Contains code relevant for deployment
  • mllib — Contains ML code for preprocessing, training, serving, and simulating
  • tests — Contains unit and integration tests

The key file for deployment is the shell script deployment/deploy.sh. You use this file to deploy the resources in your account. Before we can run the shell script, complete the following steps:

  1. Open the deployment/app.py and change the bucket_name under SageMakerPipelineSourceCodeStack. The bucket_name needs to be globally unique (for example, add your full name).
  2. In deployment/pipeline/assets/modelbuild/pipelines/energy_management/pipeline.py, change the default_bucket under get_pipeline to the same name as specified in step 1.

Deploy solution with the AWS CDK

First, configure your AWS CLI with the account and Region that you want to deploy in. Then run the following commands to change to the deployment directory, create a virtual environment, activate it, install the required pip packages specified in setup.py, and run the deploy.sh:

cd deployment
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pre-commit install
chmod u+x deploy.sh
./deploy.sh

deploy.sh performs the following actions:

  1. Creates a virtual environment in Python.
  2. Sources the virtual environment activation script.
  3. Installs the AWS CDK and the requirements outlined in setup.py.
  4. Bootstraps the environment.
  5. Zips and copies the necessary files that you developed, such as your mllib files, into the corresponding folders where these assets are needed.
  6. Runs cdk deploy —require-approval never.
  7. Creates an AWS CloudFormation stack through the AWS CDK.

The initial stage of the deployment should take less than 5 minutes. You should now have four repositories in CodeCommit in the Region you specified through the AWS CLI, as outlined in the architecture diagram. The AWS CodePipeline pipelines are run simultaneously. The modelbuild and modeldeploy pipelines depend on a successful run of the processing and training image build. The modeldeploy pipeline depends on a successful model build. The model deployment should be complete in less than 1.5 hours.

Clone the model repositories in Studio

To customize the SageMaker pipelines created through the AWS CDK deployment in the Studio UI, you first need to clone the repositories into Studio. Launch the system terminal in Studio and run the following commands after providing the project name and ID:

git clone https://git-codecommit.REGION.amazonaws.com/v1/repos/sagemaker-PROJECT_NAME-PROJECT_ID-modelbuild
git clone https://git-codecommit.REGION.amazonaws.com/v1/repos/sagemaker-PROJECT_NAME-PROJECT_ID-modeldeploy
git clone https://git-codecommit.REGION.amazonaws.com/v1/repos/sagemaker-PROJECT_NAME-PROJECT_ID-processing-imagebuild
git clone https://git-codecommit.REGION.amazonaws.com/v1/repos/sagemaker-PROJECT_NAME-PROJECT_ID-training-imagebuild

After cloning the repositories, you can push a commit to the repositories. These commits trigger a CodePipeline run for the related pipelines.

You can also adapt the solution on your local machine and work on your preferred IDE.

Navigate the SageMaker Pipelines and SageMaker Experiments UI

A SageMaker pipeline is a series of interconnected steps that are defined using the Amazon SageMaker Python SDK. This pipeline definition encodes a pipeline using a Directed Acyclic Graph (DAG) that can be exported as a JSON definition. To learn more about the structure of such pipelines, refer to SageMaker Pipelines Overview.

Navigate to SageMaker resources pane and choose the Pipelines resource to view. Under Name, you should see PROJECT_NAME-PROJECT_ID. In the run UI, there should be a successful run that is expected to take a little over 1 hour. The pipeline should look as shown in the following screenshot.

Amazon SageMaker Pipeline

The run was automatically triggered after the AWS CDK stack was deployed. You can manually invoke a run by choosing Create execution. From there you can choose your own pipeline parameters such as the instance type and number of instances for the processing and training steps. Furthermore, you can give the run a name and description. The pipeline is highly configurable through pipeline parameters that you can reference and define throughout your pipeline definition.

Feel free to start another pipeline run with your parameters as desired. Afterwards, navigate to the SageMaker resources pane again and choose Experiments and trials. There you should again see a line with a name such as PROJECT_NAME-PROJECT_ID. Navigate to the experiment and choose the only run with a random ID. From there, choose the SageMaker training job to explore the metrics related to the training Job.

The goal of SageMaker Experiments is to make it as simple as possible to create experiments, populate them with trials, and run analytics across trials and experiments. SageMaker Pipelines are closely integrated with SageMaker Experiments, and by default for each run create an experiment, trial and trial components in case they do not exist.

Approve Lambda deployment in the model registry

As a next step, navigate to the model registry under SageMaker resources. Here you can find again a line with a name such as PROJECT_NAME-PROJECT_ID. Navigate to the only model that exists and approve it. This automatically deploys the model artifact in a container in Lambda.

After you approve your model in the model registry, an Amazon EventBridge event rule is triggered. This rule runs the CodePipeline pipeline with the ending *-modeldeploy. In this section, we discuss how this solution uses the approved model and hosts it in a Lambda function. CodePipeline takes the existing CodeCommit repository also ending with *-modeldeploy and uses that code to run in CodeBuild. The main entry for CodeBuild is the buildspec.yml file. Let’s look at this first:

version: 0.2

env:
  shell: bash

phases:
  install:
    runtime_versions:
      python: 3.8
    commands:
      - python3 -m ensurepip --upgrade
      - python3 -m pip install --upgrade pip
      - python3 -m pip install --upgrade virtualenv
      - python3 -m venv .venv
      - source .venv/bin/activate
      - npm install -g aws-cdk@2.26.0
      - pip install -r requirements.txt
      - cdk bootstrap
  build:
    commands:
      - python build.py --model-package-group-name "$SOURCE_MODEL_PACKAGE_GROUP_NAME"
      - tar -xf model.tar.gz
      - cp model.joblib lambda/digital_twin
      - rm model.tar.gz
      - rm model.joblib
      - cdk deploy --require-approval never

During the installation phase, we ensure that the Python libraries are up to date, create a virtual environment, install AWS CDK v2.26.0, and install the aws-cdk Python library along with others using the requirements file. We also bootstrap the AWS account. In the build phase, we run build.py, which we discuss next. That file downloads the latest approved SageMaker model artifact from Amazon Simple Storage Service (Amazon S3) to your local CodeBuild instance. This .tar.gz file is unzipped and its contents copied into the folder that also contains our main Lambda code. The Lambda function is deployed using the AWS CDK, and code runs out of a Docker container from Amazon ECR. This is done automatically by AWS CDK.

The build.py file is a Python file that mostly uses the AWS SDK for Python (Boto3) to list the model packages available.

The function get_approved_package returns the Amazon S3 URI of the artifact that is then downloaded, as described earlier.

After successfully deploying your model, you can test it directly on the Lambda console in the Region you chose to deploy in. The name of the function should contain DigitalTwinStack-DigitalTwin*. Open the function and navigate to the Test tab. You can use the following event to run a test call:

{
  "flow": "[280, 300]",
  "pressure": "[69, 70]",
  "simulations": "10",
  "no_of_trials": "10",
  "train_error_weight": "1.0"
}

After running the test event, you get a response similar to that shown in the following screenshot.

Test AWS Lambda function

If you want to run more simulations or trials, you can increase the Lambda timeout limit and experiment with the code! Or you might want to pick up the data generated and visualize the same in Amazon QuickSight. Below is an example. It’s your turn now!

Amazon QuickSight

Clean up

To avoid further charges, complete the following steps:

  • On the AWS CloudFormation console, delete the EnergyOptimization stack.
    This deletes the entire solution.
  • Delete the stack DigitalTwinStack, which deployed your Lambda function.

Conclusion

In this post, we showed you a CI/CD-driven MLOps pipeline of an energy management solution where we keep each step decoupled. You can track your ML pipelines and experiments in the Studio UI. We also demonstrated a different deployment approach: upon approval of a model in the model registry, a Lambda function hosting the approved model is built automatically through CodePipeline.

If you’re interested in exploring either the MLOps pipeline on AWS or the sustainable energy management solution, check out the GitHub repository and deploy the stack in your own AWS environment!


About the Authors

Laurens van der Maas is a Data Scientist at AWS Professional Services. He works closely with customers building their machine learning solutions on AWS, and is passionate about how machine learning is changing the world as we know it.

Kangkang Wang is an AI/ML consultant with AWS Professional Services. She has extensive experience of deploying AI/ML solutions in healthcare and life sciences vertical. She also enjoys helping enterprise customers to build scalable AI/ML platforms to accelerate the cloud journey of their data scientists.

Selena Tabbara is a Data Scientist at AWS Professional Services. She works everyday with her customers to achieve their business outcomes by innovating on AWS platforms. In her spare time, Selena enjoys playing the piano, hiking and watching basketball.

Michael WallnerMichael Wallner is a Senior Consultant with focus on AI/ML with AWS Professional Services. Michael is passionate about enabling customers on their cloud journey to become AWSome. He is excited about manufacturing and enjoys helping transform the manufacturing space through data.

Read More

Introducing Amazon Kendra tabular search for HTML Documents

Introducing Amazon Kendra tabular search for HTML Documents

Amazon Kendra is an intelligent search service powered by machine learning (ML). Kendra reimagines enterprise search for your websites and applications so your employees and customers can easily find the content they’re looking for, even when it’s scattered across multiple locations and content repositories within your organization.

Amazon Kendra users can now quickly find the information they need from tables on a webpage (HTML tables) using Amazon Kendra tabular search. Tables contains useful information in structured format so it can be easily interpreted by making visual associations between row and column headers. With Amazon Kendra tabular search, you can now get specific information from the cell or certain rows and columns relevant to your query, as well as preview of the table.

In this post, we provide an example of how to use Amazon Kendra tabular search.

Tabular search in Amazon Kendra

Let’s say you have a webpage in HTML format that contains a table with inflation rates and annual changes in the US from 2012–2021, as shown in the following screenshot.

When you search for “Inflation rate in US”, Amazon Kendra presents the top three rows in the preview and up to five columns, as shown in the following screenshot. You can then see if this article has the relevant details that you’re looking for and decide to either use this information or open the link to get additional details. Amazon Kendra tabular search can also handle merged rows.

Let’s do another search and get specific information from the table by asking “What was the annual change of inflation rate in 2017?”. As shown in the following screenshot, Amazon Kendra tabular search highlights the specific cell that contains the answer to your question.

Now let’s search for “Which year had top inflation rate?”, Amazon Kendra searches the table, sorts the results, and gives you the year that had the highest inflation rate.

Amazon Kendra can also find the range of column information that you’re looking for. For example, let’s search for “Inflation rate from 2012 and 2014.” Amazon Kendra displays the rows and columns between 2012–2014 in the preview.

Get started with Amazon Kendra tabular search

Amazon Kendra tabular search is turned on by default and no special configuration is required to enable it. For newer documents, Amazon Kendra tabular search will work by default. For existing HTML pages that contain tables, you can either update the document and sync (if you only have a few documents), or reach out to AWS Support .

To test tabular search on your internal or external webpage, complete the following steps:

  1. Create an index.
  2. Add data sources by using the web crawler or downloading the HTML page and uploading it to an Amazon Simple Storage Service (Amazon S3) bucket.
  3. Go to the Search Indexed Content tab and test it out.

Limitations and considerations

Keep the following in mind when using this feature:

  • In this release, Amazon Kendra only supports HTML formatted tables or HTML tables within the table tag. This doesn’t include nested tables or other forms of tables.
  • Amazon Kendra can search through tables up to 30 columns and 60 rows, and up to 500 total table cells. If you have a table with a higher numbers of rows, columns, or table cells, Amazon Kendra will not search within that table.
  • Amazon Kendra doesn’t display tabular search results if the confidence score of query result for the column and row is very low. You can look at the confidence score within ScoreAttributes using the QueryResultItem API.

Conclusion

With Amazon Kendra’s tabular search for HTML in Amazon Kendra, you can now search across both unstructured data from various data sources and structured data in the form of tables. This further enhances the user experiences and you can get factual responses from your natural language query as well as from the tables. The table preview with Kendra’s suggested answers allows you to quickly asses if the HTML document table contains relevant information you are looking for, thereby saving time.

Amazon Kendra tabular search is available in the following AWS regions during launch: US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), Asia Pacific (Sydney), Asia Pacific (Singapore), Canada (Central) and AWS GovCloud (US-West).

To learn more about Amazon Kendra, visit the Amazon Kendra product page.


About the authors

Vikas Shah is an Enterprise Solutions Architect at Amazon web services. He is a technology enthusiast who enjoys helping customers find innovative solutions to complex business challenges. His areas of interest are ML, IoT, robotics and storage. In his spare time, Vikas enjoys building robots, hiking, and traveling.

Read More

Enterprise administrative controls, simple sign-up, and expanded programming language support for Amazon CodeWhisperer

Enterprise administrative controls, simple sign-up, and expanded programming language support for Amazon CodeWhisperer

Amazon CodeWhisperer is a machine learning (ML)-powered service that helps improve developer productivity by generating code recommendations based on developers’ prior code and comments. Today, we are excited to announce that AWS administrators can now enable CodeWhisperer for their organization with single sign-n (SSO) authentication. Administrators can easily integrate CodeWhisperer with their existing workforce identity solutions, provide access to users and groups, and configure organization-wide settings. Additionally, individual users who don’t have AWS accounts can now use CodeWhisperer using their personal email with AWS Builder ID. The sign-up process takes only a few minutes and enables developers to start using CodeWhisperer immediately without any waitlist. We’re also expanding programming language support for CodeWhisperer. In addition to Python, Java, and JavaScript, developers can now use CodeWhisperer to accelerate development of their C# and TypeScript projects.

In this post, we discuss enterprise administrative controls, the new AWS Builder ID sign-up for CodeWhisperer, and support for new programming languages.

Enable CodeWhisperer for your organization

CodeWhisperer is now available on the AWS Management Console. Any user with an AWS administrator role can enable CodeWhisperer, add and remove users, and centrally manage settings for your organization via the console.

As a prerequisite, your AWS administrators have to set up SSO via AWS IAM Identity Center (successor to AWS Single Sign-On), if not already enabled for your organization. IAM Identity Center enables you to use your organization’s SSO to access AWS services by integrating your existing workforce identity solution with AWS. After SSO authentication is set up, your administrators can enable CodeWhisperer and assign access to users and groups, as shown in the following screenshot.

Set up CodeWhisperer

In addition to managing users, AWS administrators can also configure settings for the reference tracker and data sharing. The CodeWhisperer reference tracker detects whether a code recommendation might be similar to particular CodeWhisperer training data and can provide those references to you. CodeWhisperer learns, in part, from open-source projects. Sometimes, a suggestion it’s giving you may be similar to a specific piece of training data. The reference tracker setting enables administrators to decide whether CodeWhisperer is allowed to offer suggestions in such cases. When allowed, CodeWhisperer will also provide references, so that you can learn more about where the training data comes from. AWS administrators can also opt out of data sharing for the purpose of CodeWhisperer service improvement on behalf of your organization (see AI services opt-out policies). Once configured by the administrator, the settings are applied across your organization.

Developers who were given access can start using CodeWhisperer in their preferred IDE by simply logging in using their SSO login credentials. CodeWhisperer is available as part of the AWS Toolkit extensions for major IDEs, including JetBrains, Visual Studio Code, and AWS Cloud9.

In your preferred IDE, choose the SSO login option and follow the prompts to get authenticated and start getting recommendations from CodeWhisperer, as shown in the following screenshots.

connect using AWS IAM

confirm your input

Sign up within minutes using your personal email

If you’re an individual developer who doesn’t have access to an AWS account, you can use your personal email to sign up and enable CodeWhisperer in your preferred IDE. The sign-up process takes only a few minutes.

We’re introducing a new method of authentication with AWS Builder ID. AWS Builder ID is a new form of authentication that allows you to sign up securely with just your personal email and a password. After you create an AWS Builder account, simply log in and enable CodeWhisperer for your IDE, as shown in the following screenshot. For more information, see AWS Builder ID docs.

sign up using personal email

Build apps faster with TypeScript and C# programming languages

Keeping up with multiple programming languages, frameworks, and software libraries is an arduous task even for the most experienced developers. Looking up correct programming syntax and searching code snippets from web to programming tasks takes a significant amount of time, especially if you consider the cost of distractions.

CodeWhisperer provides ready-to-use real-time recommendations in your IDE to help you finish your coding tasks faster. Today, we’re expanding our support to include TypeScript and C# programming languages, in addition to Python, Java, and JavaScript.

CodeWhisperer understands your intent and provides recommendations based on the most commonly used best practices for a programming language. The following example shows how CodeWhisperer can generate the entire function in TypeScript to render JSON to a table.

TypeScript to render JSON to a table

CodeWhisperer also makes it easy for developers to use AWS services by providing code recommendations for AWS application programming interfaces (APIs) across the most popular services, including Amazon Elastic Compute Cloud (Amazon EC2), AWS Lambda, and Amazon Simple Storage Service (Amazon S3). We also offer a reference tracker with our recommendations that provides valuable information about the similarity of the recommendation to particular CodeWhisperer training data. Furthermore, we have implemented techniques to detect and filter biased code that might be unfair. The following example shows how CodeWhisperer can generate an entire function based on prompts provided in C#.

CodeWhisperer generates entire function based on prompts provided in C#

Get started with CodeWhisperer

During the preview period, CodeWhisperer is available to all developers across the world for free. To access the service in preview, you can enable it for your organization using the console, or you can use the AWS Builder ID to get started as an individual developer. For more information about the service, visit Amazon CodeWhisperer.


About the Authors

Bharadwaj Tanikella is a Senior Product Manager for Amazon CodeWhisperer. He has a background in Machine Learning, both as a developer and a Product Manager. In his spare time he loves to bike, read non-fiction and learning new languages.

Ankur Desai is a Principal Product Manager within the AWS AI Services team.

Read More

Optimize hyperparameters with Amazon SageMaker Automatic Model Tuning

Optimize hyperparameters with Amazon SageMaker Automatic Model Tuning

Machine learning (ML) models are taking the world by storm. Their performance relies on using the right training data and choosing the right model and algorithm. But it doesn’t end here. Typically, algorithms defer some design decisions to the ML practitioner to adopt for their specific data and task. These deferred design decisions manifest themselves as hyperparameters.

What does that name mean? The result of ML training, the model, can be largely seen as a collection of parameters that are learned during training. Therefore, the parameters that are used to configure the ML training process are then called hyperparameters—parameters describing the creation of parameters. At any rate, they are of very practical use, such as the number of epochs to train, the learning rate, the max depth of a decision tree, and so forth. And we pay much attention to them because they have a major impact on the ultimate performance of your model.

Just like turning a knob on a radio receiver to find the right frequency, each hyperparameter should be carefully tuned to optimize performance. Searching the hyperparameter space for the optimal values is referred to as hyperparameter tuning or hyperparameter optimization (HPO), and should result in a model that gives accurate predictions.

In this post, we set up and run our first HPO job using Amazon SageMaker Automatic Model Tuning (AMT). We learn about the methods available to explore the results, and create some insightful visualizations of our HPO trials and the exploration of the hyperparameter space!

Amazon SageMaker Automatic Model Tuning

As an ML practitioner using SageMaker AMT, you can focus on the following:

  • Providing a training job
  • Defining the right objective metric matching your task
  • Scoping the hyperparameter search space

SageMaker AMT takes care of the rest, and you don’t need to think about the infrastructure, orchestrating training jobs, and improving hyperparameter selection.

Let’s start by using SageMaker AMT for our first simple HPO job, to train and tune an XGBoost algorithm. We want your AMT journey to be hands-on and practical, so we have shared the example in the following GitHub repository. This post covers the 1_tuning_of_builtin_xgboost.ipynb notebook.

In an upcoming post, we’ll extend the notion of just finding the best hyperparameters and include learning about the search space and to what hyperparameter ranges a model is sensitive. We’ll also show how to turn a one-shot tuning activity into a multi-step conversation with the ML practitioner, to learn together. Stay tuned (pun intended)!

Prerequisites

This post is for anyone interested in learning about HPO and doesn’t require prior knowledge of the topic. Basic familiarity with ML concepts and Python programming is helpful though. For the best learning experience, we highly recommend following along by running each step in the notebook in parallel to reading this post. And at the end of the notebook, you also get to try out an interactive visualization that makes the tuning results come alive.

Solution overview

We’re going to build an end-to-end setup to run our first HPO job using SageMaker AMT. When our tuning job is complete, we look at some of the methods available to explore the results, both via the AWS Management Console and programmatically via the AWS SDKs and APIs.

First, we familiarize ourselves with the environment and SageMaker Training by running a standalone training job, without any tuning for now. We use the XGBoost algorithm, one of many algorithms provided as a SageMaker built-in algorithm (no training script required!).

We see how SageMaker Training operates in the following ways:

  • Starts and stops an instance
  • Provisions the necessary container
  • Copies the training and validation data onto the instance
  • Runs the training
  • Collects metrics and logs
  • Collects and stores the trained model

Then we move to AMT and run an HPO job:

  • We set up and launch our tuning job with AMT
  • We dive into the methods available to extract detailed performance metrics and metadata for each training job, which enables us to learn more about the optimal values in our hyperparameter space
  • We show you how to view the results of the trials
  • We provide you with tools to visualize data in a series of charts that reveal valuable insights into our hyperparameter space

Train a SageMaker built-in XGBoost algorithm

It all starts with training a model. In doing so, we get a sense of how SageMaker Training works.

We want to take advantage of the speed and ease of use offered by the SageMaker built-in algorithms. All we need are a few steps to get started with training:

  1. Prepare and load the data – We download and prepare our dataset as input for XGBoost and upload it to our Amazon Simple Storage Service (Amazon S3) bucket.
  2. Select our built-in algorithm’s image URI – SageMaker uses this URI to fetch our training container, which in our case contains a ready-to-go XGBoost training script. Several algorithm versions are supported.
  3. Define the hyperparameters – SageMaker provides an interface to define the hyperparameters for our built-in algorithm. These are the same hyperparameters as used by the open-source version.
  4. Construct the estimator – We define the training parameters such as instance type and number of instances.
  5. Call the fit() function – We start our training job.

The following diagram shows how these steps work together.

SageMaker training overview

Provide the data

To run ML training, we need to provide data. We provide our training and validation data to SageMaker via Amazon S3.

In our example, for simplicity, we use the SageMaker default bucket to store our data. But feel free to customize the following values to your preference:

sm_sess = sagemaker.session.Session([..])

BUCKET = sm_sess.default_bucket()
PREFIX = 'amt-visualize-demo'
output_path = f's3://{BUCKET}/{PREFIX}/output'

In the notebook, we use a public dataset and store the data locally in the data directory. We then upload our training and validation data to Amazon S3. Later, we also define pointers to these locations to pass them to SageMaker Training.

# acquire and prepare the data (not shown here)
# store the data locally
[..]
train_data.to_csv('data/train.csv', index=False, header=False)
valid_data.to_csv('data/valid.csv', index=False, header=False)
[..]
# upload the local files to S3
boto_sess.resource('s3').Bucket(BUCKET).Object(os.path.join(PREFIX, 'data/train/train.csv')).upload_file('data/train.csv')
boto_sess.resource('s3').Bucket(BUCKET).Object(os.path.join(PREFIX, 'data/valid/valid.csv')).upload_file('data/valid.csv')

In this post, we concentrate on introducing HPO. For illustration, we use a specific dataset and task, so that we can obtain measurements of objective metrics that we then use to optimize the selection of hyperparameters. However, for the overall post neither the data nor the task matter. To present you with a complete picture, let us briefly describe what we do: we train an XGBoost model that should classify handwritten digits from the
Optical Recognition of Handwritten Digits Data Set [1] via Scikit-learn. XGBoost is an excellent algorithm for structured data and can even be applied to the Digits dataset. The values are 8×8 images, as in the following example showing a
0 a
5 and a
4.

Select the XGBoost image URI

After choosing our built-in algorithm (XGBoost), we must retrieve the image URI and pass this to SageMaker to load onto our training instance. For this step, we review the available versions. Here we’ve decided to use version 1.5.1, which offers the latest version of the algorithm. Depending on the task, ML practitioners may write their own training script that, for example, includes data preparation steps. But this isn’t necessary in our case.

If you want to write your own training script, then stay tuned, we’ve got you covered in our next post! We’ll show you how to run SageMaker Training jobs with your own custom training scripts.

For now, we need the correct image URI by specifying the algorithm, AWS Region, and version number:

xgboost_container = sagemaker.image_uris.retrieve('xgboost', region, '1.5-1')

That’s it. Now we have a reference to the XGBoost algorithm.

Define the hyperparameters

Now we define our hyperparameters. These values configure how our model will be trained, and eventually influence how the model performs against the objective metric we’re measuring against, such as accuracy in our case. Note that nothing about the following block of code is specific to SageMaker. We’re actually using the open-source version of XGBoost, just provided by and optimized for SageMaker.

Although each of these hyperparameters are configurable and adjustable, the objective metric multi:softmax is determined by our dataset and the type of problem we’re solving for. In our case, the Digits dataset contains multiple labels (an observation of a handwritten digit could be 0 or 1,2,3,4,5,6,7,8,9), meaning it is a multi-class classification problem.

hyperparameters = {
    'num_class': 10,
    'max_depth': 5,
    'eta':0.2,
    'alpha': 0.2,
    'objective':'multi:softmax',
    'eval_metric':'accuracy',
    'num_round':200,
    'early_stopping_rounds': 5
}

For more information about the other hyperparameters, refer to XGBoost Hyperparameters.

Construct the estimator

We configure the training on an estimator object, which is a high-level interface for SageMaker Training.

Next, we define the number of instances to train on, the instance type (CPU-based or GPU-based), and the size of the attached storage:

estimator = sagemaker.estimator.Estimator(
    image_uri=xgboost_container, 
    hyperparameters=hyperparameters,
    role=role,
    instance_count=1, 
    instance_type='ml.m5.large', 
    volume_size=5, # 5 GB 
    output_path=output_path
)

We now have the infrastructure configuration that we need to get started. SageMaker Training will take care of the rest.

Call the fit() function

Remember the data we uploaded to Amazon S3 earlier? Now we create references to it:

s3_input_train = TrainingInput(s3_data=f's3://{BUCKET}/{PREFIX}/data/train', content_type='csv')
s3_input_valid = TrainingInput(s3_data=f's3://{BUCKET}/{PREFIX}/data/valid', content_type='csv')

A call to fit() launches our training. We pass in the references to the training data we just created to point SageMaker Training to our training and validation data:

estimator.fit({'train': s3_input_train, 'validation': s3_input_valid})

Note that to run HPO later on, we don’t actually need to call fit() here. We just need the estimator object later on for HPO, and could just jump to creating our HPO job. But because we want to learn about SageMaker Training and see how to run a single training job, we call it here and review the output.

After the training starts, we start to see the output below the cells, as shown in the following screenshot. The output is available in Amazon CloudWatch as well as in this notebook.

The black text is log output from SageMaker itself, showing the steps involved in training orchestration, such as starting the instance and loading the training image. The blue text is output directly from the training instance itself. We can observe the process of loading and parsing the training data, and visually see the training progress and the improvement in the objective metric directly from the training script running on the instance.

Output from fit() function in Jupyter Notebook

Also note that at the end of the output job, the training duration in seconds and billable seconds are shown.

Finally, we see that SageMaker uploads our training model to the S3 output path defined on the estimator object. The model is ready to be deployed for inference.

In a future post, we’ll create our own training container and define our training metrics to emit. You’ll see how SageMaker is agnostic of what container you pass it for training. This is very handy for when you want to get started quickly with a built-in algorithm, but then later decide to pass your own custom training script!

Inspect current and previous training jobs

So far, we have worked from our notebook with our code and submitted training jobs to SageMaker. Let’s switch perspectives and leave the notebook for a moment to check out what this looks like on the SageMaker console.

Console view of SageMaker Training jobs

SageMaker keeps a historic record of training jobs it ran, their configurations such as hyperparameters, algorithms, data input, the billable time, and the results. In the list in the preceding screenshot, you see the most recent training jobs filtered for XGBoost. The highlighted training job is the job we just trained in the notebook, whose output you saw earlier. Let’s dive into this individual training job to get more information.

The following screenshot shows the console view of our training job.

Console view of a single SageMaker Training job

We can review the information we received as cell output to our fit() function in the individual training job within the SageMaker console, along with the parameters and metadata we defined in our estimator.

Recall the log output from the training instance we saw earlier. We can access the logs of our training job here too, by scrolling to the Monitor section and choosing View logs.

Console View of monitoring tab in training job

This shows us the instance logs inside CloudWatch.

Console view of training instance logs in CloudWatch

Also remember the hyperparameters we specified in our notebook for the training job. We see them here in the same UI of the training job as well.

Console view of hyperparameters of SageMaker Training job

In fact, the details and metadata we specified earlier for our training job and estimator can be found on this page on the SageMaker console. We have a helpful record of the settings used for the training, such as what training container was used and the locations of the training and validation datasets.

You might be asking at this point, why exactly is this relevant for hyperparameter optimization? It’s because you can search, inspect, and dive deeper into those HPO trials that we’re interested in. Maybe the ones with the best results, or the ones that show interesting behavior. We’ll leave it to you what you define as “interesting.” It gives us a common interface for inspecting our training jobs, and you can use it with SageMaker Search.

Although SageMaker AMT orchestrates the HPO jobs, the HPO trials are all launched as individual SageMaker Training jobs and can be accessed as such.

With training covered, let’s get tuning!

Train and tune a SageMaker built-in XGBoost algorithm

To tune our XGBoost model, we’re going to reuse our existing hyperparameters and define ranges of values we want to explore for them. Think of this as extending the borders of exploration within our hyperparameter search space. Our tuning job will sample from the search space and run training jobs for new combinations of values. The following code shows how to specify the hyperparameter ranges that SageMaker AMT should sample from:

from sagemaker.tuner import IntegerParameter, ContinuousParameter, HyperparameterTuner

hpt_ranges = {
    'alpha': ContinuousParameter(0.01, .5),
    'eta': ContinuousParameter(0.1, .5),
    'min_child_weight': ContinuousParameter(0., 2.),
    'max_depth': IntegerParameter(1, 10)
}

The ranges for an individual hyperparameter are specified by their type, like ContinuousParameter. For more information and tips on choosing these parameter ranges, refer to Tune an XGBoost Model.

We haven’t run any experiments yet, so we don’t know the ranges of good values for our hyperparameters. Therefore, we start with an educated guess, using our knowledge of algorithms and our documentation of the hyperparameters for the built-in algorithms. This defines a starting point to define the search space.

Then we run a tuning job sampling from hyperparameters in the defined ranges. As a result, we can see which hyperparameter ranges yield good results. With this knowledge, we can refine the search space’s boundaries by narrowing or widening which hyperparameter ranges to use. We demonstrate how to learn from the trials in the next and final section, where we investigate and visualize the results.

In our next post, we’ll continue our journey and dive deeper. In addition, we’ll learn that there are several strategies that we can use to explore our search space. We’ll run subsequent HPO jobs to find even more performant values for our hyperparameters, while comparing these different strategies. We’ll also see how to run a warm start with SageMaker AMT to use the knowledge gained from previously explored search spaces in our exploration beyond those initial boundaries.

For this post, we focus on how to analyze and visualize the results of a single HPO job using the Bayesian search strategy, which is likely to be a good starting point.

If you follow along in the linked notebook, note that we pass the same estimator that we used for our single, built-in XGBoost training job. This estimator object acts as a template for new training jobs that AMT creates. AMT will then vary the hyperparameters inside the ranges we defined.

By specifying that we want to maximize our objective metric, validation:accuracy, we’re telling SageMaker AMT to look for these metrics in the training instance logs and pick hyperparameter values that it believes will maximize the accuracy metric on our validation data. We picked an appropriate objective metric for XGBoost from our documentation.

Additionally, we can take advantage of parallelization with max_parallel_jobs. This can be a powerful tool, especially for strategies whose trials are selected independently, without considering (learning from) the outcomes of previous trials. We’ll explore these other strategies and parameters further in our next post. For this post, we use Bayesian, which is an excellent default strategy.

We also define max_jobs to define how many trials to run in total. Feel free to deviate from our example and use a smaller number to save money.

n_jobs = 50
n_parallel_jobs = 3

tuner_parameters = {
    'estimator': estimator, # The same estimator object we defined above
    'base_tuning_job_name': 'bayesian',
    'objective_metric_name': 'validation:accuracy',
    'objective_type': 'Maximize',
    'hyperparameter_ranges': hpt_ranges,
    'strategy': 'Bayesian',
    'max_jobs': n_jobs,
    'max_parallel_jobs': n_parallel_jobs
}

We once again call fit(), the same way as when we launched a single training job earlier in the post. But this time on the tuner object, not the estimator object. This kicks off the tuning job, and in turn AMT starts training jobs.

tuner = HyperparameterTuner(**tuner_parameters)
tuner.fit({'train': s3_input_train, 'validation': s3_input_valid}, wait=False)
tuner_name = tuner.describe()['HyperParameterTuningJobName']
print(f'tuning job submitted: {tuner_name}.')

The following diagram expands on our previous architecture by including HPO with SageMaker AMT.

Overview of SageMaker Training and hyperparameter optimization with SageMaker AMT

We see that our HPO job has been submitted. Depending on the number of trials, defined by n_jobs and the level of parallelization, this may take some time. For our example, it may take up to 30 minutes for 50 trials with only a parallelization level of 3.

tuning job submitted: bayesian-221102-2053.

When this tuning job is finished, let’s explore the information available to us on the SageMaker console.

Investigate AMT jobs on the console

Let’s find our tuning job on the SageMaker console by choosing Training in the navigation pane and then Hyperparameter tuning jobs. This gives us a list of our AMT jobs, as shown in the following screenshot. Here we locate our bayesian-221102-2053 tuning job and find that it’s now complete.

Console view of the Hyperparameter tuning jobs page. Image shows the list view of tuning jobs, containing our 1 tuning entry

Let’s have a closer look at the results of this HPO job.

We have explored extracting the results programmatically in the notebook. First via the SageMaker Python SDK, which is a higher level open-source Python library, providing a dedicated API to SageMaker. Then through Boto3, which provides us with lower-level APIs to SageMaker and other AWS services.

Using the SageMaker Python SDK, we can obtain the results of our HPO job:

sagemaker.HyperparameterTuningJobAnalytics(tuner_name).dataframe()[:10]

This allowed us to analyze the results of each of our trials in a Pandas DataFrame, as seen in the following screenshot.

Pandas table in Jupyter Notebook showing results and metadata from the trails ran for our HPO job

Now let’s switch perspectives again and see what the results look like on the SageMaker console. Then we’ll look at our custom visualizations.

On the same page, choosing our bayesian-221102-2053 tuning job provides us with a list of trials that were run for our tuning job. Each HPO trial here is a SageMaker Training job. Recall earlier when we trained our single XGBoost model and investigated the training job in the SageMaker console. We can do the same thing for our trials here.

As we investigate our trials, we see that bayesian-221102-2053-048-b59ec7b4 created the best performing model, with a validation accuracy of approximately 89.815%. Let’s explore what hyperparameters led to this performance by choosing the Best training job tab.

Console view of a single tuning job, showing a list of training jobs ran

We can see a detailed view of the best hyperparameters evaluated.

Console view of a single tuning job, showing the details of the best training job

We can immediately see what hyperparameter values led to this superior performance. However, we want to know more. Can you guess what? We see that alpha takes on an approximate value of 0.052456 and, likewise, eta is set to 0.433495. This tells us that these values worked well, but it tells us little about the hyperparameter space itself. For example, we might wonder whether 0.433495 for eta was the highest value tested, or whether there’s room for growth and model improvement by selecting higher values.

For that, we need to zoom out, and take a much wider view to see how other values for our hyperparameters performed. One way to look at a lot of data at once is to plot our hyperparameter values from our HPO trials on a chart. That way we see how these values performed relatively. In the next section, we pull this data from SageMaker and visualize it.

Visualize our trials

The SageMaker SDK provides us with the data for our exploration, and the notebooks give you a peek into that. But there are many ways to utilize and visualize it. In this post, we share a sample using the Altair statistical visualization library, which we use to produce a more visual overview of our trials. These are found in the amtviz package, which we are providing as part of the sample:

from amtviz import visualize_tuning_job
visualize_tuning_job(tuner, trials_only=True)

The power of these visualizations becomes immediately apparent when plotting our trials’ validation accuracy (y-axis) over time (x-axis). The following chart on the left shows validation accuracy over time. We can clearly see the model performance improving as we run more trials over time. This is a direct and expected outcome of running HPO with a Bayesian strategy. In our next post, we see how this compares to other strategies and observe that this doesn’t need to be the case for all strategies.

Two Charts showing HPO trails. Left Chart shows validation accuracy over time. Right chart shows density chart for validation accuracy values

After reviewing the overall progress over time, now let’s look at our hyperparameter space.

The following charts show validation accuracy on the y-axis, with each chart showing max_depth, alpha, eta, and min_child_weight on the x-axis, respectively. We’ve plotted our entire HPO job into each chart. Each point is a single trial, and each chart contains all 50 trials, but separated for each hyperparameter. This means that our best performing trial, #48, is represented by exactly one blue dot in each of these charts (which we have highlighted for you in the following figure). We can visually compare its performance within the context of all other 49 trials. So, let’s look closely.

Fascinating! We see immediately which regions of our defined ranges in our hyperparameter space are most performant! Thinking back to our eta value, it’s clear now that sampling values closer to 0 yielded worse performance, whereas moving closer to our border, 0.5, yields better results. The reverse appears to be true for alpha, and max_depth appears to have a more limited set of preferred values. Looking at max_depth, you can also see how using a Bayesian strategy instructs SageMaker AMT to sample more frequently values it learned worked well in the past.

Four charts showing validation accuracy on the y-axis, with each chart showing max_depth, alpha, eta, min_child_weight on the x-axis respectively. Each data point represents a single HPO trial

Looking at our eta value, we might wonder whether it’s worth exploring more to the right, perhaps beyond 0.45? Does it continue to trail off to lower accuracy, or do we need more data here? This wondering is part of the purpose of running our first HPO job. It provides us with insights into which areas of the hyperparameter space we should explore further.

If you’re keen to know more, and are as excited as we are by this introduction to the topic, then stay tuned for our next post, where we’ll talk more about the different HPO strategies, compare them against each other, and practice training with our own Python script.

Clean up

To avoid incurring unwanted costs when you’re done experimenting with HPO, you must remove all files in your S3 bucket with the prefix amt-visualize-demo and also shut down Studio resources.

Run the following code in your notebook to remove all S3 files from this post.

!aws s3 rm s3://{BUCKET}/amt-visualize-demo --recursive

If you wish to keep the datasets or the model artifacts, you may modify the prefix in the code to amt-visualize-demo/data to only delete the data or amt-visualize-demo/output to only delete the model artifacts.

Conclusion

In this post, we trained and tuned a model using the SageMaker built-in version of the XGBoost algorithm. By using HPO with SageMaker AMT, we learned about the hyperparameters that work well for this particular algorithm and dataset.

We saw several ways to review the outcomes of our hyperparameter tuning job. Starting with extracting the hyperparameters of the best trial, we also learned how to gain a deeper understanding of how our trials had progressed over time and what hyperparameter values are impactful.

Using the SageMaker console, we also saw how to dive deeper into individual training runs and review their logs.

We then zoomed out to view all our trials together, and review their performance in relation to other trials and hyperparameters.

We learned that based on the observations from each trial, we were able to navigate the hyperparameter space to see that tiny changes to our hyperparameter values can have a huge impact on our model performance. With SageMaker AMT, we can run hyperparameter optimization to find good hyperparameter values efficiently and maximize model performance.

In the future, we’ll look into different HPO strategies offered by SageMaker AMT and how to use our own custom training code. Let us know in the comments if you have a question or want to suggest an area that we should cover in upcoming posts.

Until then, we wish you and your models happy learning and tuning!

References

Citations:

[1] Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.


About the authors

Andrew Ellul is a Solutions Architect with Amazon Web Services. He works with small and medium-sized businesses in Germany. Outside of work, Andrew enjoys exploring nature on foot or by bike.

Elina Lesyk is a Solutions Architect located in Munich. Her focus is on enterprise customers from the Financial Services Industry. In her free time, Elina likes learning guitar theory in Spanish to cross-learn and going for a run.

Mariano Kamp is a Principal Solutions Architect with Amazon Web Services. He works with financial services customers in Germany on machine learning. In his spare time, Mariano enjoys hiking with his wife.

Read More