Gain valuable ML skills with the AWS Machine Learning Engineer Nanodegree Scholarship from Udacity

Amazon Web Services is partnering with Udacity to help educate developers of all skill levels on machine learning (ML) concepts with the AWS Machine Learning Scholarship Program by Udacity by offering 425 scholarships, with a focus on women and underrepresented groups.

Machine learning is an exciting and rapidly developing technology that has the power to create millions of jobs and transform our daily lives. According to the Future of Jobs Report 2020 by the World Economic Forum, by 2025, 97 million new roles may be created as a result of ML innovation. However, only today’s developers have the skills to act on these opportunities now. Proximity to high-quality education, cost of traditional education, and allocating time to start and complete new learning projects make learning ML more complicated.

To address this challenge, AWS invests in educating developers, data scientists, and ML developers with a variety of education solutions, such as exploring reinforcement learning concepts with AWS DeepRacer, training and validation with the AWS Certified Machine Learning – Specialty certification, and hands-on tutorials from the AWS Machine Learning Community.

Up-leveling ML skills and opening new career opportunities

With the AWS Machine Learning Engineer Nanodegree by Udacity, developers can learn valuable skills for ML career path with an interactive, cost-effective, and accessible ML education. All students that enroll in the scholarship program have access to AWS Machine Learning Foundations, a free course covering an introduction to ML concepts, including reinforcement learning, computer vision, and generative artificial intelligence, with expert-led interactive tutorials with AWS AI Devices such as AWS DeepRacer, AWS DeepLens, and AWS DeepComposer.

The AWS Machine Learning Scholarship Program is open to all for registration starting May 26, 2021, through June 23, 2021. Your learning journey begins with the free AWS Machine Learning Foundations course on June 28, 2021, in which you learn the fundamental aspects of ML, ML techniques and algorithms, programming best practices, Python coding, and interactive tutorials with AWS AI Devices. You have 3months to study and complete your assessment by October 11, 2021, with the top 425 students eligible for a scholarship to the AWS Machine Learning Engineer Nanodegree.

The AWS Machine Learning Foundations Course (free) includes the following objectives:

  • Learn the fundamentals of ML
  • Learn object-oriented programming best practices
  • Learn computer vision with AWS DeepLens, reinforcement learning with AWS DeepRacer, and generative AI with AWS DeepComposer.
  • Dedicate 3–5 hours a week on the course and work towards earning one of the follow-up Nanodegree program scholarships

In the Machine Learning Engineer Nanodegree program (a $1,000 value course), you learn advanced ML techniques and algorithms, including how to package and deploy models to a production environment.

The AWS Machine Learning Nanodegree Program enabled me to learn and achieve valuable machine learning skills at my own pace with interactive modules that made learning fun and effective,” said Juv Chan, AWS ML Hero and AWS Machine Learning Nanodegree Program Alumni. “Carving out time to learn machine learning can be very hard, especially under the demanding schedules that software engineers work from. The flexibility offered by Udacity Nanodegrees lets me learn new skills on a timetable that works for me.”

This year, we added 100 additional scholarships on top of the 325 scholarships allocated in 2020, and updated the content for students with advanced ML techniques and algorithms and expert-led tutorials on deploying ML models at scale with Amazon SageMaker.

AWS is also collaborating with several nonprofit organizations through the We Power Tech Program to increase the diversity and talent in technical roles, including organizations like Girls In Tech and the National Society of Black Engineers. As part of these ongoing relationships, the nonprofit organizations will help encourage women and underrepresented groups to participate in the AWS Machine Learning Engineer Nanodegree Scholarship Program. Organizations like these develop programs to inspire, support, train, and empower people from underrepresented groups to pursue careers in tech.

“AWS strives to help level the playing field for women and people of color, who have been underrepresented in the tech industry for far too long. We are thrilled to collaborate with Udacity to make this sort of technical training more widely available and accessible,” said LaDavia Drane, global head of Inclusion, Diversity & Equity at AWS. “We look forward to seeing the incredible innovations in machine learning that are sure to come from this initiative.”

“Tech needs representation from women, BIPOC, and other marginalized communities in every aspect of our industry. Companies must make meaningful and measurable change in the areas of diversity, equity, and inclusion to reach their greatest potential, and skills training programs uniquely tailored to increase representation from these groups are necessary for technology to achieve all that it’s capable of. Girls in Tech applauds our collaborator AWS, as well as Udacity, for breaking down the barriers that so often leave women behind in tech. Together, we aim to give everyone a seat at the table.” Adriana Gascoigne, Founder and CEO, Girls in Tech.

How the AWS Machine Learning Engineer Nanodegree Scholarship works

Scholarship enrollment is open from May 26, 2021, through June 23, 2021. Students begin their learning journey with the AWS Machine Learning Foundations Course from June 28, 2021, through October 11, 2021. At the end of the course, learners take an assessment, from which top students are selected for 425 scholarships, who then start the AWS Machine Learning Engineer Nanodegree from October 25, 2021, through January 25, 2022.

Get started and leverage the community

Get started by enrolling now and accelerate your ML journey! Connect with experts and like-minded aspiring ML developers on the AWS Machine Learning Slack channel.


About the Author

Cameron Peron is Senior Marketing Manager for AWS AI/ML Education and the AWS AI/ML community. He evangelizes how AI/ML innovation solves complex challenges facing community, enterprise, and startups alike. Out of the office, he enjoys staying active with kettlebell-sport, spending time with his family and friends, and is an avid fan of Euro-league basketball.

Read More

How Contentsquare reduced TensorFlow inference latency with TensorFlow Serving on Amazon SageMaker

In this post, we present the results of a model serving experiment made by Contentsquare scientists with an innovative DL model trained to analyze HTML documents. We show how the Amazon SageMaker TensorFlow Serving solution helped Contentsquare address several serving challenges.

Contentsquare’s challenge

Contentsquare is a fast-growing French technology company empowering brands to build better digital experiences. In their own words, “Our experience analytics platform tracks and visualizes billions of digital behaviors, delivering intelligent recommendations that everyone can use to grow revenue, increase loyalty, and fuel innovation.”

Contentsquare scientists developed several ML and deep learning models, and wanted to find solutions for cost-effective and performant real-time model serving. For this experiment, they chose a custom multi-input, multi-task deep neural network developed with TensorFlow-backed Keras, which can answer several questions in one single inference on large payloads consisting of HTML pages.

Baseline deployment: Flask on Amazon EC2

As a baseline deployment, the Contentsquare team served the TensorFlow-backed Keras model from a Flask server hosted on an Amazon Elastic Compute Cloud (Amazon EC2) p2.xlarge GPU machine. Flask is a popular Python web framework (52,000 stars on GitHub) appreciated for its simplicity and large community. The EC2 p2.xlarge instance was fitted with a NVIDIA Tesla K80 GPU card. On a reference input payload used as a benchmark, this design provided a single-request inference latency of approximately 5 seconds.

Optimized deployment: TensorFlow Serving on SageMaker

To reduce management overhead and get a simpler deployment experience, the Contentsquare team experimented with Amazon SageMaker. SageMaker is a managed service supporting the development lifecycle of custom models, from annotation up to production deployment and monitoring. Beyond enabling a faster time to market, SageMaker provides state-of-the-art open-source pre-written serving containers for XGBoost (container, SDK), Scikit-Learn (container, SDK), PyTorch (container, SDK), TensorFlow (container, SDK) and Apache MXNet (container, SDK). In particular, the SageMaker TensorFlow serving container is built on top of TensorFlow Serving (TensorFlow-Serving: Flexible, High-Performance ML Serving, Olston et al.), the official, high-performance serving stack for TensorFlow. The SageMaker team further improved the TensorFlow Serving experience by adding the option to run custom inference code in front of TensorFlow Serving (for example, for pre or postprocessing).

Slim Frikha, a Contentsquare scientist, says, “That is one of the reasons why we use TensorFlow Serving on SageMaker: TensorFlow Serving runs performant inference, SageMaker provides easy deployment, and the combination of both brings the extra possibility to do preprocessing and postprocessing with TensorFlow Serving.”

Preprocessing and postprocessing are important capabilities that ML practitioners look for when choosing an ML serving solution. To use the custom processing capacity of SageMaker TensorFlow Serving, developers can provide a custom inference.py script containing handling functions. For more information, see Create Python Scripts for Custom Input and Output Formats.

The following figures show a high-level view of the internal architecture of the current SageMaker TensorFlow Serving container. Two web servers are collocated in each instance of the endpoint instance fleet. An NGINX server handles the communication with the requesting client and can optionally run ad hoc data processing via an infererence.py script running in Gunicorn. A TensorFlow Serving server internally exposes TensorFlow models for consumption by the Gunicorn server. In-server communication between Gunicorn and TensorFlow Serving can be done in REST or gRPC when using an inference.py custom inference script, and with REST when using the default setup without the custom inference script. In both cases, external requests are done with REST.

 

 

The Contentsquare team tested both gRPC and HTTP for internal communication with TensorFlow Serving, and found gRPC to be much faster than HTTP, because HTTP required a JSON dump of the very large preprocessed input. On the specific benchmark inference payload, deploying in SageMaker TensorFlow Serving on an ml.p2.xlarge hosting instance reduced the global serving latency from 5 seconds to 3 seconds, compared to Keras deployed in Flask on Amazon EC2 p2.xlarge instance—a 40% improvement! This gain is driven by serving optimizations internal to TensorFlow Serving and decoding inputs to TensorFlow tensors, which can be faster if using gRPC.

Conclusion

Contentsquare scientists successfully completed their benchmark and found a cost-effective, high-performance serving solution for their custom TensorFlow model that reduced latency by 40% vs. a reasonable baseline. Another axis of improvement, not evaluated in this benchmark but worth consideration for extra gains, would be to evaluate different instance types. For example, the EC2 G4 instances, more recent than the P2, demonstrated great performance and economics in several inference cases. If you are interested in learning more about TensorFlow Serving on SageMaker, you can find guidance in the documentation, view the container source code on GitHub and navigate our examples gallery.


About the Author

Olivier Cruchant is a Machine Learning Specialist Solutions Architect at AWS, based in Lyon, France. Olivier helps French customers – from small startups to large enterprises – develop and deploy production-grade machine learning applications. In his spare time, he enjoys reading research papers and exploring the wilderness with friends and family.

Read More

Host multiple TensorFlow computer vision models using Amazon SageMaker multi-model endpoints

Amazon SageMaker helps data scientists and developers prepare, build, train, and deploy high-quality machine learning (ML) models quickly by bringing together a broad set of capabilities purpose-built for ML. SageMaker accelerates innovation within your organization by providing purpose-built tools for every step of ML development, including labeling, data preparation, feature engineering, statistical bias detection, AutoML, training, tuning, hosting, explainability, monitoring, and workflow automation.

Companies are increasingly training ML models based on individual user data. For example, an image sharing service designed to enable discovery of information on the internet trains custom models based on each user’s uploaded images and browsing history to personalize recommendations for that user. The company can also train custom models based on search topics for recommending images per topic. Building custom ML models for each use case leads to higher inference accuracy, but increases the cost of deploying and managing models. These challenges become more pronounced when not all models are accessed at the same rate but still need to be available at all times.

SageMaker multi-model endpoints provide a scalable and cost-effective way to deploy large numbers of ML models in the cloud. SageMaker multi-model endpoints enable you to deploy multiple ML models behind a single endpoint and serve them using a single serving container. Your application simply needs to include an API call with the target model to this endpoint to achieve low-latency, high-throughput inference. Instead of paying for a separate endpoint for every single model, you can host many models for the price of a single endpoint. For more information about SageMaker multi-model endpoints, see Save on inference costs by using Amazon SageMaker multi-model endpoints.

In this post, we demonstrate how to use SageMaker multi-model endpoints to host two computer vision models with different model architectures and datasets for image classification. In practice, you can deploy tens of thousands of models on multi-model endpoints.

Overview of solution

SageMaker multi-model endpoints work with several frameworks, such as TensorFlow, PyTorch, MXNet, and sklearn, and you can build your own container with a multi-model server. Multi-model endpoints are also supported natively in the following popular SageMaker built-in algorithms: XGBoost, Linear Learner, Random Cut Forest (RCF), and K-Nearest Neighbors (KNN). You can directly use the SageMaker-provided containers while using these algorithms without having to build your own custom container.

The following diagram is a simplified illustration of how you can host multiple (for this post, six) models using SageMaker multi-model endpoints. In practice, multi-model endpoints can accommodate hundreds to tens of thousands of ML models behind an endpoint. In our architecture, if we host more models using model artifacts stored in Amazon Simple Storage Service (Amazon S3), multi-model endpoints dynamically unload some of the least-used models to accommodate newer models.

In this post, we show how to host two computer vision models trained using the TensorFlow framework behind a single SageMaker multi-model endpoint. We use the TensorFlow Serving container enabled for multi-model endpoints to host these models. For our first model, we train a smaller version of AlexNet CNN to classify images from the CIFAR-10 dataset. For the second model, we use a VGG16 CNN model pretrained on the ImageNet dataset and fine-tuned on the Sign Language Digits Dataset to classify hand symbol images. We also provide a fully functional notebook to demonstrate all the steps.

Model 1: CIFAR-10 image classification

CIFAR-10 is a benchmark dataset for image classification in computer vision and ML. CIFAR images are colored (three channels) with dramatic variation in how the objects appear. It consists of 32 × 32 color images in 10 classes, with 6,000 images per class. It contains 50,000 training images and 10,000 test images. The following image shows a sample of the images grouped by the labels.

To build the image classifier, we use a simplified version of the classical AlexNet CNN. The network is composed of five convolutional and pooling layers, and three fully connected layers. Our simplified architecture stacks three convolutional layers and two fully connected (dense) layers.

The first step is to load the dataset into train and test objects. The TensorFlow framework provides the CIFAR dataset for us to load using the load_data() method. Next, we rescale the input images by dividing the pixel values by 255: [0,255] ⇒ [0,1]. We also need to prepare the labels using one-hot encoding. One hot encoding is a process by which the categorical variables are converted into a numerical form. The following code snippet shows these steps in action:

from tensorflow.keras.datasets import cifar10

# load dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

# rescale input images
X_train = X_train.astype('float32')/255
X_test = X_test.astype('float32')/255

# one hot encode target labels
num_classes = len(np.unique(y_train))
y_train = utils.to_categorical(y_train, num_classes)
y_test = utils.to_categorical(y_test, num_classes)

After the dataset is prepared and ready for training, it’s saved to Amazon S3 to be used by SageMaker. The model architecture and the training code for the image classifier are assembled into a training script (cifar_train.py). To generate batches of tensor image data for the training process, we use ImageDataGenerator. This enables us to apply data augmentation transformations like rotation, width, and height shifts to our training data.

In the next step, we use the training script to create a TensorFlow estimator using the SageMaker SDK (see the following code). We use the estimator to fit the CNN model on CIFAR-10 inputs. When the training is complete, the model is saved to Amazon S3.

from sagemaker.tensorflow import TensorFlow

model_name = 'cifar-10'

hyperparameters = {'epochs': 50}

estimator_parameters = {'entry_point':'cifar_train.py',
                        'instance_type': 'ml.m5.2xlarge',
                        'instance_count': 2,
                        'model_dir': f'/opt/ml/model',
                        'role': role,
                        'hyperparameters': hyperparameters,
                        'output_path': f's3://{BUCKET}/{PREFIX}/cifar_10/out',
                        'base_job_name': f'mme-cv-{model_name}',
                        'framework_version': TF_FRAMEWORK_VERSION,
                        'py_version': 'py37',
                        'script_mode': True}

estimator_1 = TensorFlow(**estimator_parameters)

estimator_1.fit(inputs)

Later, we demonstrate how to host this model using SageMaker multi-model endpoint alongside our second model (the sign language digits classifier).

Model 2: Sign language digits classification

For our second model, we use the sign language digits dataset. This dataset distinguishes the sign language digits from 0–9. The following image shows a sample of the dataset.

The dataset contains 100 x 100 images in RGB color and has 10 classes (digits 0–9). The training set contains 1,712 images, the validation set 300, and the test set 50.

This dataset is very small. Training a network from scratch on this small a dataset doesn’t achieve good results. To achieve higher accuracy, we use transfer learning. Transfer learning is usually the go-to approach when starting a classification project, especially when you don’t have much training data. It migrates the knowledge learned from the source dataset to the target dataset, to save training time and computational cost.

To train this model, we use a pretrained VGG16 CNN model trained on the ImageNet dataset and fine-tune it to work on our sign language digits dataset. A pretrained model is a network that has been previously trained on a large dataset, typically on a large-scale image classification task. The VGG16 model architecture we use has 13 convolutional layers in total. For the sign language dataset, because its domain is different from the source domain of the ImageNet dataset, we only fine-tune the last few layers. Fine-tuning here refers to freezing a few of the network layers that are used for feature extraction, and jointly training both the non-frozen layers and the newly added classifier layers of the pretrained model.

The training script (sign_language_train.py) encapsulates the model architecture and the training logic for the sign language digits classifier. First, we load the pretrained weights from the VGG16 network trained on the ImageNet dataset. Next, we freeze part of the feature extractor part, followed by adding the new classifier layers. Finally, we compile the network and run the training process to optimize the model for the smaller dataset.

Next, we use this training script to create a TensorFlow estimator using the SageMaker SDK. This estimator is used to fit the sign language digits classifier on the supplied inputs. When the training is complete, the model is saved to Amazon S3 to be hosted by SageMaker multi-model endpoints. See the following code:

model_name = 'sign-language'

hyperparameters = {'epochs': 50}

estimator_parameters = {'entry_point':'sign_language_train.py',
                        'instance_type': 'ml.m5.2xlarge',
                        'instance_count': 2,
                        'hyperparameters': hyperparameters,
                        'model_dir': f'/opt/ml/model',
                        'role': role,
                        'output_path': f's3://{BUCKET}/{PREFIX}/sign_language/out',
                        'base_job_name': f'cv-{model_name}',
                        'framework_version': TF_FRAMEWORK_VERSION,
                        'py_version': 'py37',
                        'script_mode': True}

estimator_2 = TensorFlow(**estimator_parameters)

estimator_2.fit({'train': train_input, 'val': val_input})

Deploy a multi-model endpoint

SageMaker multi-model endpoints provide a scalable and cost-effective solution to deploy large numbers of models. It uses a shared serving container that is enabled to host multiple models. This reduces hosting costs by improving endpoint utilization compared to using single-model endpoints. It also reduces deployment overhead because SageMaker manages loading models in memory and scaling them based on the traffic patterns to them.

To create the multi-model endpoint, first we need to copy the trained models for the individual estimators (1 and 2) from their saved S3 locations to a common S3 prefix that can be used by the multi-model endpoint:

tf_model_1 = estimator_1.model_data
output_1 = f's3://{BUCKET}/{PREFIX}/mme/cifar.tar.gz'

tf_model_2 = estimator_2.model_data
output_2 = f's3://{BUCKET}/{PREFIX}/mme/sign-language.tar.gz'

!aws s3 cp {tf_model_1} {output_1}
!aws s3 cp {tf_model_2} {output_2}

After the models are copied to the common location designated by the S3 prefix, we create a serving model using the TensorFlowModel class from the SageMaker SDK. The serving model is created for one of the models to be hosted under the multi-model endpoint. In this case, we use the first model (the CIFAR-10 image classifier). Next, we use the MultiDataModel class from the SageMaker SDK to create a multi-model data model using the serving model for model-1, which we created in the previous step:

from sagemaker.tensorflow.serving import TensorFlowModel
from sagemaker.multidatamodel import MultiDataModel

model_1 = TensorFlowModel(model_data=output_1, 
                          role=role, 
                          image_uri=IMAGE_URI)

mme = MultiDataModel(name=f'mme-tensorflow-{current_time}',
                     model_data_prefix=model_data_prefix,
                     model=model_1,
                     sagemaker_session=sagemaker_session)

Finally, we deploy the MultiDataModel by calling the deploy() method, providing the attributes needed to create the hosting infrastructure required to back the multi-model endpoint:

predictor = mme.deploy(initial_instance_count=2,
                       instance_type='ml.m5.2xlarge',
                       endpoint_name=f'mme-tensorflow-{current_time}')

The deploy call returns a predictor instance, which we can use to make inference calls. We see this in the next section.

Test the multi-model endpoint for real-time inference

Multi-model endpoints enable sharing memory resources across your models. If the model to be referenced is already cached, multi-model endpoints run inference immediately. On the other hand, if the particular requested model isn’t cached, SageMaker has to download the model, which increases latency for that initial request. However, this takes only a fraction of the time it would take to launch an entirely new infrastructure (instances) to host the model individually on SageMaker. After a model is cached in the multi-model endpoint, subsequent requests are initiated in real time (unless the model is removed). As a result, you can run many models from a single instance, effectively decoupling our quantity of models from our cost of deployment. This makes it easy to manage ML deployments at scale and lowers your model deployment costs through increased usage of the endpoint and its underlying compute instances. For more information and a demonstration of cost savings of over 90% for a 1,000-model example, see Save on inference costs using Amazon SageMaker multi-model endpoints.

Multi-model endpoints also unload unused models from the container when the instances backing the endpoint reach memory capacity and more models need to be loaded into its container. SageMaker deletes unused model artifacts from the instance storage volume when the volume is reaching capacity and new models need to be downloaded. The first invocation to a newly added model takes longer because the endpoint takes time to download the model from Amazon S3 to the container’s memory of the instances backing the multi-model endpoint. Models that are unloaded remain on the instance’s storage volume and can be loaded into the container’s memory later without being downloaded again from the S3 bucket.

Let’s see how to make an inference from the CIFAR-10 image classifier (model-1) hosted under the multi-model endpoint. First, we load a sample image from one of the classes—airplane— and prepare it to be sent to the multi-model endpoint using the predictor we created in the previous step.

With this predictor, we can call the predict() method along with the initial_args parameter, which specifics the name of the target model to invoke. In this case, the target model is cifar.tar.gz. The following snippet demonstrates this process in detail:

img = load_img('./data/cifar_10/raw_images/airplane.png', target_size=(32, 32))
data = img_to_array(img)
data = data.astype('float32')
data = data / 255.0
data = data.reshape(1, 32, 32, 3)
payload = {'instances': data}
y_pred = predictor.predict(data=payload, initial_args={'TargetModel': 'cifar.tar.gz'})
predicted_label = CIFAR10_LABELS[np.argmax(y_pred)]
print(f'Predicted Label: [{predicted_label}]')

Running the preceding code returns the prediction output as the label airplane, which is correctly interpreted by our served model:

Predicted Label: [airplane]

Next, let’s see how to dynamically load the sign language digit classifier (model-2) into a multi-model endpoint by invoking the endpoint with sign-language.tar.gz as the target model.

We use the following sample image of the hand sign digit 0.

The following snippet shows how to invoke the multi-model endpoint with the sample image to get back the correct response:

test_path  = './data/sign_language/test'
img = mpimg.imread(f'{test_path}/0/IMG_4159.JPG')

def path_to_tensor(img_path):
    # loads RGB image as PIL.Image.Image type
    img = image.load_img(img_path, target_size=(224, 224))
    # convert PIL.Image.Image type to 3D tensor with shape (224, 224, 3)
    x = image.img_to_array(img)
    # convert 3D tensor to 4D tensor with shape (1, 224, 224, 3) and return 4D tensor
    return np.expand_dims(x, axis=0)

data = path_to_tensor(f'{test_path}/0/IMG_4159.JPG')
payload = {'instances': data}
y_pred = predictor.predict(data=payload, initial_args={'TargetModel': 'sign-language.tar.gz'})predicted_label = np.argmax(y_pred)
print(f'Predicted Label: [{predicted_label}]')

The following code is our response, with the label 0:

Predicted Label: [0]

Conclusion

In this post, we demonstrated the SageMaker feature multi-model endpoints to optimize inference costs. Multi-model endpoints are useful when you’re dealing with hundreds to tens of thousands of models and where you don’t need to deploy each model as an individual endpoint. Models are loaded and unloaded dynamically, according to usage and the amount of memory available on the endpoint.

This post discussed how to host multiple computer vision models trained using the TensorFlow framework under one SageMaker multi-model endpoint. The image classification models were of different model architectures and trained on different datasets. The notebook included with the post provides detailed instructions on training and hosting the models.

Give SageMaker multi-model endpoints a try for your use case and leave your feedback in the comments.


About the Authors

Arunprasath Shankar is an Artificial Intelligence and Machine Learning (AI/ML) Specialist Solutions Architect with AWS, helping global customers scale their AI solutions effectively and efficiently in the cloud. In his spare time, Arun enjoys watching sci-fi movies and listening to classical music.

 

 

Mark Roy is a Principal Machine Learning Architect for AWS, helping AWS customers design and build AI/ML solutions. Mark’s work covers a wide range of ML use cases, with a primary interest in computer vision, deep learning, and scaling ML across the enterprise. He has helped companies in many industries, including Insurance, Financial Services, Media and Entertainment, Healthcare, Utilities, and Manufacturing. Mark holds six AWS certifications, including the ML Specialty Certification. Prior to joining AWS, Mark was an architect, developer, and technology leader for 25+ years, including 19 years in financial services.

Read More

Your Guide to the AWS Machine Learning Summit

We’re about a week away from the AWS Machine Learning Summit and if you haven’t registered yet, you better get on it! On June 2, 2021 (Americas) and June 3, 2021 (Asia-Pacific, Japan, Europe, Middle East, and Africa), don’t miss the opportunity to hear from some of the brightest minds in machine learning (ML) at the free virtual AWS Machine Learning Summit. This Summit, which is open to all, brings together industry luminaries, AWS customers, and leading ML experts to share the latest in ML. You’ll learn about science breakthroughs in ML, how ML is impacting business, best practices in building ML, and how to get started now without prior ML expertise. This post is your guide to navigating the Summit.

The day kicks off with a keynote from ML leaders from across AWS, Amazon, and the industry, including Swami Sivasubramanian, VP of AI and Machine Learning, AWS; Bratin Saha, VP of Machine Learning, AWS; and Yoelle Maarek, VP of Research, Alexa Shopping, who will share a keynote on how we’re applying customer-obsessed science to advance ML. You’ll also hear from Ashok Srivastava, Senior Vice President and Chief Data Officer at Intuit, about how the company is scaling its ML and AI to create new customers experiences.

Next, tune in for an exclusive fireside chat with Andrew Ng, founder and CEO of Landing AI and founder of deeplearning.ai, and Swami Sivasubramanian about the future of ML, the skills that are fundamental for the next generation of ML practitioners, and how we can bridge the gap from proof of concept to production in ML.

From there, pick the track that best matches your interests or mix and match throughout the day. We’ll also have expert Q&A available from 11:00am–3:00pm local time.

The science of machine learning

If you’re an advanced practitioner or just really interested in the science of ML, this track provides a technical deep dive into the groundbreaking work that ML scientists within AWS, Amazon, and beyond are doing to advance the science of ML in areas including computer vision, natural language processing, bias, and more.

Speakers include two Amazon Scholars, Michael Kearns and Kathleen McKeown. Kearns is a professor in the Computer and Information Science department at the University of Pennsylvania, where he holds the National Center Chair. He is co-author of the book “The Ethical Algorithm: The Science of Socially Aware Algorithm Design,” and joined Amazon as a scholar June 2020. McKeown is the Henry and Gertrude Rothschild professor of computer science at Columbia University, and the founding director of the school’s Data Science Institute. She joined Amazon as a scholar in 2019.

You’ll also get an inside look at trends in deep learning and natural language in a powerhouse fireside chat with Amazon distinguished scientists Alex Smola and Bernhard Schölkopf, and Alexa AI Senior Principal Scientist Dilek Hakkani-Tur.

The impact of machine learning

If you’re a technical business leader, you won’t want to miss this track where you’ll learn from AWS customers that are leading the way in ML adoption. Customers including 3M, AstraZeneca, Vanguard, Carbon Lighthouse, ADP, and Bundesliga will share how they’re applying ML to create efficiencies, deliver new revenue streams, and launch entirely new products and business models. You’ll get best practices for scaling ML in an organization and showing impact.

How machine learning is done

If you’re a data scientist or ML developer, join this track for practical deep dives into tools that can speed up the entire ML lifecycle, from building to training to deploying ML models. Sessions include how to choose the right algorithms, more accurate and speedy data prep, model explainability, and more.

Machine learning: No expertise required

If you’re a developer who wants to apply ML and AI to a use case but you don’t have the expertise, this track is for you. Learn how to use AWS AI services and other tools to get started with your ML project right away, for use cases including contact center intelligence, personalization, intelligent document processing, business metrics analysis, computer vision, and more. You’ll also learn from customers like Fidelity on how they’re applying ML to business problems like DevOps.

For more details, visit the website and we’ll see you there!


About the Author

Laura Jones is a product marketing lead for AWS AI/ML where she focuses on sharing the stories of AWS customers and educating organizations on the impact of machine learning. As a Florida native living and surviving in rainy Seattle, she enjoys coffee, attempting to ski and enjoying the great outdoors.

Read More

It’s a wrap for Amazon SageMaker Month, 30 days of content, discussions, and news

Did you miss SageMaker Month? Don’t look any further than this round-up post to get caught up. In this post, we share key highlights and learning materials to accelerate your machine learning (ML) innovation.

On April 20, 2021, we launched the first ever Amazon SageMaker Month, 30 days of hands-on workshops, tech talks, Twitch sessions, blog posts, and playbooks. Our goal with SageMaker Month was to connect you with AWS experts, getting started resources, workshops, and learning content to be successful with ML. The following is a summary of what you can access on-demand to get started on your ML journey with Amazon SageMaker.

Introducing SageMaker Savings Plans

To kick off SageMaker month, we introduced Amazon SageMaker Savings Plans, a flexible, usage-based pricing model for SageMaker. The goal of SageMaker Savings Plans is to offer you the flexibility to save up to 64% on SageMaker ML instance usage in exchange for a commitment of consistent usage for a 1-year or 3-year term. In addition, to help you save even more, we announced a price drop on SageMaker CPU and GPU instances.

To enable customers to save more on SageMaker, we hosted a SageMaker Friday Twitch session with Greg Coquillo, the second-most influential speaker according to LinkedIn Top Voices 2020: Data Science & AI, along with Julien Simon and Segolene Dessertine-Panhard outlining cost-optimization techniques using SageMaker and SageMaker Savings Plans.

SageMaker Savings Plans enhance the productivity and cost-optimizing capabilities already available in Amazon SageMaker Studio, which can improve your data science team’s productivity up to 10 times. Studio provides a single visual interface where you can perform all your ML development steps. Studio also gives you complete access, control, and visibility into each step required to build, train, and deploy models. To enable your teams to move faster and boost productivity, learn how to customize your Studio notebooks.

Getting started with ML

SageMaker is the most comprehensive ML service, purpose-built for every step of the ML development lifecycle. SageMaker provides all the components used for ML in a single service, so you can prepare data and build, train, and deploy models.

Data preparation is the first step of building an ML model. It’s a time-consuming and involved process that is largely undifferentiated. We hear from our customers that it constitutes up to 80% of their time during ML development. Data preparation has always been considered tedious and resource intensive, due to the inherent nature of data being “dirty” and not ready for ML in its raw form. “Dirty” data could include missing or erroneous values, outliers, and more. Feature engineering is often needed to transform the inputs to deliver more accurate and efficient ML models. To help with feature engineering, Amazon SageMaker Feature Store offers a purpose-built repository to store, update, retrieve, and share ML features within development teams.

Another challenge with data preparation is that it often requires multiple steps. Although most standalone data preparation tools provide data transformation, feature engineering, and visualization, few tools provide built-in model validation. And all of these data preparation steps are considered separate from ML. What’s needed is a framework that provides all these capabilities in one place and is tightly integrated with the rest of the ML pipeline. Most standalone tools for data preparation treat it as an extract, transform, and load (ETL) workload, making it tedious to iteratively prepare data, validate the model on test datasets, deploy it in production, and go back to ingesting new data sources and performing additional feature engineering. Most iterative data preparation is divorced from deployment. Therefore, data preparation modules need curation and integration before they’re deployed in production. These practices in ML are sometimes referred to as MLOps.

To help you overcome these challenges, you can use Amazon SageMaker Data Wrangler, a capability to simplify the process of data preparation, feature engineering, and each step of the data preparation workflow, including data selection, cleansing, and exploration on a single visual interface. As part of SageMaker Month, we created a step-by-step tutorial on how you can prepare data for ML with Data Wrangler. In addition, you can learn how financial customers use SageMaker every day to predict credit risk and approve loans. This example uses Data Wrangler and Amazon SageMaker Clarify to detect bias during the data preparation stage.

Another part of the data preparation stage is labeling data. Data labeling is the task of identifying objects in raw data, such as text, images, and videos, and tagging them with labels that help your ML model make accurate predictions and estimations. For example, in an autonomous vehicle use case, Light Detection and Ranging (LIDAR) devices are commonly used to capture and generate a three-dimensional point cloud data, which is an understanding of the physical space at a single point in time. For this use case, you need to label your data captured both in 2D and 3D spaces to produce highly accurate predictions of vehicles, lanes, and pedestrians. Amazon SageMaker Ground Truth, a fully managed data labeling service, makes it easy to build highly accurate training datasets for ML in 2D and 3D spaces using custom or built-in data labeling workflows. To help you label your data, we created how-to blog posts to showcase how to annotate 3D point cloud data and automate data labeling workflows for an autonomous vehicle use case with Ground Truth.

After you built your ML model, you must train and tune it to achieve the highest accuracy. Improving a model’s performance is an experimental and iterative process. For SageMaker Month, we consolidated a few techniques and best practices on how to train and tune high-quality deep learning models with complete visibility using SageMaker.

When you’re satisfied with your model’s accuracy, understanding how to deploy and manage models at scale is key. For model deployment and management, we showcase an example where an application developer is using SageMaker multi-model endpoints to host thousands of models and pipelines to automate retraining to improve recommendations across different US cities.

When it’s time to deploy your model and make predictions, a process called inference, you can use SageMaker for inference in the cloud or on edge devices. Amazon SageMaker Neo automatically compiles ML models for any ML framework and any target hardware. A Neo compiled model can speed up YOLOv4 inference to twice as fast. You can also reduce ML inference costs on SageMaker with hardware and software acceleration.

As part of SageMaker Month, we also launched an example use case that shows how you can use Amazon SageMaker Edge Manager, a capability to optimize, secure, monitor, and maintain ML models on fleets of smart cameras, robots, personal computers, and mobile devices. This blog outlines how to manage and monitor models on edge devices such as wind turbines.

Finally, to bring all our SageMaker capabilities together and help you move from model ideation to production, we created an on-demand introduction to SageMaker workshop similar to the virtual hands-on workshops we conducted live and during recent AWS Summits. It includes everything you need to get started with SageMaker at your own pace.

ML through our Partners

As part of SageMaker Month, we partnered with Tableau and DOMO to empower data and business analysts with ML-powered insights without needing any ML expertise. With the right data readily available, you can use ML and business intelligence (BI) tools to help make predictions needed to automate and speed up critical business processes and workflows.

We partnered with DOMO to enable ML for everyone with SageMaker. Domo AutoML, powered by Amazon SageMaker Autopilot, provides insights to complex business problems and automates the end-to-end decision-making process. This helps organizations improve decision-making and adapt faster to business changes.

We also partnered with Tableau to create a blog post and tech talk that showcases an end-to-end demo and new Quick Start solution that makes it easy for data analysts to use ML models deployed on SageMaker directly in their Tableau dashboards without writing any custom integration code.

What’s next

SageMaker Month focused on cost savings and optimization, getting started with ML, and learning content to accelerate ML innovation. As we wrap up SageMaker Month, we’re excited to share the upcoming and first ever virtual AWS Machine Learning Summit on June 2, 2021. The summit brings together industry-leading scientists, AWS customers, and experts to dive deep into the art, science, and impact of ML. Attend for free, learn about features over 30 sessions, and interact with leaders in a live Q&A.


About the Author

Shashank Murthy is a Senior Product Marketing Manager with AWS Machine Learning. His goal is to make it easy for customers to build, train, and deploy machine learning models using Amazon SageMaker. For fun outside work, Shashank likes to hike the Pacific Northwest, play soccer, and run obstacle course races.

Read More

Enhance sports narratives with natural language generation using Amazon SageMaker

This blog post was co-authored by Arbi Tamrazian, Director of Data Science and Machine Learning at Fox Sports

FOX Sports is the sports television arm of FOX Network. The company used machine learning (ML) and Amazon SageMaker to streamline the production of relevant in-game storylines for commentators to use during live broadcasts.

“We collaborated with the Amazon Machine Learning Solutions Lab to build a natural language generation (NLG) engine that automatically produces sports narratives for commentators to use during games. Leveraging Amazon SageMaker, the Amazon Machine Learning Solutions Lab developed a model pipeline that generates natural-sounding sports narratives from a ML model trained on billions of English texts and sports stats snippets. In just a few short weeks, the NLG solution achieved BLEU scores above 99% on unseen Fox Sports testing dataset, significantly improving the readability of narratives compared to test benchmarks. Standardizing our ML workloads on Amazon SageMaker will enable our broadcasters to engage fans with pertinent gameday stories, in real-time.” – Arbi Tamrazian, Director of Data Science and Machine Learning, Fox Sports

Objectives

As viewers may have noticed, sports broadcasters are increasingly sharing statistical insights throughout the game to tell a richer story for the audience. Thanks to an abundance of data and advanced stats such as NFL Next Gen Stats powered by AWS, broadcasters can quickly tell stories and make comparisons between teams and players to keep viewers engaged.

Due to the fast-paced nature of many games, broadcasters rely on template-generated narratives to speak about in-game statistics in real time. These rule-based templates “stitch” tabular information and create narratives with fixed sentence structures that sometimes sound rigid and are hard to understand. It’s also becoming harder to build and maintain templates to keep up the pace with the introduction of new statistics.

To improve the broadcasting experience, Fox Sports turns to AWS and its artificial intelligence technologies to convert their real-time data into easy-to-understand narratives for commentators and audiences. The Amazon ML Solutions Lab partnered with Fox Sports to design and implement an end-to-end ML system using natural language generation (NLG), a technique to generate natural language descriptions from structured data. The objective of the partnership is to produce more natural-sounding narratives compared to the rule-based templates in a scalable fashion. The system enables Fox Sports to expand their rule-based generation engine into an ML solution. The model is trained to understand the semantic meaning of inputs, and can be expanded to new statistics and other sports by fine-tuning with a few hundred sample narratives.

In this post, we walk you through how to fine-tune a pretrained language model to generate sentences similar to those from rule-based templates. In addition, we show how to use different NLG techniques to make the sentences sound more natural, which leads to improved fan experiences and reduced cost in building and maintaining templates.

Template for an ML approach

The first phase of the NLG-based narrative generation solution relies on tabular features, including player and team names, metrics, and game situations. These features are paired with their target sequences, which are generated using predefined rule-based templates. The goal here is to use NLG to take the tabular features and generate candidate narratives containing all the relevant information.

Dataset

To train this model, we use a dataset synthetically generated by Fox Sports using the current rule-based methodology. The dataset is generated by permuting different statistics, feature values, and team and player names, and includes more than 57,000 samples of 8 features. For each sample, we have the narrative generated from a rule-based template as our target. We randomly shuffle and divide the dataset into training, validation, and testing sets based on an 80/10/10 split for training and fine-tuning our models.

The following table shows examples of the raw data used in this experiment—each row represents a record, and each column represents the relevant information associated with the record, including the statistic, values for the statistic, situation that the statistic is calculated upon, and more. For this post, we replace actual team and players names with generic names: team Bobcats and player John Peccy.

Statistic Situation Value Time frame Rank Rank Order Population Team name / Player name
rec_td stadium_retractable_dome 5 season 7 True 32 Bobcats
qbkd score_differential_trailing 3 season 2 False 190 John Peccy

For each row, the raw tabular features are concatenated to form a text sequence. The following table shows examples of the text sequences used as input and the associated narrative from the rule-based template as output.

Template input Template output
rec_td stadium_retractable_dome 5 season 7 TRUE 32 Bobcats Bobcats’ 5 caught passes for touchdowns when playing in a retractable roof is the 7th highest out of 32 in the NFL this season.
qbkd score_differential_trailing 3 season 2 FALSE 190 John Peccy John Peccy’s 3 credited QB knockdowns when trailing is the 2nd lowest out of 190 in the NFL this season.

Methods and metrics

The task of translating tabular features to natural sentences is a subtask of natural language generation. Because transfer learning has proved effective at this task, we utilize a language model called T5 (Text-To-Text Transfer Transformer), which was pretrained on the open-source dataset C4 (Colossal Clean Crawled Corpus). T5 achieves state-of-the-art results on many NLP benchmarks and is flexible to be fine-tuned to different NLP tasks. To fine-tune the T5 model for Fox Sports, we concatenate the tabular features into a single sequence of text as our training input. Then we use the template-generated statements as labels. For example, the following table is translated into the text sequence Team Bobcats, prss, 4, score_differential_leading, 7.

Team name Metric Value Situation Rank
Bobcats prss 4 score_differential_leading 7

The corresponding template statement – The Bobcats’ 4 total times of pressuring the quarterback when leading is the 7th highest in the NFL this season” – is passed in as the target output. After fine-tuning the T5 model with thousands of such examples, the model is able to generate statements similar to the template. It even works for previously unseen input, making it extensible to fresh players and newly created metrics.

We use the BLEU (Bilingual Evaluation Understudy) performance metric to quantitatively measure model performance. BLEU measures the matching quality of a generated sentence to a ground truth sentence by assigning a score from 0–100, with 100 being a perfect match to the ground truth. After fine-tuning on a few thousand sentences, the T5 model is able to achieve a BLEU score of above 99 on the test set, an indication that most of the generated sentences are identical to template-generated sentences. It also echoes the usefulness of using pretrained models on abundantly available unlabeled text for different downstream tasks.

Improving comprehensibility

The template-generated narratives capture core details, but are repetitive and sometimes difficult to read because they follow the same predefined sentence structure. This leads to confusion for the broadcasters and fans. To address this drawback, we include a second phase of modeling, which employs language models to enhance the readability and comprehensibility of the fine-tuned T5 model’s generated narratives. This step’s objective is to make the narratives sound more natural, allowing commentators to easily communicate the information during live broadcasting.

Language processing methods

One way to replace unnatural words in sentences is through back translation. Back translation is a two-step translation method. It first translates a sentence into another language, and then translates the sentence back to its original language. It’s a technique used mostly for text data augmentation, namely, increasing the variety of original text. For this use case, we find that translation models trained on a large text corpus can help fix mistakes in the original sentence. During back translation, a singular noun may be corrected to a plural. The model may also choose more natural-sounding language. This approach gives us an automatic way to improve readability for our generated sentences.

An alternative natural language processing (NLP) approach to back translation is called paraphrasing—a technique that aims to express semantically similar narratives in different forms. We employ a pretrained T5 model, which is fine-tuned for paraphrasing purposes using the open-sourced paraphraser dataset PAWS. Our paraphrasing model generates several candidates for a given narrative with slightly different content. One major advantage of using this technique is that it offers several narratives per input. This gives us several candidate sentences, from which we can choose the version that best fits Fox Sports’s business needs. An example of the paraphrasing output against a sample sentence is shown in the following table.

Type Sentence
Original The Bobcats’ 4 total times of pressuring the quarterback when leading is the 7th highest out of 32 in the NFL this season.
Paraphrased 1 The Bobcats pressing the quarterback 4 times when leading this season is the 7th best out of 32 in the NFL.
Paraphrased 2 The Bobcats’ 4 total times of pressuring quarterback in leading is the 7th highest out of 32 in the NFL this season.
Paraphrased 3 The Bobcats have pressured the quarterback 4 times total when leading—the 7th highest out of 32 in the NFL this season.

Model evaluation

Quantitatively evaluating how natural a sentence sounds is an ongoing challenge in the NLP community. For this project, we use an existing metric called perplexity. Perplexity is a proxy measure of how “surprised” a language model is at sentences. In other words, it measures how common an evaluation sentence is among text corpus used to train a language model, which can be used to compare the quality of different sentences. For language models such as GPT2, it typically assigns a low perplexity score to real and syntactically correct sentences and high score to fake, incorrect, or highly infrequent sentences. For example, GPT2 assigns a lower score to sentences like “Can you do it?” and a higher score to sentences like “Can you does it?” With this, we can compare the quality of generated sentences sharing similar semantic meanings and output the one with the lowest perplexity score.

Architecture

Our final product is an end-to-end ML workflow using SageMaker. To meet Fox Sports’ needs, the workflow ensures that the following two criteria are satisfied:

  • The end-to-end results must include all the required features defined by a user
  • The final narrative output of the models shouldn’t be harder to read than the original rule-based template narrative

Our solution consists of two major components:

  1. Replace the current ruled-based approach with the fine-tuned T5 model
  2. Enhance the generated narratives through a multi-step ML-based approach

As illustrated in the following figure, the fine-tuned T5 ML model generates the narratives (green blocks). Next, the narratives are passed through the back translation model as an attempt to produce enhanced narratives. If the back translated results include the necessary keywords and their perplexity scores are lower compared to the T5 model outputs, they’re used as the final outputs. Otherwise, we pass the T5 model outputs through the paraphrasing model and apply the same condition check. If none of our enhancement models reduce the perplexity score, we simply output the T5 model outputs. Through this workflow, we ensure all the required features are captured and improve the readability of the sentence when appropriate, maximizing the benefit ML can bring to the existing solution.

Results

With models combined to form the preceding architecture, the output narrative has on average 13% lower perplexity compared to original rule-based, template-generated narratives, and all the information is maintained. Fox Sports can display the narratives to broadcasters and sports fans for more exciting viewing experiences!

Conclusion

The ML Solutions Lab and Fox Sports ML team worked closely to build an end-to-end ML solution that converts in-game tabular stats into natural-sounding narratives. Because the solution is built on top of language models pretrained on a huge text corpus, additional metrics and game situations can be passed in directly to generate the desired outputs. The extensibility also enables the solution to be transferred to other sports by simply fine-tuning the model with sample narratives. These capabilities allow the model to scale and adapt to future business needs.

Around the world, many sports leagues and sports networks like Fox Sports are transforming the fan experience with AWS technology. AWS is helping bring fans closer to the game through partnering with BundesligaF1NFLNHLNASCAR, and many others. Visit AWS Sports for more details.

If you’d like help accelerating your use of ML in your products and processes, please contact the ML Solutions Lab program.


About the Authors

Henry Wang is a Data Scientist at Amazon Machine Learning Solutions Lab. Prior to joining AWS, he was a graduate student at Harvard in Computational Science and Engineering, where he worked on healthcare research with reinforcement learning. In his spare time, he enjoys playing tennis and golf, reading, and watching StarCraft II tournaments.

 

 

Saman Sarraf is a Data Scientist at the Amazon ML Solutions Lab. His background is in applied machine learning including deep learning, computer vision, and time series data prediction.

 

 

 

Arbi Tamrazian is the Director of Data Science and Machine Learning at FOX where he focuses on building scalable machine learning solutions that can be applied to real-time data feeds and media assets. His main areas of interest are Deep Learning, Computer Vision and Reinforcement Learning.

Read More

How lekker got more insights into their customer churn model with Amazon SageMaker Debugger

With over 400,000 customers, lekker Energie GmbH is a leading supraregional provider of electricity and gas on the German energy market. lekker is customer and service oriented and regularly scores top marks in comparison tests. As one of the most important suppliers of green electricity to private households, the company, with its 220 employees, stands for environmentally and consumer-friendly products.

Germany’s energy market was liberalized in the 1990s. Since then, customers have free choice of their energy and gas supplier. During the liberalization, the German government standardized the switching processes, so switching your energy or gas supplier is an easy task. However, it’s a challenging task for lekker to hold churn rates low. Preventing existing customers from leaving is several times cheaper than acquiring new ones. The best way to realize low churn rates is to keep their customers satisfied. Knowledge about a customer’s churn risk is helpful information for target-based campaigns, because it allows lekker to focus on customers who are more likely to churn.

This post discusses how lekker used Amazon SageMaker Debugger to get deep insights into their customer churn model. Debugger automatically collects data during model training and provides built-in rules to automatically detect issues in model training.

Data preprocessing

lekker has a wide range of systems with different databases and data structures, and uses Spark and AWS Step Functions to create a data lake on AWS. In preparation of the churn model, lekker creates a Spark processing job that holds customer-specific information like duration, sales channel, consumption, and other information for label creation. lekker make distinctions between active and passive churn. Active churn describes customers canceling their contract. Passive churn describes customers who are no longer in lekker’s delivery area or whose contract was cancelled due to late payment. For the introduced model, lekker uses active churn as a label, which helps better fit marketing expectations for retention campaigns.

Create a customer churn model

Before lekker started with AWS, data came from an Oracle database, which was used as a business intelligence (BI) platform. The BI team and analysts were organized in different departments and had different access rights. Data scientists needed to access data by schema-on-read. Models were trained on local machines or non-scalable servers, and computational restrictions came up quickly. If a model was trained, model monitoring and debugging was hard to perform, while management’s skepticism of potential closed-box models grew. Model deployment was also difficult, caused by missing orchestration tools and limited server availability and capacity.

When lekker decided to use SageMaker, most of these problems were solved, because SageMaker offers solutions along the whole machine learning workflow. lekker can now easily scale computing capacity needs and access all available data on Amazon S3.  Their data scientists can now explore and prepare data in the same notebook, and find it easier to create and train models using SageMaker Estimators. Additionally, lekker frequently use SageMaker automatic model tuning, which figures out the best model by running different hyperparameter configurations. This helped raise model quality tremendously. lekker uses Debugger to evaluate and communicate models’ results and get model insights.

Set up training on Amazon SageMaker

To run the XGBoost training on SageMaker, lekker uses the SageMaker Estimator API. It takes the instance type for the model training (ml.m5.4xlarge). It also takes the image URI of the training image and a dictionary for the model hyperparameters. See the following code:

Estimator(
    role=role,
    instance_count=1,
    instance_type='ml.m5.4xlarge',
    hyperparameters = {
        'num_round': '20',
        'rate_drop': '0.3',
        'scale_pos_weight': scale_pos_weight,
        'tweedie_variance_power': '1.4',
        'objective': 'binary:logistic'
        },
    image_uri = sagemaker.image_uris.retrieve('xgboost',region, version='1.0-1')
)

Configure Debugger and rules

lekker uses Debugger in three ways:

  • Use built-in rules to identify underperforming training jobs
  • Create automatic visualizations
  • Collect important metrics from training jobs

The following code shows the Debugger hook configuration to collect metrics such as feature importance and Shapley values from churn model training:

debugger_hook_config=DebuggerHookConfig(
    hook_parameters={'save_interval':'5'},
    collection_configs=[ 
        CollectionConfig(name="metrics"),
        CollectionConfig(name="feature_importance"),
        CollectionConfig(name="full_shap"),
        CollectionConfig(name="average_shap"),
    ]
 )

Debugger provides built-in rules that check for model training issues such overfitting or loss not decreasing. Those rules run as a SageMaker processing job in a separate container and instance so the rule analysis doesn’t interfere with the actual training. Users don’t pay to run these built-in rules. lekker frequently uses the loss_not_decreasing and xboost_report rules. The first rule monitors the loss curves and triggers if loss doesn’t decrease by a certain percentage. The xgboost_report rule captures XGBoost model data and creates a static HTML report with visualizations such as ROC curves, errors plots, and more, and provides key insights and recommendations. See the following code:

 rules=[
    Rule.sagemaker(
        rule_configs.loss_not_decreasing(),
        rule_parameters={
        "collection_names": "metrics",
        "num_steps": str(save_interval * 2),
        },
        ),
    Rule.sagemaker(rule_configs.create_xgboost_report())
 ]

After the Debugger hook configuration and list of rules are specified, one starts the SageMaker training with estimator.fit(). The fit function takes as input the path to training and validation data in Amazon S3. See the following code:

estimator.fit( 
    "train": TrainingInput(model_train_file, content_type="csv")
    "validation": TrainingInput(model_test_file, content_type="csv"))

SageMaker automatically spins up the ml.m5.4xlarge training instance, downloads the training container and datasets, and runs the model training. It also spins up an instance to run the rule analysis as a SageMaker processing job. You can go to SageMaker Studio and check the rule status or check the status from the Python SDK.

Visualize and perform real-time monitoring

When the training is running, lekker uses Debugger’s open-source smdebug library to fetch and query the data that is uploaded in real time to Amazon S3. The first step is to create a trial object that takes either a local or S3 path:

from smdebug.trials import create_trial

s3_output_path = xgboost_estimator.latest_job_debugger_artifacts_path()
trial = create_trial(s3_output_path)

Now one access and query the data. To plot the loss curves, one simply retrieves the metrics collection and the number of recorded steps:

steps = trial.steps()
fig, ax = plt.subplots()
for tname in trial.collection("metrics").tensor_names:
    data = [value for value in trial.tensor(tname).values().values()]
    ax.plot(steps, data, label=tname)

The following figure shows that train and validation errors fall while training the customer churn model. That’s a sign of a well-trained model, because it shows that the model performs well on the unseen data (validation data). Debugger makes this visualization easy to create.

When the training job has completed, lekker uses the output of the xgboost_report rule to get further insights into the customer churn model. The following figure shows the model’s feature importance for the training job. The most important feature is customer duration (membership in months). lekker offers contracts with a fixed duration, such as 12 or 24 months. If customers cancel their contract, the churn shows at the end of the fixed duration period. That’s why most churn appears at month 12 and 24.

Knowledge about what influences the models’ outcome is important because it helps explain the model. lekker uses SHapley Additive exPlanations (SHAP) values recorded by Debugger during training. SHAP was made for local interpretability of a predictive model. It uses a game theoretic approach to explain the output of machine learning models.

In the following figure, blue represents low feature values, red represents high. The x-axis shows the SHAP-value, which describes the impact on the outcome. High values indicate a predicted value increase, low values indicate a decrease. A line’s thickness represents how many customers are at this specific point. In the churn model customers with low duration have low predicted churn probabilities. That’s a result of their contract structure, because customer churn can be determined after 12 months at the earliest.

Users running on Amazon SageMaker can obtain SHAP values for their model either through SageMaker Debugger or SageMaker Clarify. The key difference is that Debugger records those values during training, while Clarify captures them after the model has been trained. Inspecting SHAP values during the training phase, helps to further improve the model by identifying and removing irrelevant input features.

Once the model is trained, you can use Clarify to get SHAP values for any dataset. Once you deploy the model as an endpoint, you can use Clarify to monitor the SHAP values for captured data from the endpoint. Another key difference is that Debugger can collect SHAP values during training for XGBoost models whereas Clarify is model agnostic and can work with any model.

Results

With all the tools and services SageMaker provides, lekker was able to raise churn model accuracy by nearly 20%. In addition, the model is more stable than earlier versions. That’s why the F1 score raised over 80% and AUC to 96%.

“Since we got all this information about model insights, we are able to get a clear understanding about what’s happening,” says Steffen Kremers, a data scientist at lekker. “Especially the concept of feature gains, which is fully integrated in the Debugger report, gave us useful information about the most influencing features. Important information for both feature engineering and feature selection.”

Since the churn model was deployed, lekker has moved three more models to SageMaker and integrated them into operations. lekker transferred the learnings they made to all these models, and have seen that all models yield better results than before. Once lekker saw the insights ML can bring, they began expanding their ML activities.

Conclusion

This post demonstrated how lekker moved workloads from on premises to SageMaker, and how it helped their data science teams accelerate and innovate faster. lekker extensively uses Debugger to get deeper insights into their models, which help improve and better explain the models. To learn more about Debugger features and how this service can help your business, see Amazon SageMaker Debugger. To learn more about optimizing for customer churn, check out the blog post Preventing customer churn by optimizing incentive programs using stochastic programming.


About the Authors

Steffen Kremers is a data scientist at lekker based in Germany. He accompanies the whole machine learning process – from developing use case ideas to model building up to model deployment.

 

 

 

Nathalie Rauschmayr is an Applied Scientist at AWS, where she helps customers develop deep learning applications.

 

 

 

Lu HuangLu Huang is a Senior Product Manager on the AWS Deep Engine team, managing Amazon SageMaker Debugger.

 

Read More

Best practices in customer service automation

Chatbots, virtual assistants, and Interactive Voice Response (IVR) systems are key components of successful customer service strategies.

We had the pleasure of hearing from three AWS Contact Center Intelligence (AWS CCI) Partners as part of our Best Practices in Customer Service Automation webinar, who provided valuable insights and tips for building automated, customer-service solutions.

The panel included:

Why build a chatbot or IVR?

Customers expect great customer service. At the same time, enterprises struggle with the costs and resources necessary to provide high-quality, highly available, live-agent solutions. Automated solutions, like chatbots and IVR, enable enterprises to provide quality support, 24/7, while reducing costs and increasing customer satisfaction.

Although reducing costs is important, a big reason enterprises are implementing automated solutions is to provide a better overall user-experience. As Brad Beumer of UIPath points out, it is what customers are asking for. Customers want a 24/7/365 experience—especially for common tasks they can handle on their own without an agent.

Self-serve, automated solutions help take the pressure off live agents. As Rebecca Owens of Genesys mentions, self-service can help handle the upfront tasks, leaving the more complex tasks to the live agents, who are the contact centers’ most valuable assets.

The impact of COVID-19

COVID-19 has had a significant impact on the interest in chatbots. Shelter-in-place rules affected both the consumers’ ability to go into locations, and the live agents’ ability to work in the same contact center. The need for automated solutions skyrocketed. Genesys saw a large increase in call volumes—in some cases, nearly triple the volume.

Chatbots are not only helping consumers during COVID-19, but work-from-home agents as well. As Beumer mentions, automated solutions help offload more of the agents’ tasks and help them with compliance, security, and even VPN issues related to working from home.

COVID-19 resulted in more stress on existing chatbots too. As Pat Higbie of XAPP AI shares, existing chatbots were not set up to handle the additional use cases people wanted them to handle. These are opportunities to take advantage of AI, through tools like Amazon Lex or Amazon Kendra, for chatbots and natural language search, to enable users to get what they need and improve the customer experience.

Five best practices

Building automated solutions is an iterative process. Our panelists provided insights and best practices when facing common issues.

Getting started

Building conversational interfaces can be challenging because it is hard to know all the things a user may request, or even how they pose the request.

Our panelists see three basic use cases:

  • Task completion – Collecting user information to make an update, like an address change
  • Information requests – Providing information like delivery status or a bank balance
  • Efficient routing – Collecting information to route the user to the most appropriate agent

Our panelists recommend getting started with simpler use cases that have a high impact. As Beumer recommends, start with high-volume, low-complexity tasks like password resets or lost credit cards. Owens adds that starting with high-level Natural Language Understanding (NLU) menus to understand user intent and routing them to the right agent is a simple investment with a significant ROI. Afterwards, move to simple task automation and information requests, and then move into the more advanced use cases that were not possible before conversational AI. As Higbie puts it, start with a quick win, like informational chatbots, especially if you have not done this before. The level of complexity can go up quite dramatically, especially with transactional use cases.

As complexity increases, there are opportunities for more advanced use cases, like transactional or even proactive use cases. Owens mentioned an example of using AI to monitor activity on a website and proactively offering a chatbot when needed. For example, if you can predict the likelihood of an ecommerce user having an issue at checkout, a chatbot can proactively offer to help the user, to lead them through completion so the user does not abandon their cart.

Handling fallbacks gracefully

Fallbacks occur when the automated solution cannot understand the user or cannot handle the request. It is important to handle fallbacks gracefully.

In the past with contact centers, users were often routed to an agent when a fallback occurred. Now with AI, you can better understand the user’s intent and context, and either send them to another AI solution, or more efficiently transfer them to an agent, sending the full context so the user does not have to repeat themselves.

Fallbacks are an opportunity to educate users on what they can say and do—to help get users back on the “happy path.” For example, if the user asks for something the chatbot cannot do, have it respond with a list of what it can do. Predefined buttons, referred to as quick replies, can also help let a user know what the chatbot can do.

Supporting multimodal channels

Our panelists see enterprises building automated solutions across multiple channels, including multi-modal text and voice options on the web, IVR, social media, and email. Enterprises are building solutions where their customers are interacting. There are additional factors to consider when supporting multiple channels.

People ask questions differently across channels. As Higbie points out, users communicating via text tend do so in “keyword style” with incomplete sentences, whereas in voice, they tend to ask the full question.

The way the chatbot responds across channels can be different as well. In text, the chatbot can provide a menu of options for the user to click. With voice, if there are more than three options, it can be difficult for the user to remember.

Regardless of the channel, it is important to understand the user’s intent. As Beumer mentions, if the intent can be understood, the right automation can be triggered.

It can be helpful to have a common interaction model for understanding across channels, but it is important to optimize the actual responses for each particular channel. As Higbie indicates, model management, dialog management, and content management are all needed to handle the complexities in conversational AI.

Keeping context in mind

Context is important—what is known about the user, where they are, or what they are doing can help improve the user experience.

Chatbots and IVRs can connect to backend CRMs to have additional information to personalize and tailor the experience. They can also pass along information gathered from a user to a live agent for more efficient handling so the user does not have to repeat themselves.

In the case of voice, knowing if the user has been in recent contact before can be helpful. While introductory prompts can be great to educate people, if the user contacts again, it is better to use a tapered approach that reduces some of the default messaging in order to have a quicker opening response.

The context can also be used with proactive solutions that monitor user activity and prompt if help is needed.

Measuring success

Our panelists use a variety of metrics to measure success, such as call deflection rates, self-service containment rates, first response time, and customer satisfaction. The metrics can also be used to calculate operational cost savings by knowing the cost of live agents and the deflection rates.

Customer satisfaction is very important—one of the goals of automated solutions is to provide a better user experience. One way UIPath does this is to look at Net Promoter Scores (NPS) before and after an automated solution is launched. Surveys can be used as well, via outbound calls after an interaction to gather customer feedback. With chatbots, you can immediately ask the user whether the response was helpful and take further action depending on the response.

Automated solutions like chatbots and IVRs need continuous optimization. It is difficult to anticipate all the things a user may ask, or how they may ask them. Monitoring the interactions to understand what users are asking for, how the automated solution is responding, and where it needs improvement is important. It is an iterative process.

What the future looks like

Our panelists shared their thoughts on the future of automated solutions.

Owens sees an increase in usage of automated solutions across all channels as chatbot technologies gain momentum and AI is able to handle even more tasks and complexity. Although customer service is heavily voice today, she is seeing a push to digital, and expects the trend to continue. One area of growth is in the expansion of language support in AI beyond English to support worldwide coverage.

Beumer envisions expansion of automated solutions across all channels, for a more consistent user experience. While automation will increase, it is important to continue to make sure that when a chatbot hands off to a live agent, that it is done so seamlessly.

Higbie sees a lot of exciting opportunity for automated solutions, and believes we are only in the “first inning” of AI automation. Customers will ask for even more than what chatbots currently do, and they will get the responses instantly. Solutions will move more to the proactive side as well. He sees this as a bigger paradigm shift than either web or mobile. It is important to commit now and not be displaced. As he summarizes, enterprises need to get started, get a quick win, and then expand the sophistication of their AI initiatives.

As the underlying technologies continue to evolve, the opportunities for automated chatbots continue to grow. It is exciting to learn from our panelists and see where automated solutions are going in the future.

About AWS Contact Center Intelligence

AWS CCI solutions can quickly and easily add AI and ML to your existing contact center to improve customer satisfaction and reduce costs. AWS CCI covers three key areas of the contact center workflow: self-service automation, real-time analytics with agent assist, and post-call analytics. Each solution is created using a specific combination of AWS AI services, and is available through select AWS Partners. Join the next CCI Webinar, “Banking on Bots”, on May 25, 2021.


About the Author

Arte Merritt leads partnerships for Contact Center Intelligence and Conversational AI. He is a frequent author and speaker in the conversational AI space. He was the co-founder and CEO of the leading analytics platform for conversational interfaces, leading the company to 20,000 customers, 90B messages, and multiple acquisition offers. Previously he founded Motally, a mobile analytics platform he sold to Nokia. Arte has more than 20 years experience in big data analytics. Arte is an MIT alum.

Read More