Churn prediction using Amazon SageMaker built-in tabular algorithms LightGBM, CatBoost, TabTransformer, and AutoGluon-Tabular

Amazon SageMaker provides a suite of built-in algorithms, pre-trained models, and pre-built solution templates to help data scientists and machine learning (ML) practitioners get started on training and deploying ML models quickly. These algorithms and models can be used for both supervised and unsupervised learning. They can process various types of input data, including tabular, image, and text.

Customer churn is a problem faced by a wide range of companies, from telecommunications to banking, where customers are typically lost to competitors. It’s in a company’s best interest to retain existing customers rather than acquire new customers because it usually costs significantly more to attract new customers. Mobile operators have historical records in which customers continued using the service or ultimately ended up churning. We can use this historical information of a mobile operator’s churn to train an ML model. After training this model, we can pass the profile information of an arbitrary customer (the same profile information that we used to train the model) to the model, and have it predict whether this customer is going to churn or not.

In this post, we train and deploy four recently released SageMaker algorithms—LightGBM, CatBoost, TabTransformer, and AutoGluon-Tabular—on a churn prediction dataset. We use SageMaker Automatic Model Tuning (a tool for hyperparameter optimization) to find the best hyperparameters for each model, and compare their performance on a holdout test dataset to select the optimal one.

You can also use this solution as a template to search over a collection of state-of-the-art tabular algorithms and use hyperparameter optimization to find the best overall model. You can easily replace the example dataset with your own to solve real business problems you’re interested in. If you want to jump straight into the SageMaker SDK code we go through in this post, you can refer to the following sample Jupyter notebook.

Benefits of SageMaker built-in algorithms

When selecting an algorithm for your particular type of problem and data, using a SageMaker built-in algorithm is the easiest option, because doing so comes with the following major benefits:

  • Low coding – The built-in algorithms require little coding to start running experiments. The only inputs you need to provide are the data, hyperparameters, and compute resources. This allows you to run experiments more quickly, with less overhead for tracking results and code changes.
  • Efficient and scalable algorithm implementations – The built-in algorithms come with parallelization across multiple compute instances and GPU support right out of the box for all applicable algorithms. If you have a lot of data with which to train your model, most built-in algorithms can easily scale to meet the demand. Even if you already have a pre-trained model, it may still be easier to use its corollary in SageMaker and input the hyperparameters you already know rather than port it over and write a training script yourself.
  • Transparency – You’re the owner of the resulting model artifacts. You can take that model and deploy it on SageMaker for several different inference patterns (check out all the available deployment types) and easy endpoint scaling and management, or you can deploy it wherever else you need it.

Data visualization and preprocessing

First, we gather our customer churn dataset. It’s a relatively small dataset with 5,000 records, where each record uses 21 attributes to describe the profile of a customer of an unknown US mobile operator. The attributes range from the US state where the customer resides, to the number of calls they placed to customer service, to the cost they are billed for daytime calls. We’re trying to predict whether the customer will churn or not, which is a binary classification problem. The following is a subset of those features look like, with the label as the last column.

The following are some insights for each column, specifically the summary statistics and histogram of selected features.

We then preprocess the data, split it into training, validation, and test sets, and upload the data to Amazon Simple Storage Service (Amazon S3).

Automatic model tuning of tabular algorithms

Hyperparameters control how our underlying algorithms operate and influence the performance of the model. Those hyperparameters can be the number of layers, learning rate, weight decay rate, and dropout for neural network-based models, or the number of leaves, iterations, and maximum tree depth for tree ensemble models. To select the best model, we apply SageMaker automatic model tuning to each of the four trained SageMaker tabular algorithms. You need only select the hyperparameters to tune and a range for each parameter to explore. For more information about automatic model tuning, refer to Amazon SageMaker Automatic Model Tuning: Using Machine Learning for Machine Learning or Amazon SageMaker automatic model tuning: Scalable gradient-free optimization.

Let’s see how this works in practice.

LightGBM

We start by running automatic model tuning with LightGBM, and adapt that process to the other algorithms. As is explained in the post Amazon SageMaker JumpStart models and algorithms now available via API, the following artifacts are required to train a pre-built algorithm via the SageMaker SDK:

  • Its framework-specific container image, containing all the required dependencies for training and inference
  • The training and inference scripts for the selected model or algorithm

We first retrieve these artifacts, which depend on the model_id (lightgbm-classification-model in this case) and version:

from sagemaker import image_uris, model_uris, script_uris
train_model_id, train_model_version, train_scope = "lightgbm-classification-model", "*", "training"
training_instance_type = "ml.m5.4xlarge"

# Retrieve the docker image
train_image_uri = image_uris.retrieve(region=None,
                                      framework=None,
                                      model_id=train_model_id,
                                      model_version=train_model_version,
                                      image_scope=train_scope,
                                      instance_type=training_instance_type,
                                      )                                      
# Retrieve the training script
train_source_uri = script_uris.retrieve(model_id=train_model_id,
                                        model_version=train_model_version,
                                        script_scope=train_scope
                                        )
# Retrieve the pre-trained model tarball (in the case of tabular modeling, it is a dummy file)
train_model_uri = model_uris.retrieve(model_id=train_model_id,
                                      model_version=train_model_version,
                                      model_scope=train_scope)

We then get the default hyperparameters for LightGBM, set some of them to selected fixed values such as number of boosting rounds and evaluation metric on the validation data, and define the value ranges we want to search over for others. We use the SageMaker parameters ContinuousParameter and IntegerParameter for this:

from sagemaker import hyperparameters
from sagemaker.tuner import ContinuousParameter, IntegerParameter, HyperparameterTuner

# Retrieve the default hyper-parameters for fine-tuning the model
hyperparameters = hyperparameters.retrieve_default(model_id=train_model_id,
                                                   model_version=train_model_version
                                                   )
# [Optional] Override default hyperparameters with custom values
hyperparameters["num_boost_round"] = "500"
hyperparameters["metric"] = "auc"

# Define search ranges for other hyperparameters
hyperparameter_ranges_lgb = {
    "learning_rate": ContinuousParameter(1e-4, 1, scaling_type="Logarithmic"),
    "num_boost_round": IntegerParameter(2, 30),
    "num_leaves": IntegerParameter(10, 50),
    "feature_fraction": ContinuousParameter(0, 1),
    "bagging_fraction": ContinuousParameter(0, 1),
    "bagging_freq": IntegerParameter(1, 10),
    "max_depth": IntegerParameter(5, 30),
    "min_data_in_leaf": IntegerParameter(5, 50),
}

Finally, we create a SageMaker Estimator, feed it into a HyperarameterTuner, and start the hyperparameter tuning job with tuner.fit():

from sagemaker.estimator import Estimator
from sagemaker.tuner import HyperParameterTuner

# Create SageMaker Estimator instance
tabular_estimator = Estimator(
    role=aws_role,
    image_uri=train_image_uri,
    source_dir=train_source_uri,
    model_uri=train_model_uri,
    entry_point="transfer_learning.py",
    instance_count=1,
    instance_type=training_instance_type,
    max_run=360000,
    hyperparameters=hyperparameters,
)

tuner = HyperparameterTuner(
            tabular_estimator,
            "auc",
            hyperparameter_ranges_lgb,
            [{"Name": "auc", "Regex": "auc: ([0-9\.]+)"}],
            max_jobs=10,
            max_parallel_jobs=5,
            objective_type="Maximize",
            base_tuning_job_name="some_name",
        )

tuner.fit({"training": training_dataset_s3_path}, logs=True)

The max_jobs parameter defines how many total jobs will be run in the automatic model tuning job, and max_parallel_jobs defines how many concurrent training jobs should be started. We also define the objective to “Maximize” the model’s AUC (area under the curve). To dive deeper into the available parameters exposed by HyperParameterTuner, refer to HyperparameterTuner.

Check out the sample notebook to see how we proceed to deploy and evaluate this model on the test set.

CatBoost

The process for hyperparameter tuning on the CatBoost algorithm is the same as before, although we need to retrieve model artifacts under the ID catboost-classification-model and change the range selection of hyperparameters:

from sagemaker import hyperparameters

# Retrieve the default hyper-parameters for fine-tuning the model
hyperparameters = hyperparameters.retrieve_default(
    model_id=train_model_id, model_version=train_model_version
)
# [Optional] Override default hyperparameters with custom values
hyperparameters["iterations"] = "500"
hyperparameters["eval_metric"] = "AUC"

# Define search ranges for other hyperparameters
hyperparameter_ranges_cat = {
    "learning_rate": ContinuousParameter(0.00001, 0.1, scaling_type="Logarithmic"),
    "iterations": IntegerParameter(50, 1000),
    "depth": IntegerParameter(1, 10),
    "l2_leaf_reg": IntegerParameter(1, 10),
    "random_strength": ContinuousParameter(0.01, 10, scaling_type="Logarithmic"),
}

TabTransformer

The process for hyperparameter tuning on the TabTransformer model is the same as before, although we need to retrieve model artifacts under the ID pytorch-tabtransformerclassification-model and change the range selection of hyperparameters.

We also change the training instance_type to ml.p3.2xlarge. TabTransformer is a model recently derived from Amazon research, which brings the power of deep learning to tabular data using Transformer models. To train this model in an efficient manner, we need a GPU-backed instance. For more information, refer to Bringing the power of deep learning to data in tables.

from sagemaker import hyperparameters
from sagemaker.tuner import CategoricalParameter

# Retrieve the default hyper-parameters for fine-tuning the model
hyperparameters = hyperparameters.retrieve_default(
    model_id=train_model_id, model_version=train_model_version
)
# [Optional] Override default hyperparameters with custom values
hyperparameters["n_epochs"] = 40  # The same hyperparameter is named as "iterations" for CatBoost
hyperparameters["patience"] = 10

# Define search ranges for other hyperparameters
hyperparameter_ranges_tab = {
    "learning_rate": ContinuousParameter(0.001, 0.01, scaling_type="Auto"),
    "batch_size": CategoricalParameter([64, 128, 256, 512]),
    "attn_dropout": ContinuousParameter(0.0, 0.8, scaling_type="Auto"),
    "mlp_dropout": ContinuousParameter(0.0, 0.8, scaling_type="Auto"),
    "input_dim": CategoricalParameter(["16", "32", "64", "128", "256"]),
    "frac_shared_embed": ContinuousParameter(0.0, 0.5, scaling_type="Auto"),
}

AutoGluon-Tabular

In the case of AutoGluon, we don’t run hyperparameter tuning. This is by design, because AutoGluon focuses on ensembling multiple models with sane choices of hyperparameters and stacking them in multiple layers. This ends up being more performant than training one model with the perfect selection of hyperparameters and is also computationally cheaper. For details, check out AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data.

Therefore, we switch the model_id to autogluon-classification-ensemble, and only fix the evaluation metric hyperparameter to our desired AUC score:

from sagemaker import hyperparameters

# Retrieve the default hyper-parameters for fine-tuning the model
hyperparameters = hyperparameters.retrieve_default(
    model_id=train_model_id, model_version=train_model_version
)

hyperparameters["eval_metric"] = "roc_auc"

Instead of calling tuner.fit(), we call estimator.fit() to start a single training job.

Benchmarking the trained models

After we deploy all four models, we send the full test set to each endpoint for prediction and calculate accuracy, F1, and AUC metrics for each (see code in the sample notebook). We present the results in the following table, with an important disclaimer: results and relative performance between these models will depend on the dataset you use for training. These results are representative, and even though the tendency for certain algorithms to perform better is based on relevant factors (for example, AutoGluon intelligently ensembles the predictions of both LightGBM and CatBoost models behind the scenes), the balance in performance might change given a different data distribution.

. LightGBM with Automatic Model Tuning CatBoost with Automatic Model Tuning TabTransformer with Automatic Model Tuning AutoGluon-Tabular
Accuracy 0.8977 0.9622 0.9511 0.98
F1 0.8986 0.9624 0.9517 0.98
AUC 0.9629 0.9907 0.989 0.9979

Conclusion

In this post, we trained four different SageMaker built-in algorithms to solve the customer churn prediction problem with low coding effort. We used SageMaker automatic model tuning to find the best hyperparameters to train these algorithms with, and compared their performance on a selected churn prediction dataset. You can use the related sample notebook as a template, replacing the dataset with your own to solve your desired tabular data-based problem.

Make sure to try these algorithms on SageMaker, and check out sample notebooks on how to use other built-in algorithms available on GitHub.


About the authors

Dr. Xin Huang is an Applied Scientist for Amazon SageMaker JumpStart and Amazon SageMaker built-in algorithms. He focuses on developing scalable machine learning algorithms. His research interests are in the area of natural language processing, explainable deep learning on tabular data, and robust analysis of non-parametric space-time clustering. He has published many papers in ACL, ICDM, KDD conferences, and Royal Statistical Society: Series A journal.

João Moura is an AI/ML Specialist Solutions Architect at Amazon Web Services. He is mostly focused on NLP use-cases and helping customers optimize Deep Learning model training and deployment. He is also an active proponent of low-code ML solutions and ML-specialized hardware.

Read More

Parallel data processing with RStudio on Amazon SageMaker

Last year, we announced the general availability of RStudio on Amazon SageMaker, the industry’s first fully managed RStudio Workbench integrated development environment (IDE) in the cloud. You can quickly launch the familiar RStudio IDE, and dial up and down the underlying compute resources without interrupting your work, making it easy to build machine learning (ML) and analytics solutions in R at scale.

With ever-increasing data volume being generated, datasets used for ML and statistical analysis are growing in tandem. With this brings the challenges of increased development time and compute infrastructure management. To solve these challenges, data scientists have looked to implement parallel data processing techniques. Parallel data processing, or data parallelization, takes large existing datasets and distributes them across multiple processers or nodes to operate on the data simultaneously. This can allow for faster processing time of larger datasets, along with optimized usage on compute. This can help ML practitioners create reusable patterns for dataset generation, and also help reduce compute infrastructure load and cost.

Solution overview

Within Amazon SageMaker, many customers use SageMaker Processing to help implement parallel data processing. With SageMaker Processing, you can use a simplified, managed experience on SageMaker to run your data processing workloads, such as feature engineering, data validation, model evaluation, and model interpretation. This brings many benefits because there’s no long-running infrastructure to manage—processing instances spin down when jobs are complete, environments can be standardized via containers, data within Amazon Simple Storage Service (Amazon S3) is natively distributed across instances, and infrastructure settings are flexible in terms of memory, compute, and storage.

SageMaker Processing offers options for how to distribute data. For parallel data processing, you must use the ShardedByS3Key option for the S3DataDistributionType. When this parameter is selected, SageMaker Processing takes the provided n instances and distribute objects 1/n objects from the input data source across the instances. For example, if two instances are provided with four data objects, each instance receives two objects.

SageMaker Processing requires three components to run processing jobs:

  • A container image that has your code and dependencies to run your data processing workloads
  • A path to an input data source within Amazon S3
  • A path to an output data source within Amazon S3

The process is depicted in the following diagram.

In this post, we show you how to use RStudio on SageMaker to interface with a series of SageMaker Processing jobs to create a parallel data processing pipeline using the R programming language.

The solution consists of the following steps:

  1. Set up the RStudio project.
  2. Build and register the processing container image.
  3. Run the two-step processing pipeline:
    1. The first step takes multiple data files and processes them across a series of processing jobs.
    2. The second step concatenates the output files and splits them into train, test, and validation datasets.

Prerequisites

Complete the following prerequisites:

  1. Set up the RStudio on SageMaker Workbench. For more information, refer to Announcing Fully Managed RStudio on Amazon SageMaker for Data Scientists.
  2. Create a user with RStudio on SageMaker with appropriate access permissions.

Set up the RStudio project

To set up the RStudio project, complete the following steps:

  1. Navigate to your Amazon SageMaker Studio control panel on the SageMaker console.
  2. Launch your app in the RStudio environment.
  3. Start a new RStudio session.
  4. For Session Name, enter a name.
  5. For Instance Type and Image, use the default settings.
  6. Choose Start Session.
  7. Navigate into the session.
  8. Choose New Project, Version control, and then Select Git.
  9. For Repository URL, enter https://github.com/aws-samples/aws-parallel-data-processing-r.git
  10. Leave the remaining options as default and choose Create Project.

You can navigate to the aws-parallel-data-processing-R directory on the Files tab to view the repository. The repository contains the following files:

  • Container_Build.rmd
  • /dataset

    • bank-additional-full-data1.csv
    • bank-additional-full-data2.csv
    • bank-additional-full-data3.csv
    • bank-additional-full-data4.csv
  • /docker
  • Dockerfile-Processing
  • Parallel_Data_Processing.rmd
  • /preprocessing

    • filter.R
    • process.R

Build the container

In this step, we build our processing container image and push it to Amazon Elastic Container Registry (Amazon ECR). Complete the following steps:

  1. Navigate to the Container_Build.rmd file.
  2. Install the SageMaker Studio Image Build CLI by running the following cell. Make sure you have the required permissions prior to completing this step, this is a CLI designed to push and register container images within Studio.
    pip install sagemaker-studio-image-build

  3. Run the next cell to build and register our processing container:
    /home/sagemaker-user/.local/bin/sm-docker build . --file ./docker/Dockerfile-Processing --repository sagemaker-rstudio-parallel-processing:1.0

After the job has successfully run, you receive an output that looks like the following:

Image URI: <Account_Number>.dkr.ecr.<Region>.amazonaws.com/sagemaker-rstudio- parallel-processing:1.0

Run the processing pipeline

After you build the container, navigate to the Parallel_Data_Processing.rmd file. This file contains a series of steps that helps us create our parallel data processing pipeline using SageMaker Processing. The following diagram depicts the steps of the pipeline that we complete.

Start by running the package import step. Import the required RStudio packages along with the SageMaker SDK:

suppressWarnings(library(dplyr))
suppressWarnings(library(reticulate))
suppressWarnings(library(readr))
path_to_python <- system(‘which python’, intern = TRUE)

use_python(path_to_python)
sagemaker <- import('sagemaker')

Now set up your SageMaker execution role and environment details:

role = sagemaker$get_execution_role()
session = sagemaker$Session()
bucket = session$default_bucket()
account_id <- session$account_id()
region <- session$boto_region_name
local_path <- dirname(rstudioapi::getSourceEditorContext()$path)

Initialize the container that we built and registered in the earlier step:

container_uri <- paste(account_id, "dkr.ecr", region, "amazonaws.com/sagemaker-rstudio-parallel-processing:1.0", sep=".")
print(container_uri)

From here we dive into each of the processing steps in more detail.

Upload the dataset

For our example, we use the Bank Marketing dataset from UCI. We have already split the dataset into multiple smaller files. Run the following code to upload the files to Amazon S3:

local_dataset_path <- paste0(local_path,"/dataset/")

dataset_files <- list.files(path=local_dataset_path, pattern="\.csv$", full.names=TRUE)
for (file in dataset_files){
  session$upload_data(file, bucket=bucket, key_prefix="sagemaker-rstudio-example/split")
}

input_s3_split_location <- paste0("s3://", bucket, "/sagemaker-rstudio-example/split")

After the files are uploaded, move to the next step.

Perform parallel data processing

In this step, we take the data files and perform feature engineering to filter out certain columns. This job is distributed across a series of processing instances (for our example, we use two).

We use the filter.R file to process the data, and configure the job as follows:

filter_processor <- sagemaker$processing$ScriptProcessor(command=list("Rscript"),
                                                        image_uri=container_uri,
                                                        role=role,
                                                        instance_count=2L,
                                                        instance_type="ml.m5.large")

output_s3_filter_location <- paste0("s3://", bucket, "/sagemaker-rstudio-example/filtered")
s3_filter_input <- sagemaker$processing$ProcessingInput(source=input_s3_split_location,
                                                        destination="/opt/ml/processing/input",
                                                        s3_data_distribution_type="ShardedByS3Key",
                                                        s3_data_type="S3Prefix")
s3_filter_output <- sagemaker$processing$ProcessingOutput(output_name="bank-additional-full-filtered",
                                                         destination=output_s3_filter_location,
                                                         source="/opt/ml/processing/output")

filtering_step <- sagemaker$workflow$steps$ProcessingStep(name="FilterProcessingStep",
                                                      code=paste0(local_path, "/preprocessing/filter.R"),
                                                      processor=filter_processor,
                                                      inputs=list(s3_filter_input),
                                                      outputs=list(s3_filter_output))

As mentioned earlier, when running a parallel data processing job, you must adjust the input parameter with how the data will be sharded, and the type of data. Therefore, we provide the sharding method by S3Prefix:

s3_data_distribution_type="ShardedByS3Key",
                                                      s3_data_type="S3Prefix")

After you insert these parameters, SageMaker Processing will equally distribute the data across the number of instances selected.

Adjust the parameters as necessary, and then run the cell to instantiate the job.

Generate training, test, and validation datasets

In this step, we take the processed data files, combine them, and split them into test, train, and validation datasets. This allows us to use the data for building our model.

We use the process.R file to process the data, and configure the job as follows:

script_processor <- sagemaker$processing$ScriptProcessor(command=list("Rscript"),
                                                         image_uri=container_uri,
                                                         role=role,
                                                         instance_count=1L,
                                                         instance_type="ml.m5.large")

output_s3_processed_location <- paste0("s3://", bucket, "/sagemaker-rstudio-example/processed")
s3_processed_input <- sagemaker$processing$ProcessingInput(source=output_s3_filter_location,
                                                         destination="/opt/ml/processing/input",
                                                         s3_data_type="S3Prefix")
s3_processed_output <- sagemaker$processing$ProcessingOutput(output_name="bank-additional-full-processed",
                                                         destination=output_s3_processed_location,
                                                         source="/opt/ml/processing/output")

processing_step <- sagemaker$workflow$steps$ProcessingStep(name="ProcessingStep",
                                                      code=paste0(local_path, "/preprocessing/process.R"),
                                                      processor=script_processor,
                                                      inputs=list(s3_processed_input),
                                                      outputs=list(s3_processed_output),
                                                      depends_on=list(filtering_step))

Adjust the parameters are necessary, and then run the cell to instantiate the job.

Run the pipeline

After all the steps are instantiated, start the processing pipeline to run each step by running the following cell:

pipeline = sagemaker$workflow$pipeline$Pipeline(
  name="BankAdditionalPipelineUsingR",
  steps=list(filtering_step, processing_step)
)

upserted <- pipeline$upsert(role_arn=role)
execution <- pipeline$start()

execution$describe()
execution$wait()

The time each of these jobs takes will vary based on the instance size and count selected.

Navigate to the SageMaker console to see all your processing jobs.

We start with the filtering job, as shown in the following screenshot.

When that’s complete, the pipeline moves to the data processing job.

When both jobs are complete, navigate to your S3 bucket. Look within the sagemaker-rstudio-example folder, under processed. You can see the files for the train, test and validation datasets.

Conclusion

With an increased amount of data that will be required to build more and more sophisticated models, we need to change our approach to how we process data. Parallel data processing is an efficient method in accelerating dataset generation, and if coupled with modern cloud environments and tooling such as RStudio on SageMaker and SageMaker Processing, can remove much of the undifferentiated heavy lifting of infrastructure management, boilerplate code generation, and environment management. In this post, we walked through how you can implement parallel data processing within RStudio on SageMaker. We encourage you to try it out by cloning the GitHub repository, and if you have suggestions on how to make the experience better, please submit an issue or a pull request.

To learn more about the features and services used in this solution, refer to RStudio on Amazon SageMaker and Amazon SageMaker Processing.


About the authors

Raj Pathak is a Solutions Architect and Technical advisor to Fortune 50 and Mid-Sized FSI (Banking, Insurance, Capital Markets) customers across Canada and the United States. Raj specializes in Machine Learning with applications in Document Extraction, Contact Center Transformation and Computer Vision.

Jake Wen is a Solutions Architect at AWS with passion for ML training and Natural Language Processing. Jake helps Small Medium Business customers with design and thought leadership to build and deploy applications at scale. Outside of work, he enjoys hiking.

Aditi Rajnish is a first-year software engineering student at University of Waterloo. Her interests include computer vision, natural language processing, and edge computing. She is also passionate about community-based STEM outreach and advocacy. In her spare time, she can be found rock climbing, playing the piano, or learning how to bake the perfect scone.

Sean MorganSean Morgan is an AI/ML Solutions Architect at AWS. He has experience in the semiconductor and academic research fields, and uses his experience to help customers reach their goals on AWS. In his free time, Sean is an active open-source contributor and maintainer, and is the special interest group lead for TensorFlow Add-ons.

Paul Wu is a Solutions Architect working in AWS’ Greenfield Business in Texas. His areas of expertise include containers and migrations.

Read More

Discover insights from Zendesk with Amazon Kendra intelligent search

Customer relationship management (CRM) is a critical tool that organizations maintain to manage customer interactions and build business relationships. Zendesk is a CRM tool that makes it easy for customers and businesses to keep in sync. Zendesk captures a wealth of customer data, such as support tickets created and updated by customers and service agents, community discussions, and helpful guides. With such a wealth of complex data, simple keyword searches don’t suffice when it comes to discovering meaningful, accurate customer information.

Now you can use the Amazon Kendra Zendesk connector to index your Zendesk service tickets, help guides, and community posts, and perform intelligent search powered by machine learning (ML). Amazon Kendra smartly and efficiently answers natural language-based queries using advanced natural language processing (NLP) techniques. It can learn effectively from your Zendesk data, extracting meaning and context.

This post shows how to configure the Amazon Kendra Zendesk connector to index your Zendesk domain and take advantage of Amazon Kendra intelligent search. We use an example of an illustrative Zendesk domain to discuss technical topics related to AWS services.

Overview of solution

Amazon Kendra was built for intelligent search using NLP. You can use Amazon Kendra to ask factoid questions, descriptive questions, and perform keyword searches. You can use the Amazon Kendra connector for Zendesk to crawl your Zendesk domain and index service tickets, guides, and community posts to discover answers for your questions faster.

In this post, we show how to use the Amazon Kendra connector for Zendesk to index data from your Zendesk domain for intelligent search.

Prerequisites

For this walkthrough, you should have the following prerequisites:

  • An AWS account
  • Administrator level access to your Zendesk domain
  • Privileges to create an Amazon Kendra index, AWS resources, and AWS Identity and Access Management (IAM) roles and policies
  • Basic knowledge of AWS services and working knowledge of Zendesk

Configure your Zendesk domain

Your Zendesk domain has a domain owner or administrator, service group administrators, and a customer. Sample service tickets, community posts, and guides have been created for the purpose of this walkthrough. A Zendesk API client with the unique identifier amazon_kendra is registered to create an OAuth token for accessing your Zendesk domain from Amazon Kendra for crawling and indexing. The following screenshot shows the details of the OAuth configuration for the Zendesk API client.

Configure the data source using the Amazon Kendra connector for Zendesk

You can add the Zendesk connector data source to an existing Amazon Kendra index or create a new index. Then complete the following steps to configure the Zendesk connector:

  1. On the Amazon Kendra console, open the index and choose Data sources in the navigation pane.
  2. Under Zendesk, choose Add connector.
  3. Choose Add connector.
  4. In the Specify data source details section, enter the name and description of your data source and choose Next.
  5. In the Define access and security section, for Zendesk URL, enter the URL to your Zendesk domain. Use the URL format https://<domain>.zendesk.com/.
  6. Under Authentication, you can either choose Create to add a new secret using the user OAuth token created for the amazon_kendra, or use an existing AWS Secrets Manager secret that has the user OAuth token for the Zendesk domain that you want the connector to access.
  7. Optionally, configure a new AWS secret for Zendesk API access.
  8. For IAM role, you can choose Create a new role or choose an existing IAM role configured with appropriate IAM policies to access the Secrets Manager secret, Amazon Kendra index, and data source.
  9. Choose Next.
  10. In the Configure sync settings section, provide information regarding your sync scope and run schedule.
  11. Choose Next.
  12. In the Set field mappings section, you can optionally configure the field mappings, or how the Zendesk field names are mapped to Amazon Kendra attributes or facets.
  13. Choose Next.
  14. Review your settings and confirm to add the data source.
  15. After the data source is created, select the data source and choose Sync Now.
  16. Choose Facet definition in the navigation pane.
  17. Select the check box in the Facetable column for the facet _category.

Run queries with the Amazon Kendra search console

Now that the data is synced, we can run a few search queries on the Amazon Kendra search console by navigating to the Search indexed content page.

For the first query, we ask Amazon Kendra a general question related to AWS service durability. The following screenshot shows the response. The suggested answer provides the correct answer to the query by applying natural language comprehension.

For our second query, let’s query Amazon Kendra to search for product issues from Zendesk service tickets. The following screenshot shows the response for the search, along with facets showing various categories of documents included in the result.

Notice the search result includes the URL to the source document as well. Choosing the URL takes us directly to the Zendesk service ticket page, as shown in the following screenshot.

Clean up

To avoid incurring future charges, clean up any resources created as part of this solution. Delete the Zendesk connector data source so any data indexed from the source is removed from the index. If you created a new Amazon Kendra index, delete the index as well.

Conclusion

In this post, we discussed how to configure the Amazon Kendra connector for Zendesk to crawl and index service tickets, community posts, and help guides. We showed how Amazon Kendra ML-based search enables your business leaders and agents to discover insights from your Zendesk content quicker and respond to customer needs faster.

To learn more about the Amazon Kendra connector for Zendesk, refer to the Amazon Kendra Developer Guide.


About the author

Rajesh Kumar Ravi is a Senior AI Services Solution Architect at Amazon Web Services specializing in intelligent document search with Amazon Kendra. He is a builder and problem solver, and contributes to the development of new ideas. He enjoys walking and loves to go on short hiking trips outside of work.

Read More

Amazon SageMaker Automatic Model Tuning now provides up to three times faster hyperparameter tuning with Hyperband

Amazon SageMaker Automatic Model Tuning introduces Hyperband, a multi-fidelity technique to tune hyperparameters as a faster and more efficient way to find an optimal model. In this post, we show how automatic model tuning with Hyperband can provide faster hyperparameter tuning—up to three times as fast.

The benefits of Hyperband

Hyperband presents two advantages over existing black-box tuning strategies: efficient resource utilization and a better time-to-convergence.

Machine learning (ML) models are increasingly training-intensive, involve complex models and large datasets, and require a lot of effort and resources to find the optimal hyperparameters. Traditional black-box search strategies, such as Bayesian, random search, or grid search, tend to scale linearly with the complexity of the ML problem at hand, requiring longer training time.

To speed up hyperparameter tuning and optimize training cost, Hyperband uses Asynchronous Successive Halving Algorithm (ASHA), a strategy that massively parallelizes hyperparameter tuning and automatically stops training jobs early by using previously evaluated configurations to predict whether a specific candidate is promising or not.

As we demonstrate in this post, Hyperband converges to the optimal objective metric faster than most black-box strategies and therefore saves training time. This allows you to tune larger-scale models where evaluating each hyperparameter configuration requires running an expensive training loop, such as in computer vision and natural language processing (NLP) applications. If you’re interested in finding the most accurate models, Hyperband also allows you to run your tuning jobs with more resources and converge to a better solution.

Hyperband with SageMaker

The new Hyperband approach implemented for hyperparameter tuning has a few new data elements changed through AWS API calls. Implementation via the AWS Management Console is not available at this time. Let’s look at some data elements introduced for Hyperband:

  • Strategy – This defines the hyperparameter approach you want to choose. A new value for Hyperband is introduced with this change. Valid values are Bayesian, random, and Hyperband.
  • MinResource – Defines the minimum number of epochs or iterations to be used for a training job before a decision is made to stop the training.
  • MaxResource – Specifies the maximum number of epochs or iterations to be used for a training job to achieve the objective. This parameter is not required if you have numbers of training epochs defined as a hyperparameter in the tuning job.

Implementation

The following sample code shows the tuning job config and training job definition:

tuning_job_config = {
    "ParameterRanges": {
    "Strategy": "Hyperband",
    "ResourceLimits": {
        "MaxNumberOfTrainingJobs":20,
        "MaxParallelTrainingJobs": 5},
    "StrategyConfig":{
        "HyperbandStrategyConfig":{
            "MinResource": 10,
            "MaxResource": 100}
            },
    "HyperParameterTuningJobObjective": {
        "MetricName": "validation:auc", 
        "Type": "Maximize"
        },
    "CategoricalParameterRanges": [],
    "ContinuousParameterRanges": [
    {
    "MaxValue": "1",
    "MinValue": "0",
    "Name": "eta",
    },
    {
    "MaxValue": "10",
    "MinValue": "1",
    "Name": "min_child_weight",
    },
    {
    "MaxValue": "2",
    "MinValue": "0",
    "Name": "alpha",
    },
    ],
    "IntegerParameterRanges": [
    {
    "MaxValue": "10",
    "MinValue": "1",
    "Name": "max_depth",
    }
    ],
    },

}

The preceding code defines strategy as Hyperband and also defines the lower and upper bound resource limits inside the strategy configuration using HyperbandStrategyConfig, which serves as a lever to control the training runtime. For more details on how to configure and run automatic model tuning, refer to Specify the Hyperparameter Tuning Job Settings.

Hyperband compared to black-box search

In this section, we perform two experiments to compare Hyperband to a black-box search.

First experiment

In the first experiment, given a binary classification task, we aim to optimize a three-layer fully-connected network on a synthetic data set, which contains 5000 data points. The hyperparameters for the network include the number of units per layer, the learning rate of the Adam optimizer, the L2 regularization parameter, and the batch size. The range of these parameters are:

  • Number of units per layer – 10 to 1e3
  • Learning rate – 1e-4 to 0.1
  • L2 regularization parameter – 1e-6 to 2
  • Batch size – 10 to 200

The following graph shows our findings.

Comparing Hyperband with other strategies, given a target accuracy of 96%, Hyperband can achieve this target in 357 seconds, while Bayesian needs 560 seconds and random search needs 614 seconds. Hyperband consistently finds a more optimal solution at any given wall-clock time budget. This shows a clear advantage of a multi-fidelity optimization algorithm.

Second experiment

In this second experiment, we consider the Cifar10 dataset and train ResNet-20, a popular network architecture for computer vision tasks. We run all experiments on ml.g4dn.xlarge instances and optimize the neural network through SGD. We also apply standard image augmentation including random flip and random crop. The search space contains the following parameters:

  • Mini-batch size: an integer from 4 to 512
  • Learning rate of SGD: a float from 1e-6 to 1e-1
  • Momentum of SGD: a float from 0.01 to 0.99
  • Weight decay: a float from 1e-5 to 1

The following graph illustrates our findings.

Given a target validation accuracy 0.87, Hyperband can reach it in fewer than 2000 seconds while Random and Bayesian require 10000 and almost 9000 seconds respectively. This amount to a speed-up factor of ~5x and ~4.5x for Hyperband on this task compared to Random and Bayesian strategies. This shows a clear advantage for the multi-fidelity optimization algorithm, which significantly reduces the wall-clock time it takes to tune your deep learning models.

Conclusion

SageMaker Automatic Model Tuning allows you to reduce the time to tune a model by automatically searching for the best hyperparameter configuration within the ranges that you specify. You can find the best version of your model by running training jobs on your dataset with several search strategies, such as black-box or multi-fidelity strategies.

In this post, we discussed how you can now use a multi-fidelity strategy called Hyperband in SageMaker to find the best model. The support for Hyperband makes it possible for SageMaker Automatic Model Tuning to tune larger-scale models where evaluating each hyperparameter configuration requires running an expensive training loop, such as in computer vision and NLP applications.

Finally, we saw how Hyperband further optimizes runtime compared to black-box strategies with early stopping by using previously evaluated configurations to predict whether a specific candidate is promising and, if not, stop the evaluation to reduce the overall time and compute cost. Using Hyperband in SageMaker also allows you to specify the minimum and maximum resource in the HyperbandStrategyConfig parameter for further runtime controls.

To learn more, visit Perform Automatic Model Tuning with SageMaker.


About the authors

Doug Mbaya is a Senior Partner Solution architect with a focus in data and analytics. Doug works closely with AWS partners, helping them integrate data and analytics solutions in the cloud.

Gopi Mudiyala is a Senior Technical Account Manager at AWS. He helps customers in the Financial Services industry with their operations in AWS. As a machine learning specialist, Gopi works to help customers succeed in their ML journey.

Xingchen Ma is an Applied Scientist at AWS. He works in the team owning the service for SageMaker Automatic Model Tuning.

Read More

Read webpages and highlight content using Amazon Polly

In this post, we demonstrate how to use Amazon Polly—a leading cloud service that converts text into lifelike speech—to read the content of a webpage and highlight the content as it’s being read. Adding audio playback to a webpage improves the accessibility and visitor experience of the page. Audio-enhanced content is more impactful and memorable, draws more traffic to the page, and taps into the spending power of visitors. It also improves the brand of the company or organization that publishes the page. Text-to-speech technology makes these business benefits attainable. We accelerate that journey by demonstrating how to achieve this goal using Amazon Polly.

This capability improves accessibility for visitors with disabilities, and could be adopted as part of your organization’s accessibility strategy. Just as importantly, it enhances the page experience for visitors without disabilities. Both groups have significant spending power and spend more freely from pages that use audio enhancement to grab their attention.

Overview of solution

PollyReadsThePage (PRTP)—as we refer to the solution—allows a webpage publisher to drop an audio control onto their webpage. When the visitor chooses Play on the control, the control reads the page and highlights the content. PRTP uses the general capability of Amazon Polly to synthesize speech from text. It invokes Amazon Polly to generate two artifacts for each page:

  • The audio content in a format playable by the browser: MP3
  • A speech marks file that indicates for each sentence of text:
    • The time during playback that the sentence is read
    • The location on the page the sentence appears

When the visitor chooses Play, the browser plays the MP3 file. As the audio is read, the browser checks the time, finds in the marks file which sentence to read at that time, locates it on the page, and highlights it.

PRTP allows the visitor to read in different voices and languages. Each voice requires its own pair of files. PRTP uses neural voices. For a list of supported neural voices and languages, see Neural Voices. For a full list of standard and neural voices in Amazon Polly, see Voices in Amazon Polly.

We consider two types of webpages: static and dynamic pages. In a static page, the content is contained within the page and changes only when a new version of the page is published. The company might update the page daily or weekly as part of its web build process. For this type of page, it’s possible to pre-generate the audio files at build time and place them on the web server for playback. As the following figure shows, the script PRTP Pre-Gen invokes Amazon Polly to generate the audio. It takes as input the HTML page itself and, optionally, a configuration file that specifies which text from the page to extract (Text Extract Config). If the extract config is omitted, the pre-gen script makes a sensible choice of text to extract from the body of the page. Amazon Polly outputs the files in an Amazon Simple Storage Service (Amazon S3) bucket; the script copies them to your web server. When the visitor plays the audio, the browser downloads the MP3 directly from the web server. For highlights, a drop-in library, PRTP.js, uses the marks file to highlight the text being read.Solution Architecture Diagram

The content of a dynamic page changes in response to the visitor interaction, so audio can’t be pre-generated but must be synthesized dynamically. As the following figure shows, when the visitor plays the audio, the page uses PRTP.js to generate the audio in Amazon Polly, and it highlights the synthesized audio using the same approach as with static pages. To access AWS services from the browser, the visitor requires an AWS identity. We show how to use an Amazon Cognito identity pool to allow the visitor just enough access to Amazon Polly and the S3 bucket to render the audio.

Dynamic Content

Generating both Mp3 audio and speech marks requires the Polly service to synthesize the same input twice. Refer to the Amazon Polly Pricing Page to understand cost implications. Pre-generation saves costs because synthesis is performed at build time rather than on-demand for each visitor interaction.

The code accompanying this post is available as an open-source repository on GitHub.

To explore the solution, we follow these steps:

  1. Set up the resources, including the pre-gen build server, S3 bucket, web server, and Amazon Cognito identity.
  2. Run the static pre-gen build and test static pages.
  3. Test dynamic pages.

Prerequisites

To run this example, you need an AWS account with permission to use Amazon Polly, Amazon S3, Amazon Cognito, and (for demo purposes) AWS Cloud9.

Provision resources

We share an AWS CloudFormation template to create in your account a self-contained demo environment to help you follow along with the post. If you prefer to set up PRTP in your own environment, refer to instructions in README.md.

To provision the demo environment using CloudFormation, first download a copy of the CloudFormation template. Then complete the following steps:

  1. On the AWS CloudFormation console, choose Create stack.
  2. Choose With new resources (standard).
  3. Select Upload a template file.
  4. Choose Choose file to upload the local copy of the template that you downloaded. The name of the file is prtp.yml.
  5. Choose Next.
  6. Enter a stack name of your choosing. Later you enter this again as a replacement for <StackName>.
  7. You may keep default values in the Parameters section.
  8. Choose Next.
  9. Continue through the remaining sections.
  10. Read and select the check boxes in the Capabilities section.
  11. Choose Create stack.
  12. When the stack is complete, find the value for BucketName in the stack outputs.

We encourage you to review the stack with your security team prior to using it a production environment.

Set up the web server and pre-gen server in an AWS Cloud9 IDE

Next, on the AWS Cloud9 console, locate the environment PRTPDemoCloud9 created by the CloudFormation stack. Choose Open IDE to open the AWS Cloud9 environment. Open a terminal window and run the following commands, which clones the PRTP code, sets up pre-gen dependencies, and starts a web server to test with:

#Obtain PRTP code
cd /home/ec2-user/environment
git clone https://github.com/aws-samples/amazon-polly-reads-the-page.git

# Navigate to that code
cd amazon-polly-reads-the-page/setup

# Install Saxon and html5 Python lib. For pre-gen.
sh ./setup.sh <StackName>

# Run Python simple HTTP server
cd ..
./runwebserver.sh <IngressCIDR> 

For <StackName>, use the name you gave the CloudFormation stack. For <IngressCIDR>, specify a range of IP addresses allowed to access the web server. To restrict access to the browser on your local machine, find your IP address using https://whatismyipaddress.com/ and append /32 to specify the range. For example, if your IP is 10.2.3.4, use 10.2.3.4/32. The server listens on port 8080. The public IP address on which the server listens is given in the output. For example:

Public IP is

3.92.33.223

Test static pages

In your browser, navigate to PRTPStaticDefault.html. (If you’re using the demo, the URL is http://<cloud9host>:8080/web/PRTPStaticDefault.html, where <cloud9host> is the public IP address that you discovered in setting up the IDE.) Choose Play on the audio control at the top. Listen to the audio and watch the highlights. Explore the control by changing speeds, changing voices, pausing, fast-forwarding, and rewinding. The following screenshot shows the page; the text “Skips hidden paragraph” is highlighted because it is currently being read.

Try the same for PRTPStaticConfig.html and PRTPStaticCustom.html. The results are similar. For example, all three read the alt text for the photo of the cat (“Random picture of a cat”). All three read NE, NW, SE, and SW as full words (“northeast,” “northwest,” “southeast,” “southwest”), taking advantage of Amazon Polly lexicons.

Notice the main differences in audio:

  • PRTPStaticDefault.html reads all the text in the body of the page, including the wrapup portion at the bottom with “Your thoughts in one word,” “Submit Query,” “Last updated April 1, 2020,” and “Questions for the dev team.” PRTPStaticConfig.html and PRTPStaticCustom.html don’t read these because they explicitly exclude the wrapup from speech synthesis.
  • PRTPStaticCustom.html reads the QB Best Sellers table differently from the others. It reads the first three rows only, and reads the row number for each row. It repeats the columns for each row. PRTPStaticCustom.html uses a custom transformation to tailor the readout of the table. The other pages use default table rendering.
  • PRTPStaticCustom.html reads “Tom Brady” at a louder volume than the rest of the text. It uses the speech synthesis markup language (SSML) prosody tag to tailor the reading of Tom Brady. The other pages don’t tailor in this way.
  • PRTPStaticCustom.html, thanks to a custom transformation, reads the main tiles in NW, SW, NE, SE order; that is, it reads “Today’s Articles,” “Quote of the Day,” “Photo of the Day,” “Jokes of the Day.” The other pages read in the order the tiles appear in the natural NW, NE, SW, SE order they appear in the HTML: “Today’s Articles,” “Photo of the Day,” “Quote of the Day,” “Jokes of the Day.”

Let’s dig deeper into how the audio is generated, and how the page highlights the text.

Static pre-generator

Our GitHub repo includes pre-generated audio files for the PRPTStatic pages, but if you want to generate them yourself, from the bash shell in the AWS Cloud9 IDE, run the following commands:

# navigate to examples
cd /home/ec2-user/environment/amazon-polly-reads-the-page-blog/pregen/examples

# Set env var for my S3 bucket. Example, I called mine prtp-output
S3_BUCKET=prtp-output # Use output BucketName from CloudFormation

#Add lexicon for pronuniciation of NE NW SE NW
#Script invokes aws polly put-lexicon
./addlexicon.sh.

#Gen each variant
./gen_default.sh
./gen_config.sh
./gen_custom.sh

Now let’s look at how those scripts work.

Default case

We begin with gen_default.sh:

cd ..
python FixHTML.py ../web/PRTPStaticDefault.html  
   example/tmp_wff.html
./gen_ssml.sh example/tmp_wff.html generic.xslt example/tmp.ssml
./run_polly.sh example/tmp.ssml en-US Joanna 
   ../web/polly/PRTPStaticDefault compass
./run_polly.sh example/tmp.ssml en-US Matthew 
   ../web/polly/PRTPStaticDefault compass

The script begins by running the Python program FixHTML.py to make the source HTML file PRTPStaticDefault.html well-formed. It writes the well-formed version of the file to example/tmp_wff.html. This step is crucial for two reasons:

  • Most source HTML is not well formed. This step repairs the source HTML to be well formed. For example, many HTML pages don’t close P elements. This step closes them.
  • We keep track of where in the HTML page we find text. We need to track locations using the same document object model (DOM) structure that the browser uses. For example, the browser automatically adds a TBODY to a TABLE. The Python program follows the same well-formed repairs as the browser.

gen_ssml.sh takes the well-formed HTML as input, applies an XML stylesheet transformation (XSLT) transformation to it, and outputs an SSML file. (SSML is the language in Amazon Polly to control how audio is rendered from text.) In the current example, the input is example/tmp_wff.html. The output is example/tmp.ssml. The transform’s job is to decide what text to extract from the HTML and feed to Amazon Polly. generic.xslt is a sensible default XSLT transform for most webpages. In the following example code snippet, it excludes the audio control, the HTML header, as well as HTML elements like script and form. It also excludes elements with the hidden attribute. It includes elements that typically contain text, such as P, H1, and SPAN. For these, it renders both a mark, including the full XPath expression of the element, and the value of the element.

<!-- skip the header -->
<xsl:template match="html/head">
</xsl:template>

<!-- skip the audio itself -->
<xsl:template match="html/body/table[@id='prtp-audio']">
</xsl:template>

<!-- For the body, work through it by applying its templates. This is the default. -->
<xsl:template match="html/body">
<speak>
      <xsl:apply-templates />
</speak>
</xsl:template>

<!-- skip these -->
<xsl:template match="audio|option|script|form|input|*[@hidden='']">
</xsl:template>

<!-- include these -->
<xsl:template match="p|h1|h2|h3|h4|li|pre|span|a|th/text()|td/text()">
<xsl:for-each select=".">
<p>
      <mark>
          <xsl:attribute name="name">
          <xsl:value-of select="prtp:getMark(.)"/>
          </xsl:attribute>
      </mark>
      <xsl:value-of select="normalize-space(.)"/>
</p>
</xsl:for-each>
</xsl:template>

The following is a snippet of the SSML that is rendered. This is fed as input to Amazon Polly. Notice, for example, that the text “Skips hidden paragraph” is to be read in the audio, and we associate it with a mark, which tells us that this text occurs in the location on the page given by the XPath expression /html/body[1]/div[2]/ul[1]/li[1].

<speak>
<p><mark name="/html/body[1]/div[1]/h1[1]"/>PollyReadsThePage Normal Test Page</p>
<p><mark name="/html/body[1]/div[2]/p[1]"/>PollyReadsThePage is a test page for audio readout with highlights.</p>
<p><mark name="/html/body[1]/div[2]/p[2]"/>Here are some features:</p>
<p><mark name="/html/body[1]/div[2]/ul[1]/li[1]"/>Skips hidden paragraph</p>
<p><mark name="/html/body[1]/div[2]/ul[1]/li[2]"/>Speaks but does not highlight collapsed content</p>
…
</speak>

To generate audio in Amazon Polly, we call the script run_polly.sh. It runs the AWS Command Line Interface (AWS CLI) command aws polly start-speech-synthesis-task twice: once to generate MP3 audio, and again to generate the marks file. Because the generation is asynchronous, the script polls until it finds the output in the specified S3 bucket. When it finds the output, it downloads to the build server and copies the files to the web/polly folder. The following is a listing of the web folders:

  • PRTPStaticDefault.html
  • PRTPStaticConfig.html
  • PRTPStaticCustom.html
  • PRTP.js
  • polly/PRTPStaticDefault/Joanna.mp3, Joanna.marks, Matthew.mp3, Matthew.marks
  • polly/PRTPStaticConfig/Joanna.mp3, Joanna.marks, Matthew.mp3, Matthew.marks
  • polly/PRTPStaticCustom/Joanna.mp3, Joanna.marks, Matthew.mp3, Matthew.marks

Each page has its own set of voice-specific MP3 and marks files. These files are the pre-generated files. The page doesn’t need to invoke Amazon Polly at runtime; the files are part of the web build.

Config-driven case

Next, consider gen_config.sh:

cd ..
python FixHTML.py ../web/PRTPStaticConfig.html 
  example/tmp_wff.html
python ModGenericXSLT.py example/transform_config.json 
  example/tmp.xslt
./gen_ssml.sh example/tmp_wff.html example/tmp.xslt 
  example/tmp.ssml
./run_polly.sh example/tmp.ssml en-US Joanna 
  ../web/polly/PRTPStaticConfig compass
./run_polly.sh example/tmp.ssml en-US Matthew 
  ../web/polly/PRTPStaticConfig compass

The script is similar to the script in the default case, but the bolded lines indicate the main difference. Our approach is config-driven. We tailor the content to be extracted from the page by specifying what to extract through configuration, not code. In particular, we use the JSON file transform_config.json, which specifies that the content to be included are the elements with IDs title, main, maintable, and qbtable. The element with ID wrapup should be excluded. See the following code:

{
 "inclusions": [ 
 	{"id" : "title"} , 
 	{"id": "main"}, 
 	{"id": "maintable"}, 
 	{"id": "qbtable" }
 ],
 "exclusions": [
 	{"id": "wrapup"}
 ]
}

We run the Python program ModGenericXSLT.py to modify generic.xslt, used in the default case, to use the inclusions and exclusions that we specify in transform_config.json. The program writes the results to a temp file (example/tmp.xslt), which it passes to gen_ssml.sh as its XSLT transform.

This is a low-code option. The web publisher doesn’t need to know how to write XSLT. But they do need to understand the structure of the HTML page and the IDs used in its main organizing elements.

Customization case

Finally, consider gen_custom.sh:

cd ..
python FixHTML.py ../web/PRTPStaticCustom.html 
   example/tmp_wff.html
./gen_ssml.sh example/tmp_wff.html example/custom.xslt  
   example/tmp.ssml
./run_polly.sh example/tmp.ssml en-US Joanna 
   ../web/polly/PRTPStaticCustom compass
./run_polly.sh example/tmp.ssml en-US Matthew 
   ../web/polly/PRTPStaticCustom compass

This script is nearly identical to the default script, except it uses its own XSLT—example/custom.xslt—rather than the generic XSLT. The following is a snippet of the XSLT:

<!-- Use NW, SW, NE, SE order for main tiles! -->
<xsl:template match="*[@id='maintable']">
    <mark>
        <xsl:attribute name="name">
        <xsl:value-of select="stats:getMark(.)"/>
        </xsl:attribute>
    </mark>
    <xsl:variable name="tiles" select="./tbody"/>
    <xsl:variable name="tiles-nw" select="$tiles/tr[1]/td[1]"/>
    <xsl:variable name="tiles-ne" select="$tiles/tr[1]/td[2]"/>
    <xsl:variable name="tiles-sw" select="$tiles/tr[2]/td[1]"/>
    <xsl:variable name="tiles-se" select="$tiles/tr[2]/td[2]"/>
    <xsl:variable name="tiles-seq" select="($tiles-nw,  $tiles-sw, $tiles-ne, $tiles-se)"/>
    <xsl:for-each select="$tiles-seq">
         <xsl:apply-templates />  
    </xsl:for-each>
</xsl:template>   

<!-- Say Tom Brady load! -->
<xsl:template match="span[@style = 'color:blue']" >
<p>
      <mark>
          <xsl:attribute name="name">
          <xsl:value-of select="prtp:getMark(.)"/>
          </xsl:attribute>
      </mark>
      <prosody volume="x-loud">Tom Brady</prosody>
</p>
</xsl:template>

If you want to study the code in detail, refer to the scripts and programs in the GitHub repo.

Browser setup and highlights

The static pages include an HTML5 audio control, which takes as its audio source the MP3 file generated by Amazon Polly and residing on the web server:

<audio id="audio" controls>
  <source src="polly/PRTPStaticDefault/en/Joanna.mp3" type="audio/mpeg">
</audio>

At load time, the page also loads the Amazon Polly-generated marks file. This occurs in the PRTP.js file, which the HTML page includes. The following is a snippet of the marks file for PRTPStaticDefault:

{“time”:11747,“type”:“sentence”,“start”:289,“end”:356,“value”:“PollyReadsThePage is a test page for audio readout with highlights.“}
{“time”:15784,“type”:“ssml”,“start”:363,“end”:403,“value”:“/html/body[1]/div[2]/p[2]“}
{“time”:16427,“type”:“sentence”,“start”:403,“end”:426,“value”:“Here are some features:“}
{“time”:17677,“type”:“ssml”,“start”:433,“end”:480,“value”:“/html/body[1]/div[2]/ul[1]/li[1]“}
{“time”:18344,“type”:“sentence”,“start”:480,“end”:502,“value”:“Skips hidden paragraph”}
{“time”:19894,“type”:“ssml”,“start”:509,“end”:556,“value”:“/html/body[1]/div[2]/ul[1]/li[2]“}
{“time”:20537,“type”:“sentence”,“start”:556,“end”:603,“value”:“Speaks but does not highlight collapsed content”}

During audio playback, there is an audio timer event handler in PRTP.js that checks the audio’s current time, finds the text to highlight, finds its location on the page, and highlights it. The text to be highlighted is an entry of type sentence in the marks file. The location is the XPath expression in the name attribute of the entry of type SSML that precedes the sentence. For example, if the time is 18400, according to the marks file, the sentence to be highlighted is “Skips hidden paragraph,” which starts at 18334. The location is the SSML entry at time 17667: /html/body[1]/div[2]/ul[1]/li[1].

Test dynamic pages

The page PRTPDynamic.html demonstrates dynamic audio readback using default, configuration-driven, and custom audio extraction approaches.

Default case

In your browser, navigate to PRTPDynamic.html. The page has one query parameter, dynOption, which accepts values default, config, and custom. It defaults to default, so you may omit it in this case. The page has two sections with dynamic content:

  • Latest Articles – Changes frequently throughout the day
  • Greek Philosophers Search By Date – Allows the visitor to search for Greek philosophers by date and shows the results in a table

Create some content in the Greek Philosopher section by entering a date range of -800 to 0, as shown in the example. Then choose Find.

Now play the audio by choosing Play in the audio control.

Behind the scenes, the page runs the following code to render and play the audio:

   buildSSMLFromDefault();
   chooseRenderAudio();
   setVoice();

First it calls the function buildSSMLFromDefault in PRTP.js to extract most of the text from the HTML page body. That function walks the DOM tree, looking for text in common elements such as p, h1, pre, span, and td. It ignores text in elements that usually don’t contain text to be read aloud, such as audio, option, and script. It builds SSML markup to be input to Amazon Polly. The following is a snippet showing extraction of the first row from the philosopher table:

<speak>
...
  <p><mark name="/HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/TABLE[1]/TBODY[1]/TR[2]/TD[1]"/>Thales</p>
  <p><mark name="/HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/TABLE[1]/TBODY[1]/TR[2]/TD[2]"/>-624 to -546</p>
  <p><mark name="/HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/TABLE[1]/TBODY[1]/TR[2]/TD[3]"/>Miletus</p>
  <p><mark name="/HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/TABLE[1]/TBODY[1]/TR[2]/TD[4]"/>presocratic</p>
...
</speak>

The chooseRenderAudio function in PRTP.js begins by initializing the AWS SDK for Amazon Cognito, Amazon S3, and Amazon Polly. This initialization occurs only once. If chooseRenderAudio is invoked again because the content of the page has changed, the initialization is skipped. See the following code:

AWS.config.region = env.REGION
AWS.config.credentials = new AWS.CognitoIdentityCredentials({
            IdentityPoolId: env.IDP});
audioTracker.sdk.connection = {
   polly: new AWS.Polly({apiVersion: '2016-06-10'}),
   s3: new AWS.S3()
};

It generates MP3 audio from Amazon Polly. The generation is synchronous for small SSML inputs and asynchronous (with output sent to the S3 bucket) for large SSML inputs (greater than 6,000 characters). In the synchronous case, we ask Amazon Polly to provide the MP3 file using a presigned URL. When the synthesized output is ready, we set the src attribute of the audio control to that URL and load the control. We then request the marks file and load it the same way as in the static case. See the following code:

// create signed URL
const signer = new AWS.Polly.Presigner(pollyAudioInput, audioTracker.sdk.connection.polly);

// call Polly to get MP3 into signed URL
signer.getSynthesizeSpeechUrl(pollyAudioInput, function(error, url) {
  // Audio control uses signed URL
  audioTracker.audioControl.src =
    audioTracker.sdk.audio[audioTracker.voice];
  audioTracker.audioControl.load();

  // call Polly to get marks
  audioTracker.sdk.connection.polly.synthesizeSpeech(
    pollyMarksInput, function(markError, markData) {
    const marksStr = new
      TextDecoder().decode(markData.AudioStream);
    // load marks into page the same as with static
    doLoadMarks(marksStr);
  });
});

Config-driven case

In your browser, navigate to PRTPDynamic.html?dynOption=config. Play the audio. The audio playback is similar to the default case, but there are minor differences. In particular, some content is skipped.

Behind the scenes, when using the config option, the page extracts content differently than in the default case. In the default case, the page uses buildSSMLFromDefault. In the config-driven case, the page specifies the sections it wants to include and exclude:

const ssml = buildSSMLFromConfig({
	 "inclusions": [ 
	 	{"id": "title"}, 
	 	{"id": "main"}, 
	 	{"id": "maintable"}, 
	 	{"id": "phil-result"},
	 	{"id": "qbtable"}, 
	 ],
	 "exclusions": [
	 	{"id": "wrapup"}
	 ]
	});

The buildSSMLFromConfig function, defined in PRTP.js, walks the DOM tree in each of the sections whose ID is provided under inclusions. It extracts content from each and combines them together, in the order specified, to form an SSML document. It excludes the sections specified under exclusions. It extracts content from each section in the same way buildSSMLFromDefault extracts content from the page body.

Customization case

In your browser, navigate to PRTPDynamic.html?dynOption=custom. Play the audio. There are three noticeable differences. Let’s note these and consider the custom code that runs behind the scenes:

  • It reads the main tiles in NW, SW, NE, SE order. The custom code gets each of these cell blocks from maintable and adds them to the SSML in NW, SW, NE, SE order:
const nw = getElementByXpath("//*[@id='maintable']//tr[1]/td[1]");
const sw = getElementByXpath("//*[@id='maintable']//tr[2]/td[1]");
const ne = getElementByXpath("//*[@id='maintable']//tr[1]/td[2]");
const se = getElementByXpath("//*[@id='maintable']//tr[2]/td[2]");
[nw, sw, ne, se].forEach(dir => buildSSMLSection(dir, []));
  • “Tom Brady” is spoken loudly. The custom code puts “Tom Brady” text inside an SSML prosody tag:
if (cellText == "Tom Brady") {
   addSSMLMark(getXpathOfNode( node.childNodes[tdi]));
   startSSMLParagraph();
   startSSMLTag("prosody", {"volume": "x-loud"});
   addSSMLText(cellText);
   endSSMLTag();
   endSSMLParagraph();
}
  • It reads only the first three rows of the quarterback table. It reads the column headers for each row. Check the code in the GitHub repo to discover how this is implemented.

Clean up

To avoid incurring future charges, delete the CloudFormation stack.

Conclusion

In this post, we demonstrated a technical solution to a high-value business problem: how to use Amazon Polly to read the content of a webpage and highlight the content as it’s being read. We showed this using both static and dynamic pages. To extract content from the page, we used DOM traversal and XSLT. To facilitate highlighting, we used the speech marks capability in Amazon Polly.

Learn more about Amazon Polly by visiting its service page.

Feel free to ask questions in the comments.


About the authors

Mike Havey is a Solutions Architect for AWS with over 25 years of experience building enterprise applications. Mike is the author of two books and numerous articles. Visit his Amazon author page to read more.

Vineet Kachhawaha is a Solutions Architect at AWS with expertise in machine learning. He is responsible for helping customers architect scalable, secure, and cost-effective workloads on AWS.

Read More

Use Amazon SageMaker Data Wrangler for data preparation and Studio Labs to learn and experiment with ML

Amazon SageMaker Studio Lab is a free machine learning (ML) development environment based on open-source JupyterLab for anyone to learn and experiment with ML using AWS ML compute resources. It’s based on the same architecture and user interface as Amazon SageMaker Studio, but with a subset of Studio capabilities.

When you begin working on ML initiatives, you need to perform exploratory data analysis (EDA) or data preparation before proceeding with model building. Amazon SageMaker Data Wrangler is a capability of Amazon SageMaker that makes it faster for data scientists and engineers to prepare data for ML applications via a visual interface. Data Wrangler reduces the time it takes to aggregate and prepare data for ML from weeks to minutes.

A key accelerator of feature preparation in Data Wrangler is the Data Quality and Insights Report. This report checks data quality and helps detect abnormalities in your data, so that you can perform the required data engineering to fix your dataset. You can use the Data Quality and Insights Report to perform an analysis of your data to gain insights into your dataset such as the number of missing values and number of outliers. If you have issues with your data, such as target leakage or imbalance, the insights report can bring those issues to your attention and help you identify the data preparation steps you need to perform.

Studio Lab users can benefit from Data Wrangler because data quality and feature engineering are critical for the predictive performance of your model. Data Wrangler helps with data quality and feature engineering by giving insights into data quality issues and easily enabling rapid feature iteration and engineering using a low-code UI.

In this post, we show you how to perform exploratory data analysis, prepare and transform data using Data Wrangler, and export the transformed and prepared data to Studio Lab to carry out model building.

Solution overview

The solution includes the following high-level steps:

  1. Create AWS account and admin user. This is a prerequisite
  2. Download the dataset churn.csv.
  3. Load the dataset to Amazon Simple Storage Service (Amazon S3).
  4. Create a SageMaker Studio domain and launch Data Wrangler.
  5. Import the dataset into the Data Wrangler flow from Amazon S3.
  6. Create the Data Quality and Insights Report and draw conclusions on necessary feature engineering.
  7. Perform the necessary data transforms in Data Wrangler.
  8. Download the Data Quality and Insights Report and the transformed dataset.
  9. Upload the data to a Studio Lab project for model training.

The following diagram illustrates this workflow.

Prerequisites

To use Data Wrangler and Studio Lab, you need the following prerequisites:

Build a data preparation workflow with Data Wrangler

To get started, complete the following steps:

  1. Upload your dataset to Amazon S3.
  2. On the SageMaker console, under Control panel in the navigation pane, choose Studio.
  3. On the Launch app menu next to your user profile, choose Studio.

    After you successfully log in to Studio, you should see a development environment like the following screenshot.
  4. To create a new Data Wrangler workflow, on the File menu, choose New, then choose Data Wrangler Flow.

    The first step in Data Wrangler is to import your data. You can import data from multiple data sources, such as Amazon S3, Amazon Athena, Amazon Redshift, Snowflake, and Databricks. In this example, we use Amazon S3.If you just want to see how Data Wrangler works, you can always choose Use sample dataset.
  5. Choose Import data.
  6. Choose Amazon S3.
  7. Choose the dataset you uploaded and choose Import.

    Data Wrangler enables you to either import the entire dataset or sample a portion of it.
  8. To quickly get insights on the dataset, choose First K for Sampling and enter 50000 for Sample size.

Understand data quality and get insights

Let’s use the Data Quality and Insights Report to perform an analysis of the data that we imported into Data Wrangler. You can use the report to understand what steps you need to take to clean and process your data. This report provides information such as the number of missing values and the number of outliers. If you have issues with your data, such as target leakage or imbalance, the insights report can bring those issues to your attention.

  1. Choose the plus sign next to Data types and choose Get data insights.
  2. For Analysis type, choose Data Quality and Insights Report.
  3. For Target column, choose Churn?.
  4. For Problem type¸ select Classification.
  5. Choose Create.

You’re presented with a detailed report that you can review and download. The report includes several sections such as quick model, feature summary, feature correlation, and data insights. The following screenshots provide examples of these sections.

Observations from the report

From the report, we can make the following observations:

  • No duplicate rows were found.
  • The State column appears to be quite evenly distributed, so the data is balanced in terms of state population.
  • The Phone column presents too many unique values to be of any practical use. Too many unique values make this column not useful. We can drop the Phone column in our transformation.
  • Based on feature correlation section of the report, Mins and Charge are highly correlated. We can remove one of them.

Transformation

Based on our observations, we want to make the following transformations:

  • Remove the Phone column because it has many unique values.
  • We also see several features that essentially have 100% correlation with one another. Including these feature pairs in some ML algorithms can create undesired problems, whereas in others it will only introduce minor redundancy and bias. Let’s remove one feature from each of the highly correlated pairs: Day Charge from the pair with Day Mins, Night Charge from the pair with Night Mins, and Intl Charge from the pair with Intl Mins.
  • Convert True or False in the Churn column to be a numerical value of 1 or 0.
  1. Return to the data flow and choose the plus sign next to Data types.
  2. Choose Add transform.
  3. Choose Add step.
  4. You can search for the transform you looking for (in our case, manage columns).
  5. Choose Manage columns.
  6. For Transform¸ choose Drop column.
  7. For Columns to drop¸ choose Phone, Day Charge, Eve Charge, Night Charge, and Intl Charge.
  8. Choose Preview, then choose Update.

    Let’s add another transform to perform a categorical encode on the Churn? column.
  9. Choose the transform Encode categorical.
  10. For Transform, choose Ordinal encode.
  11. For Input columns, choose the Churn? column.
  12. For Invalid handling strategy, choose Replace with NaN.
  13. Choose Preview, then choose Update.

Now True and False are converted to 1 and 0, respectively.

Now that we have a good understand of the data and have prepared and transformed the data for model building, we can move the data to Studio Lab for model building.

Upload the data to Studio Lab

To start using the data in Studio Lab, complete the following steps:

  1. Choose Export data to export to an S3 bucket.
  2. For Amazon S3 location, enter your S3 path.
  3. Specify the file type.
  4. Choose Export data.
  5. After you export the data, you can download the data from the S3 bucket to your local computer.
  6. Now you can go to Studio Lab and upload the file to Studio Lab.

    Alternatively, you can connect to Amazon S3 from Studio Lab. For more information, refer to Use external resources in Amazon SageMaker Studio Lab.
  7. Let’s install SageMaker and import Pandas.
  8. Import all libraries as required.
  9. Now we can read the CSV file.
  10. Let’s print churn to confirm the dataset is correct.

Now that you have the processed dataset in Studio Lab, you can carry out further steps required for model building.

Data Wrangler pricing

You can perform all the steps in this post for EDA or data preparation within Data Wrangler and pay for the simple instance, jobs, and storage pricing based on usage or consumption. No upfront or licensing fees are required.

Clean up

When you’re not using Data Wrangler, it’s important to shut down the instance on which it runs to avoid incurring additional fees. To avoid losing work, save your data flow before shutting Data Wrangler down.

  1. To save your data flow in Studio, choose File, then choose Save Data Wrangler Flow.
    Data Wrangler automatically saves your data flow every 60 seconds.
  2. To shut down the Data Wrangler instance, in Studio, choose Running Instances and Kernels.
  3. Under RUNNING APPS, choose the shutdown icon next to the sagemaker-data-wrangler-1.0 app.
  4. Choose Shut down all to confirm.

Data Wrangler runs on an ml.m5.4xlarge instance. This instance disappears from RUNNING INSTANCES when you shut down the Data Wrangler app.

After you shut down the Data Wrangler app, it has to restart the next time you open a Data Wrangler flow file. This can take a few minutes.

Conclusion

In this post, we saw how you can gain insights into your dataset, perform exploratory data analysis, prepare and transform data using Data Wrangler within Studio, and export the transformed and prepared data to Studio Lab and carry out model building and other steps.

With SageMaker Data Wrangler, you can simplify the process of data preparation and feature engineering, and complete each step of the data preparation workflow, including data selection, cleansing, exploration, and visualization from a single visual interface.


About the authors

Rajakumar Sampathkumar is a Principal Technical Account Manager at AWS, providing customers guidance on business-technology alignment and supporting the reinvention of their cloud operation models and processes. He is passionate about the cloud and machine learning. Raj is also a machine learning specialist and works with AWS customers to design, deploy, and manage their AWS workloads and architectures.

Meenakshisundaram Thandavarayan is a Senior AI/ML specialist with a passion to design, create and promote human-centered Data and Analytics experiences. He supports AWS Strategic customers on their transformation towards data driven organization.

James Wu is a Senior AI/ML Specialist Solution Architect at AWS. helping customers design and build AI/ML solutions. James’s work covers a wide range of ML use cases, with a primary interest in computer vision, deep learning, and scaling ML across the enterprise. Prior to joining AWS, James was an architect, developer, and technology leader for over 10 years, including 6 years in engineering and 4 years in marketing & advertising industries.

Read More

Announcing Visual Conversation Builder for Amazon Lex

Amazon Lex is a service for building conversational interfaces using voice and text. Amazon Lex provides high-quality speech recognition and language understanding capabilities. With Amazon Lex, you can add sophisticated, natural language bots to new and existing applications. Amazon Lex reduces multi-platform development efforts, allowing you to easily publish your speech or text chatbots to mobile devices and multiple chat services, like Facebook Messenger, Slack, Kik, or Twilio SMS.

Today, we added a Visual Conversation Builder (VCB) to Amazon Lex—a drag-and-drop conversation builder that allows users to interact and define bot information by manipulating visual objects. These are used to design and edit conversation flows in a no-code environment. There are three main benefits of the VCB:

  • It’s easier to collaborate through a single pane of glass
  • It simplifies conversational design and testing
  • It reduces code complexity

In this post, we introduce the VCB, how to use it, and share customer success stories.

Overview of the Visual Conversation Builder

In addition to the already available menu-based editor and Amazon Lex APIs, the visual builder gives a single view of an entire conversation flow in one location, simplifying bot design and reducing dependency on development teams. Conversational designers, UX designers, and product managers—anyone with an interest in building a conversation on Amazon Lex—can utilize the builder.

Designers and developers can now collaborate and build conversations easily in the VCB without coding the business logic behind the conversation. The visual builder helps accelerate time to market for Amazon Lex-based solutions by providing better collaboration, easier iterations of the conversation design, and reduced code complexity.

With the visual builder, it’s now possible to quickly view the entire conversation flow of the intent at a glance and get visual feedback as changes are made. Changes to your design are instantly reflected in the view, and any effects to dependencies or branching logic is immediately apparent to the designer. You can use the visual builder to make any changes to the intent, such as adding utterances, slots, prompts, or responses. Each block type has its own settings that you can configure to tailor the flow of the conversation.

Previously, complex branching of conversations required implementation of AWS Lambda—a serverless, event-driven compute service—to achieve the desired pathing. The visual builder reduces the need for Lambda integrations, and designers can perform conversation branching without the need for Lambda code, as shown in the following example. This helps to decouple conversation design activities from Lambda business logic and integrations. You can still use the existing intent editor in conjunction with the visual builder, or switch between them at any time when creating and modifying intents.

The VCB is a no-code method of designing complex conversations. For example, you can now add a confirmation prompt in an intent and branch based on a Yes or No response to different paths in the flow without code. Where future Lambda business logic is needed, conversation designers can add placeholder blocks into the flow so developers know what needs to be addressed through code. Code hook blocks with no Lambda functions attached automatically take the Success pathway so testing of the flow can continue until the business logic is completed and implemented. In addition to branching, the visual builder offers designers the ability to go to another intent as part of the conversation flow.

Upon saving, VCB automatically scans the build to detect any errors in the conversation flow. In addition, the VCB auto-detects missing failure paths and provides the capability to auto-add those paths into the flow, as shown in the following example.

Using the Visual Conversation Builder

You can access the VCB via the Amazon Lex console by going to a bot and editing or creating a new intent. On the intent page, you can now switch between the visual builder interface and the traditional intent editor, as shown in the following screenshot.

For the intent, the visual builder shows what has already been designed in a visual layout, whereas new intents start with a blank canvas. The visual builder displays existing intents graphically on the canvas. For new intents, you start with a blank canvas and simply drag the components you want to add onto the canvas and begin connecting them together to create the conversation flow.

The visual builder has three main components: blocks, ports, and edges. Let’s get into how these are used in conjunction to create a conversation from beginning to end within an intent.

The basic building unit of a conversation flow is called a block. The top menu of the visual builder contains all the blocks you are able to use. To add a block to a conversation flow, drag it from the top menu onto the flow.

Each block has a specific functionality to handle different use cases of a conversation. The currently available block types are as follows:

  • Start – The root or first block of the conversation flow that can also be configured to send an initial response
  • Get slot value – Tries to elicit a value for a single slot
  • Condition – Can contain up to four custom branches (with conditions) and one default branch
  • Dialog code hook – Handles invocation of the dialog Lambda function and includes bot responses based on dialog Lambda functions succeeding, failing, or timing out
  • Confirmation – Queries the customer prior to fulfillment of the intent and includes bot responses based on the customer saying yes or no to the confirmation prompt
  • Fulfillment – Handles fulfillment of the intent and can be configured to invoke Lambda functions and respond with messages if fulfillment succeeds or fails
  • Closing response – Allows the bot to respond with a message before ending the conversation
  • Wait for user input – Captures input from the customer and switches to another intent based on the utterance
  • End conversation – Indicates the end of the conversation flow

Take the Order Flowers bot as an example. The OrderFlowers intent, when viewed in the visual builder, uses five blocks: Start, three different Get slot value blocks, and Confirmation.

Each block can contain one more ports, which are used to connect one block to another. Blocks contain an input port and one or more output ports based on desired paths for states such a success, timeout, and error.

The connection between the output port of one block and the input port of another block is referred to as an edge.

In the OrderFlowers intent, when the conversation starts, the Start output port is connected to the Get slot value: FlowerType input port using an edge. Each Get slot value block is connected using ports and edges to create a sequence in the conversation flow, which ensures the intent has all the slot values it needs to put in the order.

Notice that currently there is no edge connected to the failure output port of these blocks, but the builder will automatically add these if you choose Save intent and then choose Confirm in the pop-up Auto add block and edges for failure paths. The visual builder then adds an End conversation block and a Go to intent block, connecting the failure and error output ports to Go to intent and connecting the Yes/No ports of the Confirmation block to End conversation.

After the builder adds the blocks and edges, the intent is saved and the conversation flow can be built and tested. Let’s add a Welcome intent to the bot using the visual builder. From the OrderFlowers intent visual builder, choose Back to intents list in the navigation pane. On the Intents page, choose Add intent followed by Add empty intent. In the Intent name field, enter Welcome and choose Add.

Switch to the Visual builder tab and you will see an empty intent, with only the Start block currently on the canvas. To start, add some utterances to this intent so that the bot will be able to direct users to the Welcome intent. Choose the edit button of the Start block and scroll down to Sample utterances. Add the following utterances to this intent and then close the block:

  • Can you help me?
  • Hi
  • Hello
  • I need help

Now let’s add a response for the bot to give when it hits this intent. Because the Welcome intent won’t be processing any logic, we can drag a Closing response block into the canvas to add this message. After you add the block, choose the edit icon on the block and enter the following response:

Hi! I am the Order Flowers Bot. How can I help you today?

The canvas should now have two blocks, but they aren’t connected to each other. We can connect the ports of these two blocks using an edge.

To connect the two ports, simply click and drag from the No response output port of the Start block to the input port of the Closing response block.

At this point, you can complete the conversation flow in two different ways:

  • First, you can manually add the End conversation block and connect it to the Closing response block.
  • Alternatively, choose Save intent and then choose Confirm to have the builder create this block and connection for you.

After the intent is saved, choose Build and wait for the build to complete, then choose Test.

The bot will now properly greet the customer if an utterance matches this newly created intent.

Customer stories

NeuraFlash is an Advanced AWS Partner with over 40 collective years of experience in the voice and automation space. With a dedicated team of Conversational Experience Designers, Speech Scientists, and AWS developers, NeuraFlash helps customers take advantage of the power of Amazon Lex in their contact centers.

“One of our key focus areas is helping customers leverage AI capabilities for developing conversational interfaces. These interfaces often require specialized bot configuration skills to build effective flows. With the Visual Conversation Builder, our designers can quickly and easily build conversational interfaces, allowing them to experiment at a faster rate and deliver quality products for our customers without requiring developer skills. The drag-and-drop UI and the visual conversation flow is a game-changer for reinventing the contact center experience.”

The SmartBots ML-powered platform lies at the core of the design, prototyping, testing, validating, and deployment of AI-driven chatbots. This platform supports the development of custom enterprise bots that can easily integrate with any application—even an enterprise’s custom application ecosystem.

“The Visual Conversation Builder’s easy-to-use drag-and-drop interface enables us to easily onboard Amazon Lex, and build complex conversational experiences for our customers’ contact centers. With this new functionality, we can improve Interactive Voice Response (IVR) systems faster and with minimal effort. Implementing new technology can be difficult with a steep learning curve, but we found that the drag-and-drop features were easy to understand, allowing us to realize value immediately.“

Conclusion

The Visual Conversation Builder for Amazon Lex is now generally available, for free, in all AWS Regions where Amazon Lex V2 operates.

Additionally, on August 17, 2022, Amazon Lex V2 released a change to the way conversations are managed with the user. This change gives you more control over the path that the user takes through the conversation. For more information, see Understanding conversation flow management. Note that bots created before August 17, 2022, do not support the VCB for creating conversation flows.

To learn more, see Amazon Lex FAQs and the Amazon Lex V2 Developer Guide. Please send feedback to AWS re:Post for Amazon Lex or through your usual AWS support contacts.


About the authors

Thomas Rindfuss is a Sr. Solutions Architect on the Amazon Lex team. He invents, develops, prototypes, and evangelizes new technical features and solutions for Language AI services that improves the customer experience and eases adoption.

Austin Johnson is a Solutions Architect at AWS , helping customers on their cloud journey. He is passionate about building and utilizing conversational AI platforms to add sophisticated, natural language interfaces to their applications.

Read More