Customize business rules for intelligent document processing with human review and BI visualization

A massive amount of business documents are processed daily across industries. Many of these documents are paper-based, scanned into your system as images, or in an unstructured format like PDF. Each company may apply unique rules associated with its business background while processing these documents. How to extract information accurately and process them flexibly is a challenge many companies face.

Amazon Intelligent Document Processing (IDP) allows you to take advantage of industry-leading machine learning (ML) technology without previous ML experience. This post introduces a solution included in the Amazon IDP workshop showcasing how to process documents to serve flexible business rules using Amazon AI services. You can use the following step-by-step Jupyter notebook to complete the lab.

Amazon Textract helps you easily extract text from various documents, and Amazon Augmented AI (Amazon A2I) allows you to implement a human review of ML predictions. The default Amazon A2I template allows you to build a human review pipeline based on rules, such as when the extraction confidence score is lower than a pre-defined threshold or required keys are missing. But in a production environment, you need the document processing pipeline to support flexible business rules, such as validating the string format, verifying the data type and range, and validating fields across documents. This post shows how you can use Amazon Textract and Amazon A2I to customize a generic document processing pipeline supporting flexible business rules.

Solution overview

For our sample solution, we use the Tax Form 990, a US IRS (Internal Revenue Service) form that provides the public with financial information about a non-profit organization. For this example, we only cover the extraction logic for some of the fields on the first page of the form. You can find more sample documents on the IRS website.

The following diagram illustrates the IDP pipeline that supports customized business rules with human review.IDP HITM Overview

The architecture is composed of three logical stages:

  • Extraction – Extract data from the 990 Tax Form (we use page 1 as an example).

    • Retrieve a sample image stored in an Amazon Simple Storage Service (Amazon S3) bucket.
    • Call the Amazon Textract analyze_document API using the Queries feature to extract text from the page.
  • Validation – Apply flexible business rules with a human-in-the-loop review.

    • Validate the extracted data against business rules, such as validating the length of an ID field.
    • Send the document to Amazon A2I for a human to review if any business rules fail.
    • Reviewers use the Amazon A2I UI (a customizable website) to verify the extraction result.
  • BI visualization – We use Amazon QuickSight to build a business intelligence (BI) dashboard showing the process insights.

Customize business rules

You can define a generic business rule in the following JSON format. In the sample code, we define three rules:

  • The first rule is for the employer ID field. The rule fails if the Amazon Textract confidence score is lower than 99%. For this post, we set the confidence score threshold high, which will break by design. You could adjust the threshold to a more reasonable value to reduce unnecessary human effort in a real-world environment, such as 90%.
  • The second rule is for the DLN field (the unique identifier of the tax form), which is required for the downstream processing logic. This rule fails if the DLN field is missing or has an empty value.
  • The third rule is also for the DLN field but with a different condition type: LengthCheck. The rule breaks if the DLN length is not 16 characters.

The following code shows our business rules in JSON format:

rules = [
    {
        "description": "Employee Id confidence score should greater than 99",
        "field_name": "d.employer_id",
        "field_name_regex": None, # support Regex: "_confidence$",
        "condition_category": "Confidence",
        "condition_type": "ConfidenceThreshold",
        "condition_setting": "99",
    },
    {
        "description": "dln is required",
        "field_name": "dln",
        "condition_category": "Required",
        "condition_type": "Required",
        "condition_setting": None,
    },
    {
        "description": "dln length should be 16",
        "field_name": "dln",
        "condition_category": "LengthCheck",
        "condition_type": "ValueRegex",
        "condition_setting": "^[0-9a-zA-Z]{16}$",
    }
]

You can expand the solution by adding more business rules following the same structure.

Extract text using an Amazon Textract query

In the sample solution, we call the Amazon Textract analyze_document API query feature to extract fields by asking specific questions. You don’t need to know the structure of the data in the document (table, form, implied field, nested data) or worry about variations across document versions and formats. Queries use a combination of visual, spatial, and language cues to extract the information you seek with high accuracy.

To extract value for the DLN field, you can send a request with questions in natural languages, such as “What is the DLN?” Amazon Textract returns the text, confidence, and other metadata if it finds corresponding information on the image or document. The following is an example of an Amazon Textract query request:

textract.analyze_document(
        Document={'S3Object': {'Bucket': data_bucket, 'Name': s3_key}},
        FeatureTypes=["QUERIES"],
        QueriesConfig={
                'Queries': [
                    {
                        'Text': 'What is the DLN?',
                       'Alias': 'The DLN number - unique identifier of the form'
                    }
               ]
        }
)

Define the data model

The sample solution constructs the data in a structured format to serve the generic business rule evaluation. To keep extracted values, you can define a data model for each document page. The following image shows how the text on page 1 maps to the JSON fields.Custom data model

Each field represents a document’s text, check box, or table/form cell on the page. The JSON object looks like the following code:

{
    "dln": {
        "value": "93493319020929",
        "confidence": 0.9765, 
        "block": {} 
    },
    "omb_no": {
        "value": "1545-0047",
        "confidence": 0.9435,
        "block": {}
    },
    ...
}

You can find the detailed JSON structure definition in the GitHub repo.

Evaluate the data against business rules

The sample solution comes with a Condition class—a generic rules engine that takes the extracted data (as defined in the data model) and the rules (as defined in the customized business rules). It returns two lists with failed and satisfied conditions. We can use the result to decide if we should send the document to Amazon A2I for human review.

The Condition class source code is in the sample GitHub repo. It supports basic validation logic, such as validating a string’s length, value range, and confidence score threshold. You can modify the code to support more condition types and complex validation logic.

Create a customized Amazon A2I web UI

Amazon A2I allows you to customize the reviewer’s web UI by defining a worker task template. The template is a static webpage in HTML and JavaScript. You can pass data to the customized reviewer page using the Liquid syntax.

In the sample solution, the custom Amazon A2I UI template displays the page on the left and the failure conditions on the right. Reviewers can use it to correct the extraction value and add their comments.

The following screenshot shows our customized Amazon A2I UI. It shows the original image document on the left and the following failed conditions on the right:

  • The DLN numbers should be 16 characters long. The actual DLN has 15 characters.
  • The confidence score of employer_id is lower than 99%. The actual confidence score is around 98%.

The reviewers can manually verify these results and add comments in the CHANGE REASON text boxes.Customized A2I review UI

For more information about integrating Amazon A2I into any custom ML workflow, refer to over 60 pre-built worker templates on the GitHub repo and Use Amazon Augmented AI with Custom Task Types.

Process the Amazon A2I output

After the reviewer using the Amazon A2I customized UI verifies the result and chooses Submit, Amazon A2I stores a JSON file in the S3 bucket folder. The JSON file includes the following information on the root level:

  • The Amazon A2I flow definition ARN and human loop name
  • Human answers (the reviewer’s input collected by the customized Amazon A2I UI)
  • Input content (the original data sent to Amazon A2I when starting the human loop task)

The following is a sample JSON generated by Amazon A2I:

{
  "flowDefinitionArn": "arn:aws:sagemaker:us-east-1:711334203977:flow-definition/a2i-custom-ui-demo-workflow",
  "humanAnswers": [
    {
      "acceptanceTime": "2022-08-23T15:23:53.488Z",
      "answerContent": {
        "Change Reason 1": "Missing X at the end.",
        "True Value 1": "93493319020929X",
        "True Value 2": "04-3018996"
      },
      "submissionTime": "2022-08-23T15:24:47.991Z",
      "timeSpentInSeconds": 54.503,
      "workerId": "94de99f1bc6324b8",
      "workerMetadata": {
        "identityData": {
          "identityProviderType": "Cognito",
          "issuer": "https://cognito-idp.us-east-1.amazonaws.com/us-east-1_URd6f6sie",
          "sub": "cef8d484-c640-44ea-8369-570cdc132d2d"
        }
      }
    }
  ],
  "humanLoopName": "custom-loop-9b4e67ff-2c9f-40f9-aae5-0e26316c905c",
  "inputContent": {...} # the original input send to A2I when starting the human review task
}

You can implement extract, transform, and load (ETL) logic to parse information from the Amazon A2I output JSON and store it in a file or database. The sample solution comes with a CSV file with processed data. You can use it to build a BI dashboard by following the instructions in the next section.

Create a dashboard in Amazon QuickSight

The sample solution includes a reporting stage with a visualization dashboard served by Amazon QuickSight. The BI dashboard shows key metrics such as the number of documents processed automatically or manually, the most popular fields that required human review, and other insights. This dashboard can help you get an oversight of the document processing pipeline and analyze the common reasons causing human review. You can optimize the workflow by further reducing human input.

The sample dashboard includes basic metrics. You can expand the solution using Amazon QuickSight to show more insights into the data.BI dashboard

Expand the solution to support more documents and business rules

To expand the solution to support more document pages with corresponding business rules, you need to make the following changes:

  • Create a data model for the new page in JSON structure representing all the values you want to extract out of the pages. Refer to the Define the data model section for a detailed format.
  • Use Amazon Textract to extract text out of the document and populate values to the data model.
  • Add business rules corresponding to the page in JSON format. Refer to the Customize business rules section for the detailed format.

The custom Amazon A2I UI in the solution is generic, which doesn’t require a change to support new business rules.

Conclusion

Intelligent document processing is in high demand, and companies need a customized pipeline to support their unique business logic. Amazon A2I also offers a built-in template integrated with Amazon Textract to implement your human review use cases. It also allows you to customize the reviewer page to serve flexible requirements.

This post guided you through a reference solution using Amazon Textract and Amazon A2I to build an IDP pipeline that supports flexible business rules. You can try it out using the Jupyter notebook in the GitHub IDP workshop repo.


About the authors

Lana Zhang is a Sr. Solutions Architect at the AWS WWSO AI Services team with expertise in AI and ML for intelligent document processing and content moderation. She is passionate about promoting AWS AI services and helping customers transform their business solutions.


Sonali Sahu is leading Intelligent Document Processing AI/ML Solutions Architect team at Amazon Web Services. She is a passionate technophile and enjoys working with customers to solve complex problems using innovation. Her core area of focus are Artificial Intelligence & Machine Learning for Intelligent Document Processing.

Read More

Automate classification of IT service requests with an Amazon Comprehend custom classifier

Enterprises often deal with large volumes of IT service requests. Traditionally, the burden is put on the requester to choose the correct category for every issue. A manual error or misclassification of a ticket usually means a delay in resolving the IT service request. This can result in reduced productivity, a decrease in customer satisfaction, an impact to service level agreements (SLAs), and broader operational impacts. As your enterprise grows, the problem of getting the right service request to the right team becomes even more important. Using an approach based on machine learning (ML) and artificial intelligence can help with your enterprise’s ever-evolving needs.

Supervised ML is a process that uses labeled datasets and outputs to train learning algorithms on how to classify data or predict an outcome. Amazon Comprehend is a natural language processing (NLP) service that uses ML to uncover valuable insights and connections in text. It provides APIs powered by ML to extract key phrases, entities, sentiment analysis, and more.

In this post, we show you how to implement a supervised ML model that can help classify IT service requests automatically using Amazon Comprehend custom classification. Amazon Comprehend custom classification helps you customize Amazon Comprehend for your specific requirements without the skillset required to build ML-based NLP solutions. With automatic ML, or AutoML, Amazon Comprehend custom classification builds customized NLP models on your behalf, using the training data that you provide.

Overview of solution

To illustrate the IT service request classification, this solution uses the SEOSS dataset. This dataset is a systematically retrieved dataset consisting of 33 open-source software projects that contains a large number of typed artifacts and trace links between them. This solution uses the issue data from these 33 open-source projects, summaries, and descriptions as reported by end-users to build a custom classifier model using Amazon Comprehend.

This post demonstrates how to implement and deploy the solution using the AWS Cloud Development Kit (AWS CDK) in an isolated Amazon Virtual Private Cloud (Amazon VPC) environment consisting of only private subnets. We also use the code to demonstrate how you can use the AWS CDK provider framework, a mini-framework for implementing a provider for AWS CloudFormation custom resources to create, update, or delete a custom resource, such as an Amazon Comprehend endpoint. The Amazon Comprehend endpoint includes managed resources that make your custom model available for real-time inference to a client machine or third-party applications. The code for this solution is available on Github.

You use the AWS CDK to deploy the infrastructure, application code, and configuration for the solution. You also need an AWS account and the ability to create AWS resources. You use the AWS CDK to create AWS resources such as a VPC with private subnets, Amazon VPC endpoints, Amazon Elastic File System (Amazon EFS), an Amazon Simple Notification Service (Amazon SNS) topic, an Amazon Simple Storage Service (Amazon S3) bucket, Amazon S3 event notifications, and AWS Lambda functions. Collectively, these AWS resources constitute the training stack, which you use to build and train the custom classifier model.

After you create these AWS resources, you download the SEOSS dataset and upload the dataset to the S3 bucket created by the solution. If you’re deploying this solution in AWS Region us-east-2, the format of the S3 bucket name is comprehendcustom-<AWS-ACCOUNT-NUMBER>-us-east-2-s3stack. The solution uses the Amazon S3 multi-part upload trigger to invoke a Lambda function that starts the pre-processing of the input data, and uses the preprocessed data to train the Amazon Comprehend custom classifier to create the custom classifier model. You then use the Amazon Resource Name (ARN) of the custom classifier model to create the inference stack, which creates an Amazon Comprehend endpoint using the AWS CDK provider framework, which you can then use for inferences from a third-party application or client machine.

The following diagram illustrates the architecture of the training stack.

Training stack architecture

The workflow steps are as follows:

  1. Upload the SEOSS dataset to the S3 bucket created as part of the training stack deployment process. This creates an event trigger that invokes the etl_lambda function.
  2. The etl_lambda function downloads the raw data set from Amazon S3 to Amazon EFS.
  3. The etl_lambda function performs the data preprocessing task of the SEOSS dataset.
  4. When the function execution completes, it uploads the transformed data with prepped_data prefix to the S3 bucket.
  5. After the upload of the transformed data is complete, a successful ETL completion message is send to Amazon SNS.
  6. In Amazon Comprehend, you can classify your documents using two modes: multi-class or multi-label. Multi-class mode identifies one and only one class for each document, and multi-label mode identifies one or more labels for each document. Because we want to identify a single class to each document, we train the custom classifier model in multi-class mode. Amazon SNS triggers the train_classifier_lambda function, which initiates the Amazon Comprehend classifier training in a multi-class mode.
  7. The train_classifier_lambda function initiates the Amazon Comprehend custom classifier training.
  8. Amazon Comprehend downloads the transformed data from the prepped_data prefix in Amazon S3 to train the custom classifier model.
  9. When the model training is complete, Amazon Comprehend uploads the model.tar.gz file to the output_data prefix of the S3 bucket. The average completion time to train this custom classifier model is approximately 10 hours.
  10. The Amazon S3 upload trigger invokes the extract_comprehend_model_name_lambda function, which retrieves the custom classifier model ARN.
  11. The function extracts the custom classifier model ARN from the S3 event payload and the response of list-document-classifiers call.
  12. The function sends the custom classifier model ARN to the email address that you had subscribed earlier as part of the training stack creation process. You then use this ARN to deploy the inference stack.

This deployment creates the inference stack, as shown in the following figure. The inference stack provides you with a REST API secured by an AWS Identity and Access Management (IAM) authorizer, which you can then use to generate confidence scores of the labels based on the input text supplied from a third-party application or client machine.

Inference stack architecture

Prerequisites

For this demo, you should have the following prerequisites:

  • An AWS account.
  • Python 3.7 or later, Node.js, and Git in the development machine. The AWS CDK uses specific versions of Node.js (>=10.13.0, except for version 13.0.0 – 13.6.0). A version in active long-term support (LTS) is recommended.
    To install the active LTS version of Node.js, you can use the following install script for nvm and use nvm to install the Node.js LTS version. You can also install the current active LTS Node.js via package manager depending on the operating system of your choice.

    For macOS, you can install the Node.js via package manager using the following instructions.

    For Windows, you can install the Node.js via package manager using the following instructions.

  • AWS CDK v2 is pre-installed if you’re using an AWS Cloud9 IDE. If you’re using AWS Cloud9 IDE, you can skip this step.If you don’t have the AWS CDK installed in the development machine, install AWS CDK v2 globally using the Node Package Manager command npm install -g aws-cdk. This step requires Node.js to be installed in the development machine.
  • Configure your AWS credentials to access and create AWS resources using the AWS CDK. For instructions, refer to Specifying credentials and region.
  • Download the SEOSS dataset consisting of requirements, bug reports, code history, and trace links of 33 open-source software projects. Save the file dataverse_files.zip on your local machine.

SEOSS dataset

Deploy the AWS CDK training stack

For AWS CDK deployment, we start with the training stack. Complete the following steps:

  1. Clone the GitHub repository:
$ git clone https://github.com/aws-samples/amazon-comprehend-custom-automate-classification-it-service-request.git
  1. Navigate to the amazon-comprehend-custom-automate-classification-it-service-request folder:
$ cd amazon-comprehend-custom-automate-classification-it-service-request/

All the following commands are run within the amazon-comprehend-custom-automate-classification-it-service-request directory.

  1. In the amazon-comprehend-custom-automate-classification-it-service-request directory, initialize the Python virtual environment and install requirements.txt with pip:
$ python3 -m venv .venv
$ source .venv/bin/activate
$ pip install -r requirements.txt
  1. If you’re using the AWS CDK in a specific AWS account and Region for the first time, see the instructions for bootstrapping your AWS CDK environment:
$ cdk bootstrap aws://<AWS-ACCOUNT-NUMBER>/<AWS-REGION>
  1. Synthesize the CloudFormation templates for this solution using cdk synth and use cdk deploy to create the AWS resources mentioned earlier:
$ cdk synth
$ cdk deploy VPCStack EFSStack S3Stack SNSStack ExtractLoadTransformEndPointCreateStack --parameters SNSStack:emailaddressarnnotification=<emailaddress@example.com>

After you enter cdk deploy, the AWS CDK prompts whether you want to deploy changes for each of the stacks called out in the cdk deploy command.

  1. Enter y for each of the stack creation prompts, then the cdk deploy step creates these stacks. Subscribe the email address provide by you to the SNS topic created as part of the cdk deploy.
  2. After cdk deploy completes successfully, create a folder called raw_data in the S3 bucket comprehendcustom-<AWS-ACCOUNT-NUMBER>-<AWS-REGION>-s3stack.
  3. Upload the SEOSS dataset dataverse_files.zip that you downloaded earlier to this folder.

After the upload is complete, the solution invokes the etl_lambda function using an Amazon S3 event trigger to start the extract, transform, and load (ETL) process. After the ETL process completes successfully, a message is sent to the SNS topic, which invokes the train_classifier_lambda function. This function triggers an Amazon Comprehend custom classifier model training. Depending on whether you train your model on the complete SEOSS dataset, training could take up to 10 hours. When the training process is complete, Amazon Comprehend uploads the model.tar.gz file to the output_data prefix in the S3 bucket.

This upload triggers the extract_comprehend_model_name_lambda function using a S3 event trigger that extracts the custom classifier model ARN and sends it to the email address you had subscribed earlier. This custom classifier model ARN is then used to create the inference stack. When the model training is complete, you can view the performance metrics of the custom classifier model by navigating to the version details section in the Amazon Comprehend console (see the following screenshot), or by using the Amazon Comprehend Boto3 SDK.

Perfomance metrics

Deploy the AWS CDK inference stack

Now you’re ready to deploy the inference stack.

  1. Copy the custom classifier model ARN from the email you received and use the following cdk deploy command to create the inference stack.

This command deploys an API Gateway REST API secured by an IAM authorizer, which you use for inference with an AWS user ID or IAM role that just has the execute-api:Invoke IAM privilege. The following cdk deploy command deploys the inference stack. This stack uses the AWS CDK provider framework to create the Amazon Comprehend endpoint as a custom resource, so that creating, deleting, and updating of the Amazon Comprehend endpoint can be done as part of the inference stack lifecycle using the cdk deploy and cdk destroy commands.

Because you need to run the following command after model training is complete, which could take up to 10 hours, ensure that you’re in the Python virtual environment that you initialized in an earlier step and in the amazon-comprehend-custom-automate-classification-it-service-request directory:

$ cdk deploy APIGWInferenceStack --parameters APIGWInferenceStack:documentclassifierarn=<custom classifier model ARN retrieved from email>

For example:

$ cdk deploy APIGWInferenceStack --parameters APIGWInferenceStack:documentclassifierarn=arn:aws:comprehend:us-east-2:111122223333:document-classifier/ComprehendCustomClassifier-11111111-2222-3333-4444-abc5d67e891f/version/v1
  1. After the cdk deploy command completes successfully, copy the APIGWInferenceStack.ComprehendCustomClassfierInvokeAPI value from the console output, and use this REST API to generate inferences from a client machine or a third-party application that has execute-api:Invoke IAM privilege. If you’re running this solution in us-east-2, the format of this REST API is https://<restapi-id>.execute-api.us-east-2.amazonaws.com/prod/invokecomprehendV1.

Alternatively, you can use the test client apiclientinvoke.py from the GitHub repository to send a request to the custom classifier model. Before using the apiclientinvoke.py, ensure that the following prerequisites are in place:

  • You have the boto3 and requests Python package installed using pip on the client machine.
  • You have configured Boto3 credentials. By default, the test client assumes that a profile named default is present and it has the execute-api:Invoke IAM privilege on the REST API.
  • SigV4Auth points to the Region where the REST API is deployed. Update the <AWS-REGION> value to us-east-2 in apiclientinvoke.py if your REST API is deployed in us-east-2.
  • You have assigned the raw_data variable with the text on which you want to make the class prediction or the classification request:
raw_data="""Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis."""
  • You have assigned the restapi variable with the REST API copied earlier:

restapi="https://<restapi-id>.execute-api.us-east-2.amazonaws.com/prod/invokecomprehendV1"

  1. Run the apiclientinvoke.py after the preceding updates:
$ python3 apiclientinvoke.py

You get the following response from the custom classifier model:

{
 "statusCode": 200,
 "body": [
	{
	 "Name": "SPARK",
	 "Score": 0.9999773502349854
	},
	{
	 "Name": "HIVE",
	 "Score": 1.1613215974648483e-05
	},
	{
	 "Name": "DROOLS",
	 "Score": 1.1110682862636168e-06
	}
   ]
}

Amazon Comprehend returns confidence scores for each label that it has attributed correctly. If the service is highly confident about a label, the score will be closer to 1. Therefore, for the Amazon Comprehend custom classifier model that was trained using the SEOSS dataset, the custom classifier model predicts that the text belongs to class SPARK. This classification returned by the Amazon Comprehend custom classifier model can then be used to classify the IT service requests or predict the correct category of the IT service requests, thereby reducing manual errors or misclassification of service requests.

Clean up

To clean up all the resources created in this post that were created as part of the training stack and inference stack, use the following command. This command deletes all the AWS resources created as part of the previous cdk deploy commands:

$ cdk destroy --all

Conclusion

In this post, we showed you how enterprises can implement a supervised ML model using Amazon Comprehend custom classification to predict the category of IT service requests based on either the subject or the description of the request submitted by the end-user. After you build and train a custom classifier model, you can run real-time analysis for custom classification by creating an endpoint. After you deploy this model to an Amazon Comprehend endpoint, it can be used to run real-time inference by third-party applications or other client machines, including IT service management tools. You can then use this inference to predict the defect category and reduce manual errors or misclassifications of tickets. This helps reduce delays for ticket resolution and increases resolution accuracy and customer productivity, which ultimately results in increased customer satisfaction.

You can extend the concepts in this post to other use cases, such as routing business or IT tickets to various internal teams such as business departments, customer service agents, and Tier 2/3 IT support, created either by end-users or through automated means.

References

  • Rath, Michael; Mäder, Patrick, 2019, “The SEOSS Dataset – Requirements, Bug Reports, Code History, and Trace Links for Entire Projects”, https://doi.org/10.7910/DVN/PDDZ4Q, Harvard Dataverse, V1

About the Authors

Arnab Chakraborty is a Sr. Solutions Architect at AWS based out of Cincinnati, Ohio. He is passionate about topics in Enterprise & Solution architecture, Data analytics, Serverless and Machine Learning. In his spare time, he enjoys watching movies , travel shows and sports.

Viral Desai is a Principal Solutions Architect at AWS. With more than 25 years of experiences in information technology, he has been helping customers adopt AWS and modernize their architectures. He likes hiking, and enjoys diving deep with customers on all things AWS.

Read More

Detect fraud in mobile-oriented businesses using GrabDefence device intelligence and Amazon Fraud Detector

In this post, we present a solution that combines rich mobile device intelligence with customized machine learning (ML) modeling to help you catch fraudsters who exploit mobile apps.

GrabDefence (GD), Grab’s proprietary fraud detection and prevention technology, and AWS have launched GDxAFD, a fraud detection solution tailored for mobile apps that integrates GD’s device intelligence capabilities with Amazon Fraud Detector, AWS’s fully managed ML fraud detection solution. With GDxAFD, you can take advantage of more than 20 years of fraud detection expertise from Amazon as well as extensive mobile fraud experience from Southeast Asia’s leading superapp to safeguard your mobile application from fraudsters.

This solution rides on a larger global wave of anti-fraud efforts, which experts forecast to grow to USD $62.70 billion by 2028. With the rise of the digital economy, fraud syndicates increasingly target online businesses, causing financial loss and destroying the trust between end-users and the platform. The true cost to battle fraud is also increasing rapidly as more fraud checks leads to poorer customer experience, false positives, as well as operational burden, which as a whole is estimated to be three times larger than the actual fraud losses from the True Cost of FraudTM APAC Study by LexisNexis® Risk Solutions.

From the combined industry experience, the solution team believes that many of the modus operandi in a mobile environment is driven by fraudsters having tools and methods to create fake accounts at scale and bypass a platform’s security checks on the device, thereby enabling them to exploit the platform for large returns. Therefore, preventing mobile fraud starts from clearly understanding the risk profile of the devices used to access the mobile app and then using the device risk intelligence gathered, together with additional data about the user, event, or account, to detect potential fraudulent behavior in real time and at scale. By combining rich device intelligence and ML, companies are better positioned to stay ahead of mobile-focused fraud syndicates, and reduce fraud on their platforms.

GD device intelligence

GD is a product from Grab’s fraud prevention team, which has years of experience building solutions for Grab. Grab is a NASDAQ listed company and a leading superapp in South East Asia, with over 30 million monthly transacting users (as per Grab’s Q1 2022 Results). Due to the scale of its operations as a leading superapp in SEA and the nature of a mobile-first business, Grab has been investing heavily in building fraud prevention solutions enabled by rich data, technology focus, and insights gathered from its operational experience and exposure. GD’s device intelligence service collects rich device-level data, excluding any personally identifiable information (PII), from mobile application users and securely analyzes it to understand the risk profile of the device. Learning from a large device network built via Grab’s superapp, GD’s device intelligence service can accurately generate device fingerprints and detect risky attributes such as device or app modification or tampering, emulator usage, and GPS spoofing. As mentioned earlier, many fraud modus operandi on mobile platforms involve mass creation of fake accounts, device reengineering, and location spoofing, which GD device intelligence is capable of detecting. As a result, by integrating GD device intelligence and Amazon Fraud Detector, platforms that face similar fraud attacks can expect up to a 23% increase in fraud detection based on statistical studies done by GrabDefence on Grab’s fraud prevention systems.

Custom fraud detection ML models in Amazon Fraud Detector

Amazon Fraud Detector customizes each model it creates to your own dataset, making the accuracy of models higher than one-size-fits-all ML solutions. During the fully automated model training process, a series of models that have learned patterns of fraud from AWS and Amazon’s own fraud expertise are used to boost your model performance even further.

With the GDxAFD solution, you now have step-by-step guidance and a reference architecture for how to use flexible event schemas in Amazon Fraud Detector to add GD device intelligence findings into your custom fraud detector models. The end result is an ML model that, once trained, has the benefit of learning from multiple data sources, including your own historical data, GD’s device intelligence data, fraud patterns seen across Amazon, and additional third-party data (added automatically by Amazon Fraud Detector). Based on our pilot between GD and Amazon Fraud Detector, our model using GD device intelligence has shown a 23% increase in detection performance for detecting fake account registrations. You can deploy these models to detect mobile fraud to prevent not only fake account registration but also fraudulent payments, promotion abuse, or loyalty program abuse, among others.

To get started, you first integrate GD’s mobile SDK into your mobile application to collect device-level data. Next, you use Amazon Fraud Detector to define the event you want to evaluate for fraud by specifying the event and account data points you have available for the event or account, including the device risk intelligence data points from GD. After this, you train your ML model in Amazon Fraud Detector in just a few steps. After you train the model, you can add it to a detector.

To begin performing real-time predictions, you integrate Amazon Fraud Detector’s low-latency prediction API into your application and begin sending new mobile events to generate fraud predictions. Each fraud prediction considers the GD device intelligence data for the device associated with the event as well as additional data and intelligence automatically added by Amazon Fraud Detector, including signals from fraud patterns experienced across Amazon.

Solution overview

Device intelligence is a critical type of input for risk decisions. One of the common challenges faced in fraud detection in the mobile space is the lack of enriched data availability to make risk decisions. On the other hand, mobile devices are typically the most expensive asset the fraudsters and fraud syndicates possess and, therefore, a significant level of effort is put into masking the true identity and profile of the device being used. Understanding the risk profile of the mobile device (which sometimes isn’t even a real device) and being able to drive insights from the relationship between different mobile devices can significantly improve risk decisions for any mobile business, and becomes central to any mobile-based fraud management strategy.

For generating real-time fraud predictions, the GDxAFD solution uses Amazon Fraud Detector and GrabDefence’s device intelligence SDK, along with Amazon API Gateway and AWS Lambda. You can provision the AWS portions of the solution using AWS CloudFormation.

The following diagram illustrates our solution architecture.

The workflow consists of the following steps:

  1. When an end-user interacts with your mobile app, GD’s mobile SDK passively gathers device data and streams this data to GD’s device intelligence service, where a risk profile for the device is generated.
  2. Then, when that user transacts using the mobile app and you want to assess fraud risk in real time, the mobile app sends the transaction data gathered by the app via API Gateway to a Lambda function.
  3. The Lambda function gathers the GrabDefence risk profile for the device used during the transaction, combines that profile data with the other transaction data, and sends it to the fraud detector.
  4. The fraud detector performs a fraud prediction using your custom fraud detection ML model and ruleset, and returns a risk score and outcome to the Lambda function. This result is sent back to your mobile app via API Gateway.
  5. If desired, the mobile app can then choose to adjust the end-user experience accordingly based on this risk assessment.

Use cases for device intelligence with Amazon Fraud Detector

The ideal end-state solution is an Amazon Fraud Detector model that is trained on a dataset of your historical events and their associated historical GD device intelligence data. To achieve this, you need to integrate the GD Guardian SDK for mobile devices and then gather device intelligence data for your events until you have enough to train a model (for example,10,000 events with at least 400 examples of fraud events). Depending on your use case and availability of fraud labels, you have a couple of ways to get started sooner as you gather data for this solution:

  • Use case A: Use GD device intelligence data directly in the fraud detector rules – With this use case, you create a detector in Amazon Fraud Detector with a ruleset designed to flag high-risk events provided by the device intelligence. This works effectively when you have clear risk mitigation policies that you want to deploy for your platform. (for example, act on the user if the device is jailbroken, or don’t allow redemption of a promo if the device has more than five accounts) In such cases, you can set up your detector rules to flag events based on a combination of GD device risk score and GD device verdicts. This option requires no historical event data or labels to get started, so it can be ready to use sooner than the ML-based detection options.
  • Use case B: Use GD device intelligence and an Amazon Fraud Detector ML model with the fraud detector rules – If you have a historical event dataset and are able to train an Amazon Fraud Detector ML model immediately, you can build on use case A by adding an Amazon Fraud Detector model to your rules-based detector. This way, your detector logic is evaluating device intelligence with rules and all other event data with a customized ML model. This allows you to solve for more complex fraud tactics where statistical methods are required to separate fraud from non-fraud.

Best results are often achieved when both of these scenarios work in tandem, because they can serve different use cases over time even after you have more historical data. With these methods, Amazon Fraud Detector makes it easy to transition to the ideal solution in a few steps.

In the following sections, we walk through the steps to get started using Amazon Fraud Detector with GD device intelligence data.

Integrate the GD mobile SDK and start collecting device intelligence data

Prior to using GrabDefence device intelligence within your application, you must first register as a GrabDefence client. You receive the following credentials from the GrabDefence team:

  • tenant_id – A unique client identifier that represents your organization
  • app_id – A unique application identifier that represents the application you’re integrating

Refer to the GrabDefence documentation for further guidance on how to integrate this SDK.

Create your event type in Amazon Fraud Detector

An event type defines the schema for the event you want to assess for fraud. When creating an event type in Amazon Fraud Detector, you define all the data elements you will have available at the time of the fraud evaluation, including the GD device intelligence risk profile data elements such as the unique device ID and various device verdicts, to Amazon Fraud Detector variables. You need to include event variables (such as IP, email, or billing address) that are unique to the type of event you’re evaluating for fraud, as well as GD device intelligence data. The following table shows examples of event variables, the GD device intelligence data, and the recommended Amazon Fraud Detector variable type to map each element to.

Event Variable Type Event Variable (Not Exhaustive) Amazon Fraud Detector Event Variable Example
Event Metadata EVENT_TIMESTAMP EVENT_TIMESTAMP 2019-11-30T13:01:01Z
EVENT_ID EVENT_ID test0299df10-e2db-11eb-96e2-f7dgje3d3k03
ENTITY_ID ENTITY_ID 123
EVENT_LABEL EVENT_LABEL FRAUD or LEGIT
LABEL_TIMESTAMP LABEL_TIMESTAMP 2019-11-30T13:01:01Z
Event Variables Email EMAIL_ADDRESS test@example.com
IP IP_ADDRESS 192.0.2.1
Phone PHONE_NUMBER 555-0123
GD Device Intelligence Verdicts Verdict: IOS Jailbroken Device CUSTOM: CATEGORICAL GV_IOS_JAIL_BROKEN
Verdict: Debugger Detected CUSTOM: CATEGORICAL GV_DEBUGGER_DETECTED
Verdict: Event Token Signature Mismatch CUSTOM: CATEGORICAL GV_EVENT_TOKEN_SIGNATURE_MISMATCH
Verdict: Server Challenge Mismatch CUSTOM: CATEGORICAL GV_SERVER_CHALLENGE_MISMATCH
GD Risk Scores User account risk score CUSTOM: NUMERICAL 0.9 etc

Build your detection logic in Amazon Fraud Detector

At this point, you need to decide whether you want to start with use case A or use case B. For use case A, you start building a rules-based detector. For use case B, you build an Amazon Fraud Detector model first and, once finished, add the model to your detector.

For instructions on building an Amazon Fraud Detector model and detector, refer to the Amazon Fraud Detector user guide.

The following screenshot shows sample detector rules on the Amazon Fraud Detector console.

Test your detector using Amazon Fraud Detector batch predictions

You can use a batch predictions job to test your detector against a set of events using either the Amazon Fraud Detector console or the CreateBatchPredictionJob API. You need to specify the detector version (created in the previous step) and provide the events via a CSV file (up to 50 MB large) stored in an Amazon Simple Storage Service (Amazon S3) bucket. The output file containing the original input data along with appended results of the detector’s predictions will be available in the same S3 bucket (unless you specify a different location).

For more information on running an Amazon Fraud Detector batch prediction, refer to Amazon Fraud Detector batch predictions documentation page.

Set up the supporting infrastructure

To perform real-time predictions using the detector you built, you must set up a Lambda function that performs the following actions:

  1. Receives transaction data (via API Gateway) gathered from your mobile app. This includes data such as IP address, email address, shipping and billing info, and so on, that is unique to the transaction and use case.
  2. Collects the risk profile from the GD API. This includes device intelligence data and risk signals from GD. You need to convert the GD verdicts to the appropriate Amazon Fraud Detector variable CUSTOM: CATEGORICAL types. For example, if the GD verdict list contains GV_IOS_JAIL_BROKEN, you need to set the Verdict: IOS Jailbroken Device variable to TRUE when sending to Amazon Fraud Detector (as detailed in the next section).
  3. Sends the data to the detector using the GetEventPrediction API (see the next section).

Perform real-time predictions using the Amazon Fraud Detector GetEventPrediction API

Your Lambda function can call the Amazon Fraud Detector GetEventPrediction API to perform real-time predictions and obtain results synchronously. The GetEventPrediction API returns matched outcomes based on the rules you set up earlier. If you attached a model to your detector in Amazon Fraud Detector, the model score is also returned as part of the GetEventPrediction API response. You can find examples of GetEventPrediction requests on the aws-fraud-detector-samples GitHub repository.

You can configure your Lambda function accordingly to parse the response from this API, and return the appropriate action to the mobile application (via API Gateway).

Build and train your model

After you integrate the GD SDK and are generating predictions with Amazon Fraud Detector, your events are stored in Amazon Fraud Detector and you can use the UpdateEventLabel API to add fraud labels for confirmed fraud events. When your stored dataset has 10,000 events with device data and at least 400 labelled as fraud, you can start building a custom Amazon Fraud Detector model that learns from GD’s device intelligence data.

At this point, you’re ready to train the model. This takes a few steps on the Amazon Fraud Detector console, and model training typically takes around an hour but can be longer depending on the size of your training dataset.

  1. On the Amazon Fraud Detector console, choose Create model.
  2. Choose Transaction Fraud Insights as the model type.
  3. Choose the event type you created earlier.
  4. Choose the date range for your training dataset that encompasses the period where you’ve collected GD device intelligence data.
  5. Add all the event type variables, including the GD device-specific elements, to your model’s input configuration.
  6. Strat training the model.

After your model is trained, you can review performance metrics and then deploy it by changing its status to Active. To learn more about model scores and performance metrics, see Model scores and Training performance metrics. At this point, you can now add your model to your detector, add threshold rules to interpret the risk scores that the model outputs, and continue making predictions using the GetEventPrediction API.

Automate the solution

You can use AWS CloudFormation to automate the creation of your Amazon Fraud Detector event type and related resources. For more details, refer to managing resources using AWS CloudFormation.

Conclusion

Congrats! You have successfully built an Amazon Fraud Detector model that integrates GD device intelligence into your detector. The Amazon Fraud Detector ML model you trained has learned from multiple data sources, including your own historical data, GD’s device intelligence data, fraud patterns seen across Amazon, and additional third-party data (added automatically by Amazon Fraud Detector). You can deploy this solution on your mobile apps to detect and capture various types of mobile fraud.

Special thanks to everyone who contributed to this blog including, Abhishek Ravi, Tanay Bhargava, Eric Burris, Puneet Gambhir (GrabDefence), Brian Kim (GrabDefence), and Sing Kwan Ng (GrabDefence).


About the author

Marcel Pividal is a Sr. AI Services Solutions Architect in the World-Wide Specialist Organization. Marcel has more than 20 years of experience solving business problems through technology for Fintechs, Payment Providers, Pharma, and government agencies. His current areas of focus are Risk Management, Fraud Prevention, and Identity Verification.

Adriaan de Jonge is Partner Solutions Architect at AWS in Singapore. He is part of the AWS GSI team in the ASEAN geography. Adriaan is particularly interested in serverless, cloud-native development, and DevOps. In his spare time, he likes to bake cakes that are suitable for people with allergies.

Jianbo Liu is a Research Scientist with Amazon Fraud Detector.

Read More

How Synamedia uses Amazon Rekognition Video to build advanced video search capabilities for long-form video

Synamedia is a leading video technology provider addressing the needs for premium video service providers and direct-to-consumer (D2C) with a comprehensive solution portfolio. Synamedia solutions spread across several pillars such as video networks, TV platforms, advertisement and monetization, and content protection and piracy disruption.

Synamedia partnered with AWS to use artificial intelligence (AI) to develop enhanced video search capabilities for long-form video. This is to assist their customers in searching for videos based on a description of scenes that aren’t described in the metadata of the assets. For example, searching for a video (even within a series) that contained a scene on a boat that isn’t significant enough to be mentioned in the metadata. This enables content discovery driven from real-world objects.

With Amazon Rekognition Video, Synamedia built an AI solution that was able to perform label detection in videos and in images using standard and custom models. This enabled scene-level detection of specific objects in long-form video, based on what is actually in the scene at the time. This new capability allows users to find specific occurrences within the long-form video, based only on a general description of what they’re looking for. This enables Synamedia to perform extremely fast when onboarding new content, which now takes a few hours to spin up and get results. The solution is simple to use and extensive by providing the ability to add further custom models for domain-specific images.

“Amazon Rekognition Video is a powerful service that is simple to use. It gave us ready-made access to best-in-class computer vision capabilities, which we could use to build and test innovative video search features in a matter of weeks.”

– Avi Fruchter, Software Engineering Fellow at Synamedia.

Using AI to index visual content

As both supply of video content and demand for greater video insights continue to grow, effective video search capabilities are becoming more important. Traditional video search, however, is typically limited to basic information such as the video title, or in some instances, to metadata attached as tags that describe the key themes or content of the video.

Most descriptive information needs to be added manually, but this becomes prohibitive as the quantity of video grows. As a result, traditional video search performance is often limited. This limitation is even more pronounced for long-form video content, for which scene-level metadata usually doesn’t exist, given how expensive and time-consuming it is to produce.

To address this limitation, Synamedia set out to develop an AI-powered video search solution using computer vision to automatically identify scene-level details in any given video, and make that information discoverable to users based on general descriptions of those scenes.

Using Amazon Rekognition to build a custom computer vision solution in just 2 weeks

To accomplish this goal, Synamedia’s Software Engineering Fellow, Avi Fruchter, turned to Amazon Rekognition, a fully managed video analysis service that helps accelerate the process of using computer vision models to detect relevant scene-level occurrences such as objects, activities, and even text and scenes.

Amazon Rekognition Video accelerates the development of computer vision solutions for video by automatically processing and tagging video content using computer vision models. These models are fully managed and maintained by Amazon Rekognition. It removes the undifferentiated heavy lifting of managing the necessary infrastructure, and also reduces the technical expertise required to build and deploy these models.

To get started, you simply choose which of Amazon Rekognition’s wide range of capabilities is relevant to your task, and call the relevant API. The results are then returned as an easy-to-manage JSON response for each job.

For example, Synamedia used the StartLabelDetection API to automatically generate a list of labels for objects detected in each video frame of their video library. From this simple API call, Amazon Rekognition returned the list of labels, the confidence score of each, and the relevant timestamps for each frame. This enabled Synamedia to immediately create an entirely new set of search metadata for each video in their test library. Users are then able to search for specific video content just by describing specific objects or scenery they’re interested in, and get results that not only match their query, but that also point them to the specific scene in the video that featured that content.

Other relevant Amazon Rekognition APIs for video analysis are StartFaceDetection, StartPersonTracking, and StartSegmentDetection—a feature that can identify the moment that scenes in a video change.

Amazon Rekognition works on both pre-recorded and live video. Pre-recorded video is read from Amazon Simple Storage Service (Amazon S3), and live video can be processed from Amazon Kinesis Video Streams.

Synamedia chose Amazon Rekogntion for its ability to rapidly expand their capabilities. Synamedia’s innovation team is dedicated solely to building new technical innovations in video and has strong technical expertise. However, even for them it’s not always possible to have deep domain expertise in all areas of video technology. Enter Amazon Rekogntion, which extended their capabilities in computer vision, enabling them to conceptualize a use case and quickly test its viability.

“It was extremely fast to onboard, and the results were extremely quick,” Avi Fruchter says. “We are not always domain experts in all areas of ML, and Amazon Rekognition gives us the ability to leverage our existing expertise into new types of enhanced use cases for our customers.”

Synamedia anticipates their solution will have broad benefits for a wide range of customers, including companies with large video libraries as well as the growing number of companies who need to monitor specific events in live video feeds, such as health and safety risks.

Summary

With Amazon Rekognition Video, Synamedia was able to build and test an advanced video search capability in a matter of weeks, without needing to hire or develop additional specialized computer vision expertise.

This new capability has enabled Synamedia to expand the impact of its innovation team and continue with its mission to drive new video innovation for its customers.

Learn more about how you can quickly build advanced computer vision solutions for video by visiting Amazon Rekognition Video or referring to Amazon Rekognition resources.


About the authors

Daniel Burke is the European lead for AI and ML in the Private Equity group at AWS. Daniel works directly with Private Equity funds and their portfolio companies, helping them accelerate their AI and ML adoption to improve innovation and increase enterprise value.

John Shaw is the North American lead for AI and ML in the Private Equity group at AWS. John works directly with Private Equity funds and their portfolio companies, helping them accelerate their AI and ML adoption to improve innovation and increase enterprise value.

Read More