Seamlessly connect Amazon Athena with Amazon Lookout for Metrics to detect anomalies

Amazon Lookout for Metrics is an AWS service that uses machine learning (ML) to automatically monitor the metrics that are most important to businesses with greater speed and accuracy. The service also makes it easier to diagnose the root cause of anomalies, such as unexpected dips in revenue, high rates of abandoned shopping carts, spikes in payment transaction failures, increases in new user sign-ups, and many more. Lookout for Metrics goes beyond simple anomaly detection. It allows developers to set up autonomous monitoring for important metrics to detect anomalies and identify their root cause in a matter of a few clicks to detect anomalies in its metrics—all with no ML experience required.

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon Simple Storage Service (Amazon S3) using standard SQL. Simply point to your data in Amazon S3, define the schema, and start querying using standard SQL. Most results are delivered within seconds. With Athena, there’s no need for complex ETL jobs to prepare your data for analysis. This makes it easy for anyone with SQL skills to quickly analyze large-scale datasets.

With today’s launch, Lookout for Metrics can now seamlessly connect to your data in Athena to set up highly accurate anomaly detectors. This lets you quickly deploy state-of-the-art anomaly detection via ML with Lookout for Metrics against any datasets that are available in Athena.

Athena connectivity extends the capabilities of Lookout for Metrics by bringing the following benefits:

  • It extends the capabilities of Lookout for Metrics in terms of filetype support. Prior to this, Lookout for Metrics supported CSV and JSONLines formatted files, but with Athena this has been expanded to Parquet, Avro, Plaintext, and more. If you can parse it via Athena, then it’s now possible to import and leverage with Lookout for Metrics.
  • It also introduces support for data with federated queries. Prior to this launch, if your data was stored in multiple databases or sources, you would need to define a complete complex ETL process as well as manage its performance characteristics before you could export all of the data into a CSV or JSONLines file and input it into Lookout for Metrics for Anomaly Detection. With federated queries from Athena, you define the disparate sources as well as how the join should be performed and when the data has been processed and can be queried by Athena, it’s immediately ready for Lookout for Metrics. This enables you to hand over the burden for data transformation, aggregation, and delivery location to Athena and just focus on the identified anomalies from Lookout for Metrics.

Solution overview

In this post, we demonstrate how to integrate an Athena table and detect anomalies in the revenue metrics. We also track how order rate and inventory metrics are impacted. The source data lies in Amazon S3 and we’ve configured Athena tables to be able to query the data in it. An AWS Lambda is responsible for updating the partitions within Athena, which are used by Lookout for Metrics to detect anomalies. This solution enables you to use an Athena data source for Lookout for Metrics.

You could use the provided AWS CloudFormation stack to set up resources for the walkthrough. It contains resources to continuously generate live data and makes them query-able in Athena.

  • Launch the stack from the following link and select next on the Create Stack page.
  • On the Specify stack details page, add the values from above, give it a Stack name (for example, L4MAthenaDetector), and select Next.
  • On the Configure stack options page, leave everything as-is and select Next.

Set up a new detector with Athena as the data source

Step 1

Log in to the AWS Console to get started with creating an Anomaly Detector with Lookout for Metrics. The first step is to select the “Create detector” button.

Step 2

Fill out the mandatory detector fields like name. Select the detection interval for the detector, which is determined by the frequency at which you want Lookout for Metrics to query your data and monitor them for anomalies. Encryption information is not mandatory. Encryption information allows Lookout for Metrics to encrypt your data using your AWS Key Management Service (KMS) key. In this example, we’ll skip adding an encryption key, Lookout for Metrics would use default encryption to encrypt your data if no encryption information is provided, and proceed by selecting the “Create” button.

Step 3

Upon creation of the anomaly detector, you’ll see confirmation in a banner at the top. You can proceed by selecting “Add a dataset” through either the banner or the button under “Add a dataset”.

Fill out the basic information for the data source. Timezone is an optional field. Select the dropdown to select a data source.

Lookout for Metrics supports multiple data sources as a convenience for customers. For this example, we’ll select Athena.

Once Athena is selected as the data source, you’ll have the option of selecting Backtest or Continuous mode for the detector. For this example, we’ll proceed by using the Continuous mode. Proceed by adding details for the Athena table that you want to monitor for anomalies.

You can allow the service to create a Service role or you could use an existing AWS Identity and Access Management (IAM) role in your account for federated queries. Note that Lookout for Metrics doesn’t support automated creation of IAM roles for federated queries. Therefore, you would have to create a new IAM role to allow Athena to perform the following actions on your data,

  • CreatePreparedStatement
  • GetPreparedStatement
  • GetQueryResultsStream
  • DeletePreparedStatement
  • GetDatabase
  • GetQueryResults
  • GetWorkGroup
  • GetTableMetadata
  • StartQueryExecution
  • GetQueryExecution

The IAM role created by the service looks like the following:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "glue:GetTable",
        "glue:GetDatabase",
        "glue:GetPartitions"
      ],
      "Resource": [
        "arn:aws:glue:us-east-1:111122223333:catalog",
        "arn:aws:glue:us-east-1:111122223333:database/sample_db",
        "arn:aws:glue:us-east-1:111122223333:table/sample_db/*"
      ],
      "Effect": "Allow"
    },
    {
      "Action": [
        "athena:CreatePreparedStatement",
        "athena:GetPreparedStatement",
        "athena:GetQueryResultsStream",
        "athena:DeletePreparedStatement",
        "athena:GetDatabase",
        "athena:GetQueryResults",
        "athena:GetWorkGroup",
        "athena:GetTableMetadata",
        "athena:StartQueryExecution",
        "athena:GetQueryExecution"
      ],
      "Resource": [
          "arn:aws:athena:us-east-1:111122223333:datacatalog/AWSDataCatalog",
          "arn:aws:athena:us-east-1:111122223333:workgroup/sampleWorkgroup"
      ],
      "Effect": "Allow"
    },
    {
      "Action": [
        "s3:GetObject",
        "s3:ListBucket",
        "s3:PutObject",
        "s3:GetBucketLocation",
        "s3:ListBucketMultipartUploads",
        "s3:ListMultipartUploadParts",
        "s3:AbortMultipartUpload"
      ],
      "Resource": [
        "arn:aws:s3:::sample-data-bucket",
        "arn:aws:s3:::sample-results-bucket",
        "arn:aws:s3:::sample-data-bucket/*",
        "arn:aws:s3:::sample-results-bucket/*"
      ],
      "Effect": "Allow"
    },
    {
      "Effect": "Allow",
      "Action": [
        "kms:GenerateDataKey",
        "kms:Decrypt"
      ],
      "Resource": [
        "*"
      ],
      "Condition": {
        "ForAllValues:StringEquals": {
          "kms:ViaService": "s3.us-east-1.amazonaws.com",
          "kms:CallerAccount": [
            "111122223333"
          ]
        }
      }
    }
  ]
}

Step 4

Now we’ll define relevant metrics for the detector. Lookout for Metrics will populate the drop-downs with the columns present in the supplied Athena table. You can select up to five metrics and five dimensions. Lookout for Metrics requires the data in your table to be partitioned as timestamps for the timestamp column. You will also have an option to estimate the cost for this detector by adding the number of values across your dimensions.

Once you have selected all of the metrics, proceed by selecting the “Next” button. Review the details and select the “Save dataset” button to save the dataset.

Step 5

Once the dataset is created, we’ll activate the detector by either selecting the “Activate” button at the top or the “Activate Detector” button under the “How it works” section.

You’ll be prompted to confirm if you want to activate the detector for continuous detection. Select “Activate” to confirm.

You’ll see a confirmation informing that the detector is activating.

Step 6

Once the Anomaly Detector is Active, you can use the “Detector log” tab on the Detector details page to review detection executions that have been performed by the service.

Step 7

You can select the “View anomalies” button from the detector details page to manually inspect anomalies that may have been detected by the service.

Step 8

On the Anomalies review page, you can adjust the severity score threshold on the threshold dial to filter anomalies above a selected score.

Review and analyze the results

When detecting an anomaly, Lookout for Metrics helps you focus on what matters most by assigning a severity score to aid prioritization. To help you find the root cause, it intelligently groups anomalies that may be related to the same incident, and then summarizes the different sources of impact.

Lookout for Metrics also lets you provide real-time feedback on the relevance of detected anomalies, thereby enabling a powerful human-in-the-loop mechanism. This information is fed back to the anomaly detection model to improve its accuracy in near-real time.

Clean up

To avoid incurring additional charges for the resource set up for the demo, you can delete the created detector under Lookout for Metrics and the stack created via CloudFormation.

Conclusion

You can seamlessly connect to your data in Athena to Lookout for Metrics to set up highly accurate anomaly detector across metrics and dimensions within your Athena tables. To get started with this capability, see Using Amazon Athena with Lookout for Metrics. You can use this capability in all Regions where Lookout for Metrics is publicly available. For more information about Region availability, see AWS Regional Services.


About the Authors

Devesh Ratho is a Software Development Engineer in the Lookout for Metrics team. His interests lie in building scalable distributed systems. In his spare time, he enjoys sim racing.

Chris KingChris King is a Senior Solutions Architect in Applied AI with AWS. He has a special interest in launching AI services and helped grow and build Amazon Personalize and Amazon Forecast before focusing on Amazon Lookout for Metrics. In his spare time he enjoys cooking, reading, boxing, and building models to predict the outcome of combat sports.

Read More

Detect social media fake news using graph machine learning with Amazon Neptune ML

In recent years, social media has become a common means for sharing and consuming news. However, the spread of misinformation and fake news on these platforms has posed a major challenge to the well-being of individuals and societies. Therefore, it is imperative that we develop robust and automated solutions for early detection of fake news on social media. Traditional approaches rely purely on the news content (using natural language processing) to mark information as real or fake. However, the social context in which the news is published and shared can provide additional insights into the nature of fake news on social media and improve the predictive capabilities of fake news detection tools. In this post, we demonstrate how to use Amazon Neptune ML to detect fake news based on the content and social context of the news on social media.

Neptune ML is a new capability of Amazon Neptune that uses graph neural networks (GNNs), a machine learning (ML) technique purpose-built for graphs, to make easy, fast, and accurate predictions using graph data. Making accurate predictions on graphs with billions of relationships requires expertise. Existing ML approaches such as XGBoost can’t operate effectively on graphs because they’re designed for tabular data. As a result, using these methods on graphs can take time, require specialized skills, and produce suboptimal predictions.

Neptune ML uses the Deep Graph Library (DGL), an open-source library to which AWS contributes, and Amazon SageMaker to build and train GNNs, including Relational Graph Convolutional Networks (R-GCNs) for tasks such as node classification, node regression, link prediction, or edge classification.

The DGL makes it easy to apply deep learning to graph data, and Neptune ML automates the heavy lifting of selecting and training the best ML model for graph data. It provides fast and memory-efficient message passing primitives for training GNNs. Neptune ML uses the DGL to automatically choose and train the best ML model for your workload. This enables you to make ML-based predictions on graph data in hours instead of weeks. For more information, see Amazon Neptune ML for machine learning on graphs.

Amazon SageMaker is a fully managed service that provides every developer and data scientist with the ability to prepare, build, train, and deploy ML models quickly.

Overview of GNNs

GNNs are neural networks that take graphs as input. These models operate on the relational information in data to produce insights not possible in other neural network architectures and algorithms. A graph (sometimes called a network) is a data structure that highlights the relationships between components in the data. It consists of nodes (or vertices) and edges (or links) that act as connections between the nodes. Such a data structure has an advantage when dealing with entities that have multiple relationships. Graph data structures have been around for centuries, with a wide variety of modern use cases.

GNNs are emerging as an important class of deep learning (DL) models. GNNs learn embeddings on nodes, edges, and graphs. GNNs have been around for about 20 years, but interest in them has dramatically increased in the last 5 years. In this time, we’ve seen new architectures emerge, novel applications realized, and new platforms and libraries enter the scene. There are several potential research and industry use cases for GNNs, including the following:

  • Computer vision – Generating scene graphs
  • Forecasting – Predicting traffic volume
  • Node classification – Implementing targeted campaigns, detecting fake news
  • Graph classification – Predicting the properties of a chemical compound
  • Link prediction – Building recommendation systems
  • Other – Predicting adversarial attacks

Dataset

For this post, we use the BuzzFeed dataset from the 2018 version of FakeNewsNet. The BuzzFeed dataset consists of a sample of news articles shared on Facebook from nine news agencies over 1 week leading up to the 2016 US election. Every post and the corresponding news article have been fact-checked by BuzzFeed journalists. The following table summarizes key statistics about the BuzzFeed dataset from FakeNewsNet.

Category Amount
Users 15,257
Authors 126
Publishers 28
Social Links 634,750
Engagements 25,240
News Articles 182
Fake News 91
Real News 91

To get the raw data, you can complete the following steps:

  1. Clone the FakeNewsNet repository from GitHub.
  2. Check out the old version branch.
  3. Change the directory to Data/BuzzFeed.

Each row in the Users.txt file provides a UUID for the corresponding user.

Each row in the News.txt file provides a name and ID for the corresponding news in the dataset.

In the BuzzFeedNewsUser.txt file, the news_id in the first column is posted or shared by the user_id in the second column n times, where n is the value in the third column.

In the BuzzFeedUserUser.txt file, the user_id in the first column follows the user_id in the second column.

User features such as age, gender, and historical social media activities (109,626 features for each user) are made available in UserFeature.mat file. Sample news content files, shown in the following screenshot, contain information such as news title, news text, author name, and publisher web address.

We processed the raw data from the FakeNewsNet repository and converted it into CSV format for vertices and edges in a heterogeneous property graph that can be readily loaded into a Neptune database with Apache TinkerPop Gremlin. The constructed property graph is composed of four vertex types and five edge types, as demonstrated in the following schematic, which together describe the social context in which each news item is published and shared. The News vertices have two properties: news_title and news_type (Fake or Real). The edges connecting News and User vertices have a weight property describing how many times the user has shared the news. The User vertices have a 100-dimension property representing user features such as age, gender, and historical social media activities (reduced from 109,626 to 100 using principal coordinate analysis).

The following screenshot shows the first 10 rows of the processed nodes.csv file.

The following screenshot shows the first 10 rows of the processed edges.csv file.

To follow along with this post, start by using the following AWS CloudFormation quick-start template to quickly spin up an associated Neptune cluster and AWS graph notebook, and set up all the configurations needed to work with Neptune ML in a graph notebook. You then need to download and save the sample dataset in the default Amazon Simple Storage Service (Amazon S3) bucket associated with your SageMaker session, or in an S3 bucket of your choice. For rapid experimentation and initial data exploration, you can save a copy of the dataset under the home directory of the local volume attached to your SageMaker notebook instance, and follow the create_graph_dataset.ipynb Jupyter notebook. After you generate the processed nodes and edges files, you can run the following commands to upload the transformed graph data to Amazon S3:

bucket = '<bucket-name>'
prefix = 'fake-news-detection/data'
s3_client = boto3.client('s3')
resp = s3_client.upload_file('./Data/upload/nodes.csv', bucket, f"{prefix}/nodes.csv")
resp = s3_client.upload_file('./Data/upload/edges.csv', bucket, f"{prefix}/edges.csv")

You can use the %load magic command, which is available as part of the AWS graph notebook, to bulk load data to Neptune:

%load -s {s3_uri} -f csv -p OVERSUBSCRIBE –run

You can use the %graph_notebook_config magic command to see information about the Neptune cluster associated with your graph notebook. You can also use the %status magic command to see the status of your Neptune cluster, as shown in the following screenshot.

Solution overview

Neptune ML uses graph neural network technology to automatically create, train, and deploy ML models on your graph data. Neptune ML supports common graph prediction tasks, such as node classification and regression, edge classification and regression, and link prediction. In our solution, we use node classification to classify news nodes according to the news_type property.

The following diagram illustrates the high-level process flow to develop the best model for fake news detection.

Graph ML with Neptune ML involves five main steps:

  1. Export and configure the data – The data export step uses the Neptune-Export service to export data from Neptune into Amazon S3 in CSV format. A configuration file named training-data-configuration.json is automatically generated, which specifies how the exported data can be loaded into a trainable graph.
  2. Preprocess the data – The exported dataset is preprocessed using standard techniques to prepare it for model training. Feature normalization can be performed for numeric data, and text features can be encoded using word2vec. At the end of this step, a DGL graph is generated from the exported dataset for the model training step. This step is implemented using a SageMaker processing job, and the resulting data is stored in an Amazon S3 location that you have specified.
  3. Train the model – This step trains the ML model that will be used for predictions. Model training is done in two stages:

    1. The first stage uses a SageMaker processing job to generate a model training strategy configuration set that specifies what type of model and model hyperparameter ranges are used for the model training.
    2. The second stage uses a SageMaker model tuning job to try different hyperparameter configurations and select the training job that produced the best-performing model. The tuning job runs a pre-specified number of model training job trials on the processed data. At the end of this stage, the trained model parameters of the best training job are used to generate model artifacts for inference.
  4. Create an inference endpoint in SageMaker – The inference endpoint is a SageMaker endpoint instance that is launched with the model artifacts produced by the best training job. The endpoint is able to accept incoming requests from the graph database and return the model predictions for inputs in the requests.
  5. Query the ML model using Gremlin – You can use extensions to the Gremlin query language to query predictions from the inference endpoint.

Before we proceed with the first step of machine learning, let’s verify that the graph dataset is loaded in the Neptune cluster. Run the following Gremlin traversal to see the count of nodes by label:

%%gremlin
g.V().groupCount().by(label).unfold().order().by(keys)

If nodes are loaded correctly, the output is as follows:

  • 126 author nodes
  • 182 news nodes
  • 28 publisher nodes
  • 15,257 user nodes

Use the following code to see the count edges by label:

%%gremlin
g.E().groupCount().by(label).unfold().order().by(keys)

If edges are loaded correctly, the output is as follows:

  • 634,750 follows edges
  • 174 published edges
  • 250 wrote edges
  • 250 wrote_for edges

Now let’s go through the ML development process in detail.

Export and configure the data

The export process is triggered by calling to the Neptune-Export service endpoint. This call contains a configuration object that specifies the type of ML model to build, in our case node classification, as well as any feature configurations required.

The configuration options provided to the Neptune-Export service are broken into two main sections: selecting the target and configuring features. Here we want to classify news nodes according to the news_type property.

The second section of the configuration, configuring features, is where we specify details about the types of data stored in our graph and how the ML model should interpret that data. When data is exported from Neptune, all properties of all nodes are included. Each property is treated as a separate feature for the ML model. Neptune ML does its best to infer the correct type of feature for a property, but in many cases, the accuracy of the model can be improved by specifying information about the property used for a feature. We use word2vec to encode the news_title property of news nodes, and the numerical type for user_features property of user nodes. See the following code:

export_params={ 
"command": "export-pg", 
"params": { "endpoint": neptune_ml.get_host(),
            "profile": "neptune_ml",
            "useIamAuth": neptune_ml.get_iam(),
            "cloneCluster": False
            }, 
"outputS3Path": f"{s3_uri}/neptune-export",
"additionalParams": {
        "neptune_ml": {
          "version": "v2.0",
          "targets": [
            {
              "node": "news",
              "property": "news_type",
              "type": "classification"
            }
          ],
         "features": [
            {
                "node": "news",
                "property": "news_title",
                "type": "text_word2vec"
            },
            {
                "node": "user",
                "property": "user_features",
                "type": "numerical"
            }
         ]
        }
      },
"jobSize": "medium"}

Start the export process by running the following command:

%%neptune_ml export start --export-url {neptune_ml.get_export_service_host()} --export-iam --wait --store-to export_results
${export_params}

Preprocess the data

When the export job is complete, we’re ready to train our ML model. There are three machine learning steps in Neptune ML. The first step (data processing) processes the exported graph dataset using standard feature preprocessing techniques to prepare it for use by the DGL. This step performs functions such as feature normalization for numeric data and encoding text features using word2vec. At the conclusion of this step, the dataset is formatted for model training. This step is implemented using a SageMaker processing job, and data artifacts are stored in a pre-specified Amazon S3 location when the job is complete. Run the following code to create the data processing configuration and begin the processing job:

# The training_job_name can be set to a unique value below, otherwise one will be auto generated
training_job_name=neptune_ml.get_training_job_name('fake-news-detection')

processing_params = f"""
--config-file-name training-data-configuration.json
--job-id {training_job_name} 
--s3-input-uri {export_results['outputS3Uri']} 
--s3-processed-uri {str(s3_uri)}/preloading """

Train the model

Now that you have the data processed in the desired format, this step trains the ML model that is used for predictions. The model training is done in two stages. The first stage uses a SageMaker processing job to generate a model training strategy. A model training strategy is a configuration set that specifies what type of model and model hyperparameter ranges are used for the model training. After the first stage is complete, the SageMaker processing job launches a SageMaker hyperparameter tuning job. The hyperparameter tuning job runs a pre-specified number of model training job trials on the processed data, and stores the model artifacts generated by the training in the output Amazon S3 location. When all the training jobs are complete, the hyperparameter tuning job also notes the training job that produced the best performing model.

We use the following training parameters:

training_params=f"""
--job-id {training_job_name}
--data-processing-id {training_job_name} 
--instance-type ml.c5.18xlarge
--s3-output-uri {str(s3_uri)}/training
--max-hpo-number 20
--max-hpo-parallel 4 """

The hyperparameter tuning finds the best version of a model by running many training jobs on the dataset. You can summarize hyperparameters of the five best training jobs and their respective model performance as follows:

tuning_job_name = training_results['hpoJob']['name']
tuner = sagemaker.HyperparameterTuningJobAnalytics(tuning_job_name)

full_df = tuner.dataframe()

if len(full_df) > 0:
    df = full_df[full_df["FinalObjectiveValue"] > -float("inf")]
    if len(df) > 0:
        df = df.sort_values("FinalObjectiveValue", ascending=False)
        print("Number of training jobs with valid objective: %d" % len(df))
        print({"lowest": min(df["FinalObjectiveValue"]), "highest": max(df["FinalObjectiveValue"])})
        pd.set_option("display.max_colwidth", None)  # Don't truncate TrainingJobName
    else:
        print("No training jobs have reported valid results yet.")

We can see that the best performing training job achieved an accuracy of approximately 94%. This training job will be automatically selected by Neptune ML for creating an endpoint in the next step.

Create an endpoint

The final step of machine learning is to create an inference endpoint, which is a SageMaker endpoint instance that is launched with the model artifacts produced by the best training job. We use this endpoint in our graph queries to return the model predictions for the inputs in the request. After the endpoint is created, it stays active until it’s manually deleted. Create the endpoint with the following code:

endpoint_params=f"""
--id {training_job_name}
--model-training-job-id {training_job_name} """

#Create endpoint
%neptune_ml endpoint create --wait --store-to endpoint_results {endpoint_params}

Our new endpoint is now up and running.

Query the ML model

Now let’s query your trained graph to see how the model predicts news_type for one unseen news node:

# Random fake news: test node: Actual
%%gremlin
g.V().has('news_title', 'BREAKING: Steps to FORCE FBI Director Comey to Resign In Process – Hearing Decides His Fate Sept 28').properties("news_type").value()

# Random fake news: test node: Predicted
%%gremlin
g.with("Neptune#ml.endpoint", "${endpoint}").
  V().has('news_title', "BREAKING: Steps to FORCE FBI Director Comey to Resign In Process – Hearing Decides His Fate Sept 28").properties("news_type").with("Neptune#ml.classification").value()

If your graph is continuously changing, you may need to update ML predictions frequently using the newest data. Although you can do this simply by rerunning the earlier steps (from data export and configuration to creating your inference endpoint), Neptune ML supports simpler ways to update your ML predictions using new data. See Workflows for handling evolving graph data for more details.

Conclusion

In this post, we showed how Neptune ML and GNNs can help detect social media fake news using node classification on graph data by combining information from the complex interaction patterns in the graph. For instructions on implementing this solution, see the GitHub repo. You can also clone and extend this solution with additional data sources for model retraining and tuning. We encourage you to reach out and discuss your use cases with the authors via your AWS account manager.

Additional references

For more information related to Neptune ML and detecting fake news in social media, see the following resources:


About the Authors

Hasan Shojaei is a Data Scientist with AWS Professional Services, where he helps customers across different industries such as sports, insurance, and financial services solve their business challenges through the use of big data, machine learning, and cloud technologies. Prior to this role, Hasan led multiple initiatives to develop novel physics-based and data-driven modeling techniques for top energy companies. Outside of work, Hasan is passionate about books, hiking, photography, and ancient history.

Sarita Joshi is a Senior Data Science Manager with the AWS Professional Services Intelligence team. Together with her team, Sarita plays a strategic role for our customers and partners by helping them achieve their business outcomes through machine learning and artificial intelligence solutions at scale. She has several years of experience as a consultant advising clients across many industries and technical domains, including AI, ML, analytics, and SAP. She holds a master’s degree in Computer Science, Specialty Data Science from Northeastern University.

Read More