Leveraging artificial intelligence and machine learning at Parsons with AWS DeepRacer

Leveraging artificial intelligence and machine learning at Parsons with AWS DeepRacer

This post is co-written with Jennifer Bergstrom, Sr. Technical Director, ParsonsX.

Parsons Corporation (NYSE:PSN) is a leading disruptive technology company in critical infrastructure, national defense, space, intelligence, and security markets providing solutions across the globe to help make the world safer, healthier, and more connected. Parsons provides services and capabilities across cybersecurity, missile defense, space ground station technology, transportation, environmental remediation, and water/wastewater treatment to name a few.

Parsons is a builder community and invests heavily in employee development programs and upskilling. With programs such as ParsonsX, Parsons’s digital transformation initiative, and – ‘The Guild,’ – an employee-focused community, Parsons strives to be an employer of choice and engages its employees in career development programs year-round to create a workforce of the future.

In this post, we show you how Parsons is building its next generation workforce by using machine learning (ML) and artificial intelligence (AI) with AWS DeepRacer in a fun and collaborative way.

As Parsons’ footprint moves into the cloud, their leadership recognized the need for a change in culture and a fundamental requirement to educate their engineering task force on the new cloud operating model, tools, and technologies. Parsons is observing industry trends that make it imperative to incorporate AI and ML capabilities into the organization’s strategic and tactical decision-making processes. To best serve the customer’s needs, Parsons must upskill its workforce across the board in AI/ML tooling and how to scale it in an enterprise organization. Parsons is on a mission to make AI/ML a foundation of business across the company.

Parsons chose AWS DeepRacer because it’s a fun, interactive, and exciting challenge that appealed to their broader range of employees and didn’t mandate a significant level of expertise to compete. Parsons found that AWS has many dedicated AWS DeepRacer experts in the field who would help plan, setup and run a series of AI/ML events and challenges. Parsons realized success of this event would be driven by efficient mechanisms and processes the AWS DeepRacer community has in place.

Parsons’ goal was to upskill their employees in an enjoyable and competitive way, with virtual leagues among peer groups and an in-person event for the top racers. The education initiative in partnership with AWS was comprised of four phases.

First, Parsons hosted a virtual live workshop with AWS experts in the AI/ML and DeepRacer community. The workshop taught the basics of reinforcement learning, reward functions, hyperparameter tuning, and accessing the AWS DeepRacer console to train and submit a model.

In the next phase, they hosted a virtual community league race for all participating Parsons employees. Models were optimized, submitted, and raced, and winners were announced at the end of racing. Participants in the virtual leagues were comprised of individual contributors and frontline managers from various job roles across Parsons, including civil engineers, bridge engineers, systems and software engineers, data analysts, project managers, and program managers. Joining them as participants in the league were business unit presidents, SVPs, VPs, senior directors, and directors.

In the third phase, an in-person league was held in Maryland. The top four participants from the virtual leagues saw their models loaded into and raced in physical AWS DeepRacer cars on a track built onsite. The top four competitors at this event included a market CTO, a project signaling engineer, a project engineer, and an engineer intern.

The fourth and final phase of the event had each of the four competitors provide a technical walkthrough of the techniques used to develop, train, and test their models. Through AWS DeepRacer, Parsons not only showcased the impact this event was able to make globally across all the divisions, but also that they were able to create a memorable experience for participants.

Over 500 employees registered from various business units and service organizations across Parsons worldwide right after an announcement of the AWS DeepRacer challenge was published internally. The AWS DeepRacer workshop saw unprecedented interest with over 470 Parsons employees joining the initial workshop. The virtual workshop generated significant engagement – 245 active users developed over 1,500+ models and spent over 500 hours training these models on the AWS DeepRacer console. The virtual league was a resounding success, with 185 racers from across the country participating and submitting 1415 models into the competition!

The virtual AWS DeepRacer league at Parsons provided a fun and inviting environment with lots of iterations, learning, and experimentation. Parsons’ Market CTO, John Statuli, who was one of the top four contenders at the race said, “It was a lot of fun to participate at the AWS DeepRacer event. I have not done any programming in a long time, but the combination of the AWS DeepRacer virtual workshop and the AWS DeepRacer program provided an easy way by which I could participate and compete for the top spot.”

At the final race held in Maryland, Parsons broadcasted a companywide virtual event that showcased a tough competition between their top four competitors from three different business units. Parsons top leadership joined the event, including CTO Rico Lorenzo, D&I CTO Ryan Gabrielle, President of Connected Communities Peter Torrellas, CDO Tim LaChapelle, and ParsonsX Sr. Director Jennifer Bergstrom. At the event, Parsons hosted a webinar with over 100 attendees and a winner’s walkthrough of their models.

With such an overwhelming response from employees across the globe and an interest in AI/ML learning, Parsons is now planning several additional events to continue growing their employees’ knowledgebase. To continue to upskill and educate their workforce, Parsons intends to run more AWS DeepRacer events and workshops focused on object avoidance, an Amazon SageMaker deep dive workshop, and an AWS DeepRacer head-to-head race. Parsons continues to engage with AWS on AI/ML services to build world-class solutions in the fields of critical infrastructure, national defense, space, and cybersecurity.

Whether your organization is new to machine learning or ready to build on existing skills, AWS DeepRacer can help you get there. To learn more visit Getting Started with AWS DeepRacer.


About the Authors

Jenn Bergstrom is a Parsons Fellow and Senior Technical Director. She is passionate about innovative technological solutions and strategies and enjoys designing well-architected cloud solutions for programs across all of Parsons’s domains. When not driving innovation at Parsons, she loves exploring the world with her husband and daughters, and mentoring diverse individuals transitioning into the tech industry. You can reach her on LinkedIn.

Deval Parikh is a Sr. Enterprise Solutions Architect at Amazon Web Services. She is passionate about helping enterprises reimagine their businesses in the cloud by leading them with strategic architectural guidance and building prototypes as an AWS expert. She is also an active board member of the Women at AWS affinity group where she oversees university programs to educate students on cloud technology and careers. She is also an avid hiker and a painter of oil on canvas. You can see many of her paintings at www.devalparikh.com. You can reach her on LinkedIn.

Read More

How Thomson Reuters built an AI platform using Amazon SageMaker to accelerate delivery of ML projects

How Thomson Reuters built an AI platform using Amazon SageMaker to accelerate delivery of ML projects

This post is co-written by Ramdev Wudali and Kiran Mantripragada from Thomson Reuters.

In 1992, Thomson Reuters (TR) released its first AI legal research service, WIN (Westlaw Is Natural), an innovation at the time, as most search engines only supported Boolean terms and connectors. Since then, TR has achieved many more milestones as its AI products and services are continuously growing in number and variety, supporting legal, tax, accounting, compliance, and news service professionals worldwide, with billions of machine learning (ML) insights generated every year.

With this tremendous increase of AI services, the next milestone for TR was to streamline innovation, and facilitate collaboration. Standardize building and reuse of AI solutions across business functions and AI practitioners’ personas, while ensuring adherence to enterprise best practices:

  • Automate and standardize the repetitive undifferentiated engineering effort
  • Ensure the required isolation and control of sensitive data according to common governance standards
  • Provide easy access to scalable computing resources

To fulfill these requirements, TR built the Enterprise AI platform around the following five pillars: a data service, experimentation workspace, central model registry, model deployment service, and model monitoring.

In this post, we discuss how TR and AWS collaborated to develop TR’s first ever Enterprise AI Platform, a web-based tool that would provide capabilities ranging from ML experimentation, training, a central model registry, model deployment, and model monitoring. All these capabilities are built to address TR’s ever-evolving security standards and provide simple, secure, and compliant services to end-users. We also share how TR enabled monitoring and governance for ML models created across different business units with a single pane of glass.

The challenges

Historically at TR, ML has been a capability for teams with advanced data scientists and engineers. Teams with highly skilled resources were able to implement complex ML processes as per their needs, but quickly became very siloed. Siloed approaches didn’t provide any visibility to provide governance into extremely critical decision-making predictions.

TR business teams have vast domain knowledge; however, the technical skills and heavy engineering effort required in ML makes it difficult to use their deep expertise to solve business problems with the power of ML. TR wants to democratize the skills, making it accessible to more people within the organization.

Different teams in TR follow their own practices and methodologies. TR wants to build the capabilities that span across the ML lifecycle to their users to accelerate the delivery of ML projects by enabling teams to focus on business goals and not on the repetitive undifferentiated engineering effort.

Additionally, regulations around data and ethical AI continue to evolve, mandating for common governance standards across TR’s AI solutions.

Solution overview

TR’s Enterprise AI Platform was envisioned to provide simple and standardized services to different personas, offering capabilities for every stage of the ML lifecycle. TR has identified five major categories that modularize all TR’s requirements:

  • Data service – To enable easy and secured access to enterprise data assets
  • Experimentation workspace – To provide capabilities to experiment and train ML models
  • Central model registry – An enterprise catalog for models built across different business units
  • Model deployment service – To provide various inference deployment options following TR’s enterprise CI/CD practices
  • Model monitoring services – To provide capabilities to monitor data and model bias and drifts

As shown in the following diagram, these microservices are built with a few key principles in mind:

  • Remove the undifferentiated engineering effort from users
  • Provide the required capabilities at the click of a button
  • Secure and govern all capabilities as per TR’s enterprise standards
  • Bring a single pane of glass for ML activities

Thomson Reuters AI Platform

TR’s AI Platform microservices are built with Amazon SageMaker as the core engine, AWS serverless components for workflows, and AWS DevOps services for CI/CD practices. SageMaker Studio is used for experimentation and training, and the SageMaker model registry is used to register models. The central model registry is comprised of both the SageMaker model registry and an Amazon DynamoDB table. SageMaker hosting services are used to deploy models, while SageMaker Model Monitor and SageMaker Clarify are used to monitor models for drift, bias, custom metric calculators, and explainability.

The following sections describe these services in detail.

Data service

A traditional ML project lifecycle starts with finding data. In general, data scientists spend 60% or more of their time to find the right data when they need it. Just like every organization, TR has multiple data stores that serve as a single point of truth for different data domains. TR identified two key enterprise data stores that provide data for most of their ML use cases: an object store and a relational data store. TR built an AI Platform data service to seamlessly provide access to both data stores from users’ experimentation workspaces and remove the burden from users to navigate complex processes to acquire data on their own. The TR’s AI Platform follows all the compliances and best practices defined by Data and Model Governance team. This includes a mandatory Data Impact Assessment that helps ML practitioners to understand and follow the ethical and appropriate use of data, with formal approval processes to ensure appropriate access to the data. Core to this service, as well as all platform services, is the security and compliance according to the best practices determined by TR and the industry.

Amazon Simple Storage Service (Amazon S3) object storage acts as a content data lake. TR built processes to securely access data from the content data lake to users’ experimentation workspaces while maintaining required authorization and auditability. Snowflake is used as the enterprise relational primary data store. Upon user request and based on the approval from the data owner, the AI Platform data service provides a snapshot of the data to the user readily available into their experimentation workspace.

Accessing data from various sources is a technical problem that can be easily solved. But the complexity TR has solved is to build approval workflows that automate identifying the data owner, sending an access request, making sure the data owner is notified that they have a pending access request, and based on the approval status take action to provide data to the requester. All the events throughout this process are tracked and logged for auditability and compliance.

As shown in the following diagram, TR uses AWS Step Functions to orchestrate the workflow and AWS Lambda to run the functionality. Amazon API Gateway is used to expose the functionality with an API endpoint to be consumed from their web portal.
Data service

Model experimentation and development

An essential capability for standardizing the ML lifecycle is an environment that allows data scientists to experiment with different ML frameworks and data sizes. Enabling such a secure, compliant environment in the cloud within minutes relieves data scientists from the burden of handling cloud infrastructure, networking requirements, and security standards measures, to focus instead on the data science problem.

TR builds an experimentation workspace that offers access to services such as AWS Glue, Amazon EMR, and SageMaker Studio to enable data processing and ML capabilities adhering to enterprise cloud security standards and required account isolation for every business unit. TR has encountered the following challenges while implementing the solution:

  • Orchestration early on wasn’t fully automated and involved several manual steps. Tracking down where problems were occurring wasn’t easy. TR overcame this error by orchestrating the workflows using Step Functions. With the use of Step Functions, building complex workflows, managing states, and error handling became much easier.
  • Proper AWS Identity and Access Management (IAM) role definition for the experimentation workspace was hard to define. To comply with TR’s internal security standards and least privilege model, originally, the workspace role was defined with inline policies. Consequentially, the inline policy grew with time and became verbose, exceeding the policy size limit allowed for the IAM role. To mitigate this, TR switched to using more customer-managed policies and referencing them in the workspace role definition.
  • TR occasionally reached the default resource limits applied at the AWS account level. This caused occasional failures of launching SageMaker jobs (for example, training jobs) due to the desired resource type limit reached. TR worked closely with the SageMaker service team on this issue. This problem was solved after the AWS team launched SageMaker as a supported service in Service Quotas in June 2022.

Today, data scientists at TR can launch an ML project by creating an independent workspace and adding required team members to collaborate. Unlimited scale offered by SageMaker is at their fingertips by providing them custom kernel images with varied sizes. SageMaker Studio quickly became a crucial component in TR’s AI Platform and has changed user behavior from using constrained desktop applications to scalable and ephemeral purpose-built engines. The following diagram illustrates this architecture.

Model experimentation and development

Central model registry

The model registry provides a central repository for all of TR’s machine learning models, enables risk and health management of those in a standardized manner across business functions, and streamlines potential models’ reuse. Therefore, the service needed to do the following:

  • Provide the capability to register both new and legacy models, whether developed within or outside SageMaker
  • Implement governance workflows, enabling data scientists, developers, and stakeholders to view and collectively manage the lifecycle of models
  • Increase transparency and collaboration by creating a centralized view of all models across TR alongside metadata and health metrics

TR started the design with just the SageMaker model registry, but one of TR’s key requirements is to provide the capability to register models created outside of SageMaker. TR evaluated different relational databases but ended up choosing DynamoDB because the metadata schema for models coming from legacy sources will be very different. TR also didn’t want to impose any additional work on the users, so they implemented a seamless automatic synchronization between the AI Platform workspace SageMaker registries to the central SageMaker registry using Amazon EventBridge rules and required IAM roles. TR enhanced the central registry with DynamoDB to extend the capabilities to register legacy models that were created on users’ desktops.

TR’s AI Platform central model registry is integrated into the AI Platform portal and provides a visual interface to search models, update model metadata, and understand model baseline metrics and periodic custom monitoring metrics. The following diagram illustrates this architecture.

Central model registry

Model deployment

TR identified two major patterns to automate deployment:

  • Models developed using SageMaker through SageMaker batch transform jobs to get inferences on a preferred schedule
  • Models developed outside SageMaker on local desktops using open-source libraries, through the bring your own container approach using SageMaker processing jobs to run custom inference code, as an efficient way to migrate those models without refactoring the code

With the AI Platform deployment service, TR users (data scientists and ML engineers) can identify a model from the catalog and deploy an inference job into their chosen AWS account by providing the required parameters through a UI-driven workflow.

TR automated this deployment using AWS DevOps services like AWS CodePipeline and AWS CodeBuild. TR uses Step Functions to orchestrate the workflow of reading and preprocessing data to creating SageMaker inference jobs. TR deploys the required components as code using AWS CloudFormation templates. The following diagram illustrates this architecture.

Model deployment

Model monitoring

The ML lifecycle is not complete without being able to monitor models. TR’s enterprise governance team also mandates and encourages business teams to monitor their model performance over time to address any regulatory challenges. TR started with monitoring models and data for drift. TR used SageMaker Model Monitor to provide a data baseline and inference ground truth to periodically monitor how TR’s data and inferences are drifting. Along with SageMaker model monitoring metrics, TR enhanced the monitoring capability by developing custom metrics specific to their models. This will help TR’s data scientists understand when to retrain their model.

Along with drift monitoring, TR also wants to understand bias in the models. The out-of-the-box capabilities of SageMaker Clarify are used to build TR’s bias service. TR monitors both data and model bias and makes those metrics available for their users through the AI Platform portal.

To help all teams to adopt these enterprise standards, TR has made these services independent and readily available via the AI Platform portal. TR’s business teams can go into the portal and deploy a model monitoring job or bias monitoring job on their own and run them on their preferred schedule. They’re notified on the status of the job and the metrics for every run.

TR used AWS services for CI/CD deployment, workflow orchestration, serverless frameworks, and API endpoints to build microservices that can be triggered independently, as shown in the following architecture.
Model monitoring

Results and future improvements

TR’s AI Platform went live in Q3 2022 with all five major components: a data service, experimentation workspace, central model registry, model deployment, and model monitoring. TR conducted internal training sessions for its business units to onboard the platform and offered them self-guided training videos.

The AI Platform has provided capabilities to TR’s teams that never existed before; it has opened a wide range of possibilities for TR’s enterprise governance team to enhance compliance standards and centralize the registry, providing a single pane of glass view across all ML models within TR.

TR acknowledges that no product is at its best on initial release. All TR’s components are at different levels of maturity, and TR’s Enterprise AI Platform team is in a continuous enhancement phase to iteratively improve product features. TR’s current advancement pipeline includes adding additional SageMaker inference options like real-time, asynchronous, and multi-model endpoints. TR is also planning to add model explainability as a feature to its model monitoring service. TR plans to use the explainability capabilities of SageMaker Clarify to develop its internal explainability service.

Conclusion

TR can now process vast amounts of data securely and use advanced AWS capabilities to take an ML project from ideation to production in the span of weeks, compared to the months it took before. With the out-of-the-box capabilities of AWS services, teams within TR can register and monitor ML models for the first time ever, achieving compliance with their evolving model governance standards. TR empowered data scientists and product teams to effectively unleash their creativity to solve most complex problems.

To know more about TR’s Enterprise AI Platform on AWS, check out the AWS re:Invent 2022 session. If you’d like to learn how TR accelerated the use of machine learning using the AWS Data Lab program, refer to the case study.


About the Authors

Ramdev Wudali is a Data Architect, helping architect and build the AI/ML Platform to enable data scientists and researchers to develop machine learning solutions by focusing on the data science and not on the infrastructure needs. In his spare time, he loves to fold paper to create origami tessellations, and wearing irreverent T-shirts.

Kiran Mantripragada is the Senior Director of AI Platform at Thomson Reuters. The AI Platform team is responsible for enabling production-grade AI software applications and enabling the work of data scientists and machine learning researchers. With a passion for science, AI, and engineering, Kiran likes to bridge the gap between research and productization to bring the real innovation of AI to the final consumers.

Bhavana Chirumamilla is a Sr. Resident Architect at AWS. She is passionate about data and ML operations, and brings lots of enthusiasm to help enterprises build data and ML strategies. In her spare time, she enjoys time with her family traveling, hiking, gardening, and watching documentaries.

Srinivasa Shaik is a Solutions Architect at AWS based in Boston. He helps enterprise customers accelerate their journey to the cloud. He is passionate about containers and machine learning technologies. In his spare time, he enjoys spending time with his family, cooking, and traveling.

Qingwei Li is a Machine Learning Specialist at Amazon Web Services. He received his PhD in Operations Research after he broke his advisor’s research grant account and failed to deliver the Nobel Prize he promised. Currently, he helps customers in the financial service and insurance industry build machine learning solutions on AWS. In his spare time, he likes reading and teaching.

Read More

Federated Learning on AWS with FedML: Health analytics without sharing sensitive data – Part 2

Federated Learning on AWS with FedML: Health analytics without sharing sensitive data – Part 2

Analyzing real-world healthcare and life sciences (HCLS) data poses several practical challenges, such as distributed data silos, lack of sufficient data at a single site for rare events, regulatory guidelines that prohibit data sharing, infrastructure requirement, and cost incurred in creating a centralized data repository. Because they’re in a highly regulated domain, HCLS partners and customers seek privacy-preserving mechanisms to manage and analyze large-scale, distributed, and sensitive data.

To mitigate these challenges, we propose a federated learning (FL) framework, based on open-source FedML on AWS, which enables analyzing sensitive HCLS data. It involves training a global machine learning (ML) model from distributed health data held locally at different sites. It doesn’t require moving or sharing data across sites or with a centralized server during the model training process.

Deploying an FL framework on the cloud has several challenges. Automating the client-server infrastructure to support multiple accounts or virtual private clouds (VPCs) requires VPC peering and efficient communication across VPCs and instances. In a production workload, a stable deployment pipeline is needed to seamlessly add and remove clients and update their configurations without much overhead. Furthermore, in a heterogenous setup, clients may have varying requirements for compute, network, and storage. In this decentralized architecture, logging and debugging errors across clients can be difficult. Finally, determining the optimal approach to aggregate model parameters, maintain model performance, ensure data privacy, and improve communication efficiency is an arduous task. In this post, we address these challenges by providing a federated learning operations (FLOps) template that hosts a HCLS solution. The solution is agnostic to use cases, which means you can adapt it for your use cases by changing the model and data.

In this two-part series, we demonstrate how you can deploy a cloud-based FL framework on AWS. In the first post, we described FL concepts and the FedML framework. In this second part, we present a proof-of-concept healthcare and life sciences use case from a real-world dataset eICU. This dataset comprises a multi-center critical care database collected from over 200 hospitals, which makes it ideal to test our FL experiments.

HCLS use case

For the purpose of demonstration, we built an FL model on a publicly available dataset to manage critically ill patients. We used the eICU Collaborative Research Database, a multi-center intensive care unit (ICU) database, comprising 200,859 patient unit encounters for 139,367 unique patients. They were admitted to one of 335 units at 208 hospitals located throughout the US between 2014–2015. Due to the underlying heterogeneity and distributed nature of the data, it provides an ideal real-world example to test this FL framework. The dataset includes laboratory measurements, vital signs, care plan information, medications, patient history, admission diagnosis, time-stamped diagnoses from a structured problem list, and similarly chosen treatments. It is available as a set of CSV files, which can be loaded into any relational database system. The tables are de-identified to meet the regulatory requirements US Health Insurance Portability and Accountability Act (HIPAA). The data can be accessed via a PhysioNet repository, and details of the data access process can be found here [1].

The eICU data is ideal for developing ML algorithms, decision support tools, and advancing clinical research. For benchmark analysis, we considered the task of predicting the in-hospital mortality of patients [2]. We defined it as a binary classification task, where each data sample spans a 1-hour window. To create a cohort for this task, we selected patients with a hospital discharge status in the patient’s record and a length of stay of at least 48 hours, because we focus on prediction mortality during the first 24 and 48 hours. This created a cohort of 30,680 patients containing 1,164,966 records. We adopted domain-specific data preprocessing and methods described in [3] for mortality prediction. This resulted in an aggregated dataset comprising several columns per patient per record, as shown in the following figure. The following table provides a patient record in a tabular style interface with time in columns (5 intervals over 48 hours) and vital sign observations in rows. Each row represents a physiological variable, and each column represents its value recorded over a time window of 48 hours for a patient.

Physiologic Parameter Chart_Time_0 Chart_Time_1 Chart_Time_2 Chart_Time_3 Chart_Time_4
Glasgow Coma Score Eyes 4 4 4 4 4
FiO2 15 15 15 15 15
Glasgow Coma Score Eyes 15 15 15 15 15
Heart Rate 101 100 98 99 94
Invasive BP Diastolic 73 68 60 64 61
Invasive BP Systolic 124 122 111 105 116
Mean arterial pressure (mmHg) 77 77 77 77 77
Glasgow Coma Score Motor 6 6 6 6 6
02 Saturation 97 97 97 97 97
Respiratory Rate 19 19 19 19 19
Temperature (C) 36 36 36 36 36
Glasgow Coma Score Verbal 5 5 5 5 5
admissionheight 162 162 162 162 162
admissionweight 96 96 96 96 96
age 72 72 72 72 72
apacheadmissiondx 143 143 143 143 143
ethnicity 3 3 3 3 3
gender 1 1 1 1 1
glucose 128 128 128 128 128
hospitaladmitoffset -436 -436 -436 -436 -436
hospitaldischargestatus 0 0 0 0 0
itemoffset -6 -1 0 1 2
pH 7 7 7 7 7
patientunitstayid 2918620 2918620 2918620 2918620 2918620
unitdischargeoffset 1466 1466 1466 1466 1466
unitdischargestatus 0 0 0 0 0

We used both numerical and categorical features and grouped all records of each patient to flatten them into a single-record time series. The seven categorical features (Admission diagnosis, Ethnicity, Gender, Glasgow Coma Score Total, Glasgow Coma Score Eyes, Glasgow Coma Score Motor, and Glasgow Coma Score Verbal were converted to one-hot encoding vectors) contained 429 unique values and were converted into one-hot embeddings. To prevent data leakage across training node servers, we split the data by hospital IDs and kept all records of a hospital on a single node.

Solution overview

The following diagram shows the architecture of multi-account deployment of FedML on AWS. This includes two clients (Participant A and Participant B) and a model aggregator.

The architecture consists of three separate Amazon Elastic Compute Cloud (Amazon EC2) instances running in its own AWS account. Each of the first two instances is owned by a client, and the third instance is owned by the model aggregator. The accounts are connected via VPC peering to allow ML models and weights to be exchanged between the clients and aggregator. gRPC is used as communication backend for communication between model aggregator and clients. We tested a single account-based distributed computing setup with one server and two client nodes. Each of these instances were created using a custom Amazon EC2 AMI with FedML dependencies installed as per the FedML.ai installation guide.

Set up VPC peering

After you launch the three instances in their respective AWS accounts, you establish VPC peering between the accounts via Amazon Virtual Private Cloud (Amazon VPC). To set up a VPC peering connection, first create a request to peer with another VPC. You can request a VPC peering connection with another VPC in your account, or with a VPC in a different AWS account. To activate the request, the owner of the VPC must accept the request. For the purpose of this demonstration, we set up the peering connection between VPCs in different accounts but the same Region. For other configurations of VPC peering, refer to Create a VPC peering connection.

Before you begin, make sure that you have the AWS account number and VPC ID of the VPC to peer with.

Request a VPC peering connection

To create the VPC peering connection, complete the following steps:

  1. On the Amazon VPC console, in the navigation pane, choose Peering connections.
  2. Choose Create peering connection.
  3. For Peering connection name tag, you can optionally name your VPC peering connection.Doing so creates a tag with a key of the name and a value that you specify. This tag is only visible to you; the owner of the peer VPC can create their own tags for the VPC peering connection.
  4. For VPC (Requester), choose the VPC in your account to create the peering connection.
  5. For Account, choose Another account.
  6. For Account ID, enter the AWS account ID of the owner of the accepter VPC.
  7. For VPC (Accepter), enter the VPC ID with which to create the VPC peering connection.
  8. In the confirmation dialog box, choose OK.
  9. Choose Create peering connection.

Accept a VPC peering connection

As mentioned earlier, the VPC peering connection needs to be accepted by the owner of the VPC the connection request has been sent to. Complete the following steps to accept the peering connection request:

  1. On the Amazon VPC console, use the Region selector to choose the Region of the accepter VPC.
  2. In the navigation pane, choose Peering connections.
  3. Select the pending VPC peering connection (the status is pending-acceptance), and on the Actions menu, choose Accept Request.
  4. In the confirmation dialog box, choose Yes, Accept.
  5. In the second confirmation dialog, choose Modify my route tables now to go directly to the route tables page, or choose Close to do this later.

Update route tables

To enable private IPv4 traffic between instances in peered VPCs, add a route to the route tables associated with the subnets for both instances. The route destination is the CIDR block (or portion of the CIDR block) of the peer VPC, and the target is the ID of the VPC peering connection. For more information, see Configure route tables.

Update your security groups to reference peer VPC groups

Update the inbound or outbound rules for your VPC security groups to reference security groups in the peered VPC. This allows traffic to flow across instances that are associated with the referenced security group in the peered VPC. For more details about setting up security groups, refer to Update your security groups to reference peer security groups.

Configure FedML

After you have the three EC2 instances running, connect to each of them and perform the following steps:

  1. Clone the FedML repository.
  2. Provide topology data about your network in the config file grpc_ipconfig.csv.

This file can be found at FedML/fedml_experiments/distributed/fedavg in the FedML repository. The file includes data about the server and clients and their designated node mapping, such as FL Server – Node 0, FL Client 1 – Node 1, and FL Client 2 – Node2.

  1. Define the GPU mapping config file.

This file can be found at FedML/fedml_experiments/distributed/fedavg in the FedML repository. The file gpu_mapping.yaml consists of configuration data for client server mapping to the corresponding GPU, as shown in the following snippet.

After you define these configurations, you’re ready to run the clients. Note that the clients must be run before kicking off the server. Before doing that, let’s set up the data loaders for the experiments.

Customize FedML for eICU

To customize the FedML repository for eICU dataset, make the following changes to the data and data loader.

Data

Add data to the pre-assigned data folder, as shown in the following screenshot. You can place the data in any folder of your choice, as long as the path is consistently referenced in the training script and has access enabled. To follow a real-world HCLS scenario, where local data isn’t shared across sites, split and sample the data so there’s no overlap of hospital IDs across the two clients. This ensures the data of a hospital is hosted on its own server. We also enforced the same constraint to split the data into train/test sets within each client. Each of the train/test sets across the clients had a 1:10 ratio of positive to negative labels, with roughly 27,000 samples in training and 3,000 samples in test. We handle the data imbalance in model training with a weighted loss function.

Data loader

Each of the FedML clients loads the data and converts it into PyTorch tensors for efficient training on GPU. Extend the existing FedML nomenclature to add a folder for eICU data in the data_processing folder.

The following code snippet loads the data from the data source. It preprocesses the data and returns one item at a time through the __getitem__ function.

import logging
import pickle
import random
import numpy as np
import torch.utils.data as data


class eicu_truncated(data.Dataset):

    def __init__(self, file_path, dataidxs=None, transform=None, target_transform=None, 
                 task='mort', ohe=True, cat=True, num=True, n_cat_class=429):
        <code to initialize class variables>

    def _load_data(self, file_path):
        <code to load data files for each client>


    def __getitem__(self, index):
	<code to process data and return input and labels>
        return x.astype(np.float32), y

    def __len__(self):
        return len(self.data)

Training ML models with a single data point at a time is tedious and time-consuming. Model training is typically done on a batch of data points at each client. To implement this, the data loader in the data_loader.py script converts NumPy arrays into Torch tensors, as shown in the following code snippet. Note that FedML provides dataset.py and data_loader.py scripts for both structured and unstructured data that you can use for data-specific alterations, as in any PyTorch project.

import logging
import numpy as np
import torch
import torch.utils.data as data
import torchvision.transforms as transforms
from .dataset import eicu_truncated #load the dataset.py file mentioned above
.
.
.
.
# Use standard FedML functions for data distribution and split here
.
.
.
.
# Invoke load_partition_data function for model training. Adapt this function for your dataset
def load_partition_data_eicu(dataset, train_file, test_file, partition_method, partition_alpha, client_number, batch_size):
	<code to partition eicu data and its aggregated statistics>
    return train_data_num, test_data_num, train_data_global, test_data_global, 
           data_local_num_dict, train_data_local_dict, test_data_local_dict, class_num, net_dataidx_map

Import the data loader into the training script

After you create the data loader, import it into the FedML code for ML model training. Like any other dataset (for example, CIFAR-10 and CIFAR-100), load the eICU data to the main_fedavg.py script in the path FedML/fedml_experiments/distributed/fedavg/. Here, we used the federated averaging (fedavg) aggregation function. You can follow a similar method to set up the main file for any other aggregation function.

from fedml_api.data_preprocessing.cifar100.data_loader import load_partition_data_cifar100
from fedml_api.data_preprocessing.cinic10.data_loader import load_partition_data_cinic10

# For eicu
from fedml_api.data_preprocessing.eicu.data_loader import load_partition_data_eicu

We call the data loader function for eICU data with the following code:

    elif dataset_name == "eicu":
        logging.info("load_data. dataset_name = %s" % dataset_name)
        args.client_num_in_total = 2
        train_data_num, test_data_num, train_data_global, test_data_global, 
        train_data_local_num_dict, train_data_local_dict, test_data_local_dict, 
        class_num, net_dataidx_map = load_partition_data_eicu(dataset=dataset_name, train_file=args.train_file,
                                                  test_file=args.test_file, partition_method=args.partition_method, partition_alpha=args.partition_alpha,
                                                  client_number=args.client_num_in_total, batch_size=args.batch_size)

Define the model

FedML supports several out-of-the-box deep learning algorithms for various data types, such as tabular, text, image, graphs, and Internet of Things (IoT) data. Load the model specific for eICU with input and output dimensions defined based on the dataset. For this proof of concept development, we used a logistic regression model to train and predict the mortality rate of patients with default configurations. The following code snippet shows the updates we made to the main_fedavg.py script. Note that you can also use custom PyTorch models with FedML and import it into the main_fedavg.py script.

if model_name == "lr" and args.dataset == "mnist":
        logging.info("LogisticRegression + MNIST")
        model = LogisticRegression(28 * 28, output_dim)
elif model_name == "lr" and args.dataset == "eicu":
        logging.info("LogisticRegression + eicu")
        model = LogisticRegression(22100, output_dim)
elif model_name == "rnn" and args.dataset == "shakespeare":
        logging.info("RNN + shakespeare")
        model = RNN_OriginalFedAvg()

Run and monitor FedML training on AWS

The following video shows the training process being initialized in each of the clients. After both the clients are listed for the server, create the server training process that performs federated aggregation of models.

To configure the FL server and clients, complete the following steps:

  1. Run Client 1 and Client 2.

To run a client, enter the following command with its corresponding node ID. For instance, to run Client 1 with node ID 1, run from the command line:

> sh run_fedavg_cross_zone_eICU.sh 1
  1. After both the client instances are started, start the server instance using the same command and the appropriate node ID per your configuration in the grpc_ipconfig.csv file. You can see the model weights being passed to the server from the client instances.
  1. We train FL model for 50 epochs. As you can see in the below video, the weights are transferred between nodes 0, 1, and 2, indicating the training is progressing as expected in a federated manner.
  1. Finally, monitor and track the FL model training progression across different nodes in the cluster using the weights and biases (wandb) tool, as shown in the following screenshot. Please follow the steps listed here to install wandb and setup monitoring for this solution.

The following video captures all these steps to provide an end-to-end demonstration of FL on AWS using FedML:

Conclusion

In this post, we showed how you can deploy an FL framework, based on open-source FedML, on AWS. It allows you to train an ML model on distributed data, without the need to share or move it. We set up a multi-account architecture, where in a real-world scenario, hospitals or healthcare organizations can join the ecosystem to benefit from collaborative learning while maintaining data governance. We used the multi-hospital eICU dataset to test this deployment. This framework can also be applied to other use cases and domains. We will continue to extend this work by automating deployment through infrastructure as code (using AWS CloudFormation), further incorporating privacy-preserving mechanisms, and improving interpretability and fairness of the FL models.

Please review the presentation at re:MARS 2022 focused on “Managed Federated Learning on AWS: A case study for healthcare” for a detailed walkthrough of this solution.

Reference

[1] Pollard, Tom J., et al. “The eICU Collaborative Research Database, a freely available multi-center database for critical care research.” Scientific data 5.1 (2018): 1-13.

[2] Yin, X., Zhu, Y. and Hu, J., 2021. A comprehensive survey of privacy-preserving federated learning: A taxonomy, review, and future directions. ACM Computing Surveys (CSUR), 54(6), pp.1-36.

[3] Sheikhalishahi, Seyedmostafa, Vevake Balaraman, and Venet Osmani. “Benchmarking machine learning models on multi-centre eICU critical care dataset.” Plos one 15.7 (2020): e0235424.


About the Authors

Vidya Sagar Ravipati is a Manager at the Amazon ML Solutions Lab, where he leverages his vast experience in large-scale distributed systems and his passion for machine learning to help AWS customers across different industry verticals accelerate their AI and cloud adoption. Previously, he was a Machine Learning Engineer in Connectivity Services at Amazon who helped to build personalization and predictive maintenance platforms.

Olivia Choudhury, PhD, is a Senior Partner Solutions Architect at AWS. She helps partners, in the Healthcare and Life Sciences domain, design, develop, and scale state-of-the-art solutions leveraging AWS. She has a background in genomics, healthcare analytics, federated learning, and privacy-preserving machine learning. Outside of work, she plays board games, paints landscapes, and collects manga.

Wajahat Aziz is a Principal Machine Learning and HPC Solutions Architect at AWS, where he focuses on helping healthcare and life sciences customers leverage AWS technologies for developing state-of-the-art ML and HPC solutions for a wide variety of use cases such as Drug Development, Clinical Trials, and Privacy Preserving Machine Learning. Outside of work, Wajahat likes to explore nature, hiking, and reading.

Divya Bhargavi is a Data Scientist and Media and Entertainment Vertical Lead at the Amazon ML Solutions Lab, where she solves high-value business problems for AWS customers using Machine Learning. She works on image/video understanding, knowledge graph recommendation systems, predictive advertising use cases.

Ujjwal Ratan is the leader for AI/ML and Data Science in the AWS Healthcare and Life Science Business Unit and is also a Principal AI/ML Solutions Architect. Over the years, Ujjwal has been a thought leader in the healthcare and life sciences industry, helping multiple Global Fortune 500 organizations achieve their innovation goals by adopting machine learning. His work involving the analysis of medical imaging, unstructured clinical text and genomics has helped AWS build products and services that provide highly personalized and precisely targeted diagnostics and therapeutics. In his free time, he enjoys listening to (and playing) music and taking unplanned road trips with his family.

Chaoyang He is Co-founder and CTO of FedML, Inc., a startup running for a community building open and collaborative AI from anywhere at any scale. His research focuses on distributed/federated machine learning algorithms, systems, and applications. He received his Ph.D. in Computer Science from the University of Southern California, Los Angeles, USA.

Salman Avestimehr is Co-founder and CEO of FedML, Inc., a startup running for a community building open and collaborative AI from anywhere at any scale. Salman Avestimehr is a world-renowned expert in federated learning with over 20 years of R&D leadership in both academia and industry. He is a Dean’s Professor and the inaugural director of the USC-Amazon Center on Trustworthy Machine Learning at the University of Southern California. He has also been an Amazon Scholar in Amazon. He is a United States Presidential award winner for his profound contributions in information                                                  technology, and a Fellow of IEEE.

Read More

Federated Learning on AWS with FedML: Health analytics without sharing sensitive data – Part 1

Federated Learning on AWS with FedML: Health analytics without sharing sensitive data – Part 1

Analyzing real-world healthcare and life sciences (HCLS) data poses several practical challenges, such as distributed data silos, lack of sufficient data at any single site for rare events, regulatory guidelines that prohibit data sharing, infrastructure requirement, and cost incurred in creating a centralized data repository. Because they are in a highly regulated domain, HCLS partners and customers seek privacy-preserving mechanisms to manage and analyze large-scale, distributed, and sensitive data.

To mitigate these challenges, we propose using an open-source federated learning (FL) framework called FedML, which enables you to analyze sensitive HCLS data by training a global machine learning model from distributed data held locally at different sites. FL doesn’t require moving or sharing data across sites or with a centralized server during the model training process.

In this two-part series, we demonstrate how you can deploy a cloud-based FL framework on AWS. In the first post, we described FL concepts and the FedML framework. In the second post, we present the use cases and dataset to show its effectiveness in analyzing real-world healthcare datasets, such as the eICU data, which comprises a multi-center critical care database collected from over 200 hospitals.

Background

Although the volume of HCLS-generated data has never been greater, the challenges and constraints associated with accessing such data limits its utility for future research. Machine learning (ML) presents an opportunity to address some of these concerns and is being adopted to advance data analytics and derive meaningful insights from diverse HCLS data for use cases like care delivery, clinical decision support, precision medicine, triage and diagnosis, and chronic care management. Because ML algorithms are often not adequate in protecting the privacy of patient-level data, there is a growing interest among HCLS partners and customers to use privacy-preserving mechanisms and infrastructure for managing and analyzing large-scale, distributed, and sensitive data. [1]

We have developed an FL framework on AWS that enables analyzing distributed and sensitive health data in a privacy-preserving manner. It involves training a shared ML model without moving or sharing data across sites or with a centralized server during the model training process, and can be implemented across multiple AWS accounts. Participants can either choose to maintain their data in their on-premises systems or in an AWS account that they control. Therefore, it brings analytics to data, rather than moving data to analytics.

In this post, we showed how you can deploy the open-source FedML framework on AWS. We test the framework on eICU data, a multi-center critical care database collected from over 200 hospitals, to predict in-hospital patient mortality. We can use this FL framework to analyze other datasets, including genomic and life sciences data. It can also be adopted by other domains that are rife with distributed and sensitive data, including finance and education sectors.

Federated learning

Advancements in technology have led to an explosive growth of data across industries, including HCLS. HCLS organizations often store data in siloes. This poses a major challenge in data-driven learning, which requires large datasets to generalize well and achieve the desired level of performance. Moreover, gathering, curating, and maintaining high-quality datasets incur significant time and cost.

Federated learning mitigates these challenges by collaboratively training ML models that use distributed data, without the need to share or centralize them. It allows diverse sites to be represented within the final model, reducing the potential risk for site-based bias. The framework follows a client-server architecture, where the server shares a global model with the clients. The clients train the model based on local data and share parameters (such as gradients or model weights) with the server. The server aggregates these parameters to update the global model, which is then shared with the clients for next round of training, as shown in the following figure. This iterative process of model training continues until the global model converges.

iterative process of model training

Iterative process of model training

In recent years, this new learning paradigm has been successfully adopted to address the concern of data governance in training ML models. One such effort is MELLODDY, an Innovative Medicines Initiative (IMI)-led consortium, powered by AWS. It’s a 3-year program involving 10 pharmaceutical companies, 2 academic institutions, and 3 technology partners. Its primary goal is to develop a multi-task FL framework to improve the predictive performance and chemical applicability of drug discovery-based models. The platform comprises multiple AWS accounts, with each pharma partner retaining full control of their respective accounts to maintain their private datasets, and a central ML account coordinating the model training tasks.

The consortium trained models on billions of data points, consisting of over 20 million small molecules in over 40,000 biological assays. Based on experimental results, the collaborative models demonstrated a 4% improvement in categorizing molecules as either pharmacologically or toxicologically active or inactive. It also led to a 10% increase in its ability to yield confident predictions when applied to new types of molecules. Finally, the collaborative models were typically 2% better at estimating values of toxicological and pharmacological activities.

FedML

FedML is an open-source library to facilitate FL algorithm development. It supports three computing paradigms: on-device training for edge devices, distributed computing, and single-machine simulation. It also offers diverse algorithmic research with flexible and generic API design and comprehensive reference baseline implementations (optimizer, models, and datasets). For a detailed description of the FedML library, refer to FedML.

The following figure presents the open-source library architecture of FedML.

open-source library architecture of FedML

Open-source library architecture of FedML

As seen in the preceding figure, from the application point of view, FedML shields details of the underlying code and complex configurations of distributed training. At the application level, such as computer vision, natural language processing, and data mining, data scientists and engineers only need to write the model, data, and trainer in the same way as a standalone program and then pass it to the FedMLRunner object to complete all the processes, as shown in the following code. This greatly reduces the overhead for application developers to perform FL.

import fedml
from my_model_trainer import MyModelTrainer
from my_server_aggregator import MyServerAggregator
from fedml import FedMLRunner

if __name__ == "__main__":
# init FedML framework
args = fedml.init()

# init device
device = fedml.device.get_device(args)

# load data
dataset, output_dim = fedml.data.load(args)

# load model
model = fedml.model.create(args, output_dim)

# my customized trainer and aggregator
trainer = MyModelTrainer(model, args)
aggregator = MyServerAggregator(model, args)

# start training
fedml_runner = FedMLRunner(args, device, dataset, model, trainer, aggregator)
fedml_runner.run()

The FedML algorithm is still a work in progress and constantly being improved. To this end, FedML abstracts the core trainer and aggregator and provides users with two abstract objects, FedML.core.ClientTrainer and FedML.core.ServerAggregator, which only need to inherit the interfaces of these two abstract objects and pass them to FedMLRunner. Such customization provides ML developers with maximum flexibility. You can define arbitrary model structures, optimizers, loss functions, and more. These customizations can also be seamlessly connected with the open-source community, open platform, and application ecology mentioned earlier with the help of FedMLRunner, which completely solves the long lag problem from innovative algorithms to commercialization.

Finally, as shown in the preceding figure, FedML supports distributed computing processes, such as complex security protocols and distributed training as a Directed Acyclic Graph (DAG) flow computing process, making the writing of complex protocols similar to standalone programs. Based on this idea, the security protocol Flow Layer 1 and the ML algorithm process Flow Layer 2 can be easily separated so that security engineers and ML engineers can operate while maintaining a modular architecture.

The FedML open-source library supports federated ML use cases for edge as well as cloud. On the edge, the framework facilitates training and deployment of edge models to mobile phones and internet of things (IoT) devices. In the cloud, it enables global collaborative ML, including multi-Region, and multi-tenant public cloud aggregation servers, as well as private cloud deployment in Docker mode. The framework addresses key concerns with regards to privacy-preserving FL such as security, privacy, efficiency, weak supervision, and fairness.

Conclusion

In this post, we showed how you can deploy the open-source FedML framework on AWS. This allows you to train an ML model on distributed data, without the need to share or move it. We set up a multi-account architecture, where in a real-world scenario, organizations can join the ecosystem to benefit from collaborative learning while maintaining data governance. In the next post, we use the multi-hospital eICU dataset to demonstrate its effectiveness in a real-world scenario.

Please review the presentation at re:MARS 2022 focused on “Managed Federated Learning on AWS: A case study for healthcare” for a detailed walkthrough of this solution.

Reference

[1] Kaissis, G.A., Makowski, M.R., Rückert, D. et al. Secure, privacy-preserving and federated machine learning in medical imaging. Nat Mach Intell 2, 305–311 (2020). https://doi.org/10.1038/s42256-020-0186-1
[2] FedML https://fedml.ai


About the Authors

Olivia Choudhury, PhD, is a Senior Partner Solutions Architect at AWS. She helps partners, in the Healthcare and Life Sciences domain, design, develop, and scale state-of-the-art solutions leveraging AWS. She has a background in genomics, healthcare analytics, federated learning, and privacy-preserving machine learning. Outside of work, she plays board games, paints landscapes, and collects manga.

Vidya Sagar Ravipati is a Manager at the Amazon ML Solutions Lab, where he leverages his vast experience in large-scale distributed systems and his passion for machine learning to help AWS customers across different industry verticals accelerate their AI and cloud adoption. Previously, he was a Machine Learning Engineer in Connectivity Services at Amazon who helped to build personalization and predictive maintenance platforms.

Wajahat Aziz is a Principal Machine Learning and HPC Solutions Architect at AWS, where he focuses on helping healthcare and life sciences customers leverage AWS technologies for developing state-of-the-art ML and HPC solutions for a wide variety of use cases such as Drug Development, Clinical Trials, and Privacy Preserving Machine Learning. Outside of work, Wajahat likes to explore nature, hiking, and reading.

Divya Bhargavi is a Data Scientist and Media and Entertainment Vertical Lead at the Amazon ML Solutions Lab, where she solves high-value business problems for AWS customers using Machine Learning. She works on image/video understanding, knowledge graph recommendation systems, predictive advertising use cases.

Ujjwal Ratan is the leader for AI/ML and Data Science in the AWS Healthcare and Life Science Business Unit and is also a Principal AI/ML Solutions Architect. Over the years, Ujjwal has been a thought leader in the healthcare and life sciences industry, helping multiple Global Fortune 500 organizations achieve their innovation goals by adopting machine learning. His work involving the analysis of medical imaging, unstructured clinical text and genomics has helped AWS build products and services that provide highly personalized and precisely targeted diagnostics and therapeutics. In his free time, he enjoys listening to (and playing) music and taking unplanned road trips with his family.

Chaoyang He is Co-founder and CTO of FedML, Inc., a startup running for a community building open and collaborative AI from anywhere at any scale. His research focuses on distributed/federated machine learning algorithms, systems, and applications. He received his Ph.D. in Computer Science from the University of Southern California, Los Angeles, USA.

Salman Avestimehr is Professor, the inaugural director of the USC-Amazon Center for Secure and Trusted Machine Learning (Trusted AI), and the director of the Information Theory and Machine Learning (vITAL) research lab at the Electrical and Computer Engineering Department and Computer Science Department of University of Southern California. He is also the co-founder and CEO of FedML. He received my Ph.D. in Electrical Engineering and Computer Sciences from UC Berkeley in 2008. His research focuses on the areas of information theory, decentralized and federated machine learning, secure and privacy-preserving learning and computing.

Read More

Multilingual customer support translation made easy on Salesforce Service Cloud using Amazon Translate

Multilingual customer support translation made easy on Salesforce Service Cloud using Amazon Translate

This post was co-authored with Mark Lott, Distinguished Technical Architect, Salesforce, Inc.

Enterprises that operate globally are experiencing challenges sourcing customer support professionals with multi-lingual experience. This process can be cost-prohibitive and difficult to scale, leading many enterprises to only support English for chats. Using human interpreters for translation support is expensive, and infeasible since chats need real-time translation. Adding multi-lingual machine translation to these customer support chat workflows provides cost-effective, scalable options that improve the customer experience by providing automated translations for users and agents, create an inclusive customer experience, and improve brand loyalty.

Amazon Translate is a neural machine translation service that delivers fast, high-quality, affordable, and customizable language translation. Service Cloud by Salesforce is one of the world’s most popular and highly rated customer service software solutions. Whether by phone, web, chat, or email, this customer support software enables agents and customers to quickly connect and solve customer problems. AWS and Salesforce have been in a strategic partnership since 2016, and are working together to innovate on behalf of customers.

In this post, we demonstrate how to link Salesforce and AWS in real time and use Amazon Translate from within Service Cloud.

Solution overview

The following diagram shows the solution architecture.

Solution overview diagram

There are two personas. The contact center agent persona uses the Service Cloud console, and the customer persona initiates the chat session via a customer support portal enabled by Salesforce Experience Cloud.

The solution is composed of the following components:

  1. A Lightning Web Component that implements a custom header for the customer chat. This component lets the customer toggle between languages.
  2. A Lightning Web Component that overrides the chat for the customer and invokes Amazon Translate to translate the text in real time. This is also referred to as a snap-in.
  3. An Aura-based web component that provides real-time chat translation services to the call center agent.
  4. A Salesforce Apex Callout class, which makes real-time calls to AWS to translate chat messages for the agent and the customer.
  5. Amazon API Gateway with AWS Lambda integration that converts the input text to the target language using the Amazon Translate SDK.

Prerequisites

This solution has the following prerequisites:

Deploy resources using the AWS CDK

You can deploy the resources using the AWS CDK, an open-source development framework that lets developers define cloud resources using familiar programming languages. The following steps set up API Gateway, Lambda, and Amazon Translate resources using the AWS CDK. It may take up to 15 minutes to complete the deployment.

  1. From a command prompt, run the following commands:
git clone https://github.com/aws-samples/amazon-translate-service-cloud-chat.git
cd amazon-translate-service-cloud-chat/aws
npm i -g aws-cdk
npm i
cdk deploy
  1. Take note of the API key and the API endpoint created during the deployment. You need those values later when configuring Salesforce to communicate with API Gateway.

Configure Salesforce Service Cloud

In this section, you use the Service Setup Assistant to enable an out-of-the-box Service Cloud app with optimal settings and layouts. To configure Service Cloud, complete the following steps:

  1. Log in to your Salesforce org, choose the gear icon, and choose Service Setup (the purple gear icon).
  2. Under Open the Service Setup Assistant, choose Go to Assistant.
  3. On the Service Setup Assistant page, in the Turn on your Service app section, toggle Service Setup Assistant to On.

This process may take a couple of minutes to complete. You can choose Check Status to see if the job is finished.

  1. When the status shows Ready, choose Get Started.
  2. Choose Yes, Let’s Do It.
  3. Ignore the Personalize Service section.

At this point, we have enabled Service Cloud.

Enable Salesforce Sites

Salesforce Sites lets you create public websites that are integrated with your Salesforce org. In this step, you register a Salesforce Sites domain, which you customize to embed a chat component that allows the customer persona to engage with the agent. To enable Salesforce Sites, complete the following steps:

  1. Log in to your Salesforce org.
  2. Choose the gear icon and choose Setup.
  3. Under User Interface, choose Sites and Domains, then choose Sites.
  4. Select the check box accepting the Sites terms of service and choose Register My Salesforce Site Domain.
  5. If a pop-up window appears, choose OK.
  6. Make a note of the URL under Sample Domain Name. You need this information in the next step.

Configure Salesforce Chat

In this step, you use Service Setup to configure Salesforce Chat. This walks you through a setup wizard to create chat queues, a team that the agent belongs to, and prioritization. To configure Salesforce Chat, complete the following steps:

  1. Choose the gear icon and choose Service Setup.
  2. Within the Service Setup home page, choose View All under Recommended Setup.

A dialog box opens with a list of configuration wizards.

  1. Choose the Chat with Customers configuration wizard, either by scrolling down or entering chat in the search box, then choose Start.
  2. In the Create a chat queue section, enter ChatQueue for Queue Name, and Chat Team for Name This Group.
  3. Select yourself as a member of the chat team and choose Next.

This allows your developer edition user account to be an agent within the Service Console.

  1. In the Prioritize chats with your other work section, set the ChatQueue priority to 1 and choose Next.
  2. In the Adjust your agents’ chat workload section, accept the defaults and choose Next.
  3. In the Let’s make chat work on your website section, enter the URL you saved (add https://) and choose Next.
  4. In the What’s your type? section, choose Just Contacts, then choose Next.
  5. In the In case your team’s busy section, accept the defaults and choose Next.

You don’t need the code snippet because we will drag and drop the predefined chat component in the next section.

  1. Choose Next followed by Done.

Configure your customer support digital experience

In this section, you configure the digital experience (the customer persona’s view) to embed a chat widget that the customer will use when they need help. To configure the digital experience, complete the following steps:

  1. Choose the gear icon followed by Setup.
  2. Under Digital Experiences, choose All Sites.
  3. In the Action column under All Sites, choose the Builder link.
  4. In the navigation pane, choose Components, and search for chat.
  5. Drag Embedded Service Chat to the Content Footer section, which requires you to scroll the window while dragging.
  6. You may see a pop-up indicating you cannot access the resources due to a Content Security Policy (CSP) issue. Ignore these errors, and choose OK. We will address these errors in the next step.
  7. Choose the settings gear in the navigation pane, then choose Security & Privacy.
  8. Under Content Security Policy (CSP), change Security Level to Relaxed CSP.
  9. Accept any pop-ups asking confirmation and ignore any errors.
  10. Under CSP Errors, identify the blocked resources, choose the Allow URL, and choose Allow on any confirmation dialog. This gets rid of the CSP error pop-ups.
  1. Close the security setting screen, then choose Publish, then Got it in the resultant dialog.
  2. If you continue to get CSP errors, go back to the security settings and manually choose Allow URL for the sites that were blocked under CSP Errors.
  3. Choose the Workspaces icon.
  4. Choose Administration.
  5. Choose Settings, then choose Activate, followed by OK.

Customize Salesforce Chat

You add yourself as a valid user for the CodeBuilder permission set, which lets you create and launch a Salesforce Code Builder project. You then deploy the customizations using the Salesforce CLI. Finally, you (unit) test that the translation is working as intended. To customize chat, complete the following steps:

  1. Choose the gear icon and choose Setup.
  2. Search for Permission Sets and then choose CodeBuilder on the Permission Sets page.
  3. Choose Manage Assignments, followed by Add Assignments.
  4. Choose yourself by selecting your name or login.
  5. Choose Next, then Assign, then Done.

Your name is now listed under Current Assignments.

  1. Under App Launcher, choose Code Builder (Beta).
  2. Choose Get Started, followed by New Project.
  3. Enter amazon-translate-service for Project Name and Empty for Project Type.
  4. Choose Next.
  5. Choose Connect a Development Org, then choose Next.
  6. If prompted, log in again using the credentials for your development org.
  7. Enter amazon-translate-service for Org Alias and choose Create.

It takes a few minutes to create the environment.

  1. When the environment is available, choose Launch.
  2. On the Terminal tab, enter the following commands:
git init
git remote add origin https://github.com/aws-samples/amazon-translate-service-cloud-chat.git
git fetch origin
git checkout main -f
cd salesforce
  1. In the navigation pane, open and edit the file force-app/main/default/externalCredentials/TranslationServiceExtCred.externalCredential-meta.xml.
  2. Replace parameterValue of the AuthHeader parameterType to your API key.
  3. Save the file.
  4. Edit the file force-app/main/default/namedCredentials/ TranslateService.namedCredential-meta.xml.
  5. Replace parameterValue of the Url parameterType with your API Gateway URL.
  6. Save the file.
  7. On the Terminal tab, enter the following commands:
sfdx force:source:deploy --sourcepath ./force-app/main/default
sfdx force:apex:execute -f ./scripts/apex/addUsersToPermSet.apex
sfdx force:apex:execute -f ./scripts/apex/testTranslation.apex

The first command pushes the code and metadata into your Salesforce developer environment:

The second command runs a script that assigns your user to a permission set within your Salesforce developer environment. Each user has to be authorized to use the named credential, which contains the information necessary to connect to AWS.

The last command runs a script that tests the integration between your Salesforce developer environment and the Amazon Translate service. If everything is configured correctly and deployed successfully, you will see that Salesforce can now call Amazon Translate.

Now that we’ve configured, pushed, and tested the project, it’s time to configure the Salesforce user interface to include the translation web components.

  1. Choose the gear icon and choose Setup.
  2. Under Service, choose Embedded Service, then choose Embedded Service Deployments.
  3. For Chat Team, choose View.
  4. For Chat Settings¸ choose Edit.
  5. Under Customize with Lightning Components, choose Edit.
  6. Choose translationHeaderSnapin for Chat Header and translationSnapin for Chat Messages (Text).
  7. Choose Save.

Configure the components in the Agent’s desktop interface

You now create a new Lightning app page and add a custom component that displays the translated customer’s messages. To configure agent’s desktop interface, complete the following steps:

  1. Choose the gear icon and choose Setup.
  2. Choose User Interface, then Lightning App Builder.
  3. Choose New in the Lightning Pages section.
  4. Choose Record Page, then choose Next.
  5. Choose Translation Chat Transcript for Label and Chat Transcript for Object.
  6. Choose Next.
  7. Choose Header and Two Equal Regions as the page template and choose Finish.
  8. Drag the Conversation component into the left-hand view and the TranslationReceiver component into the right-hand view.
  9. Choose Save, then choose Activate.
  10. Choose Assign as Org Default, then choose Desktop, and Next.
  11. Review the assignment and choose Save.
  12. Exit from the Lightning App Builder by choosing Save.

Test the translation feature

It’s time to test this feature. It’s easy to test by having two browsers side by side. The first browser is set up as the agent, and the second one as the customer. Make sure you toggle the customer persona’s language as a language other than English, and initiate the chat by choosing Chat with an Expert. Complete the following steps to initiate a conversation:

  1. Under App Launcher, choose Service Console.
  2. Choose Omni-Channel to open the agent interface.
  3. Make yourself available by choosing Available – Chat as your status.
  4. Open a separate tab or browser and choose Setup.
  5. Choose Digital Experiences, then All Sites.
  6. Choose the URL to launch the customer view.
  7. Choose Chat with an Expert, and choose the language as es in the drop-down menu at the top of the Chat pane.
  8. Provide your name and email.
  9. Choose Start Chatting.
  10. Go to the agent tab and accept the incoming chat.
  11. You can now chat back and forth as a customer speaking Spanish or other supported language and the agent speaking English.

Clean up

To clean up your resources, complete the following steps:

  1. Run cdk destroy to delete the provisioned resources.
  2. Follow the instructions in Deactivate a Developer Edition Org to deactivate your Salesforce Developer org.

Conclusion

In this post, we demonstrated how you can set up and configure real-time translations powered by Amazon Translate for Salesforce Service Cloud chat conversations. The combination of Salesforce Service Cloud and Amazon Translate enables a scalable, cost-effective solution for your customer support agents to communicate in real time with customers in their preferred languages. Amazon Translate can help you scale this solution to support over 5,550 translation pairs out of the box.

For more details about Amazon Translate, visit Amazon Translate resources to find video resources and blog posts, and also refer to Amazon Translate FAQs. If you’re new to Amazon Translate, try it out using the Free Tier, which offers up to 2 million characters per month for free for the first 12 months, starting from your first translation request.


About the Authors

Mark Lott is a Distinguished Technical Architect at Salesforce. He has over 25 years working in the software industry and works with customers of all sizes to design custom solutions using the Salesforce platform.

Kishore Dhamodaran is a Senior Solutions Architect at AWS. Kishore helps strategic customers with their cloud enterprise strategy and migration journey, leveraging his years of industry and cloud experience.

Tim McLaughlin is a Product Manager at Amazon Web Services in the AWS Language AI Services team. He works closely with customers around the world by supporting their AWS adoption journey with Language AI services.

Jared Wiener is a Solutions Architect at AWS.

Read More

Redacting PII data at The Very Group with Amazon Comprehend

Redacting PII data at The Very Group with Amazon Comprehend

This is guest post by Andy Whittle, Principal Platform Engineer – Application & Reliability Frameworks at The Very Group.

At The Very Group, which operates digital retailer Very, security is a top priority in handling data for millions of customers. Part of how The Very Group secures and tracks business operations is through activity logging between business systems (for example, across the stages of a customer order). It is a critical operating requirement and enables The Very Group to trace incidents and proactively identify problems and trends. However, this can mean processing customer data in the form of personally identifiable information (PII) in relation to activities such as purchases, returns, use of flexible payment options, and account management.

In this post, The Very Group shows how they use Amazon Comprehend to add a further layer of automated defense on top of policies to design threat modelling into all systems, to prevent PII from being sent in log data to Elasticsearch for indexing. Amazon Comprehend is a fully managed and continuously trained natural language processing (NLP) service that can extract insight about the content of a document or text.

Overview of solution

The overriding goal for The Very Group’s engineering team was to prevent any PII data from reaching documents within Elasticsearch. To accomplish this and automate removal of PII from millions of identified records per day, The Very Group’s engineering team created an Application Observability module in Terraform. This module implements an observability solution, including application logs, application performance monitoring (APM), and metrics. Within the module, the team used Amazon Comprehend to highlight PII within log data with the option of removing it before sending to Elasticsearch.

Amazon Comprehend was identified as part of an internal platform engineering initiative to investigate how AWS AI services can be used to improve efficiency and reduce risk in repetitive business activities. The Very Group’s culture to learn and experiment meant Amazon Comprehend was reviewed for applicability using a Java application to learn how it worked with test PII data. The team used code examples in the documentation to accelerate the proof of concept and quickly proved potential within a day.

The engineering team developed a schematic demonstrating how a PII redaction service could integrate with The Very Group’s logging. It involved developing a microservice to call Amazon Comprehend to detect PII data. The solution worked by passing The Very Group’s log data through a Logstash instance running on AWS Fargate, which cleanses the data using another Fargate-hosted pii-logstash-redaction service based on a Spring Boot Java application that makes calls to Amazon Comprehend to remove PII. The following diagram illustrates this architecture.

Very Group Comprehend PII Redaction Architecture Diagram

The Very Group’s solution takes logs from Amazon CloudWatch and Amazon Elastic Container Service (Amazon ECS) and passes cleansed versions to Elasticsearch to be indexed. Amazon Kinesis is used in the solution to capture and store logs for short periods, with Logstash pulling logs down every few seconds.

Logs are sourced across the many business processes, including ordering, returns, and Financial Services. They include logs from over 200 Amazon ECS apps across test and prod environments in Fargate that push logs into Logstash. Another source is AWS Lambda logs that are pulled into Kinesis and then pulled into Logstash. Finally, a separate standalone instance of Filebeat pulls log analysis and that puts them into CloudWatch and then into Logstash. The result is that many sources of logs are pulled or pushed into Logstash and processed by the Application Observability module and Amazon Comprehend before being stored in Elasticsearch.

A separate Terraform module provides all the infrastructure required to stand up a Logstash service capable of exporting logs from CloudWatch log groups into Elasticsearch via an AWS PrivateLink VPC endpoint. The Logstash service can also be integrated with Amazon ECS via a firelens log configuration, with Amazon ECS establishing connectivity over an Amazon Route 53 record. Scalability is built in with Kinesis scaling on demand (although the team started with fixed shards, but are now switching to on-demand usage), and Logstash scales out with additional Amazon Elastic Compute Cloud (Amazon EC2) instances behind an NLB due to protocols used by Filebeat and enables Logstash to more effectively pull logs from Kinesis.

Finally, the Logstash service consists of a task definition containing a Logstash container and PII redaction container, ensuring the removal of PII prior to exporting to Elasticsearch.

Results

The engineering team was able to build and test the solution within a week, without needing to understand machine learning (ML) or the working of AI, using Amazon Comprehend video guidance, API reference documentation, and example code. Having demonstrated business value so quickly, the business product owners have begun to develop new use cases to take advantage of the service. Some decisions had to be made to enable the solution. Although the platform engineering team knew they could redact the data, they wanted to intercept the logs from the current solution (based on a Fluent Bit sidecar to redirect logs to an endpoint). They decided to adopt Logstash to enable interception of log fields through pipelines to integrate with their PII service (comprising the Terraform module and Java service).

The adoption of Logstash was initially done seamlessly. The Very Group engineering squads are now using the service directly through an API endpoint to put logs straight into Elasticsearch. This has allowed them to switch their endpoint from the sidecar to the new endpoint and deploy it through the Terraform module. The only issue the team had was from initial tests that revealed a speed issue when testing with peak trading loads. This was overcome through adjustments to the Java code.

The following code shows how The Very Group use Amazon Comprehend to remove PII from log messages. It detects any PII and creates a list of entity types to record. To accelerate development, the code was taken from the AWS documentation and adapted for use in the Java application service deployed on Fargate.

        
private List<EntityLabel> getEntityLabels(String logData) {
		ContainsPiiEntitiesRequest request = ContainsPiiEntitiesRequest
                .builder()
                .languageCode(LanguageCode.EN)
                .text(logData)
                .build();

        ContainsPiiEntitiesResponse response = comprehendClient.containsPiiEntities(request);

        List<EntityLabel> labels = new ArrayList<>();
        if (response != null && response.hasLabels() && !response.labels().isEmpty()) {
            for (EntityLabel el : response.labels()) {
                if (el.score() > minScore && !redactionConfig.getComprehendExcludedTypes().contains(el.nameAsString())) {
                    labels.add(el);
                }
            }
        }
        return labels;
    }

The following screenshot shows the output sent to Elasticsearch as part of the PII redaction process. The service generates 1 million records per day, generating a record each time a redaction is made.

PII redacted output record sent to Elasticsearch

The log message is redacted, and the field redacted_entities contains a list of the entity types found in the message. In this case, the example found a URL, but it could have identified any type of PII data largely based on the built-in types of PII. An additional bespoke PII type for customer account number was added through Amazon Comprehend, but has not been needed so far. Engineering squad-level overrides are documented in GitHub on how to use them.

Conclusion

This project allowed The Very Group to implement a quick and simple solution to redact sensitive PII in logs. The engineering team added further flexibility allowing overrides for entity types, using Amazon Comprehend to provide the flexibility to redact PII based on the business needs. In the future, the engineering team is looking into training individual Amazon Comprehend entities to redact strings such as our customer IDs.

The result of the solution is that The Very Group has freedom to put logs through without needing to worry. It enforces the policy of not having PII stored in logs, thereby reducing risk and improving compliance. Furthermore, metadata being redacted is being reported back to the business through an Elasticsearch dashboard, enabling alerts and further action.

Make time to assess AWS AI/ML services that your organization hasn’t used yet and foster a culture of experimentation. Starting simple can quickly lead to business benefit, just as The Very Group proved.


About the Author

Andy Whittle is Principal Platform Engineer – Application & Reliability Frameworks at The Very Group, which operates UK-based digital retailer Very. Andy helps deliver performance monitoring across the organization’s tribes, and has a particular interest in application monitoring, observability, and performance. Since joining Very in 1998, Andy has undertaken a wide variety of roles covering content management and catalog production, stock management, production support, DevOps, and Fusion Middleware. For the past 4 years, he has been part of the platform engineering team.

Read More

Enriching real-time news streams with the Refinitiv Data Library, AWS services, and Amazon SageMaker

Enriching real-time news streams with the Refinitiv Data Library, AWS services, and Amazon SageMaker

This post is co-authored by Marios Skevofylakas, Jason Ramchandani and Haykaz Aramyan from Refinitiv, An LSEG Business.

Financial service providers often need to identify relevant news, analyze it, extract insights, and take actions in real time, like trading specific instruments (such as commodities, shares, funds) based on additional information or context of the news item. One such additional piece of information (which we use as an example in this post) is the sentiment of the news.

Refinitiv Data (RD) Libraries provide a comprehensive set of interfaces for uniform access to the Refinitiv Data Catalogue. The library offers multiple layers of abstraction providing different styles and programming techniques suitable for all developers, from low-latency, real-time access to batch ingestions of Refinitiv data.

In this post, we present a prototype AWS architecture that ingests our news feeds using RD Libraries and enhances them with machine learning (ML) model predictions using Amazon SageMaker, a fully managed ML service from AWS.

In an effort to design a modular architecture that could be used in a variety of use cases, like sentiment analysis, named entity recognition, and more, regardless of the ML model used for enhancement, we decided to focus on the real-time space. The reason for this decision is that real-time use cases are generally more complex and that the same architecture can also be used, with minimal adjustments, for batch inference. In our use case, we implement an architecture that ingests our real-time news feed, calculates sentiment on each news headline using ML, and re-serves the AI enhanced feed through a publisher/subscriber architecture.

Moreover, to present a comprehensive and reusable way to productionize ML models by adopting MLOps practices, we introduce the concept of infrastructure as code (IaC) during the entire MLOps lifecycle of the prototype. By using Terraform and a single entry point configurable script, we are able to instantiate the entire infrastructure, in production mode, on AWS in just a few minutes.

In this solution, we don’t address the MLOps aspect of the development, training, and deployment of the individual models. If you’re interested in learning more on this, refer to MLOps foundation roadmap for enterprises with Amazon SageMaker, which explains in detail a framework for model building, training, and deployment following best practices.

Solution overview

In this prototype, we follow a fully automated provisioning methodology in accordance with IaC best practices. IaC is the process of provisioning resources programmatically using automated scripts rather than using interactive configuration tools. Resources can be both hardware and needed software. In our case, we use Terraform to accomplish the implementation of a single configurable entry point that can automatically spin up the entire infrastructure we need, including security and access policies, as well as automated monitoring. With this single entry point that triggers a collection of Terraform scripts, one per service or resource entity, we can fully automate the lifecycle of all or parts of the components of the architecture, allowing us to implement granular control both on the DevOps as well as the MLOps side. After Terraform is correctly installed and integrated with AWS, we can replicate most operations that can be done on the AWS service dashboards.

The following diagram illustrates our solution architecture.

The architecture consists of three stages: ingestion, enrichment, and publishing. During the first stage, the real-time feeds are ingested on an Amazon Elastic Compute Cloud (Amazon EC2) instance that is created through a Refinitiv Data Library-ready AMI. The instance also connects to a data stream via Amazon Kinesis Data Streams, which triggers an AWS Lambda function.

In the second stage, the Lambda function that is triggered from Kinesis Data Streams connects to and sends the news headlines to a SageMaker FinBERT endpoint, which returns the calculated sentiment for the news item. This calculated sentiment is the enrichment in the real-time data that the Lambda function then wraps the news item with and stores in an Amazon DynamoDB table.

In the third stage of the architecture, a DynamoDB stream triggers a Lambda function on new item inserts, which is integrated with an Amazon MQ server running RabbitMQ, which re-serves the AI enhanced stream.

The decision on this three-stage engineering design, rather than the first Lambda layer directly communicating with the Amazon MQ server or implementing more functionality in the EC2 instance, was made to enable exploration of more complex, less coupled AI design architectures in the future.

Building and deploying the prototype

We present this prototype in a series of three detailed blueprints. In each blueprint and for every service used, you will find overviews and relevant information on its technical implementations as well as Terraform scripts that allow you to automatically start, configure, and integrate the service with the rest of the structure. At the end of each blueprint, you will find instructions on how to make sure that everything is working as expected up to each stage. The blueprints are as follows:

To start the implementation of this prototype, we suggest creating a new Python environment dedicated to it and installing the necessary packages and tools separately from other environments you may have. To do so, create and activate the new environment in Anaconda using the following commands:

conda create —name rd_news_aws_terraform python=3.7
conda activate rd_news_aws_terraform

We’re now ready to install the AWS Command Line Interface (AWS CLI) toolset that will allow us to build all the necessary programmatic interactions in and between AWS services:

pip install awscli

Now that the AWS CLI is installed, we need to install Terraform. HashiCorp provides Terraform with a binary installer, which you can download and install.

After you have both tools installed, ensure that they properly work using the following commands:

terraform -help
AWS – version

You’re now ready to follow the detailed blueprints on each of the three stages of the implementation.

Blueprint I: Real-time news ingestion using Amazon EC2 and Kinesis Data Streams

This blueprint represents the initial stages of the architecture that allow us to ingest the real-time news feeds. It consists of the following components:

  • Amazon EC2 preparing your instance for RD News ingestion – This section sets up an EC2 instance in a way that it enables the connection to the RD Libraries API and the real-time stream. We also show how to save the image of the created instance to ensure its reusability and scalability.
  • Real-time news ingestion from Amazon EC2 – A detailed implementation of the configurations needed to enable Amazon EC2 to connect the RD Libraries as well as the scripts to start the ingestion.
  • Creating and launching Amazon EC2 from the AMI – Launch a new instance by simultaneously transferring ingestion files to the newly created instance, all automatically using Terraform.
  • Creating a Kinesis data stream – This section provides an overview of Kinesis Data Streams and how to set up a stream on AWS.
  • Connecting and pushing data to Kinesis – Once the ingestion code is working, we need to connect it and send data to a Kinesis stream.
  • Testing the prototype so far – We use Amazon CloudWatch and command line tools to verify that the prototype is working up to this point and that we can continue to the next blueprint. The log of ingested data should look like the following screenshot.

Blueprint II: Real-time serverless AI news sentiment analysis using Kinesis Data Streams, Lambda, and SageMaker

In this second blueprint, we focus on the main part of the architecture: the Lambda function that ingests and analyzes the news item stream, attaches the AI inference to it, and stores it for further use. It includes the following components:

  • Lambda – Define a Terraform Lambda configuration allowing it to connect to a SageMaker endpoint.
  • Amazon S3 – To implement Lambda, we need to upload the appropriate code to Amazon Simple Storage Service (Amazon S3) and allow the Lambda function to ingest it in its environment. This section describes how we can use Terraform to accomplish that.
  • Implementing the Lambda function: Step 1, Handling the Kinesis event – In this section, we start building the Lambda function. Here, we build the Kinesis data stream response handler part only.
  • SageMaker – In this prototype, we use a pre-trained Hugging Face model that we store into a SageMaker endpoint. Here, we present how this can be achieved using Terraform scripts and how the appropriate integrations take place to allow SageMaker endpoints and Lambda functions work together.
    • At this point, you can instead use any other model that you have developed and deployed behind a SageMaker endpoint. Such a model could provide a different enhancement to the original news data, based on your needs. Optionally, this can be extrapolated to multiple models for multiple enhancements if such exist. Thanks to the rest of the architecture, any such models will enrich your data sources in real time.
  • Building the Lambda function: Step 2, Invoking the SageMaker endpoint – In this section, we build up our original Lambda function by adding the SageMaker block to get a sentiment enhanced news headline by invoking the SageMaker endpoint.
  • DynamoDB – Finally, when the AI inference is in the memory of the Lambda function, it re-bundles the item and sends it to a DynamoDB table for storage. Here, we discuss both the appropriate Python code needed to accomplish that, as well as the necessary Terraform scripts that enable these interactions.
  • Building the Lambda function: Step 3, Pushing enhanced data to DynamoDB – Here, we continue building up our Lambda function by adding the last part that creates an entry in the Dynamo table.
  • Testing the prototype so far – We can navigate to the DynamoDB table on the DynamoDB console to verify that our enhancements are appearing in the table.

Blueprint III: Real-time streaming using DynamoDB Streams, Lambda, and Amazon MQ

This third Blueprint finalizes this prototype. It focuses on redistributing the newly created, AI enhanced data item to a RabbitMQ server in Amazon MQ, allowing consumers to connect and retrieve the enhanced news items in real time. It includes the following components:

  • DynamoDB Streams – When the enhanced news item is in DynamoDB, we set up an event getting triggered that can then be captured from the appropriate Lambda function.
  • Writing the Lambda producer – This Lambda function captures the event and acts as a producer of the RabbitMQ stream. This new function introduces the concept of Lambda layers as it uses Python libraries to implement the producer functionality.
  • Amazon MQ and RabbitMQ consumers – The final step of the prototype is setting up the RabbitMQ service and implementing an example consumer that will connect to the message stream and receive the AI enhanced news items.
  • Final test of the prototype – We use an end-to-end process to verify that the prototype is fully working, from ingestion to re-serving and consuming the new AI-enhanced stream.

At this stage, you can validate that everything has been working by navigating to the RabbitMQ dashboard, as shown in the following screenshot.

In the final blueprint, you also find a detailed test vector to make sure that the entire architecture is behaving as planned.

Conclusion

In this post, we shared a solution using ML on the cloud with AWS services like SageMaker (ML), Lambda (serverless), and Kinesis Data Streams (streaming) to enrich streaming news data provided by Refinitiv Data Libraries. The solution adds a sentiment score to news items in real time and scales the infrastructure using code.

The benefit of this modular architecture is that you can reuse it with your own model to perform other types of data augmentation, in a serverless, scalable, and cost-efficient way that can be applied on top of Refinitiv Data Library. This can add value for trading/investment/risk management workflows.

If you have any comments or questions, please leave them in the comments section.

Related Information


 About the Authors

Marios Skevofylakas comes from a financial services, investment banking and consulting technology background. He holds an engineering Ph.D. in Artificial Intelligence and an M.Sc. in Machine Vision. Throughout his career, he has participated in numerous multidisciplinary AI and DLT projects. He is currently a Developer Advocate with Refinitiv, an LSEG business, focusing on AI and Quantum applications in financial services.

Jason Ramchandani has worked at Refinitiv, an LSEG Business, for 8 years as Lead Developer Advocate helping to build their Developer Community. Previously he has worked in financial markets for over 15 years with a quant background in the equity/equity-linked space at Okasan Securities, Sakura Finance and Jefferies LLC. His alma mater is UCL.

Haykaz Aramyan comes from a finance and technology background. He holds a Ph.D. in Finance, and an M.Sc. in Finance, Technology and Policy. Through 10 years of professional experience Haykaz worked on several multidisciplinary projects involving pension, VC funds and technology startups. He is currently a Developer Advocate with Refinitiv, An LSEG Business, focusing on AI applications in financial services.

Georgios Schinas is a Senior Specialist Solutions Architect for AI/ML in the EMEA region. He is based in London and works closely with customers in UK and Ireland. Georgios helps customers design and deploy machine learning applications in production on AWS with a particular interest in MLOps practices and enabling customers to perform machine learning at scale. In his spare time, he enjoys traveling, cooking and spending time with friends and family.

Muthuvelan Swaminathan is an Enterprise Solutions Architect based out of New York. He works with enterprise customers providing architectural guidance in building resilient, cost-effective, innovative solutions that address their business needs and help them execute at scale using AWS products and services.

Mayur Udernani leads AWS AI & ML business with commercial enterprises in UK & Ireland. In his role, Mayur spends majority of his time with customers and partners to help create impactful solutions that solve the most pressing needs of a customer or for a wider industry leveraging AWS Cloud, AI & ML services. Mayur lives in the London area. He has an MBA from Indian Institute of Management and Bachelors in Computer Engineering from Mumbai University.

Read More