Redacting PII data at The Very Group with Amazon Comprehend

Redacting PII data at The Very Group with Amazon Comprehend

This is guest post by Andy Whittle, Principal Platform Engineer – Application & Reliability Frameworks at The Very Group.

At The Very Group, which operates digital retailer Very, security is a top priority in handling data for millions of customers. Part of how The Very Group secures and tracks business operations is through activity logging between business systems (for example, across the stages of a customer order). It is a critical operating requirement and enables The Very Group to trace incidents and proactively identify problems and trends. However, this can mean processing customer data in the form of personally identifiable information (PII) in relation to activities such as purchases, returns, use of flexible payment options, and account management.

In this post, The Very Group shows how they use Amazon Comprehend to add a further layer of automated defense on top of policies to design threat modelling into all systems, to prevent PII from being sent in log data to Elasticsearch for indexing. Amazon Comprehend is a fully managed and continuously trained natural language processing (NLP) service that can extract insight about the content of a document or text.

Overview of solution

The overriding goal for The Very Group’s engineering team was to prevent any PII data from reaching documents within Elasticsearch. To accomplish this and automate removal of PII from millions of identified records per day, The Very Group’s engineering team created an Application Observability module in Terraform. This module implements an observability solution, including application logs, application performance monitoring (APM), and metrics. Within the module, the team used Amazon Comprehend to highlight PII within log data with the option of removing it before sending to Elasticsearch.

Amazon Comprehend was identified as part of an internal platform engineering initiative to investigate how AWS AI services can be used to improve efficiency and reduce risk in repetitive business activities. The Very Group’s culture to learn and experiment meant Amazon Comprehend was reviewed for applicability using a Java application to learn how it worked with test PII data. The team used code examples in the documentation to accelerate the proof of concept and quickly proved potential within a day.

The engineering team developed a schematic demonstrating how a PII redaction service could integrate with The Very Group’s logging. It involved developing a microservice to call Amazon Comprehend to detect PII data. The solution worked by passing The Very Group’s log data through a Logstash instance running on AWS Fargate, which cleanses the data using another Fargate-hosted pii-logstash-redaction service based on a Spring Boot Java application that makes calls to Amazon Comprehend to remove PII. The following diagram illustrates this architecture.

Very Group Comprehend PII Redaction Architecture Diagram

The Very Group’s solution takes logs from Amazon CloudWatch and Amazon Elastic Container Service (Amazon ECS) and passes cleansed versions to Elasticsearch to be indexed. Amazon Kinesis is used in the solution to capture and store logs for short periods, with Logstash pulling logs down every few seconds.

Logs are sourced across the many business processes, including ordering, returns, and Financial Services. They include logs from over 200 Amazon ECS apps across test and prod environments in Fargate that push logs into Logstash. Another source is AWS Lambda logs that are pulled into Kinesis and then pulled into Logstash. Finally, a separate standalone instance of Filebeat pulls log analysis and that puts them into CloudWatch and then into Logstash. The result is that many sources of logs are pulled or pushed into Logstash and processed by the Application Observability module and Amazon Comprehend before being stored in Elasticsearch.

A separate Terraform module provides all the infrastructure required to stand up a Logstash service capable of exporting logs from CloudWatch log groups into Elasticsearch via an AWS PrivateLink VPC endpoint. The Logstash service can also be integrated with Amazon ECS via a firelens log configuration, with Amazon ECS establishing connectivity over an Amazon Route 53 record. Scalability is built in with Kinesis scaling on demand (although the team started with fixed shards, but are now switching to on-demand usage), and Logstash scales out with additional Amazon Elastic Compute Cloud (Amazon EC2) instances behind an NLB due to protocols used by Filebeat and enables Logstash to more effectively pull logs from Kinesis.

Finally, the Logstash service consists of a task definition containing a Logstash container and PII redaction container, ensuring the removal of PII prior to exporting to Elasticsearch.

Results

The engineering team was able to build and test the solution within a week, without needing to understand machine learning (ML) or the working of AI, using Amazon Comprehend video guidance, API reference documentation, and example code. Having demonstrated business value so quickly, the business product owners have begun to develop new use cases to take advantage of the service. Some decisions had to be made to enable the solution. Although the platform engineering team knew they could redact the data, they wanted to intercept the logs from the current solution (based on a Fluent Bit sidecar to redirect logs to an endpoint). They decided to adopt Logstash to enable interception of log fields through pipelines to integrate with their PII service (comprising the Terraform module and Java service).

The adoption of Logstash was initially done seamlessly. The Very Group engineering squads are now using the service directly through an API endpoint to put logs straight into Elasticsearch. This has allowed them to switch their endpoint from the sidecar to the new endpoint and deploy it through the Terraform module. The only issue the team had was from initial tests that revealed a speed issue when testing with peak trading loads. This was overcome through adjustments to the Java code.

The following code shows how The Very Group use Amazon Comprehend to remove PII from log messages. It detects any PII and creates a list of entity types to record. To accelerate development, the code was taken from the AWS documentation and adapted for use in the Java application service deployed on Fargate.

        
private List<EntityLabel> getEntityLabels(String logData) {
		ContainsPiiEntitiesRequest request = ContainsPiiEntitiesRequest
                .builder()
                .languageCode(LanguageCode.EN)
                .text(logData)
                .build();

        ContainsPiiEntitiesResponse response = comprehendClient.containsPiiEntities(request);

        List<EntityLabel> labels = new ArrayList<>();
        if (response != null && response.hasLabels() && !response.labels().isEmpty()) {
            for (EntityLabel el : response.labels()) {
                if (el.score() > minScore && !redactionConfig.getComprehendExcludedTypes().contains(el.nameAsString())) {
                    labels.add(el);
                }
            }
        }
        return labels;
    }

The following screenshot shows the output sent to Elasticsearch as part of the PII redaction process. The service generates 1 million records per day, generating a record each time a redaction is made.

PII redacted output record sent to Elasticsearch

The log message is redacted, and the field redacted_entities contains a list of the entity types found in the message. In this case, the example found a URL, but it could have identified any type of PII data largely based on the built-in types of PII. An additional bespoke PII type for customer account number was added through Amazon Comprehend, but has not been needed so far. Engineering squad-level overrides are documented in GitHub on how to use them.

Conclusion

This project allowed The Very Group to implement a quick and simple solution to redact sensitive PII in logs. The engineering team added further flexibility allowing overrides for entity types, using Amazon Comprehend to provide the flexibility to redact PII based on the business needs. In the future, the engineering team is looking into training individual Amazon Comprehend entities to redact strings such as our customer IDs.

The result of the solution is that The Very Group has freedom to put logs through without needing to worry. It enforces the policy of not having PII stored in logs, thereby reducing risk and improving compliance. Furthermore, metadata being redacted is being reported back to the business through an Elasticsearch dashboard, enabling alerts and further action.

Make time to assess AWS AI/ML services that your organization hasn’t used yet and foster a culture of experimentation. Starting simple can quickly lead to business benefit, just as The Very Group proved.


About the Author

Andy Whittle is Principal Platform Engineer – Application & Reliability Frameworks at The Very Group, which operates UK-based digital retailer Very. Andy helps deliver performance monitoring across the organization’s tribes, and has a particular interest in application monitoring, observability, and performance. Since joining Very in 1998, Andy has undertaken a wide variety of roles covering content management and catalog production, stock management, production support, DevOps, and Fusion Middleware. For the past 4 years, he has been part of the platform engineering team.

Read More

Advancing human-centered AI: Updates on responsible AI research

Advancing human-centered AI: Updates on responsible AI research

Editor’s note: All papers referenced here represent collaborations throughout Microsoft and across academia and industry that include authors who contribute to Aether, the Microsoft internal advisory body for AI Ethics and Effects in Engineering and Research.


illustration of a lightbulb shape with different icons surrounding it on a purple background

  • Video frame with an image of hands reaching upward and overlaid with the question

    Video

    A human-centered approach to AI 

    Learn how considering potential benefits and harms to people and society helps create better AI in the keynote “Challenges and opportunities in responsible AI” (2022 ACM SIGIR Conference on Human Information Interaction and Retrieval).

Artificial intelligence, like all tools we build, is an expression of human creativity. As with all creative expression, AI manifests the perspectives and values of its creators. A stance that encourages reflexivity among AI practitioners is a step toward ensuring that AI systems are human-centered, developed and deployed with the interests and well-being of individuals and society front and center. This is the focus of research scientists and engineers affiliated with Aether, the advisory body for Microsoft leadership on AI ethics and effects. Central to Aether’s work is the question of who we’re creating AI for—and whether we’re creating AI to solve real problems with responsible solutions. With AI capabilities accelerating, our researchers work to understand the sociotechnical implications and find ways to help on-the-ground practitioners envision and realize these capabilities in line with Microsoft AI principles.

The following is a glimpse into the past year’s research for advancing responsible AI with authors from Aether. Throughout this work are repeated calls for reflexivity in AI practitioners’ processes—that is, self-reflection to help us achieve clarity about who we’re developing AI systems for, who benefits, and who may potentially be harmed—and for tools that help practitioners with the hard work of uncovering assumptions that may hinder the potential of human-centered AI. The research discussed here also explores critical components of responsible AI, such as being transparent about technology limitations, honoring the values of the people using the technology, enabling human agency for optimal human-AI teamwork, improving effective interaction with AI, and developing appropriate evaluation and risk-mitigation techniques for multimodal machine learning (ML) models.

Considering who AI systems are for

The need to cultivate broader perspectives and, for society’s benefit, reflect on why and for whom we’re creating AI is not only the responsibility of AI development teams but also of the AI research community. In the paper “REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research,” the authors point out that machine learning publishing often exhibits a bias toward emphasizing exciting progress, which tends to propagate misleading expectations about AI. They urge reflexivity on the limitations of ML research to promote transparency about findings’ generalizability and potential impact on society—ultimately, an exercise in reflecting on who we’re creating AI for. The paper offers a set of guided activities designed to help articulate research limitations, encouraging the machine learning research community toward a standard practice of transparency about the scope and impact of their work.

Graphic incorporating photos of a researcher sitting with a laptop and using the REAL ML tool, reflecting on research limitations to foster scientific progress, and a bird’s eye view of a cityscape at night.

Walk through REAL ML’s instructional guide and worksheet that help researchers with defining the limitations of their research and identifying societal implications these limitations may have in the practical use of their work.

Despite many organizations formulating principles to guide the responsible development and deployment of AI, a recent survey highlights that there’s a gap between the values prioritized by AI practitioners and those of the general public. The survey, which included a representative sample of the US population, found AI practitioners often gave less weight than the general public to values associated with responsible AI. This raises the question of whose values should inform AI systems and shifts attention toward considering the values of the people we’re designing for, aiming for AI systems that are better aligned with people’s needs.

Related papers

Creating AI that empowers human agency

Supporting human agency and emphasizing transparency in AI systems are proven approaches to building appropriate trust with the people systems are designed to help. In human-AI teamwork, interactive visualization tools can enable people to capitalize on their own domain expertise and let them easily edit state-of-the-art models. For example, physicians using GAM Changer can edit risk prediction models for pneumonia and sepsis to incorporate their own clinical knowledge and make better treatment decisions for patients.

A study examining how AI can improve the value of rapidly growing citizen-science contributions found that emphasizing human agency and transparency increased productivity in an online workflow where volunteers provide valuable information to help AI classify galaxies. When choosing to opt in to using the new workflow and receiving messages that stressed human assistance was necessary for difficult classification tasks, participants were more productive without sacrificing the quality of their input and they returned to volunteer more often.

Failures are inevitable in AI because no model that interacts with the ever-changing physical world can be complete. Human input and feedback are essential to reducing risks. Investigating reliability and safety mitigations for systems such as robotic box pushing and autonomous driving, researchers formalize the problem of negative side effects (NSEs), the undesirable behavior of these systems. The researchers experimented with a framework in which the AI system uses immediate human assistance in the form of feedback—either about the user’s tolerance for an NSE occurrence or their decision to modify the environment. Results demonstrate that AI systems can adapt to successfully mitigate NSEs from feedback, but among future considerations, there remains the challenge of developing techniques for collecting accurate feedback from individuals using the system.

The goal of optimizing human-AI complementarity highlights the importance of engaging human agency. In a large-scale study examining how bias in models influences humans’ decisions in a job recruiting task, researchers made a surprising discovery: when working with a black-box deep neural network (DNN) recommender system, people made significantly fewer gender-biased decisions than when working with a bag-of-words (BOW) model, which is perceived as more interpretable. This suggests that people tend to reflect and rely on their own judgment before accepting a recommendation from a system for which they can’t comfortably form a mental model of how its outputs are derived. Researchers call for exploring techniques to better engage human reflexivity when working with advanced algorithms, which can be a means for improving hybrid human-AI decision-making and mitigating bias. 

How we design human-AI interaction is key to complementarity and empowering human agency. We need to carefully plan how people will interact with AI systems that are stochastic in nature and present inherently different challenges than deterministic systems. Designing and testing human interaction with AI systems as early as possible in the development process, even before teams invest in engineering, can help avoid costly failures and redesign. Toward this goal, researchers propose early testing of human-AI interaction through factorial surveys, a method from the social sciences that uses short narratives for deriving insights about people’s perceptions.

But testing for optimal user experience before teams invest in engineering can be challenging for AI-based features that change over time. The ongoing nature of a person adapting to a constantly updating AI feature makes it difficult to observe user behavior patterns that can inform design improvements before deploying a system. However, experiments demonstrate the potential of HINT (Human-AI INtegration Testing), a framework for uncovering over-time patterns in user behavior during pre-deployment testing. Using HINT, practitioners can design test setup, collect data via a crowdsourced workflow, and generate reports of user-centered and offline metrics.

Graphic of bridging HCI and NLP for empowering human agency with images of people using chatbots.

Check out the 2022 anthology of this annual workshop that brings human-computer interaction (HCI) and natural language processing (NLP) research together for improving how people can benefit from NLP apps they use daily.

Related papers

Building responsible AI tools for foundation models

Although we’re still in the early stages of understanding how to responsibly harness the potential of large language and multimodal models that can be used as foundations for building a variety of AI-based systems, researchers are developing promising tools and evaluation techniques to help on-the-ground practitioners deliver responsible AI. The reflexivity and resources required for deploying these new capabilities with a human-centered approach are fundamentally compatible with business goals of robust services and products.

Natural language generation with open-ended vocabulary has sparked a lot of imagination in product teams. Challenges persist, however, including for improving toxic language detection; content moderation tools often over-flag content that mentions minority groups without respect to context while missing implicit toxicity. To help address this, a new large-scale machine-generated dataset, ToxiGen, enables practitioners to fine-tune pretrained hate classifiers for improving detection of implicit toxicity for 13 minority groups in both human- and machine-generated text.

Graphic for ToxiGen dataset for improving toxic language detection with images of diverse demographic groups of people in discussion and on smartphone.

Download the large-scale machine-generated ToxiGen dataset and install source code for fine-tuning toxic language detection systems for adversarial and implicit hate speech for 13 demographic minority groups. Intended for research purposes.

Multimodal models are proliferating, such as those that combine natural language generation with computer vision for services like image captioning. These complex systems can surface harmful societal biases in their output and are challenging to evaluate for mitigating harms. Using a state-of-the-art image captioning service with two popular image-captioning datasets, researchers isolate where in the system fairness-related harms originate and present multiple measurement techniques for five specific types of representational harm: denying people the opportunity to self-identify, reifying social groups, stereotyping, erasing, and demeaning.

The commercial advent of AI-powered code generators has introduced novice developers alongside professionals to large language model (LLM)-assisted programming. An overview of the LLM-assisted programming experience reveals unique considerations. Programming with LLMs invites comparison to related ways of programming, such as search, compilation, and pair programming. While there are indeed similarities, the empirical reports suggest it is a distinct way of programming with its own unique blend of behaviors. For example, additional effort is required to craft prompts that generate the desired code, and programmers must check the suggested code for correctness, reliability, safety, and security. Still, a user study examining what programmers value in AI code generation shows that programmers do find value in suggested code because it’s easy to edit, increasing productivity. Researchers propose a hybrid metric that combines functional correctness and similarity-based metrics to best capture what programmers value in LLM-assisted programming, because human judgment should determine how a technology can best serve us.

Related papers

Understanding and supporting AI practitioners

Organizational culture and business goals can often be at odds with what practitioners need for mitigating fairness and other responsible AI issues when their systems are deployed at scale. Responsible, human-centered AI requires a thoughtful approach: just because a technology is technically feasible does not mean it should be created.

Similarly, just because a dataset is available doesn’t mean it’s appropriate to use. Knowing why and how a dataset was created is crucial for helping AI practitioners decide on whether it should be used for their purposes and what its implications are for fairness, reliability, safety, and privacy. A study focusing on how AI practitioners approach datasets and documentation reveals current practices are informal and inconsistent. It points to the need for data documentation frameworks designed to fit within practitioners’ existing workflows and that make clear the responsible AI implications of using a dataset. Based on these findings, researchers iterated on Datasheets for Datasets and proposed the revised Aether Data Documentation Template.

Graphic for the Aether Data Documentation Template for promoting reflexivity and transparency with bird’s eye view of pedestrians at busy crosswalks and a close-up of hands typing on a computer keyboard.

Use this flexible template to reflect and help document underlying assumptions, potential risks, and implications of using your dataset.

AI practitioners find themselves balancing the pressures of delivering to meet business goals and the time requirements necessary for the responsible development and evaluation of AI systems. Examining these tensions across three technology companies, researchers conducted interviews and workshops to learn what practitioners need for measuring and mitigating AI fairness issues amid time pressure to release AI-infused products to wider geographic markets and for more diverse groups of people. Participants disclosed challenges in collecting appropriate datasets and finding the right metrics for evaluating how fairly their system will perform when they can’t identify direct stakeholders and demographic groups who will be affected by the AI system in rapidly broadening markets. For example, hate speech detection may not be adequate across cultures or languages. A look at what goes into AI practitioners’ decisions around what, when, and how to evaluate AI systems that use natural language generation (NLG) further emphasizes that when practitioners don’t have clarity about deployment settings, they’re limited in projecting failures that could cause individual or societal harm. Beyond concerns for detecting toxic speech, other issues of fairness and inclusiveness—for example, erasure of minority groups’ distinctive linguistic expression—are rarely a consideration in practitioners’ evaluations.

Coping with time constraints and competing business objectives is a reality for teams deploying AI systems. There are many opportunities for developing integrated tools that can prompt AI practitioners to think through potential risks and mitigations for sociotechnical systems.

Related papers

Thinking about it: Reflexivity as an essential for society and industry goals

As we continue to envision what all is possible with AI’s potential, one thing is clear: developing AI designed with the needs of people in mind requires reflexivity. We have been thinking about human-centered AI as being focused on users and stakeholders. Understanding who we are designing for, empowering human agency, improving human-AI interaction, and developing harm mitigation tools and techniques are as important as ever. But we also need to turn a mirror toward ourselves as AI creators. What values and assumptions do we bring to the table? Whose values get to be included and whose are left out? How do these values and assumptions influence what we build, how we build, and for whom? How can we navigate complex and demanding organizational pressures as we endeavor to create responsible AI? With technologies as powerful as AI, we can’t afford to be focused solely on progress for its own sake. While we work to evolve AI technologies at a fast pace, we need to pause and reflect on what it is that we are advancing—and for whom.

The post Advancing human-centered AI: Updates on responsible AI research appeared first on Microsoft Research.

Read More

NVIDIA, Evozyne Create Generative AI Model for Proteins

NVIDIA, Evozyne Create Generative AI Model for Proteins

Using a pretrained AI model from NVIDIA, startup Evozyne created two proteins with significant potential in healthcare and clean energy.

A joint paper released today describes the process and the biological building blocks it produced. One aims to cure a congenital disease, another is designed to consume carbon dioxide to reduce global warming.

Initial results show a new way to accelerate drug discovery and more.

“It’s been really encouraging that even in this first round the AI model has produced synthetic proteins as good as naturally occurring ones,” said Andrew Ferguson, Evozyne’s co-founder and a co-author of the paper. “That tells us it’s learned nature’s design rules correctly.”

A Transformational AI Model

Evozyne used NVIDIA’s implementation of ProtT5, a transformer model that’s part of NVIDIA BioNeMo, a software framework and service for creating AI models for healthcare.

“BioNeMo really gave us everything we needed to support model training and then run jobs with the model very inexpensively — we could generate millions of sequences in just a few seconds,” said Ferguson, a molecular engineer working at the intersection of chemistry and machine learning.

The model lies at the heart of Evovyne’s process called ProT-VAE. It’s a workflow that combines BioNeMo with a variational autoencoder that acts as a filter.

“Using large language models combined with variational autoencoders to design proteins was not on anybody’s radar just a few years ago,” he said.

Model Learns Nature’s Ways

Like a student reading a book, NVIDIA’s transformer model reads sequences of amino acids in millions of proteins. Using the same techniques neural networks employ to understand text, it learned how nature assembles these powerful building blocks of biology.

The model then predicted how to assemble new proteins suited for functions Evozyne wants to address.

“The technology is enabling us to do things that were pipe dreams 10 years ago,” he said.

A Sea of Possibilities

Machine learning helps navigate the astronomical number of possible protein sequences, then efficiently identifies the most useful ones.

The traditional method of engineering proteins, called directed evolution, uses a slow, hit-or-miss approach. It typically only changes a few amino acids in sequence at a time.

Evozyne's ProT-VAE workflow generates useful proteins with NVIDIA BioNeMo
Evozyne’s ProT-VAE process uses a powerful transformer model in NVIDIA BioNeMo to generate useful proteins for drug discovery and energy sustainability.

By contrast, Evozyne’s approach can alter half or more of the amino acids in a protein in a single round. That’s the equivalent of making hundreds of mutations.

“We’re taking huge jumps which allows us to explore proteins never seen before that have new and useful functions,” he said.

Using the new process, Evozyne plans to build a range of proteins to fight diseases and climate change.

Slashing Training Time, Scaling Models

“NVIDIA’s been an incredible partner on this work,” he said.

“They scaled jobs to multiple GPUs to speed up training,” said Joshua Moller, a data scientist at Evozyne. “We were getting through entire datasets every minute.”

That reduced the time to train large AI models from months to a week. “It allowed us to train models — some with billions of trainable parameters — that just would not be possible otherwise,” Ferguson said.

Much More to Come

The horizon for AI-accelerated protein engineering is wide.

“The field is moving incredibly quickly, and I’m really excited to see what comes next,” he said, noting the recent rise of diffusion models.

“Who knows where we will be in five years’ time.”

Sign up for early access to the NVIDIA BioNeMo to see how it can accelerate your applications.

Read More

GFN Thursday Adds New Titles From THQ Nordic to GeForce NOW

GFN Thursday Adds New Titles From THQ Nordic to GeForce NOW

GFN Thursday kicks each weekend off with new games and updates straight from the cloud. This week adds more games from publisher THQ Nordic to the GeForce NOW library, as part seven total additions.

Members can gear up to play these new titles the ultimate way with the upcoming release of the new Ultimate membership, delivering RTX 4080-class performance and elevated cloud gaming perks.

Just announced at CES 2023, HP is adding support for NVIDIA GeForce NOW through its OMEN Gaming Hub. Members will have access to the GeForce NOW library of over 1,500 titles built right into their latest HP laptops, making it even easier to stream at GeForce quality.

New Titles From THQ Nordic

Adventure to new and strange worlds with support for five THQ Nordic titles coming to the GeForce NOW library. Members can stream the Steam and Epic Games Store versions from their favorite digital stores across all GeForce NOW-compatible devices.

Destroy All Humans 2 Reprobed on GeForce NOW
Great gaming is just a cloud away.

Follow the story of a former crusader knight called back into action to stop the unification of a powerful ancient artifact that could bring untold evil to the world in The Valiant. Rally warriors with different skills to the cause and build custom hero-squads to defeat your enemies. Stream from PC and Mac apps even in 4K resolution with the GeForce NOW Ultimate membership.

Play as evil alien Crypto-137 harvesting DNA from Earth’s citizens in the brazen action-adventure title Destroy All Humans!. Use an assortment of alien weaponry and psychic skills to bring down the government and reduce cities of the 1950s to rubble with a flying saucer.

Crypto the alien invader returns, groovier than ever, in Destroy All Humans! 2: Reprobed, the swinging sequel set in the ‘60s. Stream your intergalactic adventures on the big screen with NVIDIA SHIELD or Samsung Smart TVs in beautiful 4K.

Become the new owner of a hunting lodge, explore vast open-world environments, and hunt with a premium selection of firearms and equipment in Way of the Hunter. Enjoy the hunt on your own or with a friend in multiplayer co-op, and experience the great outdoors on the go playing from mobile devices.

Experience high-intensity outdoor racing and become a world-famous, professional off-road rider in MX vs ATV Legends. Compete against others in the new career mode, where choices lead to different paths on devices designed for enhanced streaming experiences like the Logitech G CLOUD or cloud gaming Chromebooks.

Upgrade to Ultimate Gaming

Ready for the ultimate cloud gaming performance? Upgrade to a GeForce NOW Ultimate membership and get ready for RTX 4080-class performance the moment it’s available.

GeForce NOW Ultimate Membership
Ultimate PC gaming power, itty-bitty living space in the cloud.

GeForce NOW Ultimate is cloud gaming that is “beyond fast.” Powered by the NVIDIA Ada Lovelace architecture in upgraded GeForce NOW RTX 4080 SuperPODs, Ultimate members can stream at up to 240 frames per second for the lowest latency ever from the cloud, or up to 4K 120 fps.

Ultimate members can also take advantage of new ultrawide resolution support for their favorite PC games, and experience full ray tracing and DLSS 3 in supported titles for beautiful, cinematic-quality graphics.

Ultimate members can play today on GeForce NOW RTX 3080 rigs for the highest performance and lowest latency available in cloud gaming. And when GeForce RTX 4080-powered SuperPODs begin rolling out in North America and Europe later this month, Ultimate members will be the first to stream at RTX 4080-class power.

Sign up today — quantities are limited.

Gamers, Come Out to Play!

Tom Clancy's The Division 2 on GeForce NOW
Calling all active Division agents. Save the city from chaos before it’s too late.

Get the gaming going this weekend with seven more titles supported on GeForce NOW:

While you’re getting ready for an out-of-this-world weekend full of gaming, we’ve got a question for you. Let us know your answer on Twitter or in the comments below.

Read More

NVIDIA Helps Retail Industry Tackle Its $100 Billion Shrink Problem

NVIDIA Helps Retail Industry Tackle Its $100 Billion Shrink Problem

The global retail industry has a $100 billion problem.

“Shrinkage” — the loss of goods due to theft, damage and misplacement — significantly crimps retailers’ profits.

An estimated 65% of shrinkage is due to theft, according to the National Retail Federation’s 2022 Retail Security Survey, conducted in partnership with the Loss Prevention Research Council. And many retailers are reporting theft has more than doubled recently, driven by rising prices of food and other essentials.

To make it easier for developers to quickly build and roll out applications designed to prevent theft, NVIDIA today announced three Retail AI Workflows, built on its Metropolis microservices. They can be used as no-code or low-code building blocks for loss-prevention applications because they come pretrained with images of the most-stolen products as well as software to plug into existing store applications for point-of-sale machines and object and product tracking across entire stores.

“Retail theft is growing due to macro-dynamics, and threatens to overwhelm the industry,” said Read Hayes, director of the Loss Prevention Research Council. “Businesses are now facing the reality that investment in loss-prevention solutions is a critical requirement.”

The NVIDIA Retail AI Workflows, which are available through the NVIDIA AI Enterprise software suite, include:

  • Retail Loss Prevention AI Workflow: The AI models within this workflow come pretrained to recognize hundreds of products most frequently lost to theft — including meat, alcohol and laundry detergent — and to recognize them in the varying sizes and shapes they’re offered. With synthetic data generation from NVIDIA Omniverse, retailers and independent software vendors can customize and further train the models to hundreds of thousands of store products. The workflow is based on a state-of-the-art few-shot learning technique developed by NVIDIA Research which, combined with active learning, identifies and captures any new products scanned by customers and sales associates during checkout to ultimately improve model accuracy.
  • Multi-Camera Tracking AI Workflow: Delivers multi-target, multi-camera (MTMC) capabilities that allow application developers to more easily create systems that track objects across multiple cameras throughout the store. The workflow tracks objects and store associates across cameras and maintains a unique ID for each object. Objects are tracked through visual embeddings or appearance, rather than personal biometric information, to maintain full shopper privacy.
  • Retail Store Analytics Workflow: Uses computer vision to provide insights for store analytics, such as store traffic trends, counts of customers with shopping baskets, aisle occupancy and more via custom dashboards.

The workflows are built on NVIDIA Metropolis microservices, a low- or no-code way of building AI applications. The microservices provide the building blocks for developing complex AI workflows and allow them to rapidly scale into production-ready AI apps.

Developers can easily customize and extend these AI workflows, including by integrating their own models. The microservices also make it easier to integrate new offerings with legacy systems, such as point-of-sale systems.

“NVIDIA’s new Retail AI Workflows built on Metropolis microservices allow us to customize our product, scale rapidly to fit our ever-growing customers’ needs better and continue to drive innovation in the retail space,” said Bobby Chowdary, chief technology officer at Radius.ai.

“As part of our applied AI offerings, Infosys is developing state-of-the-art loss prevention systems leveraging NVIDIA’s new workflows comprising pretrained models for retail SKU recognition and microservices architecture,” said Balakrishna D R, executive vice president and head of AI and Automation at Infosys. “It will enable us to deploy these solutions faster and rapidly scale across stores and product lines while also getting much higher levels of accuracy than before.”

NVIDIA will unveil additional details of its Retail AI Workflows at the National Retail Federation Conference in New York, Jan. 15-17.

Sign up for early access to the new NVIDIA Retail AI Workflows for developers and learn more in the NVIDIA Technical Blog. Join NVIDIA at #NRF2023.

Read More

Enriching real-time news streams with the Refinitiv Data Library, AWS services, and Amazon SageMaker

Enriching real-time news streams with the Refinitiv Data Library, AWS services, and Amazon SageMaker

This post is co-authored by Marios Skevofylakas, Jason Ramchandani and Haykaz Aramyan from Refinitiv, An LSEG Business.

Financial service providers often need to identify relevant news, analyze it, extract insights, and take actions in real time, like trading specific instruments (such as commodities, shares, funds) based on additional information or context of the news item. One such additional piece of information (which we use as an example in this post) is the sentiment of the news.

Refinitiv Data (RD) Libraries provide a comprehensive set of interfaces for uniform access to the Refinitiv Data Catalogue. The library offers multiple layers of abstraction providing different styles and programming techniques suitable for all developers, from low-latency, real-time access to batch ingestions of Refinitiv data.

In this post, we present a prototype AWS architecture that ingests our news feeds using RD Libraries and enhances them with machine learning (ML) model predictions using Amazon SageMaker, a fully managed ML service from AWS.

In an effort to design a modular architecture that could be used in a variety of use cases, like sentiment analysis, named entity recognition, and more, regardless of the ML model used for enhancement, we decided to focus on the real-time space. The reason for this decision is that real-time use cases are generally more complex and that the same architecture can also be used, with minimal adjustments, for batch inference. In our use case, we implement an architecture that ingests our real-time news feed, calculates sentiment on each news headline using ML, and re-serves the AI enhanced feed through a publisher/subscriber architecture.

Moreover, to present a comprehensive and reusable way to productionize ML models by adopting MLOps practices, we introduce the concept of infrastructure as code (IaC) during the entire MLOps lifecycle of the prototype. By using Terraform and a single entry point configurable script, we are able to instantiate the entire infrastructure, in production mode, on AWS in just a few minutes.

In this solution, we don’t address the MLOps aspect of the development, training, and deployment of the individual models. If you’re interested in learning more on this, refer to MLOps foundation roadmap for enterprises with Amazon SageMaker, which explains in detail a framework for model building, training, and deployment following best practices.

Solution overview

In this prototype, we follow a fully automated provisioning methodology in accordance with IaC best practices. IaC is the process of provisioning resources programmatically using automated scripts rather than using interactive configuration tools. Resources can be both hardware and needed software. In our case, we use Terraform to accomplish the implementation of a single configurable entry point that can automatically spin up the entire infrastructure we need, including security and access policies, as well as automated monitoring. With this single entry point that triggers a collection of Terraform scripts, one per service or resource entity, we can fully automate the lifecycle of all or parts of the components of the architecture, allowing us to implement granular control both on the DevOps as well as the MLOps side. After Terraform is correctly installed and integrated with AWS, we can replicate most operations that can be done on the AWS service dashboards.

The following diagram illustrates our solution architecture.

The architecture consists of three stages: ingestion, enrichment, and publishing. During the first stage, the real-time feeds are ingested on an Amazon Elastic Compute Cloud (Amazon EC2) instance that is created through a Refinitiv Data Library-ready AMI. The instance also connects to a data stream via Amazon Kinesis Data Streams, which triggers an AWS Lambda function.

In the second stage, the Lambda function that is triggered from Kinesis Data Streams connects to and sends the news headlines to a SageMaker FinBERT endpoint, which returns the calculated sentiment for the news item. This calculated sentiment is the enrichment in the real-time data that the Lambda function then wraps the news item with and stores in an Amazon DynamoDB table.

In the third stage of the architecture, a DynamoDB stream triggers a Lambda function on new item inserts, which is integrated with an Amazon MQ server running RabbitMQ, which re-serves the AI enhanced stream.

The decision on this three-stage engineering design, rather than the first Lambda layer directly communicating with the Amazon MQ server or implementing more functionality in the EC2 instance, was made to enable exploration of more complex, less coupled AI design architectures in the future.

Building and deploying the prototype

We present this prototype in a series of three detailed blueprints. In each blueprint and for every service used, you will find overviews and relevant information on its technical implementations as well as Terraform scripts that allow you to automatically start, configure, and integrate the service with the rest of the structure. At the end of each blueprint, you will find instructions on how to make sure that everything is working as expected up to each stage. The blueprints are as follows:

To start the implementation of this prototype, we suggest creating a new Python environment dedicated to it and installing the necessary packages and tools separately from other environments you may have. To do so, create and activate the new environment in Anaconda using the following commands:

conda create —name rd_news_aws_terraform python=3.7
conda activate rd_news_aws_terraform

We’re now ready to install the AWS Command Line Interface (AWS CLI) toolset that will allow us to build all the necessary programmatic interactions in and between AWS services:

pip install awscli

Now that the AWS CLI is installed, we need to install Terraform. HashiCorp provides Terraform with a binary installer, which you can download and install.

After you have both tools installed, ensure that they properly work using the following commands:

terraform -help
AWS – version

You’re now ready to follow the detailed blueprints on each of the three stages of the implementation.

Blueprint I: Real-time news ingestion using Amazon EC2 and Kinesis Data Streams

This blueprint represents the initial stages of the architecture that allow us to ingest the real-time news feeds. It consists of the following components:

  • Amazon EC2 preparing your instance for RD News ingestion – This section sets up an EC2 instance in a way that it enables the connection to the RD Libraries API and the real-time stream. We also show how to save the image of the created instance to ensure its reusability and scalability.
  • Real-time news ingestion from Amazon EC2 – A detailed implementation of the configurations needed to enable Amazon EC2 to connect the RD Libraries as well as the scripts to start the ingestion.
  • Creating and launching Amazon EC2 from the AMI – Launch a new instance by simultaneously transferring ingestion files to the newly created instance, all automatically using Terraform.
  • Creating a Kinesis data stream – This section provides an overview of Kinesis Data Streams and how to set up a stream on AWS.
  • Connecting and pushing data to Kinesis – Once the ingestion code is working, we need to connect it and send data to a Kinesis stream.
  • Testing the prototype so far – We use Amazon CloudWatch and command line tools to verify that the prototype is working up to this point and that we can continue to the next blueprint. The log of ingested data should look like the following screenshot.

Blueprint II: Real-time serverless AI news sentiment analysis using Kinesis Data Streams, Lambda, and SageMaker

In this second blueprint, we focus on the main part of the architecture: the Lambda function that ingests and analyzes the news item stream, attaches the AI inference to it, and stores it for further use. It includes the following components:

  • Lambda – Define a Terraform Lambda configuration allowing it to connect to a SageMaker endpoint.
  • Amazon S3 – To implement Lambda, we need to upload the appropriate code to Amazon Simple Storage Service (Amazon S3) and allow the Lambda function to ingest it in its environment. This section describes how we can use Terraform to accomplish that.
  • Implementing the Lambda function: Step 1, Handling the Kinesis event – In this section, we start building the Lambda function. Here, we build the Kinesis data stream response handler part only.
  • SageMaker – In this prototype, we use a pre-trained Hugging Face model that we store into a SageMaker endpoint. Here, we present how this can be achieved using Terraform scripts and how the appropriate integrations take place to allow SageMaker endpoints and Lambda functions work together.
    • At this point, you can instead use any other model that you have developed and deployed behind a SageMaker endpoint. Such a model could provide a different enhancement to the original news data, based on your needs. Optionally, this can be extrapolated to multiple models for multiple enhancements if such exist. Thanks to the rest of the architecture, any such models will enrich your data sources in real time.
  • Building the Lambda function: Step 2, Invoking the SageMaker endpoint – In this section, we build up our original Lambda function by adding the SageMaker block to get a sentiment enhanced news headline by invoking the SageMaker endpoint.
  • DynamoDB – Finally, when the AI inference is in the memory of the Lambda function, it re-bundles the item and sends it to a DynamoDB table for storage. Here, we discuss both the appropriate Python code needed to accomplish that, as well as the necessary Terraform scripts that enable these interactions.
  • Building the Lambda function: Step 3, Pushing enhanced data to DynamoDB – Here, we continue building up our Lambda function by adding the last part that creates an entry in the Dynamo table.
  • Testing the prototype so far – We can navigate to the DynamoDB table on the DynamoDB console to verify that our enhancements are appearing in the table.

Blueprint III: Real-time streaming using DynamoDB Streams, Lambda, and Amazon MQ

This third Blueprint finalizes this prototype. It focuses on redistributing the newly created, AI enhanced data item to a RabbitMQ server in Amazon MQ, allowing consumers to connect and retrieve the enhanced news items in real time. It includes the following components:

  • DynamoDB Streams – When the enhanced news item is in DynamoDB, we set up an event getting triggered that can then be captured from the appropriate Lambda function.
  • Writing the Lambda producer – This Lambda function captures the event and acts as a producer of the RabbitMQ stream. This new function introduces the concept of Lambda layers as it uses Python libraries to implement the producer functionality.
  • Amazon MQ and RabbitMQ consumers – The final step of the prototype is setting up the RabbitMQ service and implementing an example consumer that will connect to the message stream and receive the AI enhanced news items.
  • Final test of the prototype – We use an end-to-end process to verify that the prototype is fully working, from ingestion to re-serving and consuming the new AI-enhanced stream.

At this stage, you can validate that everything has been working by navigating to the RabbitMQ dashboard, as shown in the following screenshot.

In the final blueprint, you also find a detailed test vector to make sure that the entire architecture is behaving as planned.

Conclusion

In this post, we shared a solution using ML on the cloud with AWS services like SageMaker (ML), Lambda (serverless), and Kinesis Data Streams (streaming) to enrich streaming news data provided by Refinitiv Data Libraries. The solution adds a sentiment score to news items in real time and scales the infrastructure using code.

The benefit of this modular architecture is that you can reuse it with your own model to perform other types of data augmentation, in a serverless, scalable, and cost-efficient way that can be applied on top of Refinitiv Data Library. This can add value for trading/investment/risk management workflows.

If you have any comments or questions, please leave them in the comments section.

Related Information


 About the Authors

Marios Skevofylakas comes from a financial services, investment banking and consulting technology background. He holds an engineering Ph.D. in Artificial Intelligence and an M.Sc. in Machine Vision. Throughout his career, he has participated in numerous multidisciplinary AI and DLT projects. He is currently a Developer Advocate with Refinitiv, an LSEG business, focusing on AI and Quantum applications in financial services.

Jason Ramchandani has worked at Refinitiv, an LSEG Business, for 8 years as Lead Developer Advocate helping to build their Developer Community. Previously he has worked in financial markets for over 15 years with a quant background in the equity/equity-linked space at Okasan Securities, Sakura Finance and Jefferies LLC. His alma mater is UCL.

Haykaz Aramyan comes from a finance and technology background. He holds a Ph.D. in Finance, and an M.Sc. in Finance, Technology and Policy. Through 10 years of professional experience Haykaz worked on several multidisciplinary projects involving pension, VC funds and technology startups. He is currently a Developer Advocate with Refinitiv, An LSEG Business, focusing on AI applications in financial services.

Georgios Schinas is a Senior Specialist Solutions Architect for AI/ML in the EMEA region. He is based in London and works closely with customers in UK and Ireland. Georgios helps customers design and deploy machine learning applications in production on AWS with a particular interest in MLOps practices and enabling customers to perform machine learning at scale. In his spare time, he enjoys traveling, cooking and spending time with friends and family.

Muthuvelan Swaminathan is an Enterprise Solutions Architect based out of New York. He works with enterprise customers providing architectural guidance in building resilient, cost-effective, innovative solutions that address their business needs and help them execute at scale using AWS products and services.

Mayur Udernani leads AWS AI & ML business with commercial enterprises in UK & Ireland. In his role, Mayur spends majority of his time with customers and partners to help create impactful solutions that solve the most pressing needs of a customer or for a wider industry leveraging AWS Cloud, AI & ML services. Mayur lives in the London area. He has an MBA from Indian Institute of Management and Bachelors in Computer Engineering from Mumbai University.

Read More

Research Focus: Week of January 9, 2023

Research Focus: Week of January 9, 2023

Research Focus 07 - Week of January 9th, 2023

Welcome to Research Focus, a new series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

High-throughput ab initio reaction mechanism exploration in the cloud with automated multi-reference validation

Jan P. Unsleber, Hongbin Liu, Leopold Talirz, Thomas Weymuth, Maximilian Mörchen, Adam Grofe, Dave Wecker, Christopher J. Stein, Ajay Panyala, Bo Peng, Karol Kowalski, Matthias Troyer, Markus Reiher

Quantum chemical calculations on atomistic systems have evolved into a standard approach to studying molecular matter. These calculations often involve a significant amount of manual input and specific process considerations, which could be automated and allow for further efficiencies. In our recent paper: High-throughput ab initio reaction mechanism exploration in the cloud with automated multi-reference validation, we present the AutoRXN workflow, an automated workflow for exploratory high-throughput electronic structure calculations of molecular systems. In this workflow, (i) density functional theory methods are exploited to deliver minimum and transition-state structures and corresponding energies and properties, (ii) coupled cluster calculations are then launched for optimized structures to provide more accurate energy and property estimates, and (iii) multi-reference diagnostics are evaluated to back check the coupled cluster results and subject them to automated multi-configurational calculations for potential multi-configurational cases. All calculations are carried out in a cloud environment and support massive computational campaigns. Key features of all components of the AutoRXN workflow are autonomy, stability, and minimum operator interference. We highlight the AutoRXN workflow at the example of an autonomous reaction mechanism exploration of the mode of action of a homogeneous catalyst for the asymmetric reduction of ketones.


Spotlight: On-Demand EVENT

Microsoft Research Summit 2022

On-Demand
Watch now to learn about some of the most pressing questions facing our research community and listen in on conversations with 120+ researchers around how to ensure new technologies have the broadest possible benefit for humanity.

Disparate Impacts on Online Information Access during the COVID-19 Pandemic

Jina Suh, Eric Horvitz, Ryen W. White, Tim Althoff

Despite efforts to close the long-term and emergent health equity gap, studies during the COVID-19 pandemic show that socioeconomically and environmentally disadvantaged subpopulations have been disproportionately harmed by the disease[1]. Digital access to health services and information has also emerged as an important factor modulating health outcomes. During the pandemic, digital engagement in resources across health, educational, economic, and social needs became a necessity due to lockdown mandates and increased use of internet-based communication by public institutions. Unfortunately, disparities in digital access also reflect socioeconomic and environmental dimensions, which can lead to negative offline consequences, creating a “digital vicious cycle”[2]. Therefore, it is a public health priority to identify vulnerable populations and to understand potential barriers to critical digital resources.

In a new paper: Disparate Impacts on Online Information Access during the COVID-19 Pandemic, published in Nature Communications, researchers from Microsoft Research and the University of Washington have collaborated to harness the centrality of web search engines for online information access to observe digital disparities during the pandemic. They analyzed over 55 billion web search interactions on Bing during the pandemic across 25,150 U.S. ZIP codes to reveal that socioeconomic and environmental factors are associated with the differential use of digital resources across different communities – even if they were digitally connected.


DeepSpeed Data Efficiency library: Towards less data, faster training, and higher model quality

DeepSpeed Team, Andrey Proskurin

DeepSpeed has released a new Data Efficiency library to optimize deep learning training efficiency and cost. The library offers new algorithms on efficient data sampling/scheduling via curriculum learning and efficient data routing via random layerwise token dropping, together with composable and customizable library support. The library greatly reduces training cost while maintaining model quality (1.5-2x less data and time for GPT-3/BERT pretraining), or further improves model quality under the same training cost (>1 point gain for GPT-3-1.3B zero/few-shot evaluation). The code is open-sourced at https://github.com/microsoft/DeepSpeed.

You can learn more in our blog post and in the papers below.


Research Fellows Program at Microsoft Research India – Apply now

The Research Fellows Program at Microsoft Research India is now accepting applications for Fall 2023. This is an opportunity to work with world-class researchers on state-of-the-art technology. The program prepares students for careers in research, engineering, and entrepreneurship, while pushing the frontiers of computer science and technology. Previous Research Fellows have contributed to all aspects of the research lifecycle, spanning ideation, implementation, evaluation, and deployment.

Selected candidates spend one to two years with Microsoft Research India. Candidates should have completed BS/BE/BTech or MS/ME/MTech in Computer Science or related areas, graduating by summer 2023. Apply before February 3, 2023.


The post Research Focus: Week of January 9, 2023 appeared first on Microsoft Research.

Read More