October 2023 – Page 6

Simulated Spotify Listening Experiences for Reinforcement Learning with TensorFlow and TF-Agents

Posted by Surya Kanoria, Joseph Cauteruccio, Federico Tomasi, Kamil Ciosek, Matteo Rinaldi, and Zhenwen Dai – Spotify

Introduction

Many of our music recommendation problems involve providing users with ordered sets of items that satisfy users’ listening preferences and intent at that point in time. We base current recommendations on previous interactions with our application and, in the abstract, are faced with a sequential decision making process as we continually recommend content to users.

Reinforcement Learning (RL) is an established tool for sequential decision making that can be leveraged to solve sequential recommendation problems. We decided to explore how RL could be used to craft listening experiences for users. Before we could start training Agents, we needed to pick a RL library that allowed us to easily prototype, test, and potentially deploy our solutions.

At Spotify we leverage TensorFlow and the extended TensorFlow Ecosystem (TFX, TensorFlow Serving, and so on) as part of our production Machine Learning Stack. We made the decision early on to leverage TensorFlow Agents as our RL Library of choice, knowing that integrating our experiments with our production systems would be vastly more efficient down the line.

One missing bit of technology we required was an offline Spotify environment we could use to prototype, analyze, explore, and train Agents offline prior to online testing. The flexibility of the TF-Agents library, coupled with the broader advantages of TensorFlow and its ecosystem, allowed us to cleanly design a robust and extendable offline Spotify simulator.

We based our simulator design on TF-Agents Environment primitives and using this simulator we developed, trained and evaluated sequential models for item recommendations, vanilla RL Agents (PPG, DQN) and a modified deep Q-Network, which we call the Action-Head DQN (AH-DQN), that addressed the specific challenges imposed by the large state and action space of our RL formulation.

Through live experiments we were able to show that our offline performance estimates were strongly correlated with online results. This then opened the door for large scale experimentation and application of Reinforcement Learning across Spotify, enabled by the technological foundations unlocked by TensorFlow and TF-Agents.

In this post we’ll provide more details about our RL problem and how we used TF-Agents to enable this work end to end.

The RL Loop and Simulated Users

In RL, Agents interact with the environment continuously. At a given time step the Agent consumes an observation from the environment and, using this observation, produces an action given its policy at time t. The environment then processes the action and emits both a reward and the next observation (note that although typically used interchangeably, State is the complete information required to summarize the environment post action, Observation is the portion of this information actually exposed to the Agent).

In our case the reward emitted from the environment is the response of a user to music recommendations driven by the Agent’s action. In the absence of a simulator we would need to expose real users to Agents to observe rewards. We utilize a model-based RL approach to avoid letting an untrained Agent interact with real users (with the potential of hurting user satisfaction in the training process).

In this model-based RL formulation the Agent is not trained online against real users. Instead, it makes use of a user model that predicts responses to a list of tracks derived via the Agent’s action. Using this model we optimize actions in such a way as to maximize a (simulated) user satisfaction metric. During the training phase the environment makes use of this user model to return a predicted user response to the action recommended by the Agent.

We use Keras to design and train our user model. The serialized user model is then unpacked by the simulator and used to calculate rewards during Agent training and evaluation.

Simulator Design

In the abstract, what we needed to build was clear. We needed a way to simulate user listening sessions for the Agent. Given a simulated user and some content, instantiate a listening session and let the Agent drive recommendations in that session. Allow the simulated user to “react” to these recommendations and let the Agent adjust its strategy based on this result to drive some expected cumulative reward.

The TensorFlow Agents environment design guided us in developing the modular components of our system, each of which was responsible for different parts of the overall simulation.

In our codebase we define an environment abstraction that requires the following be defined for every concrete instantiation:

class AbstractEnvironment(ABC):

      _user_model: AbstractUserModel = None

      _track_sampler: AbstractTrackSampler = None

      _episode_tracker: EpisodeTracker = None

      _episode_sampler: AbstractEpisodeSampler = None
    @abstractmethod

    def reset(self) -> List[float]:

      pass
    @abstractmethod

    def step(self, action: float) -> (List[float], float, bool):

      pass
    def observation_space(self) -> Dict:

      pass

@abstractmethod def action_space(self) -> Dict: pass

Set-Up

At the start of Agent training we need to instantiate a simulation environment that has representations of hypothetical users and the content we’re looking to recommend to them. We base these instantiations on both real and hypothetical Spotify listening experiences. The critical information that defines these instantiations is passed to the environment via _episode_sampler. As mentioned, we also need to provide the simulator with a trained user model, in this case via _user_model.

Actions and Observations

Just like any Agent environment, our simulator requires that we specify the action_spec and observation_spec. Actions in our case may be continuous or discrete depending both on our Agent selection and how we propose to translate an Agent’s action into actual recommendations. We typically recommend ordered lists of items drawn from a pool of potential items. Formulating this action space directly would lead to it being combinatorially complex. We also assume the user will interact with multiple items, and as such previous work in this area that relies on single choice assumptions doesn’t apply.

In the absence of a discrete action space consisting of item collections we need to provide the simulator with a method for turning the Agent’s action into actual recommendations. This logic is contained in the via _track_sampler. The “example play modes” proposed by the episode sampler contains information on items that can be presented to the simulated user. The track sampler consumes these and the agent’s action and returns actual item recommendations.

Flow chart of Agent actions_spec and observation_spec combining to create a recommendation

Termination and Reset

We also need to handle the episode termination dynamics. In our simulator, the reset rules are set by the model builder and based on empirical investigations of interaction data relevant to a specific music listening experience. As a hypothetical, we may determine that 92% of listening sessions terminate after 6 sequential track skips and we’d construct our simulation termination logic to match. It also requires that we design abstractions in our simulator that allow us to check if the episode should be terminated after each step.

When the episode is reset the simulator will sample a new hypothetical user listening session pair and begin the next episode.

Episode Steps

As with standard TF Agents Environments we need to define the step dynamics for our simulation. We have optional dynamics of the simulation that we need to make sure are enforced at each step. For example, we may desire that the same item cannot be recommended more than once. If the Agent’s action indicates a recommendation of an item that was previously recommended we need to build in the functionality to pick the next best item based on this action.

We also need to call the termination (and other supporting functions) mentioned above as needed at each step.

Episode Storage and Replay

The functionality mentioned up until this point collectively created a very complex simulation setup. While the TF Agents replay buffer provided us with the functionality required to store episodes for Agent training and evaluation, we quickly realized the need to be able to store more episode data for debugging purposes, and more detailed evaluations specific to our simulation distinct from standard Agent performance measures.

We thus allowed for the inclusion of an expanded _episode_tracker that would store additional information about the user model predictions, information noting the sampled users/content pairs, and more.

Creating TF-Agent Environments

Our environment abstraction gives us a template that matches that of a standard TF-Agents Environment class. Some inputs to our environment need to be resolved before we can actually create the concrete TF-Agents environment instance. This happens in three steps.

First we define a specific simulation environment that conforms to our abstraction. For example:

class PlaylistEnvironment(AbstractEnvironment):

    def __init__(

        self,

        user_model: AbstractUserModel,

        track_sampler: AbstractTrackSampler,

        episode_tracker: EpisodeTracker,

        episode_sampler: AbstractEpisodeSampler,

	 ....

    ):

...

Next we use an Environment Builder Class that takes as input a user model, track sampler, etc. and an environment class like PlaylistEnvironment. The builder creates a concrete instance of this environment:

self.playlist_env: PlaylistEnvironment = environment_ctor(

            user_model=user_model,

            track_sampler=track_sampler,

            episode_tracker=episode_tracker,

            episode_sampler=self._eps_sampler,

        )

Lastly, we utilize a conversion class that constructs a TF-Agents Environment from a concrete instance of ours:

class TFAgtPyEnvironment(py_environment.PyEnvironment):

      def __init__(self, environment: AbstractEnvironment):

          super().__init__()

          self.env = environment

This is then executed internally to our Environment Builder:

class EnvironmentBuilder(AbstractEnvironmentBuilder):
      def __init__(self, ...):

          ...

def get_tf_env(self): ... tf_env: TFAgtPyEnvironment = TFAgtPyEnvironment( self.playlist_env ) return tf_env

The resulting TensorFlow Agents environment can then be used for Agent training.

This simulator design allows us to easily create and manage multiple environments with a variety of different configurations as needed.

We next discuss how we used our simulator to train RL Agents to generate Playlists.

A Customized Agent for Playlist Generation

As mentioned, Reinforcement Learning provides us with a method set that naturally accommodates the sequential nature of music listening; allowing us to adapt to users’ ever evolving preferences as sessions progress.

One specific problem we can attempt to use RL to solve is that of automatic music playlist generation. Given a (large) set of tracks, we want to learn how to create one optimal playlist to recommend to the user in order to maximize satisfaction metrics. Our use case is different from standard slate recommendation tasks, where usually the target is to select at most one item in the sequence. In our case, we assume we have a user-generated response for multiple items in the slate, making slate recommendation systems not directly applicable. Another complication is that the set of tracks from which recommendations are drawn is ever changing.

We designed a DQN variant capable of handling these constraints that we called an Action Head DQN (AHDQN).

Moving image of AH-DQN network creating recommendations based on changing variables

The AH-DQN network takes as input the current state and an available action to produce a single Q value for the input action. This process is repeated for every possible item in the input. Finally, the item with the highest Q value is selected and added to the slate, and the process continues until the slate is full.

Experiments In Brief

We tested our approach both offline and online at scale to assess the ability of the Agent to power our real-world recommender systems. In addition to testing the Agent itself we were also keen to assess the extent to which our offline performance estimates for various policies returned by our simulator matched (or at least directionally aligned) with our online results.

Graph measuring simulated performance assessment by scaled online reward for different policies

We observed this directional alignment for numerous naive, heuristic, model driven, and RL policies.

Please refer to our KDD paper for more information on the specifics of our model-based RL approach and Agent design.

Automatic Music Playlist Generation via Simulation-based Reinforcement Learning

Federico Tomasi, Joseph Cauteruccio, Surya Kanoria, Kamil Ciosek, Matteo Rinaldi, and Zhenwen Dai

KDD 2023

Acknowledgements

We’d like to thank all our Spotify teammates past and present who contributed to this work. Particularly, we’d like to thank Mehdi Ben Ayed for his early work in helping to develop our RL codebase. We’d also like to thank the TensorFlow Agents team for their support and encouragement throughout this project (and for the library that made it possible).

Defect detection in high-resolution imagery using two-stage Amazon Rekognition Custom Labels models

High-resolution imagery is very prevalent in today’s world, from satellite imagery to drones and DLSR cameras. From this imagery, we can capture damage due to natural disasters, anomalies in manufacturing equipment, or very small defects such as defects on printed circuit boards (PCBs) or semiconductors. Building anomaly detection models using high-resolution imagery can be challenging because modern computer vision models typically resize images to a lower resolution to fit into memory for training and running inference. Reducing the image resolution significantly means that visual information relating to the defect is degraded or completely lost.

One approach to overcome these challenges is to build two-stage models. Stage 1 models detect a region of interest, and Stage 2 models detect defects on the cropped region of interest, thereby maintaining sufficient resolution for small detects.

In this post, we go over how to build an effective two-stage defect detection system using Amazon Rekognition Custom Labels and compare results for this specific use case with one-stage models. Note that several one-stage models are effective even at lower or resized image resolutions, and others may accommodate large images in smaller batches.

Solution overview

For our use case, we use a dataset of images of PCBs with synthetically generated missing hole pins, as shown in the following example.

We use this dataset to demonstrate that a one-stage approach using object detection results in subpar detection performance for the missing hole pin defects. A two-step model is preferred, in which we use Rekognition Custom Labels first for object detection to identify the pins and then a second-stage model to classify cropped images of the pins into pins with missing holes or normal pins.

The training process for a Rekognition Custom Labels model consists of several steps, as illustrated in the following diagram.

First, we use Amazon Simple Storage Service (Amazon S3) to store the image data. The data is ingested in Amazon Sagemaker Jupyter notebooks, where typically a data scientist will inspect the images and preprocess them, removing any images that are of poor quality such as blurred images or poor lighting conditions, and resize or crop the images. Then data is split into training and test sets, and Amazon SageMaker Ground Truth labeling jobs are run to label the sets of images and output a train and test manifest file. The manifest files are used by Rekognition Custom Labels for training.

One-stage model approach

The first approach we take to identifying missing holes on the PCB is to label the missing holes and train an object detection model to identify the missing holes. The following is an image example from the dataset.

We train a model with a dataset with 95 images used as training and 20 images used for testing. The following table summarizes our results.

Evaluation Results
F1 Score		Average Precision		Overall Recall
0.468		0.750		0.340
Training Time		Training Dataset		Testing Dataset
Trained in 1.791 hours		1 label, 95 images		1 label, 20 images
Per Label Performance
Label Name	F1 Score	Test Images	Precision	Recall	Assumed Threshold
`missing_hole`	0.468	20	0.750	0.340	0.053

The resulting model has high precision but low recall, meaning that when we localize a region for a missing hole, we’re usually correct, but we’re missing a lot of missing holes that are present on the PCB. To build an effective defect detection system, we need to improve recall. The low performance of this model may be due to the defects being small on this high-resolution image of the PCB, so the model has no reference of a healthy pin.

Next, we explore splitting the image into four or six crops depending on the PCB size and labeling both healthy and missing holes. The following is an example of the resulting cropped image.

We train a model with 524 images used as training and 106 images used for testing. We maintain the same PCBs used in train and test as the full board model. The results for cropped healthy pins vs. missing holes are shown in the following table.

Evaluation Results
F1 Score		Average Precision		Overall Recall
0.967		0.989		0.945
Training Time		Training Dataset		Testing Dataset
Trained in 2.118 hours		2 labels, 524 images		2 labels, 106 images
Per Label Performance
Label Name	F1 Score	Test Images	Precision	Recall	Assumed Threshold
`missing_hole`	0.949	42	0.980	0.920	0.536
`pin`	0.984	106	0.998	0.970	0.696

Both precision and recall have improved significantly. Training the model with zoomed-in cropped images and a reference to the model for healthy pins helped. However, recall is still at 92%, meaning that we would still miss 8% of the missing holes and let defects go by unnoticed.

Next, we explore a two-stage model approach in which we can improve the model performance further.

Two-stage model approach

For the two-stage model, we train two models: one for detecting pins and one for detecting if the pin is missing or not on zoomed-in cropped images of the pin. The following is an image from the pin detection dataset.

The data is similar to our previous experiment, in which we cropped the PCB into four or six cropped images. This time, we label all pins and don’t make any distinctions if the pin has a missing hole or not. We train this model with 522 images and test with 108 images, maintaining the same train/test split as previous experiments. The results are shown in the following table.

Evaluation Results
F1 Score		Average Precision		Overall Recall
1.000		0.999		1.000
Training Time		Training Dataset		Testing Dataset
Trained in 1.581 hours		1 label, 522 images		1 label, 108 images
Per Label Performance
Label Name	F1 Score	Test Images	Precision	Recall	Assumed Threshold
`pin`	1.000	108	0.999	1.000	0.617

The model detects the pins perfectly on this synthetic dataset.

Next, we build the model to make the distinction for missing holes. We use cropped images of the holes to train the second stage of the model, as shown in the following examples. This model is separate from the previous models because it’s a classification model and will be focused on the narrow task of determining if the pin has a missing hole.

We train this second-stage model on 16,624 images and test on 3,266, maintaining the same train/test splits as the previous experiments. The following table summarizes our results.

Evaluation Results
F1 Score		Average Precision		Overall Recall
1.000		1.000		1.000
Training Time		Training Dataset		Testing Dataset
Trained in 6.660 hours		2 labels, 16,624 images		2 labels, 3,266 images
Per Label Performance
Label Name	F1 Score	Test Images	Precision	Recall	Assumed Threshold
`anomaly`	1.000	88	1.000	1.000	0.960
`normal`	1.000	3,178	1.000	1.000	0.996

Again, we receive perfect precision and recall on this synthetic dataset. Combining the previous pin detection model with this second-stage missing hole classification model, we can build a model that outperforms any single-stage model.

The following table summarizes the experiments we conducted.

Experiment	Type	Description	F1 Score	Precision	Recall
1	One-stage model	Object detection model to detect missing holes on full images	0.468	0.75	0.34
2	One-stage model	Object detection model to detect healthy pins and missing holes on cropped images	0.967	0.989	0.945
3	Two-stage model	Stage 1: Object detection on all pins	1.000	0.999	1.000
		Stage 2: Image classification of healthy pin or missing holes	1.000	1.000	1.000
		End-to-end average	1.000	0.9995	1.000

Inference pipeline

You can use the following architecture to deploy the one-stage and two-stage models that we described in this post. The following main components are involved:

Amazon API Gateway
AWS Lambda
An Amazon Rekognition custom endpoint

For one-stage models, you can send an input image to the API Gateway endpoint, followed by Lambda for any basic image preprocessing, and route to the Rekognition Custom Labels trained model endpoint. In our experiments, we explored one-stage models that can detect only missing holes, and missing holes and healthy pins.

For two-stage models, you can similarly send an image to the API Gateway endpoint, followed by Lambda. Lambda acts as an orchestrator that first calls the object detection model (trained using Rekognition Custom Labels), which generates the region of interest. The original image is then cropped in the Lambda function, and sent to another Rekognition Custom Labels classification model for detecting defects in each cropped image.

Conclusion

In this post, we trained one- and two-stage models to detect missing holes in PCBs using Rekognition Custom Labels. We reported results for various models; in our case, two-stage models outperformed other variants. We encourage customers with high-resolution imagery from other domains to test model performance with one- and two-stage models. Additionally, consider the following ways to expand the solution:

Sliding window crops for your actual datasets
Reusing your object detection models in the same pipeline
Pre-labeling workflows using bounding box predictions

About the authors

Andreas Karagounis is a Data Science Manager at Accenture. He holds a masters in Computer Science from Brown University. He has a background in computer vision and works with customers to solve their business challenges using data science and machine learning.

Yogesh Chaturvedi is a Principal Solutions Architect at AWS with a focus in computer vision. He works with customers to address their business challenges using cloud technologies. Outside of work, he enjoys hiking, traveling, and watching sports.

Shreyas Subramanian is a Principal Data Scientist, and helps customers by using machine learning to solve their business challenges using the AWS platform. Shreyas has a background in large-scale optimization and machine learning, and in the use of machine learning and reinforcement learning for accelerating optimization tasks.

Selimcan “Can” Sakar is a cloud-first developer and Solutions Architect at AWS Accenture Business Group with a focus on emerging technologies such as GenAI, ML, and blockchain. When he isn’t watching models converge, he can be seen biking or playing the clarinet.

Automatically redact PII for machine learning using Amazon SageMaker Data Wrangler

Customers increasingly want to use deep learning approaches such as large language models (LLMs) to automate the extraction of data and insights. For many industries, data that is useful for machine learning (ML) may contain personally identifiable information (PII). To ensure customer privacy and maintain regulatory compliance while training, fine-tuning, and using deep learning models, it’s often necessary to first redact PII from source data.

This post demonstrates how to use Amazon SageMaker Data Wrangler and Amazon Comprehend to automatically redact PII from tabular data as part of your machine learning operations (ML Ops) workflow.

Problem: ML data that contains PII

PII is defined as any representation of information that permits the identity of an individual to whom the information applies to be reasonably inferred by either direct or indirect means. PII is information that either directly identifies an individual (name, address, social security number or other identifying number or code, telephone number, email address, and so on) or information that an agency intends to use to identify specific individuals in conjunction with other data elements, namely, indirect identification.

Customers in business domains such as financial, retail, legal, and government deal with PII data on a regular basis. Due to various government regulations and rules, customers have to find a mechanism to handle this sensitive data with appropriate security measures to avoid regulatory fines, possible fraud, and defamation. PII redaction is the process of masking or removing sensitive information from a document so it can be used and distributed, while still protecting confidential information.

Businesses need to deliver delightful customer experiences and better business outcomes by using ML. Redaction of PII data is often a key first step to unlock the larger and richer data streams needed to use or fine-tune generative AI models, without worrying about whether their enterprise data (or that of their customers) will be compromised.

Solution overview

This solution uses Amazon Comprehend and SageMaker Data Wrangler to automatically redact PII data from a sample dataset.

Amazon Comprehend is a natural language processing (NLP) service that uses ML to uncover insights and relationships in unstructured data, with no managing infrastructure or ML experience required. It provides functionality to locate various PII entity types within text, for example names or credit card numbers. Although the latest generative AI models have demonstrated some PII redaction capability, they generally don’t provide a confidence score for PII identification or structured data describing what was redacted. The PII functionality of Amazon Comprehend returns both, enabling you to create redaction workflows that are fully auditable at scale. Additionally, using Amazon Comprehend with AWS PrivateLink means that customer data never leaves the AWS network and is continuously secured with the same data access and privacy controls as the rest of your applications.

Similar to Amazon Comprehend, Amazon Macie uses a rules-based engine to identify sensitive data (including PII) stored in Amazon Simple Storage Service (Amazon S3). However, its rules-based approach relies on having specific keywords that indicate sensitive data located close to that data (within 30 characters). In contrast, the NLP-based ML approach of Amazon Comprehend uses sematic understanding of longer chunks of text to identify PII, making it more useful for finding PII within unstructured data.

Additionally, for tabular data such as CSV or plain text files, Macie returns less detailed location information than Amazon Comprehend (either a row/column indicator or a line number, respectively, but not start and end character offsets). This makes Amazon Comprehend particularly helpful for redacting PII from unstructured text that may contain a mix of PII and non-PII words (for example, support tickets or LLM prompts) that is stored in a tabular format.

Amazon SageMaker provides purpose-built tools for ML teams to automate and standardize processes across the ML lifecycle. With SageMaker MLOps tools, teams can easily prepare, train, test, troubleshoot, deploy, and govern ML models at scale, boosting productivity of data scientists and ML engineers while maintaining model performance in production. The following diagram illustrates the SageMaker MLOps workflow.

SageMaker Data Wrangler is a feature of Amazon SageMaker Studio that provides an end-to-end solution to import, prepare, transform, featurize, and analyze datasets stored in locations such as Amazon S3 or Amazon Athena, a common first step in the ML lifecycle. You can use SageMaker Data Wrangler to simplify and streamline dataset preprocessing and feature engineering by either using built-in, no-code transformations or customizing with your own Python scripts.

Using Amazon Comprehend to redact PII as part of a SageMaker Data Wrangler data preparation workflow keeps all downstream uses of the data, such as model training or inference, in alignment with your organization’s PII requirements. You can integrate SageMaker Data Wrangler with Amazon SageMaker Pipelines to automate end-to-end ML operations, including data preparation and PII redaction. For more details, refer to Integrating SageMaker Data Wrangler with SageMaker Pipelines. The rest of this post demonstrates a SageMaker Data Wrangler flow that uses Amazon Comprehend to redact PII from text stored in tabular data format.

This solution uses a public synthetic dataset along with a custom SageMaker Data Wrangler flow, available as a file in GitHub. The steps to use the SageMaker Data Wrangler flow to redact PII are as follows:

Open SageMaker Studio.
Download the SageMaker Data Wrangler flow.
Review the SageMaker Data Wrangler flow.
Add a destination node.
Create a SageMaker Data Wrangler export job.

This walkthrough, including running the export job, should take 20–25 minutes to complete.

Prerequisites

For this walkthrough, you should have the following:

An AWS account.
A SageMaker Studio domain and user. For details on setting these up, refer to Onboard to Amazon SageMaker Domain Using Quick setup. The SageMaker Studio execution role must have permission to call the Amazon Comprehend DetectPiiEntities action.
An S3 bucket for the redacted results.

Open SageMaker Studio

To open SageMaker Studio, complete the following steps:

On the SageMaker console, choose Studio in the navigation pane.
Choose the domain and user profile
Choose Open Studio.

To get started with the new capabilities of SageMaker Data Wrangler, it’s recommended to upgrade to the latest release.

Download the SageMaker Data Wrangler flow

You first need to retrieve the SageMaker Data Wrangler flow file from GitHub and upload it to SageMaker Studio. Complete the following steps:

Navigate to the SageMaker Data Wrangler redact-pii.flow file on GitHub.
On GitHub, choose the download icon to download the flow file to your local computer.
In SageMaker Studio, choose the file icon in the navigation pane.
Choose the upload icon, then choose redact-pii.flow.

Review the SageMaker Data Wrangler flow

In SageMaker Studio, open redact-pii.flow. After a few minutes, the flow will finish loading and show the flow diagram (see the following screenshot). The flow contains six steps: an S3 Source step followed by five transformation steps.

On the flow diagram, choose the last step, Redact PII. The All Steps pane opens on the right and shows a list of the steps in the flow. You can expand each step to view details, change parameters, and potentially add custom code.

Let’s walk through each step in the flow.

Steps 1 (S3 Source) and 2 (Data types) are added by SageMaker Data Wrangler whenever data is imported for a new flow. In S3 Source, the S3 URI field points to the sample dataset, which is a CSV file stored in Amazon S3. The file contains roughly 116,000 rows, and the flow sets the value of the Sampling field to 1,000, which means that SageMaker Data Wrangler will sample 1,000 rows to display in the user interface. Data types sets the data type for each column of imported data.

Step 3 (Sampling) sets the number of rows SageMaker Data Wrangler will sample for an export job to 5,000, via the Approximate sample size field. Note that this is different from the number of rows sampled to display in the user interface (Step 1). To export data with more rows, you can increase this number or remove Step 3.

Steps 4, 5, and 6 use SageMaker Data Wrangler custom transforms. Custom transforms allow you to run your own Python or SQL code within a Data Wrangler flow. The custom code can be written in four ways:

In SQL, using PySpark SQL to modify the dataset
In Python, using a PySpark data frame and libraries to modify the dataset
In Python, using a pandas data frame and libraries to modify the dataset
In Python, using a user-defined function to modify a column of the dataset

The Python (pandas) approach requires your dataset to fit into memory and can only be run on a single instance, limiting its ability to scale efficiently. When working in Python with larger datasets, we recommend using either the Python (PySpark) or Python (user-defined function) approach. SageMaker Data Wrangler optimizes Python user-defined functions to provide performance similar to an Apache Spark plugin, without needing to know PySpark or Pandas. To make this solution as accessible as possible, this post uses a Python user-defined function written in pure Python.

Expand Step 4 (Make PII column) to see its details. This step combines different types of PII data from multiple columns into a single phrase that is saved in a new column, pii_col. The following table shows an example row containing data.

customer_name	customer_job	billing_address	customer_email
Katie	Journalist	19009 Vang Squares Suite 805	hboyd@gmail.com

This is combined into the phrase “Katie is a Journalist who lives at 19009 Vang Squares Suite 805 and can be emailed at hboyd@gmail.com”. The phrase is saved in pii_col, which this post uses as the target column to redact.

Step 5 (Prep for redaction) takes a column to redact (pii_col) and creates a new column (pii_col_prep) that is ready for efficient redaction using Amazon Comprehend. To redact PII from a different column, you can change the Input column field of this step.

There are two factors to consider to efficiently redact data using Amazon Comprehend:

The cost to detect PII is defined on a per-unit basis, where 1 unit = 100 characters, with a 3-unit minimum charge for each document. Because tabular data often contains small amounts of text per cell, it’s generally more time- and cost-efficient to combine text from multiple cells into a single document to send to Amazon Comprehend. Doing this avoids the accumulation of overhead from many repeated function calls and ensures that the data sent is always greater than the 3-unit minimum.
Because we’re doing redaction as one step of a SageMaker Data Wrangler flow, we will be calling Amazon Comprehend synchronously. Amazon Comprehend sets a 100 KB (100,000 character) limit per synchronous function call, so we need to ensure that any text we send is under that limit.

Given these factors, Step 5 prepares the data to send to Amazon Comprehend by appending a delimiter string to the end of the text in each cell. For the delimiter, you can use any string that doesn’t occur in the column being redacted (ideally, one that is as few characters as possible, because they’re included in the Amazon Comprehend character total). Adding this cell delimiter allows us to optimize the call to Amazon Comprehend, and will be discussed further in Step 6.

Note that if the text in any individual cell is longer than the Amazon Comprehend limit, the code in this step truncates it to 100,000 characters (roughly equivalent to 15,000 words or 30 single-spaced pages). Although this amount of text is unlikely to be stored in in a single cell, you can modify the transformation code to handle this edge case another way if needed.

Step 6 (Redact PII) takes a column name to redact as input (pii_col_prep) and saves the redacted text to a new column (pii_redacted). When you use a Python custom function transform, SageMaker Data Wrangler defines an empty custom_func that takes a pandas series (a column of text) as input and returns a modified pandas series of the same length. The following screenshot shows part of the Redact PII step.

The function custom_func contains two helper (inner) functions:

make_text_chunks – This function does the work of concatenating text from individual cells in the series (including their delimiters) into longer strings (chunks) to send to Amazon Comprehend.
redact_pii– This function takes text as input, calls Amazon Comprehend to detect PII, redacts any that is found, and returns the redacted text. Redaction is done by replacing any PII text with the type of PII found in square brackets, for example John Smith would be replaced with [NAME]. You can modify this function to replace PII with any string, including the empty string (“”) to remove it. You also could modify the function to check the confidence score of each PII entity and only redact if it’s above a specific threshold.

After the inner functions are defined, custom_func uses them to do the redaction, as shown in the following code excerpt. When the redaction is complete, it converts the chunks back into original cells, which it saves in the pii_redacted column.

# concatenate text from cells into longer chunks
chunks = make_text_chunks(series, COMPREHEND_MAX_CHARS)

redacted_chunks = []
# call Comprehend once for each chunk, and redact
for text in chunks:
  redacted_text = redact_pii(text)
  redacted_chunks.append(redacted_text)
  
# join all redacted chunks into one text string
redacted_text = ''.join(redacted_chunks)

# split back to list of the original rows
redacted_rows = redacted_text.split(CELL_DELIM)

Add a destination node

To see the result of your transformations, SageMaker Data Wrangler supports exporting to Amazon S3, SageMaker Pipelines, Amazon SageMaker Feature Store, and Python code. To export the redacted data to Amazon S3, we first need to create a destination node:

In the SageMaker Data Wrangler flow diagram, choose the plus sign next to the Redact PII step.
Choose Add destination, then choose Amazon S3.
Provide an output name for your transformed dataset.
Browse or enter the S3 location to store the redacted data file.
Choose Add destination.

You should now see the destination node at the end of your data flow.

Create a SageMaker Data Wrangler export job

Now that the destination node has been added, we can create the export job to process the dataset:

In SageMaker Data Wrangler, choose Create job.
The destination node you just added should already be selected. Choose Next.
Accept the defaults for all other options, then choose Run.

This creates a SageMaker Processing job. To view the status of the job, navigate to the SageMaker console. In the navigation pane, expand the Processing section and choose Processing jobs. Redacting all 116,000 cells in the target column using the default export job settings (two ml.m5.4xlarge instances) takes roughly 8 minutes and costs approximately $0.25. When the job is complete, download the output file with the redacted column from Amazon S3.

Clean up

The SageMaker Data Wrangler application runs on an ml.m5.4xlarge instance. To shut it down, in SageMaker Studio, choose Running Terminals and Kernels in the navigation pane. In the RUNNING INSTANCES section, find the instance labeled Data Wrangler and choose the shutdown icon next to it. This shuts down the SageMaker Data Wrangler application running on the instance.

Conclusion

In this post, we discussed how to use custom transformations in SageMaker Data Wrangler and Amazon Comprehend to redact PII data from your ML dataset. You can download the SageMaker Data Wrangler flow and start redacting PII from your tabular data today.

For other ways to enhance your MLOps workflow using SageMaker Data Wrangler custom transformations, check out Authoring custom transformations in Amazon SageMaker Data Wrangler using NLTK and SciPy. For more data preparation options, check out the blog post series that explains how to use Amazon Comprehend to react, translate, and analyze text from either Amazon Athena or Amazon Redshift.

About the Authors

Tricia Jamison is a Senior Prototyping Architect on the AWS Prototyping and Cloud Acceleration (PACE) Team, where she helps AWS customers implement innovative solutions to challenging problems with machine learning, internet of things (IoT), and serverless technologies. She lives in New York City and enjoys basketball, long distance treks, and staying one step ahead of her children.

Neelam Koshiya is an Enterprise Solutions Architect at AWS. With a background in software engineering, she organically moved into an architecture role. Her current focus is helping enterprise customers with their cloud adoption journey for strategic business outcomes with the area of depth being AI/ML. She is passionate about innovation and inclusion. In her spare time, she enjoys reading and being outdoors.

Adeleke Coker is a Global Solutions Architect with AWS. He works with customers globally to provide guidance and technical assistance in deploying production workloads at scale on AWS. In his spare time, he enjoys learning, reading, gaming and watching sport events.

English learners can now practice speaking on Search

Posted by Christian Plagemann, Director, and Katya Cox, Product Manager, Google Research

Learning a language can open up new opportunities in a person’s life. It can help people connect with those from different cultures, travel the world, and advance their career. English alone is estimated to have 1.5 billion learners worldwide. Yet proficiency in a new language is difficult to achieve, and many learners cite a lack of opportunity to practice speaking actively and receiving actionable feedback as a barrier to learning.

We are excited to announce a new feature of Google Search that helps people practice speaking and improve their language skills. Within the next few days, Android users in Argentina, Colombia, India (Hindi), Indonesia, Mexico, and Venezuela can get even more language support from Google through interactive speaking practice in English — expanding to more countries and languages in the future. Google Search is already a valuable tool for language learners, providing translations, definitions, and other resources to improve vocabulary. Now, learners translating to or from English on their Android phones will find a new English speaking practice experience with personalized feedback.

A new feature of Google Search allows learners
to practice speaking words in context.

Learners are presented with real-life prompts and then form their own spoken answers using a provided vocabulary word. They engage in practice sessions of 3-5 minutes, getting personalized feedback and the option to sign up for daily reminders to keep practicing. With only a smartphone and some quality time, learners can practice at their own pace, anytime, anywhere.

Activities with personalized feedback, to supplement existing learning tools

Designed to be used alongside other learning services and resources, like personal tutoring, mobile apps, and classes, the new speaking practice feature on Google Search is another tool to assist learners on their journey.

We have partnered with linguists, teachers, and ESL/EFL pedagogical experts to create a speaking practice experience that is effective and motivating. Learners practice vocabulary in authentic contexts, and material is repeated over dynamic intervals to increase retention — approaches that are known to be effective in helping learners become confident speakers. As one partner of ours shared:

“Speaking in a given context is a skill that language learners often lack the opportunity to practice. Therefore this tool is very useful to complement classes and other resources.” – Judit Kormos, Professor, Lancaster University

We are also excited to be working with several language learning partners to surface content they are helping create and to connect them with learners around the world. We look forward to expanding this program further and working with any interested partner.

Personalized real-time feedback

Every learner is different, so delivering personalized feedback in real time is a key part of effective practice. Responses are analyzed to provide helpful, real-time suggestions and corrections.

The system gives semantic feedback, indicating whether their response was relevant to the question and may be understood by a conversation partner. Grammar feedback provides insights into possible grammatical improvements, and a set of example answers at varying levels of language complexity give concrete suggestions for alternative ways to respond in this context.

The feedback is composed of three elements: Semantic analysis, grammar correction, and example answers.

Contextual translation

Among the several new technologies we developed, contextual translation provides the ability to translate individual words and phrases in context. During practice sessions, learners can tap on any word they don’t understand to see the translation of that word considering its context.

Example of contextual translation feature.

This is a difficult technical task, since individual words in isolation often have multiple alternative meanings, and multiple words can form clusters of meaning that need to be translated in unison. Our novel approach translates the entire sentence, then estimates how the words in the original and the translated text relate to each other. This is commonly known as the word alignment problem.

Example of a translated sentence pair and its word alignment. A deep learning alignment model connects the different words that create the meaning to suggest a translation.

The key technology piece that enables this functionality is a novel deep learning model developed in collaboration with the Google Translate team, called Deep Aligner. The basic idea is to take a multilingual language model trained on hundreds of languages, then fine-tune a novel alignment model on a set of word alignment examples (see the figure above for an example) provided by human experts, for several language pairs. From this, the single model can then accurately align any language pair, reaching state-of-the-art alignment error rate (AER, a metric to measure the quality of word alignments, where lower is better). This single new model has led to dramatic improvements in alignment quality across all tested language pairs, reducing average AER from 25% to 5% compared to alignment approaches based on Hidden Markov models (HMMs).

Alignment error rates (lower is better) between English (EN) and other languages.

This model is also incorporated into Google’s translation APIs, greatly improving, for example, the formatting of translated PDFs and websites in Chrome, the translation of YouTube captions, and enhancing Google Cloud’s translation API.

Grammar feedback

To enable grammar feedback for accented spoken language, our research teams adapted grammar correction models for written text (see the blog and paper) to work on automatic speech recognition (ASR) transcriptions, specifically for the case of accented speech. The key step was fine-tuning the written text model on a corpus of human and ASR transcripts of accented speech, with expert-provided grammar corrections. Furthermore, inspired by previous work, the teams developed a novel edit-based output representation that leverages the high overlap between the inputs and outputs that is particularly well-suited for short input sentences common in language learning settings.

The edit representation can be explained using an example:

Input: I¹ am² so³ bad⁴ cooking⁵
Correction: I¹ am² so³ bad⁴ at⁵ cooking⁶
Edits: (‘at’, 4, PREPOSITION, 4)

In the above, “at” is the word that is inserted at position 4 and “PREPOSITION” denotes this is an error involving prepositions. We used the error tag to select tag-dependent acceptance thresholds that improved the model further. The model increased the recall of grammar problems from 4.6% to 35%.

Some example output from our model and a model trained on written corpora:

	Example 1	Example 2
User input (transcribed speech)	I live of my profession.	I need a efficient card and reliable.

Text-based grammar model	I live by my profession.	I need an efficient card and a reliable.

New speech-optimized model	I live off my profession.	I need an efficient and reliable card.

Semantic analysis

A primary goal of conversation is to communicate one’s intent clearly. Thus, we designed a feature that visually communicates to the learner whether their response was relevant to the context and would be understood by a partner. This is a difficult technical problem, since early language learners’ spoken responses can be syntactically unconventional. We had to carefully balance this technology to focus on the clarity of intent rather than correctness of syntax.

Our system utilizes a combination of two approaches:

Sensibility classification: Large language models like LaMDA or PaLM are designed to give natural responses in a conversation, so it’s no surprise that they do well on the reverse: judging whether a given response is contextually sensible.
Similarity to good responses: We used an encoder architecture to compare the learner’s input to a set of known good responses in a semantic embedding space. This comparison provides another useful signal on semantic relevance, further improving the quality of feedback and suggestions we provide.

The system provides feedback about whether the response was relevant to the prompt, and would be understood by a communication partner.

ML-assisted content development

Our available practice activities present a mix of human-expert created content, and content that was created with AI assistance and human review. This includes speaking prompts, focus words, as well as sets of example answers that showcase meaningful and contextual responses.

A list of example answers is provided when the learner receives feedback and when they tap the help button.

Since learners have different levels of ability, the language complexity of the content has to be adjusted appropriately. Prior work on language complexity estimation focuses on text of paragraph length or longer, which differs significantly from the type of responses that our system processes. Thus, we developed novel models that can estimate the complexity of a single sentence, phrase, or even individual words. This is challenging because even a phrase composed of simple words can be hard for a language learner (e.g., “Let’s cut to the chase”). Our best model is based on BERT and achieves complexity predictions closest to human expert consensus. The model was pre-trained using a large set of LLM-labeled examples, and then fine-tuned using a human expert–labeled dataset.

Mean squared error of various approaches’ performance estimating content difficulty on a diverse corpus of ~450 conversational passages (text / transcriptions). Top row: Human raters labeled the items on a scale from 0.0 to 5.0, roughly aligned to the CEFR scale (from A1 to C2). Bottom four rows: Different models performed the same task, and we show the difference to the human expert consensus.

Using this model, we can evaluate the difficulty of text items, offer a diverse range of suggestions, and most importantly challenge learners appropriately for their ability levels. For example, using our model to label examples, we can fine-tune our system to generate speaking prompts at various language complexity levels.

	Vocabulary focus words, to be elicited by the questions
		guitar	apple	lion

Simple		What do you like to play?	Do you like fruit?	Do you like big cats?

Intermediate		Do you play any musical instruments?	What is your favorite fruit?	What is your favorite animal?

Complex		What stringed instrument do you enjoy playing?	Which type of fruit do you enjoy eating for its crunchy texture and sweet flavor?	Do you enjoy watching large, powerful predators?

Furthermore, content difficulty estimation is used to gradually increase the task difficulty over time, adapting to the learner’s progress.

Conclusion

With these latest updates, which will roll out over the next few days, Google Search has become even more helpful. If you are an Android user in India (Hindi), Indonesia, Argentina, Colombia, Mexico, or Venezuela, give it a try by translating to or from English with Google.

We look forward to expanding to more countries and languages in the future, and to start offering partner practice content soon.

Acknowledgements

Many people were involved in the development of this project. Among many others, we thank our external advisers in the language learning field: Jeffrey Davitz, Judit Kormos, Deborah Healey, Anita Bowles, Susan Gaer, Andrea Revesz, Bradley Opatz, and Anne Mcquade.

Evaluating social and ethical risks from generative AI

Introducing a context-based framework for comprehensively evaluating the social and ethical risks of AI systemsRead More

What’s Your Story: Ranveer Chandra

In this new Microsoft Research Podcast series What’s Your Story, Lab Director Johannes Gehrke explores the who behind the technical and scientific advancements helping to reshape the world. He talks to members of the research community at Microsoft about what motivates their work and how they got where they are today.

Ranveer Chandra is Managing Director of Research for Industry and CTO of Agri-Food. He is also head of Networking Research at Microsoft Research Redmond. His work in systems and networking is helping to bring more internet connectivity to more people and is yielding tools designed to help farmers increase food production more affordably and sustainably. In this episode, he shares what it was like growing up in Jamshedpur, India; why he focuses his efforts in the areas he does; and where the joy in his work comes from.

Learn more:

Ranveer Chandra at Microsoft Research

FarmBeats: AI, Edge & IoT for Agriculture

Project FarmVibes

6G | Space

Transcript

[TEASER]

[MUSIC PLAYS UNDER DIALOGUE]

RANVEER CHANDRA: If you’re a professional, one of the things I would say is try to go after your passion. If you give your work a bigger meaning than just making money, you’ll go beyond the 9-to-5 or 9-to-6 schedule. You’ll give it a lot more than just thinking about it as work.

[TEASER ENDS]

JOHANNES GEHRKE: Microsoft Research works at the cutting edge. But how much do we know about the people behind the science and technology that we create? This is What’s Your Story, and I’m Johannes Gehrke, Lab Director of Microsoft Research Redmond. I’m excited by the people I work with, and I’m curious about how they became the talented and passionate people they are today. So I sat down with some of them. Now, I’m sharing their stories with you. In this podcast series, you’ll hear from them about how they grew up, the critical choices that shaped their lives, and their advice to others looking to carve a similar path.

[MUSIC FADES]

In this episode, I’m talking with Ranveer Chandra. Ranveer is the Managing Director of Research for Industry and head of Networking Research in Redmond, and he’s been with the company for almost 20 years. His work in systems and networking is helping to bring more internet connectivity to more people and is yielding tools designed to help farmers increase food production more affordably and sustainably.

Here’s my conversation with Ranveer, beginning with his childhood in India and his experience applying to and studying at one of the prestigious Indian Institutes of Technology.

RANVEER CHANDRA: So I grew up in India. I grew up in a city called Jamshedpur in India. It’s a steel city. It’s the only city in India without a mayor, so the Tatas run …

JOHANNES GEHRKE: What does steel city mean?

CHANDRA: It’s the first steel plant in India …

GEHRKE: OK, uh-huh …

CHANDRA: … and a lot of the steel comes from there. The Tatas, which are the big industrialists in India, they run the city. So it’s … I grew up with 24 water, 24 electricity, trees on both sides. It looks like a mini Seattle or a mini Palo Alto in India. It’s a beautiful city. And I did my schooling there in, uh, in one of the schools in that city called Jamshedpur. I did my undergrad in IITs, one of the IITs in India, and then I came to the US, uh, to do my PhD at Cornell. So my childhood, we are three brothers and a sister. All three brothers went to … all four of us studied engineering. Uh, the three brothers all went to IITs, different IITs, and we studied hard, played hard. We did spend a lot of time in villages, though. Every summer and winter vacation, we would go to my grandparents’ place, which was in another state in India called Bihar, which is one of the poorest states. But my grandparents, they were farmers, and they had a lot of farmlands in those villages, so I did spend summer and winter vacations in those villages.

GEHRKE: And how did you end up to study engineering? How do you decide on that?

CHANDRA: Yeah. So in India, as it happens, these IITs are very competitive exams. So in … around … during our time, like close to half a million people gave the test, students gave the test, and the top 2,000 got into IIT. Those are tests of physics, chemistry, and math. Those are the only three subjects, and among those, the top few would try … typically pick computer science. So it was more I enjoyed math. That was my … that was what I really enjoyed. And then, uh, because I got selected into the IITs, that was of course a kind of a dream for many people, to go study there. The education level there is really high, really good. And that is how I ended up in IIT. It was kind of unplanned. It wasn’t, you know, when I got my … when I got through IIT … I wanted to go to another IIT because the one I went to, Kharagpur, it was close to home, but I wanted to go to Bombay because it’s a big city. Mumbai. It’s a big city. Bollywood runs out of Bombay. I thought I could get into Bollywood. Not, not really. [LAUGHS] But I did go to Kharagpur, which is closest to … this is the one where it has … it’s the oldest IIT, and it was very close to home. So I ended up going there and studying computer science.

GEHRKE: And why computer science?

CHANDRA: Yeah, so, uh, computer science because it was … it had a lot of math, so once I got … the way I got exposed to computers was I was … in high school, I studied the theory before I got to touch a computer. There was one computer in school.

GEHRKE: One computer in the whole …

CHANDRA: One computer, and everyone had to go there and see what a computer is. But we did get … we had books to teach you everything about what binary is, how computers were invented. That was around the time I enjoyed reading about computers …

GEHRKE: So you did like algorithms on like sheets of paper?

CHANDRA: So, yeah, so you draw the flow charts …

GEHRKE: Right.

CHANDRA: I enjoyed some of the flow charts. I remember some of the flow charts like how do you have the greatest common factor and things like that. I enjoyed doing those algorithms, and there was that similarity with math. You need to have a good math background to, to enjoy those things in computers. So I did a lot of programming on pen and paper, and someone would correct it. And then we got to start learning … BASIC was the first language that I learned. I really liked coding.

GEHRKE: What kind of computer was it, actually?

CHANDRA: So this was, um, you know, these, these dumb computers with one mainframe behind, so this was one of the Sun computers back then.

GEHRKE: Oh, wow. OK, uh-huh.

CHANDRA: And we had just these dumb terminals through which you would get access to these … [LAUGHS]

GEHRKE: And that was BASIC, not, not Pascal or anything? That’s interesting …

CHANDRA: It started with BASIC. Yeah, BASIC, and then FORTRAN was the next one, then C. So those were the languages that I learned. And computers because I just enjoyed … I would have picked either math or computers. Those were my two things, and computers was just fun. It was more … and that was just the time, you know, when you would reserve some time to play a computer game, Pac-Man and things like that, so those things were fun back then. This was, uh, late ’80s, early ’90s.

GEHRKE: And then what I’ve heard is that to get into the IITs is super competitive, so did you then study a lot or you played a lot, or what was the … ?

CHANDRA: That’s a funny story. So you know, when I went into IIT, I was … the interesting thing is once you go there, everyone who comes there is from all over India; these are the people who are top of their class. So everyone else is as good as you. So you, you then end up studying very hard because that’s the culture. Everyone is coming in there. And, uh, in the first semester at IIT, I was No. 1 in the IIT, all across IIT, and that was like, “Whoa, that was, that was easy.” I didn’t really put in a lot of effort. My elder brother was there, too. He was in the last year, and he was, he was more of the fun kind. I was more of the, you know, the studious kind. He came and told me don’t do Thing A, Thing B; don’t get into alcohol or, or party and all of that stuff. I ended up doing all of that. Don’t run for elections. I ran for elections and all that stuff.

GEHRKE: Oh really. Where did you run for elections? The student parliament?

CHANDRA: Within the institute, so I was the secretary of sports—volleyball and all of that stuff. So I did … a lot of fun, as well. So in the end I was like No. 3 graduating. But I did have a lot of fun, too. I did a lot of social, cultural things. I was on the volleyball team and things like that at IIT.

GEHRKE: And coming once more back, I mean so to get into the IIT … I mean, for, for me as a German, this is so, you know, unusual because we don’t have these centralized entrance exams, except for medicine.

CHANDRA: Yeah.

GEHRKE: But, um, I heard the test is really, really hard. And you actually in your last year of high school, you don’t really study for high school anymore. You just study for that test. How, how, how is that actually?

CHANDRA: Yeah. And now it has become even more competitive. During our time, it was … there were fewer seats. There were like 2,000 people from all across … there were five IITs, six IITs, back then. And yeah, studying towards the end … so you start studying just physics, chemistry, math. Back during our time, we didn’t have as much tuitions and stuff. I didn’t have many … anyone … like the last six months, I’ve had something. But now, people go to these, these other towns which are meant for coaching people for IITs, and they have these different sections …

GEHRKE: They live away from their parents?

CHANDRA: They go away from their parents; they live in a hostel. And all they’re preparing for is the IITs. We don’t have that … I didn’t have that during our time. But now it has become so much even more competitive. More students take it, and it’s like a centralized exam for, for studying. But it does … you know, in the end, the experience was worth it. If you ask me, “Hey, was all this studying worth it?” I think getting into IITs, of course, the professors are good, but the students are exceptional, the kind of people you’re interacting with, that ambience. And now when I look at my classmates, everyone’s doing well. And you find people doing different things. Not, not everyone is in, is in tech. They go do different things, and they excel in that field because of the kind of people that they select into these IITs. So I think in the end, it was stressful, but it was worth it.

GEHRKE: It was a great opportunity. Yeah, I mean, and, and then you made sort of the decision, not only after the IITs, to, to stay in India and to take probably a very good job …

CHANDRA: Yeah.

GEHRKE: … but to come to the US and, you know, learn even more. So what, what drove you to that decision?

CHANDRA: Yeah, that was kind of like the way I studied computer science. It was not … at least I had a passion for computer science. I didn’t want to do a PhD, by the way, when I was coming here. So you would ask why. So when, when I was graduating, I got the highest-paying job that year among all the undergrads. And that was a big deal. That was back then, Synopsys, one of the EDA companies, the CAD companies, right, the VLSI companies. So I would have taken that, but as it happens, usually the people who are at the top of the class, they would apply outside and they would come here to study. And that was the reason I had applied. But then the, the thing I really wanted to do in my career was to be in business. I wasn’t really looking to be an academic back then. I was like, you know, I’ll go study an MBA.

GEHRKE: And you studied to, to get a PhD instead?

CHANDRA: Yeah. No, so I was like, you know, I’ll apply to PhD programs, and they give a master’s anyways, and after that I’ll go do an MBA. I wanted to be the business guy. So that was the reason I applied, and … but the reason … the, the person who had convinced me to come here was a professor at Cornell. I had other top schools, but there was a professor at Cornell who was a networking guru at that time. I won’t name him. He’s still a very good friend of mine. So he convinced me to come there. I was a fan of his work, um, and I decided to come to Cornell for him. I said no to other schools. And then I land here; this was 1999. I send a message as soon as I get to Ithaca saying, “Hey, I’m here. I would love to meet you,” and he says, “Well, you know, I’m really sorry, but I left Cornell to do a startup.” And then I was a bit … I was very upset. For a few months, I didn’t know … what am I doing here. I gave everything up. I had other colleges where I could have gone. But to be here, I came to study computer science and the person I came here for is no longer here. It was disappointing. But then I was lucky that Professor Ken Birman adopted me. He was like, “Hey, you have a fellowship. You do what you want. I’m not going to interfere. You just do what you want.” And that’s what convinced me to do a PhD … that in the sense, the first few months were disappointing, but then once I got the freedom, I really was like I was getting paid some money for just learning. And that bit really got me very excited. The fact that I had all the independence to pick what I want, to work on the things that I want. And that’s what convinced me that I don’t want to do an MBA. I want to … I can do what I would do with an MBA after doing a PhD. So that’s what got me to do a PhD at Cornell.

GEHRKE: It’s, it’s super interesting because, I mean, if you hear that story, for many people, it would be kind of frightening, right. You come there … well, you have this person who you wanted to work with and maybe he … there was sort of a plan set up or so, and now, I mean you have to switch advisors. OK, that’s one thing. But the second thing is a PhD sounds so frightening to many people because it’s like a step into the unknown, right. So your PhD by definition, you don’t know whether you’re going to get there, right, because it’s research, and research sometimes leads you into the wrong path and sometimes you don’t get the result that you want. So how, how do you deal personally with that uncertainty?

CHANDRA: For me, it is more I like the unstructured part of it. I like the fact that I could take it in many directions and grow it, and I want that, that level of flexibility, and the more I realized … I think problem picking becomes important, and Ken helped me a bit with it. So initially, I told him I want to do wireless. This was back in 1999. It’s six months into a course, and I’m like I want to write a paper. This is what we want to do, on reliable multicast but for wireless systems. At that time, wireless was very new; people didn’t have cellphones, uh, and such. So … and he said go for it; it was worth it. And then I started exploring it with another grad student. We wrote a paper on it. And that was a good learning experience, which I really enjoyed. The fact that I’m venturing into the unknown, and Ken was explicit. He was like I’m not the expert in wireless; you have to learn it yourself. We did it ourselves. We wrote the paper. It got accepted. And all that was … really helped me … it gave me the confidence that it is possible to explore and do new things. And that’s what got me excited. And that’s what got me into the space of networking, as well. It’s all about wireless, and, um, and getting people more connected at low cost. How do you get everyone connected to the internet? And that’s the space. I think there is a passion within me around that, as well. And the fact that during my PhD, I got the opportunity to go explore, just try everything, and we just made … kept making the right bets, as well, with respect to papers and what got accepted. I did, I did an internship at Microsoft Research, as well, during my PhD. This was three years into my PhD. I came here a few times and that helped me, as well. That helped me further. I worked with Victor Bahl, who was my intern manager, but he was my mentor, and that helped me further go towards my career goals.

GEHRKE: And Victor is now a technical fellow in Azure, where he’s the CTO of our Azure for Operators efforts.

CHANDRA: That’s right. Yeah.

GEHRKE: And maybe, maybe one thought about networking, right. So networking seems to me like this field which is pretty hard because without the hardware, networking doesn’t work. But without the right kind of network protocols and the software, it doesn’t work. So you don’t only … you can’t only do one thing, right. You cannot do only just …

CHANDRA: The one layer …

GEHRKE: … the software … and you also have to do the hardware, and they have to sort of co-evolve. How, how does this work in networking research? Explain that a little bit to our audience. How does networking research actually make progress if both of these have to sort of work in lockstep?

CHANDRA: That’s a great point, Johannes. And that is one of the things with networking. Right when I was an undergrad, I started getting excited by this layered diagram of networking—the seven-layer diagram, the seven-layer OSI stack that we …

GEHRKE: Oh, yeah, I never understood that completely. [LAUGHTER]

CHANDRA: Yeah, it’s all the way from the physical layer, so if you think of the physical layer as one hop …

GEHRKE: Yeah.

CHANDRA: MAC layer … so networking is all about how do you send bits across two computers anywhere in the world. And at the lowest layer, it’s about how do you send the bits across. The layer above it makes it reliable over one hop. That’s the medium access layer. The layer above that ensures that you can communicate not just over one hop but anywhere on the internet using IP. The level above that, with TCP, you make sure that end-to-end communication is more reliable. So every step, every layer that you go above …

GEHRKE: Got it …

CHANDRA: … helps to make sure that your network is better. Now of course once you start layering things, it makes it harder to interoperate. It makes things inefficient because you’re adding headers per layer, which makes it … when you’re consuming bandwidth, you’re introducing extra latency. But that’s an opportunity. At the very least, what this layer diagram has done is that it has ensured innovation across different layers as long as they’re good enough APIs for each layer to communicate with the next layer, so that is the key part of networking research, where over the years it has kept evolving. Every layer has changed. The hardware, we’ve seen Ethernet go from bits per second to kilobits to megabits, gigabits. Now it’s hundreds of gigabits. We’ll soon hit terabits, as well, which people are talking about with 6G—to every layer. When we think of the MAC layer, the TCP layer, all of those have been evolving and that has led to applications. A lot of times, a lot of people just worry about the applications: is my media application … can I watch things on Netflix? Well, underlying that is all the bandwidth that the network provides.

GEHRKE: Got it. So, so one way to think about this is that as long as I make my hardware have the same APIs, I can even go … I can sort of significantly evolve my hardware and all the other parts of the network stack will work?

CHANDRA: Exactly, exactly.

GEHRKE: I see.

CHANDRA: So you could be innovating on the radio—you make that radio faster—but as long as you keep the APIs the same, the TCP layer would work as is with the layers on the bottom.

GEHRKE: Got it. So, so I hear this magic word “6G” from you a lot these days. Can you just explain a little bit. What is 6G all about, and why is it interesting?

CHANDRA: Yeah, so every … over the network, we’ve seen these standards evolve over time. Every 10 years, we see from 2G to 3G to 4G. Now 5G.

GEHRKE: Why 10 years? Why 10 years?

CHANDRA: Ten years is usually the time it takes to come up with a new innovation, drive the standard, drive alignment across different stakeholders to see this is what the next standard should be. Because then once we finalize the standard is when you’ll have all the other vendors like people who build the hardware to base stations, to cellphones, to modems, everyone can then align and build something that is … you know, you have your Qualcomm modem talking with say an Ericsson hardware with the AT&T carrier, which is running on Azure cloud.

GEHRKE: Because everything has to interoperate. That’s why we have the standards.

CHANDRA: Yeah, so that’s why we have the standards, which evolve in a 10-year time frame. With 6G, we are looking at 10x more bandwidth. Your throughputs will go much higher. And one-tenth the latency. Can you get to sub millisecond latency? And the kind of scenarios that we are thinking of are … think of, uh, we can think of completing the feedback loop like robotics and so on, where you’re getting the information, you need to send all this to the cloud because this is huge amounts of information, you need to act on it using AI, and you need to send the feedback so that your robot can perform in time. This could be something in a racetrack, something in, uh, in the middle of … on the roads, or it could be in the middle of a farm. So this is what the vision is. And along with that, the other vision that we have with 6G is to bring internet all over the world. That is right now still around 40 percent of the population in the world—that’s close to 3 billion people in the world—doesn’t have internet access. They just cannot … they, they just don’t have access to the internet.

GEHRKE: And why does 6G help with that?

CHANDRA: 6G should make connectivity more affordable.

GEHRKE: So 6G is also cheaper, even though it’s faster and lower latency?

CHANDRA: It will be.

GEHRKE: That seems so contradictory. Why is that the case?

CHANDRA: No, so I think it will be high speed and low latency in areas where it is needed, but the other feature it should bring in is affordable connectivity in regions that are not connected. And these are in … a lot of it is in the emerging markets, where the people are not connected. And it is not just people. Now we’re also talking about people and things because, you know, if you think of the entire world’s surface, close to 80 percent of the world’s surface, which includes ocean and land, doesn’t have terrestrial internet. So how do you bring internet connectivity throughout the world? That’s one of the challenges that people are looking at with 6G, along with some of the other things around sustainability, security, trust. These are all issues, as well, but at an underlying layer, the fundamental thing we want is high speed, lower latency, and connectivity—affordable connectivity—everywhere. We can’t be leaving 3 billion people in the world behind with no internet when it is so central to the way we are. It defines everything we do, and yet there are so many people in the world who don’t have internet access.

GEHRKE: You mentioned one word, one word “farm,” and we’ll get to that in a second. I just wanted to ask one more question because it just sounds a little bit like magic to me that, you know, you get lower latency, higher bandwidth, and lower cost.

CHANDRA: Yeah.

GEHRKE: Why don’t I get this with 5G if I just push the hardware along?

CHANDRA: So this is where the research would come in, and I think it won’t be the … so when you think of a standard, we think of different components of the standard. One part of the standard is the spectrum. Which part of the spectrum do you operate on?

GEHRKE: Right.

CHANDRA: That could define the throughputs that you get. Now the high speed usually comes with a limited range, as well. Like, you know, like one of the technologies that people are talking about—we are investigating here at Microsoft Research, as well—is terahertz networks. This is a part of the spectrum where you get huge amounts of bandwidth. It’s still following Shannon’s law, but it is just in that part of the spectrum that until now, people said couldn’t be used for communication. But what we are showing is that well, you could, you could use it for communication in that part of the spectrum. Once you get that bandwidth, it also helps us reduce latency by a significant amount. So that’s one thing people are looking at. Along with that and other technology people are looking at to overcome this problem of short range, like 100 meters, to go beyond that, is smart surfaces. So one of the things we’re building is rather than just have these base stations, what do we … if we have smart surfaces, which are programmable and can then make sure that wherever people are, wherever things are, you can provide connectivity there by, by channeling the signals in that particular region. Along with that, people are also looking at affordability. People are looking at different forms, other forms of communication. Like previously, we’ve looked at other parts of the spectrum like lower … terahertz is going further closer to light, that part of the spectrum. The other part of the spectrum is lower in the TV spectrum, for example, a radio spectrum. Once you go lower in the spectrum, your connectivity can just, just go really long distances. So one of the innovations that we had done a lot at Microsoft Research was on using TV spectrum to send and receive information. The benefit is this spectrum is not being used in many places, and using that, you can provide very low-cost point-to-multipoint connectivity in different regions.

GEHRKE: Makes sense. And so 6G encompasses all of those?

CHANDRA: 6G would encompass … right now, it’s still being defined. It’s still early. But as far as research goes, we are working with the community on all these aspects. The other aspect about 6G is AI-driven networks. So can you make your networks much more intelligent? Right now, you define these networks in standards, and the standard’s written, and that’s what is implemented. But you could adapt parts of it based on what’s happening around you, and you can use the spectrum better. You can use it to make sure that you’re getting much more efficiencies in your, in your system. You can prioritize things better. So that’s again one of the other themes that, uh, that we’re investing in and a few of the other, other research labs are investing in, as well.

GEHRKE: Super interesting. And I mean you mentioned this word “farm” before. You’re, of course, known for FarmVibes. And maybe just explain very briefly what FarmVibes is and then also explain … you know, you started out here doing this in Microsoft Research, but then you actually went to a product group, right. What made you, you know, what made you take that decision? And, you know, you actually now finally here in Microsoft Research again. So maybe tell us a little bit about that, that journey.

CHANDRA: Yeah. So I’ll start with why did I even pick agriculture, right. So as I said I did spend a lot of time in my grandparents’ farms in Bihar. This was in north India.

GEHRKE: What were they farming?

CHANDRA: So they used to farm wheat, sugar cane, rice, and, uh, they had farms there. And back then, I did not like anything to do with agriculture. So I used to go there with my brothers and sister and, you know, I, I did do … like I played kabaddi there. I learned how to ride a bicycle with the people there.

GEHRKE: What, what is kabaddi?

CHANDRA: Kabaddi is like, uh, it’s a funner form of rugby. Not, not … it’s, uh, it’s … there are two teams, and you essentially have to bring the other team down, so it’s … you play it in the sand; you get really dirty playing it. Growing up in those villages, it was fun. But spending time in those villages, I didn’t really look forward to them. The reason was that, you know, the rest of the year, you were in this city, which is maintained by the Tatas, which has water, electricity, clean roads, everything, and then, the rest of the three, four months, I was in this village, which did not have electricity. They didn’t have toilets. If you have to go to the bathroom, you have to go out in the fields in the middle of the night in the winter. It wasn’t what you’d look out … look forward to. But that’s how … that’s one of the things that I grew up with. But one of the things that really stuck with me was the poverty that exists in these villages. Like one of the times, my mom, she did some prayers, she had an offering, and she left it outside. And there were a group of kids, they hadn’t had anything to eat during the day. They were just there to grab something to eat. And that has been something that, um, has really … it’s been in my, my mind during my undergrad, and even over here, one of the, one of the things I always want with any project I’m working on is this bigger mission, things that can impact the people I grew up with, and be it with TV white spaces for providing internet connectivity to what I saw was very primitive forms of agriculture. These, these farmers, they would do hand-based seeding. Over here, you use tractors. They would go with the hand and put the seeds. They would do … use bullock-driven tractors. Like to till, they would just go with a bullock. They’d put this hitch on it and then go till the fields, and this, it’s very primitive. So what we, what we want to do is to enable data-driven agriculture, and the bigger goal here is to help address the world’s food problem. The world needs 50 percent more food compared to today’s levels. And in order to grow that 50 percent more food, we need to get there … not just food. We need to grow good food, nutritious food, and we need to get there without harming the planet. The soils are not getting any richer. The water levels are receding. So that’s the big picture of what we want to do with FarmBeats. Our goal is … one of the most promising approaches to get there is what we call data-driven agriculture. That is, can you, can farmers, use data and AI to remove guesswork as part of their farming operations? You know, the farmers that we work with, um, like even when I was doing networking here, I would actually go and volunteer in farms here, and I’d cold call various farmers. What I realized is that these farmers …

GEHRKE: Like here in, in Washington state, right?

CHANDRA: In Washington state. In fact, there’s a Starbucks right here. There was a barista who knew me. She said that, “Hey, I’m going to Spokane, Eastern Washington, this weekend.” If she’s in Eastern Washington, maybe farms. Who farms there? She says, “My grandfather.” I was like, can you connect me to him? So I would just cold call them. I talk to a lot of farmers, and what I realize is that these farmers, they know a lot about their farm. They’ve been farming there for a long time, yet a lot of decisions they make is based on guesswork.

GEHRKE: Right.

CHANDRA: That is where all the data-driven farming piece came in. So through FarmVibes and previously FarmBeats … with FarmBeats, we built, we built a data, data platform for agriculture. Then I had moved over to the product side; we shipped it as a product. We announced partnerships with Land O’ Lakes. We announced partnerships with … well, Land O’ Lakes, their agriculture platform is now running on that. We announced partnerships with Bayer Corporation, with USDA, and other organizations, as well. Then while I was there, what I realized is that the engineering team is now on track. They are delivering this product. But that’s not enough to help us address the world’s food problem. We need to add intelligence on top of what we are building. We need to bring all the innovations that we’re doing in AI for this field. And that was one of the reasons and of course working with you and the fact that the networking … networking is one of the key components that can help us—networking and AI—so that was one of the other reasons why I came back. And with FarmVibes, that’s the problem that we are addressing. With FarmVibes, it’s the, it’s the farm intelligence-speak; it’s the intelligence that sits … that can light up scenarios, these scenarios that we talk of when we think of data-driven agriculture, sustainable agriculture. The kind of things we want to do is help a farmer take the right decisions for what will make them more productive, what will make them reduce their emissions, what will help them sequester more carbon. These are the kind of questions we want to help a farmer, a farmer answer. And some of that is very fundamental research. We’ve come up with ways to see through the clouds, to do very hyperlocal microclimate prediction, to combine different models to make much more accurate predictions to help farmers. And that’s, that’s the kind of thing we are enabling as part of FarmVibes.

GEHRKE: Well, and so just curious, I mean here in Washington state, what is, what is grown on those farms, and how, how have you helped so far?

CHANDRA: Yeah, so there are farms here … there’s one farm in Eastern Washington. We work with this farmer Andrew Nelson, who is a fifth-generation wheat farmer. This is an hour east of Spokane, so if you go to Spokane, you have to drive another hour. It’s interesting. When you go to Andrew’s farm, like we are about 15, 20 minutes from his farm and you lose internet … cell connectivity.

GEHRKE: It’s completely gone.

CHANDRA: [LAUGHS] So you’re like off the grid. And then you reach his house, and then he set up this TV white spaces thing. He has some connectivity in his farm.

GEHRKE: Through satellite or TV white spaces?

CHANDRA: TV white spaces and a fiber to his home. So there’s a fiber that he’s paid to bring fiber to his home and then that lights up the area around his farm using some of the technologies that we have been inventing here. And with Andrew, this is just one use case, but you could replicate it across other farms, as well. He uses some of the techniques throughout his farming life cycle, all the way from planning what to farm to planting—what to plant, where to plant—to in production, like, for example, doing chemical application. Where do I apply herbicides? Do I need to spray pesticides? Where do I spray it? Rather than spraying it throughout. To harvest. That is, when should I harvest? How should I … what route should I take? To post-harvest. Monitoring things and deciding when and where to sell certain things to get more profit. So he uses it throughout his, um, his farming life cycle, and he’s seen a lot of benefit. Like Andrew’s talked about how in one part of the farm, he could double his yield.

GEHRKE: Double? I was just going to ask actually how much benefit he got from it.

CHANDRA: Yeah, double the yield. And in another part of the farm, he’s talked about 40 percent reduction in chemical costs. You know, for a farmer, one of the input costs is chemicals. And using these precision techniques that we built, Andrew’s been able to save 40 percent. That’s, that’s huge.

GEHRKE: It’s also good for the environment.

CHANDRA: It’s good for the environment, as well. He’s not putting in more chemicals than are needed. So these are real use cases with farmers in our, in Washington state. We’re also working, for example, we announced a partnership in India, in Maharashtra. This is one of the Centres of Excellence that’s being put up for FarmVibes. So this is again, they are building AI capable … this is across Oxford University; there’s an organization in India called Agriculture Development Trusts; and Microsoft. And working with, of course, Microsoft India, our sales team there. They’ve set up this Centre of Excellence in a village called Baramati in India, where they are going to be taking the same techniques we built and adapting it for smallholder farmers in, in, in that region. So really excited about the value it brings. Yeah.

GEHRKE: Ranveer, it seems like, you know, here at Microsoft, you had the amazing opportunity to really have huge impact. You know, you start on research, then deliver a product, now even extending the product to more use cases. Do you have any career advice for our listeners given where you are and where you’re going?

CHANDRA: Yeah. And as a student, if there are students listening to this, I would say consider going after a PhD. It gives you that exposure, uh, the opportunity to learn, to dig deep, to know a lot about, about a field. If you’re a professional, one of the things I would say is try to go after your passion. If you give your work a bigger meaning than just making money, you’ll go beyond the 9-to-5 or 9-to-6 schedule to make that happen. Like, you know, Johannes, one funny incident is over here, working at Microsoft, most people sit in front of the computer. When I had started working on FarmBeats, every day, I would be driving to this farm. There was another farm about 40 minutes’ drive from here. Every day, summer or winter, I would be driving there to do the experiments. And I would go there, and a few days, especially when it rains, it gets really gloomy [LAUGHS] and you have to go in boots to a farm that is muddy, half flooded. I’d be like, “Why am I doing this? I could be sitting there …” And then the, the way I would argue to myself is, you know, even if 1 percent of what I’m doing works, it will help the lives of so many farmers worldwide. And then that just gave me the extra energy to go even more, to, to just give it everything that I have to make that difference. So that’s something which I tell students, as well. Give whatever work you do—you’re working in AI, you’re working in systems, you’re working in, in building the next plane or building the next ship—give your work a bigger meaning. You will, you’ll enjoy it. It just, um, you’ll give it a lot more than, than just thinking about it as work. And, you’re right, that at Microsoft, we get that opportunity to make that wholesome impact, that is as you did, as well. We get to go to the products. If something ships as part of a Microsoft product, it touches the lives of so many people. Like one of the projects I was with was Xbox, for the Xbox, when I designed that Xbox wireless controller protocol. Now over 100 million people use it, and one of the most common congratulatory messages I get is still around the Xbox. When I’m giving a talk, someone will come and say, “My son said thanks to you because you helped make the Xbox successful.” So that’s one of the opportunities we get. But not just that. We get the opportunity to come do research and think bigger about the problem; take it to a different level and then influence the next generation of product. So this is, uh, thank you. I think this is an awesome place to work, to realize that mission, that, that vision of what we want to achieve in our lives.

GEHRKE: Yeah, I think, I mean it speaks so much to me because that’s something that I was also really excited about. Making the transition from university here to Microsoft, as well.

[OUTRO MUSIC]

Thanks again, Ranveer, for the great conversation.

CHANDRA: Yeah, thank you, Johannes.

The post What’s Your Story: Ranveer Chandra appeared first on Microsoft Research.

Coming in Clutch: Stream ‘Counter-Strike 2’ From the Cloud for Highest Frame Rates

Rush to the cloud — stream Counter-Strike 2 on GeForce NOW for the highest frame rates. Members can play through the newest chapter of Valve’s elite, competitive, first-person shooter from the cloud.

It’s all part of an action-packed GFN Thursday, with 22 more games joining the cloud gaming platform’s library, including Hot Wheels Unleashed 2 – Turbocharged.

“Rush B! Rush B!”

Counter-Strike 2 is the long-awaited upgrade to one of the most recognizable competitive first-person shooters in the world.

Building on the legacy of Counter-Strike: Global Offensive, the latest iteration brings the action to Valve’s long-anticipated Source 2 video game engine, promising enhanced graphical fidelity with a physically based rendering system for more realistic textures and materials, dynamic lighting, reflections and more.

Smoke grenades are now dynamic volumetric objects that can interact with their surroundings by reacting to lighting and other environmental effects. And smoke particles work with the unified lighting system, allowing for more realistic light and color.

Even better: GeForce NOW Ultimate members can take full advantage of NVIDIA Reflex for ultra-low-latency gameplay streaming from the cloud. Rush the objective with the squad on Counter-Strike 2’s remastered maps at up to 240 frames per second — a first for cloud gaming. Upgrade today for the Ultimate Counter-Strike experience.

Vroom, Vroom!

There’s more action around every turn of the GeForce NOW library. Put the pedal to the metal in Hot Wheels Unleashed 2 – Turbocharged, one of 22 newly supported games joining this week:

Wizard With a Gun (New release on Steam, Oct. 17)
Alaskan Road Truckers (New release on Steam, Oct. 18)
Hellboy: Web of Wyrd (New release on Steam, Oct. 18)
AirportSim (New release on Steam, Oct. 19)
Eternal Threads (New release on Epic Games Store, Oct. 19)
Hot Wheels Unleashed 2 – Turbocharged (New release on Steam, Oct. 19)
Laika Aged Through Blood (New release on Steam, Oct. 19)
Battle Chasers: Nightwar (Xbox, available on Microsoft Store)
Black Skylands (Xbox, available on Microsoft Store)
Blair Witch (Xbox, available on Microsoft Store)
Chicory: A Colorful Tale (Xbox and available on PC Game Pass)
Dead by Daylight (Xbox and available on PC Game Pass)
Dune: Spice Wars (Xbox and available on PC Game Pass)
Everspace 2 (Xbox and available on PC Game Pass)
EXAPUNKS (Xbox and available on PC Game Pass)
Gungrave G.O.R.E (Xbox and available on PC Game Pass)
Railway Empire 2 (Xbox and available on PC Game Pass)
Techtonica (Xbox and available on PC Game Pass)
Teenage Mutant Ninja Turtles: Shredder’s Revenge (Xbox and available on PC Game Pass)
Torchlight III (Xbox and available on PC Game Pass)
Trine 5: A Clockwork Conspiracy (Epic Games Store)
Vampire Survivors (Xbox, available on PC Game Pass)

What are you planning to play this weekend? Let us know on Twitter or in the comments below.

What’s a game you played this year that has become a fave?

— NVIDIA GeForce NOW (@NVIDIAGFN) October 18, 2023

DALL·E 3 is now available in ChatGPT Plus and Enterprise

We developed a safety mitigation stack to ready DALL·E 3 for wider release and are sharing updates on our provenance research.OpenAI Blog

Evaluating social and ethical risks from generative AI

Generative AI systems are already being used to write books, create graphic designs, assist medical practitioners, and are becoming increasingly capable. To ensure these systems are developed and deployed responsibly requires carefully evaluating the potential ethical and social risks they may pose.In our paper, we propose a three-layered framework for evaluating the social and ethical risks of AI systems. This framework includes evaluations of AI system capability, human interaction, and systemic impacts.We also map the current state of safety evaluations and find three main gaps: context, specific risks, and multimodality. To help close these gaps, we call for repurposing existing evaluation methods for generative AI and for implementing a comprehensive approach to evaluation, as in our case study on misinformation. This approach integrates findings like how likely the AI system is to provide factually incorrect information, with insights on how people use that system, and in what context. Multi-layered evaluations can draw conclusions beyond model capability and indicate whether harm — in this case, misinformation — actually occurs and spreads. To make any technology work as intended, both social and technical challenges must be solved. So to better assess AI system safety, these different layers of context must be taken into account. Here, we build upon earlier research identifying the potential risks of large-scale language models, such as privacy leaks, job automation, misinformation, and more — and introduce a way of comprehensively evaluating these risks going forward.Read More

Measurement-induced entanglement phase transitions in a quantum circuit

Posted by Jesse Hoke, Student Researcher, and Pedram Roushan, Senior Research Scientist, Quantum AI Team

Quantum mechanics allows many phenomena that are classically impossible: a quantum particle can exist in a superposition of two states simultaneously or be entangled with another particle, such that anything you do to one seems to instantaneously also affect the other, regardless of the space between them. But perhaps no aspect of quantum theory is as striking as the act of measurement. In classical mechanics, a measurement need not affect the system being studied. But a measurement on a quantum system can profoundly influence its behavior. For example, when a quantum bit of information, called a qubit, that is in a superposition of both “0” and “1” is measured, its state will suddenly collapse to one of the two classically allowed states: it will be either “0” or “1,” but not both. This transition from the quantum to classical worlds seems to be facilitated by the act of measurement. How exactly it occurs is one of the fundamental unanswered questions in physics.

In a large system comprising many qubits, the effect of measurements can cause new phases of quantum information to emerge. Similar to how changing parameters such as temperature and pressure can cause a phase transition in water from liquid to solid, tuning the strength of measurements can induce a phase transition in the entanglement of qubits.

Today in “Measurement-induced entanglement and teleportation on a noisy quantum processor”, published in Nature, we describe experimental observations of measurement-induced effects in a system of 70 qubits on our Sycamore quantum processor. This is, by far, the largest system in which such a phase transition has been observed. Additionally, we detected “quantum teleportation” — when a quantum state is transferred from one set of qubits to another, detectable even if the details of that state are unknown — which emerged from measurements of a random circuit. We achieved this breakthrough by implementing a few clever “tricks” to more readily see the signatures of measurement-induced effects in the system.

Background: Measurement-induced entanglement

Consider a system of qubits that start out independent and unentangled with one another. If they interact with one another , they will become entangled. You can imagine this as a web, where the strands represent the entanglement between qubits. As time progresses, this web grows larger and more intricate, connecting increasingly disparate points together.

A full measurement of the system completely destroys this web, since every entangled superposition of qubits collapses when it’s measured. But what happens when we make a measurement on only a few of the qubits? Or if we wait a long time between measurements? During the intervening time, entanglement continues to grow. The web’s strands may not extend as vastly as before, but there are still patterns in the web.

There is a balancing point between the strength of interactions and measurements, which compete to affect the intricacy of the web. When interactions are strong and measurements are weak, entanglement remains robust and the web’s strands extend farther, but when measurements begin to dominate, the entanglement web is destroyed. We call the crossover between these two extremes the measurement-induced phase transition.

In our quantum processor, we observe this measurement-induced phase transition by varying the relative strengths between interactions and measurement. We induce interactions by performing entangling operations on pairs of qubits. But to actually see this web of entanglement in an experiment is notoriously challenging. First, we can never actually look at the strands connecting the qubits — we can only infer their existence by seeing statistical correlations between the measurement outcomes of the qubits. So, we need to repeat the same experiment many times to infer the pattern of the web. But there’s another complication: the web pattern is different for each possible measurement outcome. Simply averaging all of the experiments together without regard for their measurement outcomes would wash out the webs’ patterns. To address this, some previous experiments used “post-selection,” where only data with a particular measurement outcome is used and the rest is thrown away. This, however, causes an exponentially decaying bottleneck in the amount of “usable” data you can acquire. In addition, there are also practical challenges related to the difficulty of mid-circuit measurements with superconducting qubits and the presence of noise in the system.

How we did it

To address these challenges, we introduced three novel tricks to the experiment that enabled us to observe measurement-induced dynamics in a system of up to 70 qubits.

Trick 1: Space and time are interchangeable

As counterintuitive as it may seem, interchanging the roles of space and time dramatically reduces the technical challenges of the experiment. Before this “space-time duality” transformation, we would have had to interleave measurements with other entangling operations, frequently checking the state of selected qubits. Instead, after the transformation, we can postpone all measurements until after all other operations, which greatly simplifies the experiment. As implemented here, this transformation turns the original 1-spatial-dimensional circuit we were interested in studying into a 2-dimensional one. Additionally, since all measurements are now at the end of the circuit, the relative strength of measurements and entangling interactions is tuned by varying the number of entangling operations performed in the circuit.

Exchanging space and time. To avoid the complication of interleaving measurements into our experiment (shown as gauges in the left panel), we utilize a space-time duality mapping to exchange the roles of space and time. This mapping transforms the 1D circuit (left) into a 2D circuit (right), where the circuit depth (T) now tunes the effective measurement rate.

Trick 2: Overcoming the post-selection bottleneck

Since each combination of measurement outcomes on all of the qubits results in a unique web pattern of entanglement, researchers often use post-selection to examine the details of a particular web. However, because this method is very inefficient, we developed a new “decoding” protocol that compares each instance of the real “web” of entanglement to the same instance in a classical simulation. This avoids post-selection and is sensitive to features that are common to all of the webs. This common feature manifests itself into a combined classical–quantum “order parameter”, akin to the cross-entropy benchmark used in the random circuit sampling used in our beyond-classical demonstration.

This order parameter is calculated by selecting one of the qubits in the system as the “probe” qubit, measuring it, and then using the measurement record of the nearby qubits to classically “decode” what the state of the probe qubit should be. By cross-correlating the measured state of the probe with this “decoded” prediction, we can obtain the entanglement between the probe qubit and the rest of the (unmeasured) qubits. This serves as an order parameter, which is a proxy for determining the entanglement characteristics of the entire web.

In the decoding procedure we choose a “probe” qubit (pink) and classically compute its expected value, conditional on the measurement record of the surrounding qubits (yellow). The order parameter is then calculated by the cross correlation between the measured probe bit and the classically computed value.

Trick 3: Using noise to our advantage

A key feature of the so-called “disentangling phase” — where measurements dominate and entanglement is less widespread — is its insensitivity to noise. We can therefore look at how the probe qubit is affected by noise in the system and use that to differentiate between the two phases. In the disentangling phase, the probe will be sensitive only to local noise that occurs within a particular area near the probe. On the other hand, in the entangling phase, any noise in the system can affect the probe qubit. In this way, we are turning something that is normally seen as a nuisance in experiments into a unique probe of the system.

What we saw

We first studied how the order parameter was affected by noise in each of the two phases. Since each of the qubits is noisy, adding more qubits to the system adds more noise. Remarkably, we indeed found that in the disentangling phase the order parameter is unaffected by adding more qubits to the system. This is because, in this phase, the strands of the web are very short, so the probe qubit is only sensitive to the noise of its nearest qubits. In contrast, we found that in the entangling phase, where the strands of the entanglement web stretch longer, the order parameter is very sensitive to the size of the system, or equivalently, the amount of noise in the system. The transition between these two sharply contrasting behaviors indicates a transition in the entanglement character of the system as the “strength” of measurement is increased.

Order parameter vs. gate density (number of entangling operations) for different numbers of qubits. When the number of entangling operations is low, measurements play a larger role in limiting the entanglement across the system. When the number of entangling operations is high, entanglement is widespread, which results in the dependence of the order parameter on system size (inset).

In our experiment, we also demonstrated a novel form of quantum teleportation that arises in the entangling phase. Typically, a specific set of operations are necessary to implement quantum teleportation, but here, the teleportation emerges from the randomness of the non-unitary dynamics. When all qubits, except the probe and another system of far away qubits, are measured, the remaining two systems are strongly entangled with each other. Without measurement, these two systems of qubits would be too far away from each other to know about the existence of each other. With measurements, however, entanglement can be generated faster than the limits typically imposed by locality and causality. This “measurement-induced entanglement” between the qubits (that must also be aided with a classical communications channel) is what allows for quantum teleportation to occur.

Proxy entropy vs. gate density for two far separated subsystems (pink and black qubits) when all other qubits are measured. There is a finite-size crossing at ~0.9. Above this gate density, the probe qubit is entangled with qubits on the opposite side of the system and is a signature of the teleporting phase.

Conclusion

Our experiments demonstrate the effect of measurements on a quantum circuit. We show that by tuning the strength of measurements, we can induce transitions to new phases of quantum entanglement within the system and even generate an emergent form of quantum teleportation. This work could potentially have relevance to quantum computing schemes, where entanglement and measurements both play a role.

Acknowledgements

This work was done while Jesse Hoke was interning at Google from Stanford University. We would like to thank Katie McCormick, our Quantum Science Communicator, for helping to write this blog post.

Introduction

The RL Loop and Simulated Users

Simulator Design

Set-Up

Actions and Observations

Termination and Reset

Episode Steps

Episode Storage and Replay

Creating TF-Agent Environments

A Customized Agent for Playlist Generation

Experiments In Brief

Acknowledgements

Solution overview

One-stage model approach

Two-stage model approach

Inference pipeline

Conclusion

About the authors

Problem: ML data that contains PII

Solution overview

Prerequisites

Open SageMaker Studio

Download the SageMaker Data Wrangler flow

Review the SageMaker Data Wrangler flow

Add a destination node

Create a SageMaker Data Wrangler export job

Clean up

Conclusion

About the Authors

Activities with personalized feedback, to supplement existing learning tools

Personalized real-time feedback

Contextual translation

Grammar feedback

Semantic analysis

ML-assisted content development

Conclusion

Acknowledgements

Learn more:

Subscribe to the Microsoft Research Podcast:

Transcript

“Rush B! Rush B!”

Vroom, Vroom!

Background: Measurement-induced entanglement

How we did it

Trick 1: Space and time are interchangeable

Trick 2: Overcoming the post-selection bottleneck

Trick 3: Using noise to our advantage

What we saw

Conclusion

Acknowledgements

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.