Optimal transport (OT) theory describes general principles to define and select, among many possible choices, the most efficient way to map a probability measure onto another. That theory has been mostly used to estimate, given a pair of source and target probability measures , a parameterized map that can efficiently map onto . In many applications, such as predicting cell responses to treatments, the data measures (features of untreated/treated cells) that define optimal transport problems do not arise in isolation but are associated with a context (the treatment). To account for and…Apple Machine Learning Research