Two salient limitations have long hindered the relevance of optimal transport methods to machine learning. First, the computational cost of standard sample-based solvers (when used on batches of samples) is prohibitive. Second, the mass conservation constraint makes OT solvers too rigid in practice: because they must match textit{all} points from both measures, their output can be heavily influenced by outliers. A flurry of recent works has addressed these computational and modeling limitations. Still it has resulted in two separate strains of methods: While the computational outlook was…Apple Machine Learning Research