Consider the supervised learning setting where the goal is to learn to predict labels y given points x from a distribution. An omnipredictor for a class L of loss functions and a class C of hypotheses is a predictor whose predictions incur less expected loss than the best hypothesis in C for every loss in L. Since the work of [GKR+21] that introduced the notion, there has been a large body of work in the setting of binary labels where y∈{0,1}, but much less is known about the regression setting where y∈[0,1] can be continuous. Our main conceptual contribution is the notion of sufficient…Apple Machine Learning Research