Estimating the density of a distribution from samples is a fundamental problem in statistics. In many practical settings, the Wasserstein distance is an appropriate error metric for density estimation. For example, when estimating population densities in a geographic region, a small Wasserstein distance means that the estimate is able to capture roughly where the population mass is. In this work we study differentially private density estimation in the Wasserstein distance. We design and analyze instance-optimal algorithms for this problem that can adapt to easy instances.
For distributions…Apple Machine Learning Research
Samplable Anonymous Aggregation for Private Federated Data Analytics
We revisit the problem of designing scalable protocols for private statistics and private federated learning when each device holds its private data. Locally differentially private algorithms require little trust but are (provably) limited in their utility. Centrally differentially private algorithms can allow significantly better utility but require a trusted curator. This gap has led to significant interest in the design and implementation of simple cryptographic primitives, that can allow central-like utility guarantees without having to trust a central server.
Our first contribution is to…Apple Machine Learning Research
Towards Automated Accessibility Report Generation for Mobile Apps
Many apps have basic accessibility issues, like missing labels or low contrast. Automated tools can help app developers catch basic issues, but can be laborious to run or require writing dedicated tests. In this work, we developed a system to generate accessibility reports from mobile apps through a collaborative process with accessibility stakeholders at Apple. Our method combines varied data collection methods (e.g., app crawling, manual recording) with an existing accessibility scanner. Many such scanners are based on single-screen scanning, and a key problem in whole app accessibility…Apple Machine Learning Research
Improving GFlowNets for Text-to-Image Diffusion Alignment
This paper was accepted at the Foundation Models in the Wild workshop at ICML 2024.
Diffusion models have become the de-facto approach for generating visual data, which are trained to match the distribution of the training dataset. In addition, we also want to control generation to fulfill desired properties such as alignment to a text description, which can be specified with a black-box reward function. Prior works fine-tune pretrained diffusion models to achieve this goal through reinforcement learning-based algorithms. Nonetheless, they suffer from issues including slow credit assignment as…Apple Machine Learning Research
Projected Language Models: A Large Model Pre-Segmented Into Smaller Ones
This paper has been accepted at the Foundation Models in the Wild workshop at ICML 2024.
Large language models are versatile tools but are not suitable for small inference budgets. Small models have more efficient inference but their lower capacity means that their performance can be good only if one limits their scope to a specialized domain. This paper explores how to get a small language model with good specialized accuracy, even when specialization data is unknown during pretraining. We propose a novel architecture, projected networks (PN). PN is a high capacity network whose parameters…Apple Machine Learning Research
PINE: Efficient Norm-Bound Verification for Secret-Shared Vectors
Secure aggregation of high-dimensional vectors is a fundamental primitive in federated statistics and learning. A two-server system such as PRIO allows for scalable aggregation of secret-shared vectors. Adversarial clients might try to manipulate the aggregate, so it is important to ensure that each (secret-shared) contribution is well-formed. In this work, we focus on the important and well-studied goal of ensuring that each contribution vector has bounded Euclidean norm. Existing protocols for ensuring bounded-norm contributions either incur a large communication overhead, or only allow for…Apple Machine Learning Research
On a Neural Implementation of Brenier’s Polar Factorization
In 1991, Brenier proved a theorem that generalizes the polar decomposition for square matrices — factored as PSD ×times× unitary — to any vector field F:Rd→RdF:mathbb{R}^drightarrow mathbb{R}^dF:Rd→Rd. The theorem, known as the polar factorization theorem, states that any field FFF can be recovered as the composition of the gradient of a convex function uuu with a measure-preserving map MMM, namely F=∇u∘MF=nabla u circ MF=∇u∘M. We propose a practical implementation of this far-reaching theoretical result, and explore possible uses within machine learning. The theorem is closely related…Apple Machine Learning Research
Ferretv2: An Improved Baseline for Referring and Grounding
While Ferret seamlessly integrates regional understanding into the Large Language Model (LLM) to facilitate its referring and grounding capability, it poses certain limitations: constrained by the pre-trained fixed visual encoder and failed to perform well on broader tasks. In this work, we unveil Ferret-v2, a significant upgrade to Ferret, with three key designs. (1) Any resolution grounding and referring: A flexible approach that effortlessly handles higher image resolution, improving the model’s ability to process and understand images in greater detail. (2) Multi-granularity visual…Apple Machine Learning Research
Whispering Experts: Toxicity Mitigation in Pre-trained Language Models by Dampening Expert Neurons
An important issue with Large Language Models (LLMs) is their undesired ability to generate toxic language. In this work, we show that the neurons responsible for toxicity can be determined by their power to discriminate toxic sentences, and that toxic language can be mitigated by reducing their activation levels proportionally to this power. We propose AUROC adaptation (AURA), an intervention that can be applied to any pre-trained LLM to mitigate toxicity. As the intervention is proportional to the ability of each neuron to discriminate toxic content, it is free of any model-dependent…Apple Machine Learning Research