This paper was accepted at Generative AI and Biology workshop at NeurIPS 2023.
In this paper we tackle the problem of generating a molecule conformation in 3D space given its 2D structure. We approach this problem through the lens of a diffusion model for functions in Riemannian Manifolds. Our approach is simple and scalable, and obtains results that are on par with state-of-the-art while making no assumptions about the explicit structure of molecules.Apple Machine Learning Research
Manifold Diffusion Fields
This paper was accepted at the Diffusion Models workshop at NeurIPS 2023.
Score-based models have quickly become the de facto choice for generative modeling of images, text and more recently molecules. However, to adapt a score-based generative modeling to these domains the score network needs to be carefully designed, hampering its applicability to arbitrary data domains. In this paper we tackle this problem by taking a textit{functional} view of data. This functional view allows to cast seemingly different domains to a common shared representation. We then re-formulate the score function to…Apple Machine Learning Research
Fast Optimal Locally Private Mean Estimation via Random Projections
We study the problem of locally private mean estimation of high-dimensional vectors in the Euclidean ball. Existing algorithms for this problem either incur sub-optimal error or have high communication and/or run-time complexity. We propose a new algorithmic framework, ProjUnit, for private mean estimation that yields algorithms that are computationally efficient, have low communication complexity, and incur optimal error up to a 1+o(1)-factor. Our framework is deceptively simple: each randomizer projects its input to a random low-dimensional subspace, normalizes the result, and then runs an…Apple Machine Learning Research
FLEEK: Factual Error Detection and Correction with Evidence Retrieved from External Knowledge
Large language models’ inability to attribute their claims to external knowledge and their tendency to hallucinate makes it difficult to trust their responses. Even humans are prone to factual errors in their writing. Therefore verifying the factual accuracy of textual information, whether generated by large language models or curated by humans, is an important task. However, manually validating and correcting factual errors tends to be a tedious and labor-intensive process. In this paper, we propose FLEEK for automatic fact verification and correction. FLEEK automatically extracts factual…Apple Machine Learning Research
4M: Massively Multimodal Masked Modeling
*=Equal Contributors
Current machine learning models for vision are often highly specialized and limited to a single modality and task. In contrast, recent large language models exhibit a wide range of capabilities, hinting at a possibility for similarly versatile models in computer vision. In this paper, we take a step in this direction and propose a multimodal training scheme called 4M. It consists of training a single unified Transformer encoder-decoder using a masked modeling objective across a wide range of input/output modalities – including text, images, geometric, and semantic…Apple Machine Learning Research
Adaptive Weight Decay
We propose adaptive weight decay, which automatically tunes the hyper-parameter for weight decay during each training iteration. For classification problems, we propose changing the value of the weight decay hyper-parameter on the fly based on the strength of updates from the classification loss (i.e., gradient of cross-entropy), and the regularization loss (i.e., -norm of the weights). We show that this simple modification can result in large improvements in adversarial robustness — an area which suffers from robust overfitting — without requiring extra data across various datasets and…Apple Machine Learning Research
Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs
Recent work in Natural Language Processing and Computer Vision has been using textual information – e.g., entity names and descriptions – available in knowledge graphs to ground neural models to high-quality structured data. However, when it comes to non-English languages, the quantity and quality of textual information are comparatively scarce. To address this issue, we introduce the novel task of automatic Knowledge Graph Enhancement (KGE) and perform a thorough investigation on bridging the gap in both the quantity and quality of textual information between English and non-English…Apple Machine Learning Research
What Algorithms can Transformers Learn? A Study in Length Generalization
This paper was accepted at the MATH workshop at NeurIPS 2023.
Large language models exhibit surprising emergent generalization properties, yet also struggle on many simple reasoning tasks such as arithmetic and parity. This raises the question of if and when Transformer models can learn the true algorithm for solving a task. We study the scope of Transformers’ abilities in the specific setting of length generalization on algorithmic tasks. Here, we propose a unifying framework to understand when and how Transformers can exhibit strong length generalization on a given task. Specifically, we…Apple Machine Learning Research
TiC-CLIP: Continual Training of CLIP Models
This paper was accepted to the workshop on Distribution Shifts in NeurIPS 2023.
Large-scale training of models has become exceedingly more expensive. In an ever changing world where Petabytes of new data is generated every day, we want to be able to continually train models. In this paper, we create a benchmark for continual large-scale training of CLIP models where the data distribution varies only by time. Compared with traditional continual learning literature, there is no hard separation of tasks, i.e., we assume an infinite stream of data in a canonical format arrives that exhibits…Apple Machine Learning Research