Apple – Page 39 – Vedere AI

FLEEK: Factual Error Detection and Correction with Evidence Retrieved from External Knowledge

December 1, 2023

by Apple

Large language models’ inability to attribute their claims to external knowledge and their tendency to hallucinate makes it difficult to trust their responses. Even humans are prone to factual errors in their writing. Therefore verifying the factual accuracy of textual information, whether generated by large language models or curated by humans, is an important task. However, manually validating and correcting factual errors tends to be a tedious and labor-intensive process. In this paper, we propose FLEEK for automatic fact verification and correction. FLEEK automatically extracts factual…Apple Machine Learning Research

Federated Learning for Speech Recognition: Revisiting Current Trends Towards Large-Scale ASR

November 30, 2023

by Apple

This paper was accepted at the Federated Learning in the Age of Foundation Models workshop at NeurIPS 2023.
While automatic speech recognition (ASR) has witnessed remarkable achievements in recent years, it has not garnered a widespread focus within the federated learning (FL) and differential privacy (DP) communities. Meanwhile, ASR is also a well suited benchmark for FL and DP as there is (i) a natural data split across users by using speaker information; (ii) heterogeneous data across speakers close to practical settings; (iii) interplay between acoustic and language modeling; (iv) and it…Apple Machine Learning Research

Swap Agnostic Learning, or Characterizing Omniprediction via Multicalibration

November 30, 2023

by Apple

A recent line of work shows that notions of multigroup fairness imply surprisingly strong notions of omniprediction: loss minimization guarantees that apply not just for a specific loss function, but for any loss belonging to a large family of losses. While prior work has derived various notions of omniprediction from multigroup fairness guarantees of varying strength, it was unknown whether the connection goes in both directions. In this work, we answer this question in the affirmative, establishing equivalences between notions of multicalibration and omniprediction. The new definitions that…Apple Machine Learning Research

SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding

November 30, 2023

by Apple

This paper was accepted at the UniReps Workshop at NeurIPS 2023.
The landscape of publicly available vision foundation models (VFMs), such as CLIP and Segment Anything Model (SAM), is expanding rapidly. VFMs are endowed with distinct capabilities stemming from their pre-training objectives. For instance, CLIP excels in semantic understanding, while SAM specializes in spatial understanding for segmentation. In this work, we introduce a simple recipe to efficiently merge VFMs into a unified model that absorbs their expertise. Our method integrates techniques of multi-task learning, continual…Apple Machine Learning Research

Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs

November 30, 2023

by Apple

Recent work in Natural Language Processing and Computer Vision has been using textual information – e.g., entity names and descriptions – available in knowledge graphs to ground neural models to high-quality structured data. However, when it comes to non-English languages, the quantity and quality of textual information are comparatively scarce. To address this issue, we introduce the novel task of automatic Knowledge Graph Enhancement (KGE) and perform a thorough investigation on bridging the gap in both the quantity and quality of textual information between English and non-English…Apple Machine Learning Research

What Algorithms can Transformers Learn? A Study in Length Generalization

November 30, 2023

by Apple

This paper was accepted at the MATH workshop at NeurIPS 2023.
Large language models exhibit surprising emergent generalization properties, yet also struggle on many simple reasoning tasks such as arithmetic and parity. This raises the question of if and when Transformer models can learn the true algorithm for solving a task. We study the scope of Transformers’ abilities in the specific setting of length generalization on algorithmic tasks. Here, we propose a unifying framework to understand when and how Transformers can exhibit strong length generalization on a given task. Specifically, we…Apple Machine Learning Research

TiC-CLIP: Continual Training of CLIP Models

November 30, 2023

by Apple

This paper was accepted to the workshop on Distribution Shifts in NeurIPS 2023.
Large-scale training of models has become exceedingly more expensive. In an ever changing world where Petabytes of new data is generated every day, we want to be able to continually train models. In this paper, we create a benchmark for continual large-scale training of CLIP models where the data distribution varies only by time. Compared with traditional continual learning literature, there is no hard separation of tasks, i.e., we assume an infinite stream of data in a canonical format arrives that exhibits…Apple Machine Learning Research

Empirical Methods in Natural Language Processing (EMNLP) 2023

November 17, 2023

by Apple

Apple Machine Learning Research

How to Scale Your EMA

November 16, 2023

by Apple

*=Equal Contributors
Preserving training dynamics across batch sizes is an important tool for practical machine learning as it enables the trade-off between batch size and wall-clock time. This trade-off is typically enabled by a scaling rule; for example, in stochastic gradient descent, one should scale the learning rate linearly with the batch size. Another important machine learning tool is the model EMA, a functional copy of a target model whose parameters move towards those of its target model according to an Exponential Moving Average (EMA) at a rate parameterized by a momentum…Apple Machine Learning Research

Automating Behavioral Testing in Machine Translation

November 16, 2023

by Apple

Behavioral testing in NLP allows fine-grained evaluation of systems by examining their linguistic capabilities through the analysis of input-output behavior. Unfortunately, existing work on behavioral testing in Machine Translation (MT) is currently restricted to largely handcrafted tests covering a limited range of capabilities and languages. To address this limitation, we propose using Large Language Models (LLMs) to generate a diverse set of source sentences tailored to test the behavior of MT models in a range of situations. We can then verify whether the MT model exhibits the expected…Apple Machine Learning Research

Vedere AI

Posts in category: Apple

FLEEK: Factual Error Detection and Correction with Evidence Retrieved from External Knowledge

Federated Learning for Speech Recognition: Revisiting Current Trends Towards Large-Scale ASR

Swap Agnostic Learning, or Characterizing Omniprediction via Multicalibration

SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding

Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs

What Algorithms can Transformers Learn? A Study in Length Generalization

TiC-CLIP: Continual Training of CLIP Models

Empirical Methods in Natural Language Processing (EMNLP) 2023

How to Scale Your EMA

Automating Behavioral Testing in Machine Translation

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.