Apple – Page 16 – Vedere AI

Towards Time-Series Reasoning with LLMs

December 3, 2024

by Apple

Multi-modal large language models (MLLMs) have enabled numerous advances in understanding and reasoning in domains like vision, but we have not yet seen this broad success for time-series. Although prior works on time-series MLLMs have shown promising performance in time-series forecasting, very few works show how an LLM could be used for time-series reasoning in natural language. We propose a novel multi-modal time-series LLM approach that learns generalizable information across various domains with powerful zero-shot performance. First, we train a lightweight time-series encoder on top of an…Apple Machine Learning Research

Learning Elastic Costs to Shape Monge Displacements

December 3, 2024

by Apple

Given a source and a target probability measure supported on Rdmathbb{R}^dRd, the Monge problem aims for the most efficient way to map one distribution to the other.
This efficiency is quantified by defining a cost function between source and target data.
Such a cost is often set by default in the machine learning literature to the squared-Euclidean distance, ℓ22(x,y)=12∥x−y∥22ell^2_2(x,y)=tfrac12|x-y|_2^2ℓ22(x,y)=21∥x−y∥22.
The benefits of using elastic costs, defined through a regularizer τtauτ as c(x,y)=ℓ22(x,y)+τ(x−y)c(x, y)=ell^2_2(x,y)+tau(x-y)c(x,y)=ℓ22(x,y)+τ(x−y), was…Apple Machine Learning Research

GENOT: Entropic (Gromov) Wasserstein Flow Matching with Applications to Single-Cell Genomics

December 3, 2024

by Apple

Single-cell genomics has significantly advanced our understanding of cellular behavior, catalyzing innovations in treatments and precision medicine. However, single-cell sequencing technologies are inherently destructive and can only measure a limited array of data modalities simultaneously. This limitation underscores the need for new methods capable of realigning cells. Optimal transport (OT) has emerged as a potent solution, but traditional discrete solvers are hampered by scalability, privacy, and out-of-sample estimation issues. These challenges have spurred the development of neural…Apple Machine Learning Research

Strategic Linear Contextual Bandits

December 3, 2024

by Apple

Motivated by the phenomenon of strategic agents gaming a recommendation system to maximize the number of times they are recommended to users, we study a strategic variant of the linear contextual bandit problem, where the arms strategically misreport privately observed contexts to the learner. % under strategic context manipulation. We treat the algorithm design problem as one of emph{mechanism design} under uncertainty and propose the Optimistic Grim Trigger Mechanism (OptGTM) that minimizes regret while simultaneously incentivizing the agents to be approximately truthful. We show that…Apple Machine Learning Research

Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling

December 2, 2024

by Apple

Diffusion models have emerged as a powerful tool for generating high-quality images from textual descriptions. Despite their successes, these models often exhibit limited diversity in the sampled images, particularly when sampling with a high classifier-free guidance weight. To address this issue, we present Kaleido, a novel approach that enhances the diversity of samples by incorporating autoregressive latent priors. Kaleido integrates an autoregressive language model that encodes the original caption and generates latent variables, serving as abstract and intermediary representations for…Apple Machine Learning Research

Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody?

November 26, 2024

by Apple

This paper was accepted at the Ninth Conference on Machine Translation (WMT24) at EMNLP 2024.
The prosody of a spoken utterance, including features like stress, intonation and rhythm, can significantly affect the underlying semantics, and as a consequence can also affect its textual translation. Nevertheless, prosody is rarely studied within the context of speech-to-text translation (S2TT) systems. In particular, end-to-end (E2E) systems have been proposed as well-suited for prosody-aware translation because they have direct access to the speech signal when making translation decisions, but…Apple Machine Learning Research

Memory-Retaining Finetuning via Distillation

November 21, 2024

by Apple

This paper was accepted at the Fine-Tuning in Modern Machine Learning: Principles and Scalability (FITML) Workshop at NeurIPS 2024.
Large language models (LLMs) pretrained on large corpora of internet text possess much of the world’s knowledge. Following pretraining, one often needs to conduct continued pretraining on certain capabilities, such as math and coding, or “posttraining” (a.k.a., alignment) techniques to make the models follow users’ instructions and align them with human preferences. One challenge during these finetuning stages is that the model can lose the pretraining knowledge…Apple Machine Learning Research

Instance-Optimal Private Density Estimation in the Wasserstein Distance

November 21, 2024

by Apple

Estimating the density of a distribution from samples is a fundamental problem in statistics. In many practical settings, the Wasserstein distance is an appropriate error metric for density estimation. For example, when estimating population densities in a geographic region, a small Wasserstein distance means that the estimate is able to capture roughly where the population mass is. In this work we study differentially private density estimation in the Wasserstein distance. We design and analyze instance-optimal algorithms for this problem that can adapt to easy instances.
For distributions…Apple Machine Learning Research

Multimodal Autoregressive Pre-Training of Large Vision Encoders

November 21, 2024

by Apple

*Equal Contributors
A dominant paradigm in large multimodal models is to pair a large language de- coder with a vision encoder. While it is well-known how to pre-train and tune language decoders for multimodal tasks, it is less clear how the vision encoder should be pre-trained. A de facto standard is to pre-train the vision encoder with a discriminative objective, such as contrastive loss. This causes a mismatch between pre-training and the generative autoregressive downstream task. At the same time, following their success in the language domain, autoregressive image models have been shown…Apple Machine Learning Research

Do LLMs Estimate Uncertainty Well in Instruction-Following?

November 20, 2024

by Apple

This paper was accepted at the Safe Generative AI Workshop (SGAIW) at NeurIPS 2024.
Large language models (LLMs) could be valuable personal AI agents across various domains, provided they can precisely follow user instructions. However, recent studies have shown significant limitations in LLMs’ instruction-following capabilities, raising concerns about their reliability in high-stakes applications. Accurately estimating LLMs’ uncertainty in adhering to instructions is critical to mitigating deployment risks. We present, to our knowledge, the first systematic evaluation of uncertainty…Apple Machine Learning Research

Vedere AI

Posts in category: Apple

Towards Time-Series Reasoning with LLMs

Learning Elastic Costs to Shape Monge Displacements

GENOT: Entropic (Gromov) Wasserstein Flow Matching with Applications to Single-Cell Genomics

Strategic Linear Contextual Bandits

Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling

Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody?

Memory-Retaining Finetuning via Distillation

Instance-Optimal Private Density Estimation in the Wasserstein Distance

Multimodal Autoregressive Pre-Training of Large Vision Encoders

Do LLMs Estimate Uncertainty Well in Instruction-Following?

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.