Apple – Page 29 – Vedere AI

Improved Modelling of Federated Datasets using Mixtures-of-Dirichlet-Multinomials

June 12, 2024

by Apple

In practice, training using federated learning can be orders of magnitude slower than standard centralized training. This severely limits the amount of experimentation and tuning that can be done, making it challenging to obtain good performance on a given task. Server-side proxy data can be used to run training simulations, for instance for hyperparameter tuning. This can greatly speed up the training pipeline by reducing the number of tuning runs to be performed overall on the true clients. However, it is challenging to ensure that these simulations accurately reflect the dynamics of the…Apple Machine Learning Research

Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation

June 12, 2024

by Apple

Human evaluation is a critical component in machine translation system development and has received much attention in text translation research. However, little prior work exists on the topic of human evaluation for speech translation, which adds additional challenges such as noisy data and segmentation mismatches. We take first steps to fill this gap by conducting a comprehensive human evaluation of the results of several shared tasks from the last International Workshop on Spoken Language Translation (IWSLT 2023). We propose an effective evaluation strategy based on automatic resegmentation…Apple Machine Learning Research

Introducing Apple’s On-Device and Server Foundation Models

June 10, 2024

by Apple

Apple Machine Learning Research

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024

June 10, 2024

by Apple

Apple Machine Learning Research

AGRaME: Any Granularity Ranking with Multi-Vector Embeddings

June 4, 2024

by Apple

Ranking is a fundamental and popular problem in search. However, existing ranking algorithms usually restrict the granularity of ranking to full passages or require a specific dense index for each desired level of granularity. Such lack of flexibility in granularity negatively affects many applications that can benefit from more granular ranking, such as sentence-level ranking for open-domain question-answering, or proposition-level ranking for attribution. In this work, we introduce the idea of any-granularity ranking which leverages multi-vector approaches to rank at varying levels of…Apple Machine Learning Research

Entity Disambiguation via Fusion Entity Decoding

June 4, 2024

by Apple

Entity disambiguation (ED), which links the mentions of ambiguous entities to their referent entities in a knowledge base, serves as a core component in entity linking (EL). Existing generative approaches demonstrate improved accuracy compared to classification approaches under the standardized ZELDA benchmark. Nevertheless, generative approaches suffer from the need for large-scale pre-training and inefficient generation. Most importantly, entity descriptions, which could contain crucial information to distinguish similar entities from each other, are often overlooked. We propose an…Apple Machine Learning Research

Embedding Pose Graph, Enabling 3D Foundation Model Capabilities with a Compact Representation

May 31, 2024

by Apple

This paper presents the Embedding Pose Graph (EPG), an innovative method that combines the strengths of foundation models with a simple 3D representation suitable for robotics applications. Addressing the need for efficient spatial understanding in robotics, EPG provides a compact yet powerful approach by attaching foundation model features to the nodes of a pose graph. Unlike traditional methods that rely on bulky data formats like voxel grids or point clouds, EPG is lightweight and scalable. It facilitates a range of robotic tasks, including open-vocabulary querying, disambiguation…Apple Machine Learning Research

CLIP meets Model Zoo Experts: Pseudo-Supervision for Visual Enhancement

May 30, 2024

by Apple

Contrastive language image pretraining (CLIP) is a standard method for training vision-language models. While CLIP is scalable, promptable, and robust to distribution shifts on image classification tasks, it lacks object localization capabilities. This paper studies the following question: Can we augment CLIP training with task-specific vision models from model zoos to improve its visual representations? Towards this end, we leverage open-source task-specific vision models to generate pseudo-labels for an uncurated and noisy image-text dataset. Subsequently, we train CLIP models on these…Apple Machine Learning Research

Affine-based Deformable Attention and Selective Fusion for Semi-dense Matching

May 29, 2024

by Apple

This paper was accepted at the Image Matching: Local Features & Beyond workshop at CVPR 2024.
Identifying robust and accurate correspondences across images is a fundamental problem in computer vision that enables various downstream tasks. Recent semi-dense matching methods emphasize the effectiveness of fusing relevant cross-view information through Transformer. In this paper, we propose several improvements upon this paradigm. Firstly, we introduce affine-based local attention to model cross-view deformations. Secondly, we present selective fusion to merge local and global messages from…Apple Machine Learning Research

KPConvX: Modernizing Kernel Point Convolution with Kernel Attention

May 28, 2024

by Apple

In the field of deep point cloud understanding, KPConv is a unique architecture that uses kernel points to locate convolutional weights in space, instead of relying on Multi-Layer Perceptron (MLP) encodings. While it initially achieved success, it has since been surpassed by recent MLP networks that employ updated designs and training strategies. Building upon the kernel point principle, we present two novel designs: KPConvD (depthwise KPConv), a lighter design that enables the use of deeper architectures, and KPConvX, an innovative design that scales the depthwise convolutional weights of…Apple Machine Learning Research

Vedere AI

Posts in category: Apple

Improved Modelling of Federated Datasets using Mixtures-of-Dirichlet-Multinomials

Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation

Introducing Apple’s On-Device and Server Foundation Models

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024

AGRaME: Any Granularity Ranking with Multi-Vector Embeddings

Entity Disambiguation via Fusion Entity Decoding

Embedding Pose Graph, Enabling 3D Foundation Model Capabilities with a Compact Representation

CLIP meets Model Zoo Experts: Pseudo-Supervision for Visual Enhancement

Affine-based Deformable Attention and Selective Fusion for Semi-dense Matching

KPConvX: Modernizing Kernel Point Convolution with Kernel Attention

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.