Recent advances in deep learning and automatic speech recognition (ASR) have enabled the end-to-end (E2E) ASR system and boosted its accuracy to a new level. The E2E systems implicitly model all conventional ASR components, such as the acoustic model (AM) and the language model (LM), in a single network trained on audio-text pairs. Despite this simpler system architecture, fusing a separate LM, trained exclusively on text corpora, into the E2E system has proven to be beneficial. However, the application of LM fusion presents certain drawbacks, such as its inability to address the domain…Apple Machine Learning Research
Hybrid Model Learning for Cardiovascular Biomarkers Inference
This paper was accepted at the workshop Deep Generative Models for Health at NeurIPS 2023.
Cardiovascular diseases (CVDs) are a major global health concern, making the longitudinal monitoring of cardiovascular biomarkers vital for early diagnosis and intervention. A core challenge is the inference of cardiac pulse parameters from pulse waves, especially when acquired from wearable sensors at peripheral body locations. Traditional machine learning (ML) approaches face hurdles in this context due to the scarcity of labeled data, primarily sourced from clinical settings. Simultaneously, physical…Apple Machine Learning Research
One Wide Feedforward is All You Need
This paper was accepted at WMT conference at EMNLP.
The Transformer architecture has two main non-embedding components: Attention and the Feed Forward Network (FFN). Attention captures interdependencies between words regardless of their position, while the FFN non-linearly transforms each input token independently. In this work, we explore the role of FFN and find that despite, and find that despite taking up a significant fraction of the model’s parameters, it is highly redundant. Concretely, we are able to substantially reduce the number of parameters with only a modest drop in accuracy by…Apple Machine Learning Research
Bin Prediction for Better Conformal Prediction
This paper was accepted at the workshop on Regulatable ML at NeurIPS 2023.
Conformal Prediction (CP) is a method of estimating risk or uncertainty when using Machine Learning to help abide by common Risk Management regulations often seen in fields like healthcare and finance. CP for regression can be challenging, especially when the output distribution is heteroscedastic, multimodal, or skewed. Some of the issues can be addressed by estimating a distribution over the output, but in reality, such approaches can be sensitive to estimation error and yield unstable intervals. Here, we circumvent…Apple Machine Learning Research
Simulation-based Inference for Cardiovascular Models
This paper was accepted at the workshop Machine Learning and the Physical Sciences at NeurIPS 2023.
Over the past decades, hemodynamics simulators have steadily evolved and have become tools of choice for studying cardiovascular systems in-silico. This comes naturally at the cost of increasing complexity since state-of-the-art models are non-linear partial differential equations depending on many parameters. While such tools are routinely used to simulate hemodynamics given physiological parameters, solving the related inverse problems — mapping waveforms to physiological parameters — has…Apple Machine Learning Research
FastSR-NeRF: Improving NeRF Efficiency on Consumer Devices with A Simple Super-Resolution Pipeline
Super-resolution (SR) techniques have recently been proposed to upscale the outputs of neural radiance fields (NeRF) and generate high-quality images with enhanced inference speeds. However, existing NeRF+SR methods increase training overhead by using extra input features, loss functions, and/or expensive training procedures such as knowledge distillation. In this paper, we aim to leverage SR for efficiency gains without costly training or architectural changes. Specifically, we build a simple NeRF+SR pipeline that directly combines existing modules, and we propose a lightweight augmentation…Apple Machine Learning Research
Unbalanced Low-Rank Optimal Transport Solvers
Two salient limitations have long hindered the relevance of optimal transport methods to machine learning. First, the computational cost of standard sample-based solvers (when used on batches of samples) is prohibitive. Second, the mass conservation constraint makes OT solvers too rigid in practice: because they must match textit{all} points from both measures, their output can be heavily influenced by outliers. A flurry of recent works has addressed these computational and modeling limitations. Still it has resulted in two separate strains of methods: While the computational outlook was…Apple Machine Learning Research
Large-scale Training of Foundation Models for Wearable Biosignals
Tracking biosignals is crucial for monitoring wellness and preempting the development of severe medical conditions. Today, wearable devices can conveniently record various biosignals, creating the opportunity to monitor health status without disruption to one’s daily routine. Despite the widespread use of wearable devices and existing digital biomarkers, the absence of curated data with annotated medical labels hinders the development of new biomarkers to measure common health conditions. In fact, medical datasets are usually small in comparison to other domains, which is an obstacle for…Apple Machine Learning Research
Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization
Recent advances in deep learning and automatic speech recognition have boosted the accuracy of end-to-end speech recognition to a new level. However, recognition of personal content such as contact names remains a challenge. In this work, we present a personalization solution for an end-to-end system based on connectionist temporal classification. Our solution uses class-based language model, in which a general language model provides modeling of the context for named entity classes, and personal named entities are compiled in a separate finite state transducer. We further introduce a…Apple Machine Learning Research
Deploying Attention-Based Vision Transformers to Apple Neural Engine
Apple Machine Learning Research