A key challenge in many modern data analysis tasks is that user data is heterogeneous. Different users may possess vastly different numbers of data points. More importantly, it cannot be assumed that all users sample from the same underlying distribution. This is true, for example in language data, where different speech styles result in data heterogeneity. In this work we propose a simple model of heterogeneous user data that differs in both distribution and quantity of data, and we provide a method for estimating the population-level mean while preserving user-level differential privacy. We…Apple Machine Learning Research
Two-Layer Bandit Optimization for Recommendations
Online commercial app marketplaces serve millions of apps to billions of users in an efficient manner. Bandit optimization algorithms are used to ensure that the recommendations are relevant, and converge to the best performing content over time. However, directly applying bandits to real-world systems, where the catalog of items is dynamic and continuously refreshed, is not straightforward. One of the challenges we face is the existence of several competing content surfacing components, a phenomenon not unusual in large-scale recommender systems. This often leads to challenging scenarios…Apple Machine Learning Research
Toward Supporting Quality Alt Text in Computing Publications
While researchers have examined alternative (alt) text for social media and news contexts, few have studied the status and challenges for authoring alt text of figures in computing-related publications. These figures are distinct, often conveying dense visual information, and may necessitate unique accessibility solutions. Accordingly, we explored how to support authors in creating alt text in computing publications—specifically in the field of human-computer interaction (HCI). We conducted two studies: (1) an analysis of 300 recently published figures at a general HCI conference (ACM CHI)…Apple Machine Learning Research
PhysioMTL: Personalizing Physiological Patterns using Optimal Transport Multi-Task Regression
Heart rate variability (HRV) is a practical and noninvasive measure of autonomic nervous system activity, which plays an essential role in cardiovascular health. However, using HRV to assess physiology status is challenging. Even in clinical settings, HRV is sensitive to acute stressors such as physical activity, mental stress, hydration, alcohol, and sleep. Wearable devices provide convenient HRV measurements, but the irregularity of measurements and uncaptured stressors can bias conventional analytical methods. To better interpret HRV measurements for downstream healthcare applications, we…Apple Machine Learning Research
Providing Insights for Open-Response Surveys via End-to-End Context-Aware Clustering
Teachers often conduct surveys in order to collect data from a predefined group of students to gain insights into topics of interest. When analyzing surveys with open-ended textual responses, it is extremely time-consuming, labor-intensive, and difficult to manually process all the responses into an insightful and comprehensive report. In the analysis step, traditionally, the teacher has to read each of the responses and decide on how to group them in order to extract insightful information. Even though it is possible to group the responses only using certain keywords, such an approach would…Apple Machine Learning Research
Fusion-Id: A Photoplethysmography and Motion Sensor Fusion Biometric Authenticator With Few-Shot on-Boarding
The abundance of wrist-worn heart rate measuring devices enables long term cardiovascular monitoring through photoplethysmography (PPG). Such signals contain unique identifiable information that can help in biometric authentication. In this work, we propose Fusion-ID, which use wrist-worn PPG sensors fused with motion sensor data as a way to do bio authentication on wrist worn devices. We conducted a user study using a PPG and motion sensor enabled wrist-worn device and collected data from 247 users. We then propose a novel sensor fusion deep Siamese network architecture for feature embedding…Apple Machine Learning Research
RGB-X Classification for Electronics Sorting
Effectively disassembling and recovering materials from waste electrical and electronic equipment (WEEE) is a critical step in moving global supply chains from carbon-intensive, mined materials to recycled and renewable ones. Conventional recycling processes rely on shredding and sorting waste streams, but for WEEE, which is comprised of numerous dissimilar materials, we explore targeted disassembly of numerous objects for improved material recovery. Many WEEE objects share many key features and therefore can look quite similar, but their material composition and internal component layout can…Apple Machine Learning Research
ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer
Generating robust and reliable correspondences across images is a fundamental task for a diversity of applications. To capture context at both global and local granularity, we propose ASpanFormer, a Transformer-based detector-free matcher that is built on hierarchical attention structure, adopting a novel attention operation which is capable of adjusting attention span in a self-adaptive manner. To achieve this goal, first, flow maps are regressed in each cross attention phase to locate the center of search region. Next, a sampling grid is generated around the center, whose size, instead of…Apple Machine Learning Research
Multi-objective Hyper-parameter Optimization of Behavioral Song Embeddings
Song embeddings are a key component of most music recommendation engines. In this work, we study the hyper-parameter optimization of behavioral song embeddings based on Word2Vec on a selection of downstream tasks, namely next-song recommendation, false neighbor rejection, and artist and genre clustering. We present new optimization objectives and metrics to monitor the effects of hyper-parameter optimization. We show that single-objective optimization can cause side effects on the non optimized metrics and propose a simple multi-objective optimization to mitigate these effects. We find that…Apple Machine Learning Research