Streaming On-Device Detection of Device Directed Speech from Voice and Touch-Based Invocation

When interacting with smart devices such as mobile- phones or wearables, the user typically invokes a virtual assistant (VA) by saying a keyword or by pressing a but- ton on the device. However, in many cases, the VA can accidentally be invoked by the keyword-like speech or ac- cidental button press, which may have implications on user experience and privacy. To this end, we propose an acous- tic false-trigger-mitigation (FTM) approach for on-device device-directed speech detection that simultaneously handles the voice-trigger and touch-based invocation. To facilitate the model deployment…Apple Machine Learning Research

Vedere AI

Streaming On-Device Detection of Device Directed Speech from Voice and Touch-Based Invocation

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.