On-device Virtual Assistants powered by Automated Speech Recognition (ASR) require effective knowledge integration for the challenging entity-rich query recognition.
In this paper, we conduct an empirical study of modeling strategies for server-side rescoring of spoken information domain queries using various categories of Language Models (N-Gram word Language Models, sub-word neural LMs).
We investigate the combination of on-device and server-side signals, and demonstrate significant WER improvements of 23%-35% on various entity-centric query subpopulations
by integrating various server-side…Apple Machine Learning Research