Pre-Trained Foundation Model Representations to Uncover Breathing Patterns in Speech

The process of human speech production involves coordinated respiratory action to elicit acoustic speech signals. Typically, speech is produced when air is forced from the lungs and is modulated by the vocal tract, where such actions are interspersed by moments of breathing in air (inhalation) to refill the lungs again. Respiratory rate (š‘…š‘…) is a vital metric that is used to assess the overall health, fitness, and general well-being of an individual. Existing approaches to measure š‘…š‘… (number of breaths one takes in a minute) are performed using specialized equipment or training. Studiesā€¦Apple Machine Learning Research