Amazon Releases New Public Data Set to Help Address “Cocktail Party” Problem

Amazon today announced the public release of a new data set that will help speech scientists address the difficult problem of separating speech signals in reverberant rooms with multiple speakers. In the field of automatic speech recognition, this problem is known as the “cocktail party” or “dinner party” problem; accordingly, we call our data set the Dinner Party Corpus, or DiPCo.Read More

Accelerating parallel training of neural nets

Earlier this year, we reported a speech recognition system trained on a million hours of data, a feat possible through semi-supervised learning, in which training data is annotated by machines rather than by people. These sorts of massive machine learning projects are becoming more common, and they require distributing the training process across multiple processors. Otherwise, training becomes too time consuming.Read More

New Alexa Research on Task-Oriented Dialogue Systems

Earlier this year, at Amazon’s re:MARS conference, Alexa head scientist Rohit Prasad unveiled Alexa Conversations, a new service that allows Alexa skill developers to more easily integrate conversational elements into their skills. The announcement is an indicator of the next stage in Alexa’s evolution: more-natural, dialogue-based engagements that enable Alexa to aggregate data and refine requests to better meet customer needs.Read More