Echo devices have already attracted tens of millions of customers, but in the Alexa AI group, we’re constantly working to make Alexa’s speech recognition systems even more accurate.Read More
Alexa at Interspeech 2018: How interaction histories can improve speech understanding
Alexa’s ability to act on spoken requests depends on statistical models that translate speech to text and text to actions. Historically, the models’ decisions were one-size-fits-all: the same utterance would produce the same action, regardless of context.Read More
How Alexa is learning to converse more naturally
To handle more-natural spoken interactions, Alexa must track references through several rounds of conversation. If, for instance, a customer says, “How far is it to Redmond?” and after the answer follows up by saying, “Find good Indian restaurants there”, Alexa should be able to infer that “there” refers to Redmond.Read More
3 questions about Interspeech 2018 with Björn Hoffmeister
This year’s Interspeech — the largest conference in speech technology — will take place in Hyderabad, India, the first week of September. More than 40 Amazon researchers will be attending, including Björn Hoffmeister, the senior manager for machine learning in the Alexa Automatic Speech Recognition group. He took a few minutes to answer three questions about this year’s conference.Read More
Alexa, do I need to use your wake word? How about now?
Here’s a fairly common interaction with Alexa: “Alexa, set volume to five”; “Alexa, play music”. Even though the queries come in quick succession, the customer needs to repeat the wake word “Alexa”. To allow for more natural interactions, the device could immediately re-enter its listening state after the first query, without wake-word repetition; but that would require it to detect whether a follow-up speech input is indeed a query intended for the device (“device-directed”) or just background speech (“non-device-directed”).Read More
Public release of fact-checking dataset quickly begins to pay dividends
At the annual meeting of the North American chapter of the Association for Computational Linguistics in June, researchers at Amazon and the University of Sheffield released a new dataset that can be used to train machine-learning systems to determine the veracity of factual assertions online. The dataset is called FEVER, for fact extraction and verification.Read More
Shrinking machine learning models for offline use
Last week, the Alexa Auto team announced the release of its new Alexa Auto Software Development Kit (SDK), enabling developers to bring Alexa functionality to in-vehicle infotainment systems.Read More
Safety-first AI for autonomous data centre cooling and industrial control
Many of societys most pressing problems have grown increasingly complex, so the search for solutions can feel overwhelming. At DeepMind and Google, we believe that if we can use AI as a tool to discover new knowledge, solutions will be easier to reach.In 2016, we jointly developed an AI-powered recommendation system to improve the energy efficiency of Googles already highly-optimised data centres. Our thinking was simple: even minor improvements would provide significant energy savings and reduce CO2 emissions to help combat climate change.Now were taking this system to the next level: instead of human-implemented recommendations, our AI system is directly controlling data centre cooling, while remaining under the expert supervision of our data centre operators. This first-of-its-kind cloud-based control system is now safely delivering energy savings in multiple Google data centres.Read More
A major milestone for the treatment of eye disease
We are delighted to announce the results of the first phase of our joint research partnership with Moorfields Eye Hospital, which could potentially transform the management of sight-threatening eye disease.The results, published online inNature Medicine(open access full text, see end of blog), show that our AI system can quickly interpret eye scans from routine clinical practice with unprecedented accuracy. It can correctly recommend how patients should be referred for treatment for over 50 sight-threatening eye diseases as accurately as world-leading expert doctors.These are early results, but they show that our system could handle the wide variety of patients found in routine clinical practice. In the long term, we hope this will help doctors quickly prioritise patients who need urgent treatment which could ultimately save sight.A more streamlined processCurrently, eyecare professionals use optical coherence tomography (OCT) scans to help diagnose eye conditions. These 3D images provide a detailed map of the back of the eye, but they are hard to read and need expert analysis to interpret.The time it takes to analyse these scans, combined with the sheer number of scans that healthcare professionals have to go through (over 1,000 a day at Moorfields alone), can lead to lengthy delays between scan and treatment even when someone needs urgent care.Read More
Automatic Transliteration Can Help Alexa Find Data Across Language Barriers
As Alexa-enabled devices continue to expand into new countries, finding information across languages that use different scripts becomes a more pressing challenge. For example, a Japanese music catalogue may contain names written in English or the various scripts used in Japanese — Kanji, Katakana, or Hiragana. When an Alexa customer, from anywhere in the world, asks for a certain song, album, or artist, we could have a mismatch between Alexa’s transcription of the request and the script used in the corresponding catalogue.Read More