Mastering the strategy, tactical understanding, and team play involved in multiplayer video games represents a critical challenge for AI research. Now, through new developments in reinforcement learning, our agents have achieved human-level performance in Quake III Arena Capture the Flag, a complex multi-agent environment and one of the canonical 3D first-person multiplayer games. These agents demonstrate the ability to team up with both artificial agents and human players.Read More
Using adversarial training to recognize speakers’ emotions
A person’s tone of voice can tell you a lot about how they’re feeling. Not surprisingly, emotion recognition is an increasingly popular conversational-AI research topic.Read More
Should Alexa read “2/3” as “two-thirds” or “February Third”?: The science of text normalization
Text normalization is an important process in conversational AI. If an Alexa customer says, “book me a table at 5:00 p.m.”, the automatic speech recognizer will transcribe the time as “five p m”. Before a skill can handle this request, “five p m” will need to be converted to “5:00PM”. Once Alexa has processed the request, it needs to synthesize the response — say, “Is 6:30 p.m. okay?” Here, 6:30PM will be converted to “six thirty p m” for the text-to-speech synthesizer. We call the process of converting “5:00PM” to “five p m” text normalization and its counterpart — converting “five p m” to “5:00PM” — inverse text normalization.Read More
Training a Machine Learning Model in English Improves Its Performance in Japanese
Recently, we published a paper showing that training a neural network to do language processing in English, then retraining it in German, drastically reduces the amount of German-language training data required to achieve a given level of performance.Read More
Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask
At Uber, we apply neural networks to fundamentally improve how we understand the movement of people and things in cities. Among other use cases, we employ them to enable faster customer service response with natural language models and lower wait …
The post Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask appeared first on Uber Engineering Blog.
How We Add New Skills to Alexa’s Name-Free Skill Selector
In the past year, we’ve introduced what we call name-free skill interaction for Alexa. In countries where the service has rolled out, a customer who wants to, say, order a car can just say, “Alexa, get me a car”, instead of having to specify the name of a ride-sharing provider.Read More
Introducing the Uber Research Publications Site
Zoubin Ghahramani is Uber’s Chief Scientist and the Head of AI.
The ease and simplicity of Uber’s platform is built on fundamental advances in science and technology. Teams across Uber are committed to developing the most advanced scientific techniques in …
The post Introducing the Uber Research Publications Site appeared first on Uber Engineering Blog.
“Alexa, Turn Down the Lights and Play Music”: The Science of Handling Compound Requests
Traditionally, Alexa has interpreted customer requests according to their intents and slots. If you say, “Alexa, play ‘What’s Going On?’ by Marvin Gaye,” the intent should be PlayMusic, and “‘What’s Going On?’” and “Marvin Gaye” should fill the slots SongName and ArtistName.Read More