DeepMind papers at ICLR 2018

Between 30 April and 03 May, hundreds of researchers and engineers will gather in Vancouver, Canada, for the Sixth International Conference on Learning RepresentationsHere you can read details of all DeepMinds accepted papers and find out where you can see the accompanying poster sessions and talks. Maximum a posteriori policy optimisationAuthors: Abbas Abdolmaleki, Jost Tobias Springenberg, Nicolas Heess, Yuval Tassa, Remi MunosWe introduce a new algorithm for reinforcement learning called Maximum a posteriori Policy Optimisation (MPO) based on coordinate ascent on a relative entropy objective. We show that several existing methods can directly be related to our derivation. We develop two off-policy algorithms and demonstrate that they are competitive with the state-of-the-art in deep reinforcement learning.Read More

Our first COO Lila Ibrahim takes DeepMind to the next level

One of the greatest pleasures of coming to work every day at DeepMind is the chance to collaborate with brilliant researchers and engineers from so many different fields and perspectives – with machine learning experts alongside neuroscientists, physicists, mathematicians, roboticists, ethicists and more.This level of interdisciplinary collaboration is both challenging and unusual, and it requires a unique type of organisation. We built DeepMind to combine the rigour and long-term thinking of the worlds best scientific institutions, along with the focus, pace and energy common to the best tech startups. I believe this is essential if were to fulfil the scientific and social promise of AI, and Im proud of all thatweve achieved so far. But theres still a very long way to go!So Im really pleased to welcome Lila Ibrahim to DeepMind as our first ever Chief Operating Officer, partnering with me to design, build and manage our next phase of growth. Having started out as a microprocessor designer and assembler programmer at Intel, Lila went on to lead the companys emerging markets product group, as well as working with Intel CEO Craig Barrett and then the legendary investor John Doerr at Kleiner Perkins as Chief of Staff.Read More

Learning to navigate in cities without a map

How did you learn to navigate the neighborhood of your childhood, to go to a friends house, to your school or to the grocery store? Probably without a map and simply by remembering the visual appearance of streets and turns along the way. As you gradually explored your neighborhood, you grew more confident, mastered your whereabouts and learned new and increasingly complex paths. You may have gotten briefly lost, but found your way again thanks to landmarks, or perhaps even by looking to the sun for an impromptu compass.Navigation is an important cognitive task that enables humans and animals to traverse, without maps, over long distances in a complex world. Such long-range navigation can simultaneously support self-localisation (I am here) and a representation of the goal (I am going there).In Learning to Navigate in Cities Without a Map,we present an interactive navigation environment that uses first-person perspective photographs from Google Street View,approved for use by the StreetLearn project and academic research, and gamify that environment to train an AI. As standard with Street View images, faces and license plates have been blurred and are unrecognisable. We build a neural network-based artificial agent that learns to navigate multiple cities using visual information (pixels from a Street View image).Read More

Retour à Paris / A return to Paris

English version followsLorsque nous avons tabli notre sige Londres en 2010, nous voulions faire de DeepMind le nec plus ultra de la recherche de pointe dans le domaine de lintelligence artificielle. Nous voulions galement aider la communaut de lintelligence artificielle se dvelopper. Nous avons ainsi publi des articles dans les confrences et journaux les plus slectifs (plus de 180 ce jour!) et partag nos connaissances dans ce domaine; nous avons incit nos experts enseigner dans les universits locales, et uvr avec les coles et les ONG former la prochaine gnration de scientifiques. Nous avons eu non seulement la chance de contribuer au succs scientifique du Royaume-Uni, mais avons aussi grandement bnfici de louverture et de la diversit de cette ville ainsi que de son influence culturelle. Lintelligence artificielle doit tre dveloppe en accordant la plus grande attention aux diffrents besoins de la socit et pour autant quune ville puisse runir elle seule ces conditions une capitale multiculturelle comme Londres, ma ville natale, est cet gard lendroit idal. Je suis, donc, trs heureux dannoncer notre dcision douvrir notre premier laboratoire en Europe continentale, dans une autre grande capitale culturelle et scientifique: Paris.Read More

Learning to write programs that generate images

Through a humans eyes, the world is much more than just the images reflected in our corneas. For example, when we look at a building and admire the intricacies of its design, we can appreciate the craftsmanship it requires. This ability to interpret objects through the tools that created them gives us a richer understanding of the world and is an important aspect of our intelligence.We would like our systems to create similarly rich representations of the world. For example, when observing an image of a painting we would like them to understand the brush strokes used to create it and not just the pixels that represent it on a screen.In this work, we equipped artificial agents with the same tools that we use to generate images and demonstrate that they can reason about how digits, characters and portraits are constructed. Crucially, they learn to do this by themselves and without the need for human-labelled datasets. This contrasts with recent research which has so far relied on learning from human demonstrations, which can be a time-intensive process.Read More

Understanding deep learning through neuron deletion

Deep neural networks are composed of many individual neurons, which combine in complex and counterintuitive ways to solve a wide range of challenging tasks. This complexity grants neural networks their power but also earns them their reputation as confusing and opaque black boxes.Understanding how deep neural networks function is critical for explaining their decisions and enabling us to build more powerful systems. For instance, imagine the difficulty of trying to build a clock without understanding how individual gears fit together. One approach to understanding neural networks, both in neuroscience and deep learning, is to investigate the role of individual neurons, especially those which are easily interpretable.Our investigation intothe importance of single directions for generalisation, soonto appear at the Sixth International Conference on Learning Representations (ICLR), uses an approach inspired by decades of experimental neuroscience exploring the impact of damage to determine: how important are small groups of neurons in deep neural networks? Are more easily interpretable neurons also more important to the networks computation?Read More

Stop, look and listen to the people you want to help

I like to take things slow. Take it slowly and get it right first time, one participant said, but was quickly countered by someone else around the table: But Im impatient, I want to see the benefits now. This exchange neatly captures many of the conversations I heard at DeepMind Healths recent Collaborative Listening Summit. It also represents, in laymans terms, the debate that tech thinkers and policy-makers are having right now about the future of artificial intelligence.The Collaborative Listening Summit brought together members of the public, patient representatives and stakeholder, and was facilitated by Ipsos MORI. The objective of the Summit: to explore how principles, co-created in earlier events with the public, patients and stakeholders, should govern DeepMind Healths operating practices and engagement with the NHS. These principles ranged from the technical for example, how evidence should inform DeepMinds practice to the societal for example, operating in the best interests of society.The challenge of how technology companies and the NHS should interact has had many of us, including myself, cautious about the risk of big technology firms leveraging their finance and power over an NHS that is under seemingly endless pressure.Read More

Learning by playing

Getting children (and adults) to tidy up after themselves can be a challenge, but we face an even greater challenge trying to get our AI agents to do the same. Success depends on the mastery of several core visuo-motor skills: approaching an object, grasping and lifting it, opening a box and putting things inside of it. To make matters more complicated, these skills must be applied in the right sequence.Control tasks, like tidying up a table or stacking objects, require an agent to determine how, when and where to coordinate the nine joints of its simulated arms and fingers to move correctly and achieve its objective. The sheer number of possible combinations of movements at any given time, along with the need to carry out a long sequence of correct actions constitute a serious exploration problemmaking this a particularly interesting area for reinforcement learning research.Techniques like reward shaping, apprenticeship learning or learning from demonstrations can help with the exploration problem. However, these methods rely on a considerable amount of knowledge about the taskthe problem of learning complex control problems from scratch with minimal prior knowledge is still an open challenge.Our new paper proposes a new learning paradigm called Scheduled Auxiliary Control (SAC-X) which seeks to overcome this exploration issue.Read More

Researching patient deterioration with the US Department of Veterans Affairs

Were excited to announce a medical research partnership with the US Department of Veterans Affairs (VA), one of the worlds leading healthcare organisations responsible for providing high-quality care to veterans and their families across the United States.This project will see us analyse patterns from historical, depersonalised medical records to predict patient deterioration.Patient deterioration is a significant global health problem that often has fatal consequences. Studies estimate that 11% of all in-hospital deaths are due to patient deterioration not being recognised early enough or acted on in the right way.Alongside world-renowned clinicians and researchers at the VA, we are analysing patterns from approximately 700,000 historical, depersonalised medical records in order to determine if machine learning can accurately identify the risk factors for patient deterioration and correctly predict its onset.Were focusing on Acute Kidney Injury (AKI), one of the most common conditions associated with patient deterioration, and an area where DeepMind and the VA both have expertise. This is a complex challenge, because predicting AKI is far from easy. Not only is the onset of AKI sudden and often asymptomatic, but the risk factors associated with it are commonplace throughout hospitals.Read More

Scalable agent architecture for distributed training

Deep Reinforcement Learning (DeepRL) has achieved remarkable success in a range of tasks, from continuous control problems in robotics to playing games like Go and Atari. The improvements seen in these domains have so far been limited to individual tasks where a separate agent has been tuned and trained for each task.In our most recent work, we explore the challenge of training a single agent on many tasks.Today we are releasing DMLab-30, a set of new tasks that span a large variety of challenges in a visually unified environment with a common action space.Training an agent to perform well on many tasks requires massive throughput and making efficient use of every data point. To this end, we have developed a new, highly scalable agent architecture for distributed training called Importance Weighted Actor-Learner Architecture that uses a new off-policy correction algorithm called V-trace.DMLab-30DMLab-30 is a collection of new levels designed using our open source RL environment DeepMind Lab. These environments enable any DeepRL researcher to test systems on a large spectrum of interesting tasks either individually or in a multi-task setting.Read More