The Atari57 suite of games is a long-standing benchmark to gauge agent performance across a wide range of tasks. Weve developed Agent57, the first deep reinforcement learning agent to obtain a score that is above the human baseline on all 57 Atari 2600 games. Agent57 combines an algorithm for efficient exploration with a meta-controller that adapts the exploration and long vs. short-term behaviour of the agent.Read More
Visual Grounding in Video for Unsupervised Word Translation
Our goal is to use visual grounding to improve unsupervised word mapping between languages. The key idea is to establish a common visual representation between two languages by learning embeddings from unpaired instructional videos narrated in the native language.Read More
A new model and dataset for long-range memory
This blog introduces a new long-range memory model, the Compressive Transformer, alongside a new benchmark for book-level language modelling, PG19. We provide the conceptual tools needed to understand this new research in the context of recent developments in memory models and language modelling.Read More
AlphaFold: Using AI for scientific discovery
Our Nature paper describes AlphaFold, a system that generates3D models of proteins that are far more accurate than any that have come before.Read More
Dopamine and temporal difference learning: A fruitful relationship between neuroscience and AI
A recent development in computer science may provide a deep, parsimonious explanation for several previously unexplained features of reward learning in the brain.Read More
Artificial Intelligence, Values and Alignment
This paper looks at philosophical questions that arise in the context of AI alignment. It defends three propositions. First, normative and technical aspects of the AI alignment problem are interrelated, creating space for productive engagement between people working in both domains. Second, it is important to be clear about the goal of alignment. There are significant differences between AI that aligns with instructions, intentions, revealed preferences, ideal preferences, interests and values. A principle-based approach to AI alignment, which combines these elements in a systematic way, has considerable advantages in this context. Third, the central challenge for theorists is not to identify ‘true’ moral principles for AI; rather, it is to identify fair principles for alignment, that receive reflective endorsement despite widespread variation in people’s moral beliefs. The final part of the paper explores three ways in which fair principles for AI alignment could potentially be identified.Read More
International evaluation of an AI system for breast cancer screening
Screening mammography aims to identify breast cancer before symptoms appear, enabling earlier therapy for more treatable disease. Despite the existence of screening programs worldwide, interpretation of these images suffers from suboptimal rates of false positives and false negatives. Here we present an AI system capable of surpassing a single expert reader in breast cancer prediction performance.Read More
Using WaveNet technology to reunite speech-impaired users with their original voices
We demonstrate an early proof of concept of how text-to-speech technologies can synthesise a high-quality, natural sounding voice using minimal recorded speech data.Read More
Learning human objectives by evaluating hypothetical behaviours
We present a new method for training reinforcement learning agents from human feedback in the presence of unknown unsafe states.Read More
From unlikely start-up to major scientific organisation: Entering our tenth year at DeepMind
Weve come a long way in building the organisation we need to achieve our long-term mission.Read More