In new work, algorithmic advances and new training environments lead to agents which exhibit general heuristic behaviours.Read More
Generally capable agents emerge from open-ended play
In recent years, artificial intelligence agents have succeeded in a range of complex game environments. For instance, AlphaZero beat world-champion programs in chess, shogi, and Go after starting out with knowing no more than the basic rules of how to play. Through reinforcement learning (RL), this single system learnt by playing round after round of games through a repetitive process of trial and error. But AlphaZero still trained separately on each game — unable to simply learn another game or task without repeating the RL process from scratch. The same is true for other successes of RL, such as Atari, Capture the Flag, StarCraft II, Dota 2, and Hide-and-Seek. DeepMind’s mission of solving intelligence to advance science and humanity led us to explore how we could overcome this limitation to create AI agents with more general and adaptive behaviour. Instead of learning one game at a time, these agents would be able to react to completely new conditions and play a whole universe of games and tasks, including ones never seen before.Read More
Enabling high-accuracy protein structure prediction at the proteome scale
Many novel machine learning innovations contribute to AlphaFold’s current level of accuracy. We give a high-level overview of the system below; for a technical description of the network architecture see our AlphaFold methods paper and especially its extensive Supplementary Information.Read More
Putting the power of AlphaFold into the world’s hands
In partnership with EMBL-EBI, were incredibly proud to be launching the AlphaFold Protein Structure Database.Read More
Melting Pot: an evaluation suite for multi-agent reinforcement learning
Here we introduce Melting Pot, a scalable evaluation suite for multi-agent reinforcement learning. Melting Pot assesses generalisation to novel social situations involving both familiar and unfamiliar individuals, and has been designed to test a broad range of social interactions such as: cooperation, competition, deception, reciprocation, trust, stubbornness and so on. Melting Pot offers researchers a set of 21 MARL “substrates” (multi-agent games) on which to train agents, and over 85 unique test scenarios on which to evaluate these trained agents.Read More
An update on our racial justice efforts
To help combat racism and advance racial equity, we’ve made donations to organisations that support Black communities in the AI/ML space.Read More
Advancing sports analytics through AI research
Sports analytics is in the midst of a remarkably important era, offering interesting opportunities for AI researchers and sports leaders alike.Read More
Game theory as an engine for large-scale data analysis
Our research explored a new approach to an old problem: we reformulated principal component analysis (PCA), a type of eigenvalue problem, as a competitive multi-agent game we call EigenGame.Read More
Alchemy: A structured task distribution for meta-reinforcement learning
There has been rapidly growing interest in developing methods for meta-learning within deep RL. Although there has been substantive progress toward such ‘meta-reinforcement learning,’ research in this area has been held back by a shortage of benchmark tasks. In the present work, we aim to ease this problem by introducing (and open-sourcing) Alchemy, a useful new benchmark environment for meta-RL, along with a suite of analysis tools.Read More
Data, Architecture, or Losses: What Contributes Most to Multimodal Transformer Success?
In this work, we examine what aspects of multimodal transformers – attention, losses, and pretraining data – are important in their success at multimodal pretraining. We find that Multimodal attention, where both language and image transformers attend to each other, is crucial for these models’ success. Models with other types of attention (even with more depth or parameters) fail to achieve comparable results to shallower and smaller models with multimodal attention.Read More