Unsupervised learning: The curious pupil

One in a series of posts explaining the theories underpinning our research. Over the last decade, machine learning has made unprecedented progress in areas as diverse as image recognition, self-driving cars and playing complex games like Go. These successes have been largely realised by training deep neural networks with one of two learning paradigmssupervised learning and reinforcement learning. Both paradigms require training signals to be designed by a human and passed to the computer. In the case of supervised learning, these are the targets (such as the correct label for an image); in the case of reinforcement learning, they are the rewards for successful behaviour (such as getting a high score in an Atari game). The limits of learning are therefore defined by the human trainers. While some scientists contend that a sufficiently inclusive training regimefor example, the ability to complete a very wide variety of tasksshould be enough to give rise to general intelligence, others believe that true intelligence will require more independent learning strategies. Consider how a toddler learns, for instance. Her grandmother might sit with her and patiently point out examples of ducks (acting as the instructive signal in supervised learning), or reward her with applause for solving a woodblock puzzle (as in reinforcement learning).Read More

Capture the Flag: the emergence of complex cooperative agents

Mastering the strategy, tactical understanding, and team play involved in multiplayer video games represents a critical challenge for AI research. Now, through new developments in reinforcement learning, our agents have achieved human-level performance in Quake III Arena Capture the Flag, a complex multi-agent environment and one of the canonical 3D first-person multiplayer games. These agents demonstrate the ability to team up with both artificial agents and human players.Read More

Identifying and eliminating bugs in learned predictive models

One in a series of posts explaining the theories underpinning our research. Bugs and software have gone hand in hand since the beginning of computer programming. Over time, software developers have established a set of best practices for testing and debugging before deployment, but these practices are not suited for modern deep learning systems. Today, the prevailing practice in machine learning is to train a system on a training data set, and then test it on another set. While this reveals the average-case performance of models, it is also crucial to ensure robustness, or acceptably high performance even in the worst case. In this article, we describe three approaches for rigorously identifying and eliminating bugs in learned predictive models: adversarial testing, robust learning, and formal verification.Machine learning systems are not robust by default. Even systems that outperform humans in a particular domain can fail at solving simple problems if subtle differences are introduced. For example, consider the problem of image perturbations: a neural network that can classify images better than a human can be easily fooled into believing that sloth is a race car if a small amount of carefully calculated noise is added to the input image.Read More

TF-Replicator: Distributed Machine Learning for Researchers

At DeepMind, the Research Platform Team builds infrastructure to empower and accelerate our AI research. Today, we are excited to share how we developed TF-Replicator, a software library that helps researchers deploy their TensorFlow models on GPUs and Cloud TPUs with minimal effort and no previous experience with distributed systems. TF-Replicators programming model has now been open sourced as part of TensorFlows tf.distribute.Strategy. This blog post gives an overview of the ideas and technical challenges underlying TF-Replicator. For a more comprehensive description, please read our arXiv paper.A recurring theme in recent AI breakthroughs – from AlphaFold to BigGAN to AlphaStar – is the need for effortless and reliable scalability. Increasing amounts of computational capacity allow researchers to train ever-larger neural networks with new capabilities. To address this, the Research Platform Team developed TF-Replicator, which allows researchers to target different hardware accelerators for Machine Learning, scale up workloads to many devices, and seamlessly switch between different types of accelerators.Read More

Machine learning can boost the value of wind energy

Carbon-free technologies like renewable energy help combat climate change, but many of them have not reached their full potential. Consider wind power: over the past decade, wind farms have become an important source of carbon-free electricity as the cost of turbines has plummeted and adoption has surged. However, the variable nature of wind itself makes it an unpredictable energy sourceless useful than one that can reliably deliver power at a set time.In search of a solution to this problem, last year, DeepMind and Google started applying machine learning algorithms to 700 megawatts of wind power capacity in the central United States. These wind farmspart of Googles global fleet of renewable energy projectscollectively generate as much electricity as is needed by a medium-sized city.Using a neural network trained on widely available weather forecasts and historical turbine data, we configured the DeepMind system to predict wind power output 36 hours ahead of actual generation. Based on these predictions, our model recommends how to make optimal hourly delivery commitments to the power grid a full day in advance. This is important, because energy sources that can be scheduled (i.e.Read More

AlphaStar: Mastering the Real-Time Strategy Game StarCraft II

Games have been used for decades as an important way to test and evaluate the performance of artificial intelligence systems. As capabilities have increased, the research community has sought games with increasing complexity that capture different elements of intelligence required to solve scientific and real-world problems. In recent years, StarCraft, considered to be one of the most challenging Real-Time Strategy (RTS) games and one of the longest-played esports of all time, has emerged by consensus as a grand challenge for AI research.Read More

AlphaZero: Shedding new light on chess, shogi, and Go

In late 2017 we introduced AlphaZero, a single system that taught itself from scratch how to master the games of chess, shogi (Japanese chess), and Go, beating a world-champion program in each case. We were excited by the preliminary results and thrilled to see the response from members of the chess community, who saw in AlphaZeros games a ground-breaking, highly dynamic and unconventional style of play that differed from any chess playing engine that came before it.Today, we are delighted to introduce the full evaluation of AlphaZero, published in the journal Science(Open Access version here), that confirms and updates those preliminary results. It describes how AlphaZero quickly learns each game to become the strongest player in history for each, despite starting its training from random play, with no in-built domain knowledge but the basic rules of the game.Read More

Scaling Streams with Google

Were excited to announce that the team behind Streams our mobile app that supports doctors and nurses to deliver faster, better care to patientswill be joining Google.Its been a phenomenal journey to see Streams go from initial idea to live deployment, and to hear how its helped change the lives of patients and the nurses and doctors who treat them. The arrival of world-leading health expert Dr. David Feinberg at Google will accelerate these efforts, helping to make a difference to the lives of millions of patients around the world.This is a major milestone for DeepMind! One of the reasons for joining forces with Google in 2014 was the opportunity to use Googles scale and experience in building billion-user products to bring our breakthroughs more rapidly to the wider world. Its been amazing to put this into practice in data centre efficiency, Android battery life, text-to-speech applications, and now the work of our Streams team.Over the past three years weve built a team of experts in what it takes to deploy clinical tools in practice – engineers, clinicians, translational researchers and more.Read More