Acme is a framework for building readable, efficient, research-oriented RL algorithms. At its core Acme is designed to enable simple descriptions of RL agents that can be run at various scales of execution — including distributed agents. By releasing Acme, our aim is to make the results of various RL algorithms developed in academia and industrial labs easier to reproduce and extend for the machine learning community at large.Read More
Using AI to predict retinal disease progression
Vision loss among the elderly is a major healthcare issue: about one in three people have some vision-reducing disease by the age of 65. Age-related macular degeneration (AMD) is the most common cause of blindness in the developed world. In Europe, approximately 25% of those 60 and older have AMD. The dry form is relatively common among people over 65, and usually causes only mild sight loss. However, about 15% of patients with dry AMD go on to develop a more serious form of the disease exudative AMD, or exAMD which can result in rapid and permanent loss of sight. Fortunately, there are treatments that can slow further vision loss. Although there are no preventative therapies available at present, these are being explored in clinical trials. The period before the development of exAMD may therefore represent a critical window to target for therapeutic innovations: can we predict which patients will progress to exAMD, and help prevent sight loss before it even occurs?Read More
Simple Sensor Intentions for Exploration
In this paper we focus on a setting in which goal tasks are defined via simple sparse rewards, and exploration is facilitated via agent-internal auxiliary tasks. We introduce the idea of simple sensor intentions (SSIs) as a generic way to define auxiliary tasks. SSIs reduce the amount of prior knowledge that is required to define suitable rewards. They can further be computed directly from raw sensor streams and thus do not require expensive and possibly brittle state estimation on real systems.Read More
Learning to Segment Actions from Observation and Narration
We apply a generative segmental model of task structure, guided by narration, to action segmentation in video. We focus on unsupervised and weakly-supervised settings where no action labels are known during training. Despite its simplicity, our model performs competitively with previous work on a dataset of naturalistic instructional videos.Read More
Specification gaming: the flip side of AI ingenuity
Specification gaming is a behaviour that satisfies the literal specification of an objective without achieving the intended outcome. We have all had experiences with specification gaming, even if not by this name. Readers may have heard the myth of King Midas and the golden touch, in which the king asks that anything he touches be turned to gold – but soon finds that even food and drink turn to metal in his hands. In the real world, when rewarded for doing well on a homework assignment, a student might copy another student to get the right answers, rather than learning the material – and thus exploit a loophole in the task specification.Read More
Towards understanding glasses with graph neural networks
Under a microscope, a pane of window glass doesnt look like a collection of orderly molecules, as a crystal would, but rather a jumble with no discernable structure. Glass is made by starting with a glowing mixture of high-temperature melted sand and minerals. Once cooled, its viscosity (a measure of the friction in the fluid) increases a trillion-fold, and it becomes a solid, resisting tension from stretching or pulling. Yet the molecules in the glass remain in a seemingly disordered state, much like the original molten liquid almost as though the disordered liquid state had been flash-frozen in place. The glass transition, then, first appears to be a dramatic arrest in the movement of the glass molecules. Whether this process corresponds to a structural phase transition (as in water freezing, or the superconducting transition) is a major open question in the field. Understanding the nature of the dynamics of glass is fundamental to understanding how the atomic-scale properties define the visible features of many solid materials.Read More
Agent57: Outperforming the human Atari benchmark
The Atari57 suite of games is a long-standing benchmark to gauge agent performance across a wide range of tasks. Weve developed Agent57, the first deep reinforcement learning agent to obtain a score that is above the human baseline on all 57 Atari 2600 games. Agent57 combines an algorithm for efficient exploration with a meta-controller that adapts the exploration and long vs. short-term behaviour of the agent.Read More
Visual Grounding in Video for Unsupervised Word Translation
Our goal is to use visual grounding to improve unsupervised word mapping between languages. The key idea is to establish a common visual representation between two languages by learning embeddings from unpaired instructional videos narrated in the native language.Read More
A new model and dataset for long-range memory
This blog introduces a new long-range memory model, the Compressive Transformer, alongside a new benchmark for book-level language modelling, PG19. We provide the conceptual tools needed to understand this new research in the context of recent developments in memory models and language modelling.Read More
Dopamine and temporal difference learning: A fruitful relationship between neuroscience and AI
A recent development in computer science may provide a deep, parsimonious explanation for several previously unexplained features of reward learning in the brain.Read More