To make RL more applicable to real-world applications like robotics, we propose using an intelligent evaluation procedure to select the policy for deployment, called active offline policy selection (A-OPS). In A-OPS, we make use of the prerecorded dataset and allow limited interactions with the real environment to boost the selection quality.Read More
When a passion for bass and brass help build better tools
We caught up with Kevin Millikin, a software engineer on the DevTools team. He’s in Salt Lake City this week to present at PyCon US, the largest annual gathering for those using and developing the open-source Python programming language.Read More
When a passion for bass and brass help build better tools
We caught up with Kevin Millikin, a software engineer on the DevTools team. He’s in Salt Lake City this week to present at PyCon US, the largest annual gathering for those using and developing the open-source Python programming language.Read More
Tackling multiple tasks with a single visual language model
We introduce Flamingo, a single visual language model (VLM) that sets a new state of the art in few-shot learning on a wide range of open-ended multimodal tasks.Read More
Tackling multiple tasks with a single visual language model
We introduce Flamingo, a single visual language model (VLM) that sets a new state of the art in few-shot learning on a wide range of open-ended multimodal tasks.Read More
DeepMind’s latest research at ICLR 2022
Beyond supporting the event as sponsors and regular workshop organisers, our research teams are presenting 29 papers, including 10 collaborations this year. Here’s a brief glimpse into our upcoming oral, spotlight, and poster presentations.Read More
DeepMind’s latest research at ICLR 2022
Beyond supporting the event as sponsors and regular workshop organisers, our research teams are presenting 29 papers, including 10 collaborations this year. Here’s a brief glimpse into our upcoming oral, spotlight, and poster presentations.Read More
An empirical analysis of compute-optimal large language model training
We ask the question: “What is the optimal model size and number of training tokens for a given compute budget?” To answer this question, we train models of various sizes and with various numbers of tokens, and estimate this trade-off empirically. Our main finding is that the current large language models are far too large for their compute budget and are not being trained on enough data.Read More
An empirical analysis of compute-optimal large language model training
We ask the question: “What is the optimal model size and number of training tokens for a given compute budget?” To answer this question, we train models of various sizes and with various numbers of tokens, and estimate this trade-off empirically. Our main finding is that the current large language models are far too large for their compute budget and are not being trained on enough data.Read More
GopherCite: Teaching language models to support answers with verified quotes
Language models like Gopher can “hallucinate” facts that appear plausible but are actually fake. Those who are familiar with this problem know to do their own fact-checking, rather than trusting what language models say. Those who are not, may end up believing something that isn’t true. This paper describes GopherCite, a model which aims to address the problem of language model hallucination. GopherCite attempts to back up all of its factual claims with evidence from the web.Read More