Member-only story
Deep Mind’s Generalist Agent: Gato
A breakthrough step using a single neural transformer to perform numerous multiple tasks.
Imagine a single neural sequence transformer model with a single set of weights engaged in predicting various tasks like caption images, chat, stacking blocks with a robotic arm, outperforming humans at playing Atari games, navigating in simulated 3D environments, and more.
Deep Mind ‘s Gato is a significant step towards a generalist AI model but a giant leap that will lead to inteligent machines performing intellectual tasks much like humans
What is Gato?
Gato is a single generalist agent that works as a multi-modal, multi-task, multi-embodiment generalist policy. It currently uses the same network with the same weights, around 1.2B parameters to play Atari, caption images, chat, stack blocks with a robotic arm, and much more based on its context.
Gato is trained on 604 distinct tasks with varying modalities, observations, and action specifications.
Inspiration for Gato
Gato is inspired by works such as GPT-3 and Gopher to push the limits of generalist…