Member-only story

Deep Mind’s Generalist Agent: Gato

A breakthrough step using a single neural transformer to perform numerous multiple tasks.

6 min readMay 18, 2022

Imagine a single neural sequence transformer model with a single set of weights engaged in predicting various tasks like caption images, chat, stacking blocks with a robotic arm, outperforming humans at playing Atari games, navigating in simulated 3D environments, and more.

Deep Mind ‘s Gato is a significant step towards a generalist AI model but a giant leap that will lead to inteligent machines performing intellectual tasks much like humans

What is Gato?

Gato is a single generalist agent that works as a multi-modal, multi-task, multi-embodiment generalist policy. It currently uses the same network with the same weights, around 1.2B parameters to play Atari, caption images, chat, stack blocks with a robotic arm, and much more based on its context.

Gato is trained on 604 distinct tasks with varying modalities, observations, and action specifications.

Inspiration for Gato

Gato is inspired by works such as GPT-3 and Gopher to push the limits of generalist…

Deep Mind’s Generalist Agent: Gato

A breakthrough step using a single neural transformer to perform numerous multiple tasks.

What is Gato?

Inspiration for Gato

Written by Renu Khandelwal

No responses yet