Member-only story

Deep Mind’s Generalist Agent: Gato

A breakthrough step using a single neural transformer to perform numerous multiple tasks.

Renu Khandelwal
6 min readMay 18, 2022

Imagine a single neural sequence transformer model with a single set of weights engaged in predicting various tasks like caption images, chat, stacking blocks with a robotic arm, outperforming humans at playing Atari games, navigating in simulated 3D environments, and more.

Deep Mind ‘s Gato is a significant step towards a generalist AI model but a giant leap that will lead to inteligent machines performing intellectual tasks much like humans

What is Gato?

Gato is a single generalist agent that works as a multi-modal, multi-task, multi-embodiment generalist policy. It currently uses the same network with the same weights, around 1.2B parameters to play Atari, caption images, chat, stack blocks with a robotic arm, and much more based on its context.

Gato is trained on 604 distinct tasks with varying modalities, observations, and action specifications.

Source: A Generalist Agent

Inspiration for Gato

Gato is inspired by works such as GPT-3 and Gopher to push the limits of generalist

--

--

Renu Khandelwal
Renu Khandelwal

Written by Renu Khandelwal

A Technology Enthusiast who constantly seeks out new challenges by exploring cutting-edge technologies to make the world a better place!

No responses yet