The various datasets mentionned in the Gato paper are not all publicly available and some (like 'Playroom') are not even detailed. Here is what we could find on the various datasets.
Environment | Tasks | Episodes | Approx Tokens | Sample Weight | Agent used | Open-Source Repo | Additional information |
---|---|---|---|---|---|---|---|
DM LAB | 254 | 16.4M | 194B | IMPALA | DM Lab | Appendix F.5 of the Gato paper mentions that they trained an IMPALA agent on a set of 18 parent DM Lab levels. “Data was collected by executing the agent on these 18 levels, as well as an additional set of 237 levels handcrafted to test a diverse set of skills”. We don’t have much information on the definition of those 18 “parent levels” and the 237 “handcrafted levels”. But there are a lot of different levels here: https://github.com/deepmind/lab/tree/master/game_scripts/levelsCheck out this paper which claims SOTA with an IMPALA agent on DM Lab 30: https://arxiv.org/pdf/1809.04474v1.pdf | |
ALE Atari | 51 | 63.4K | 1.26B | Muesli agent for 200M steps per environment | ALE Atari | ||
ALE Atari Extended | 28 | 28.4K | 565M | Muesli agent for 200M steps per environment | ALE Atari | ||
Sokoban | 1 | 27.2K | 298M | Muesli agent | Sokoban | ||
Baby AI | 46 | 4.61M | 22.8B | Built-in BabyAI bot with 100 000 episodes for each level | Baby AI | ||
DM Control Suite | 30 | 395K | 22.5B | DM Control | In Appendix F.4 of the Gato paper, the authors mention that “for each task in the control suite, they collect two disjoint sets of data, one using only state features and another using only pixels'’ . They use a D4PG agent to collect data from tasks with state features, and an MPO based agent to collect data with pixels. They also collect data for randomized versions of the control suite tasks with a D4PG agent. They randomize the actuator gear, joint range, stiffness, and damping and geom size and density from a small interval and a large interval.There are some SOTA agents here :https://paperswithcode.com/dataset/deepmind-control-suite | ||
DM Control Suite Pixels | 28 | 485K | 35.5B | D4PG for tasks with state feature, MPO for data using pixels. Randomized versions with D4PG | DM Control | ||
DM Control Suite Random Small | 26 | 10.6M | 313B | DM Control | |||
DM Control Suite Random Large | 26 | 26.1M | 791B | DM Control | |||
Meta-World | 45 | 94.6K | 3.39B | MPO agent | Meta-World | Appendix F.9 of the Gato paper mention that they collected data from all train and test tasks in the MT50 mode by training a MPO agent with unlimited environment seeds and access to state of the MuJoCo physics engine. The collected data also contains the MuJoCo physics engine state. | |
Procgen Benchmark | 16 | 1.6M | 4.46B | R2D2 agent | Procgen | Appendix F.6 from the Gato paper mention that they trained a R2D2 agent on the 16 environments at the hard difficulty setting except for the maze and heist which they set to easy. OpenRL has some benchmarks here: https://wandb.ai/openrlbenchmark/openrlbenchmark/reportlist | |
RGB Stacking Simulator | 1 | 387K | 24.4B | RGB Stacking | The repo contains specialist agent | ||
RGB Stacking real robot | 1 | 15.7K | 980M | ||||
Modular RL | 38 | 843K | 69.6B | D4PG for a total of 140M steps with 30 random seeds | Modular RL | Appendix F.7 of the Gato paper mentions that the authors trained a D4PG agent on each variant for a total of 140M actor steps with 30 random seeds per variant. | |
DM Manipulation Playground | 4 | 286K | 6.58B | The Gato paper mentions it contains 4 tasks of simulated Kinova Jaco arm but I cant find any specific repo or source for the “DM Manipulation playgroun”. Searching for ‘jaco’ in the DM control suite repo yields multiple results…. so maybe it is included in the DM Control suite repo? | |||
Playroom | 1 | 829K | 118B | The word “Playroom” literally appears only once in the paper… I found a reference to a “Playroom” environment in a repo from Google Research: https://github.com/google-research/google-research/tree/master/playrooms |
Dataset | Sample Weight | Open-Source? | Repo | Open-Source equivalent | Additional info |
---|---|---|---|---|---|
MassiveText | No | ThePile | Web, Books, news articles and code https://vaclavkosar.com/ml/massivetext-dataset-pretraining-deepminds-gopher | ||
MultiModal MassiveWeb (M3W) | No | Maybe this? Big interleaved Dataset | Introduced in the Flamingo paper: https://openreview.net/pdf?id=EbMuimAbPbs | ||
ALIGN | No | Cant find any | Introduced by Google: https://ai.googleblog.com/2021/05/align-scaling-up-visual-and-vision.html | ||
MS-COCO Captions | Yes | Pretty sure its in there: MS-COCO | |||
Conceptual Captions | Yes | ||||
LTIP | No | Proprietary from Deepmind, introduced in Flamingo paper | |||
OKVQA | Yes | OKQVA | |||
VQAV2 | Yes | VisualQA |