NEKO
NEKO copied to clipboard
In Progress Implementation of GATO style Generalist Multimodal model capable of image, text, RL and Robotics tasks
A note from the meeting with Daniel today. See for an example of temperature calculation: https://github.com/kzl/decision-transformer/blob/e2d82e68f330c00f763507b3b01d774740bee53f/atari/mingpt/utils.py#L47 Right now we're either entirely deterministic (temperature = 0) or entirely stochasticly sampling from...
The env.yml file has host specific builds for some dependencies (eg: numpy=1.24.3=py310h5f9d8c6_1) which is valid for Linux only as per https://anaconda.org/anaconda/numpy/files?version=1.24.3&page=0&channel=main. This is causing ```conda env create -f env.yml``` to...
Part of a larger task to acquire as many of the GATO control datasets as we can (listed in the image below). data:image/s3,"s3://crabby-images/0cdd8/0cdd876984ba4f8a07f2ab79debdff62484f350a" alt="Image" Our existing control datasets come from [D4RL](https://github.com/takuseno/d4rl-atari/blob/master/d4rl_atari/offline_env.py)...
Context: @snat-s has done great work w/ the analysis of various data that may be relevant to Neko We should now, w/ the input of the team, finalize our proposed...
Context: @snat-s has made significant progress in reviewing and updating the planned multimodal dataset (combination of many datasets) for the NEKO model. there are numerous older issues that we want...
## Background [BabyAI](https://arxiv.org/abs/1810.08272) is a "gridworld environment whose levels consist of instruction-following tasks that are described by a synthetic language". Gato generates their dataset using the built-in BabyAI bot, with...
Here's the steps to recreate: First, to show that it's _specifically_ related to the `--pretrained-lm` argument, run this train/eval pair once _without_ `--pretrained-lm=gpt2` in the training arguments, then run the...
This is a really small issue. I think it would be nice if I could quickly and easily determine which of my training runs was the most recent training run....
Rather than training from scratch, we might use pretrained weights to serve as a basis for our model. We want to understand models might serve as the basis for our...
I'm still trying to track down exactly what's causing this. But in the meantime, here's some data. Training command: ``` python -m pdb train.py \ --embed_dim=768 \ --layers=4 \ --heads=12...