NEKO
NEKO copied to clipboard
Training run with certain non-default values causes evaluation run to fail.
I'm still trying to track down exactly what's causing this. But in the meantime, here's some data.
Training command:
python -m pdb train.py \
--embed_dim=768 \
--layers=4 \
--heads=12 \
--training_steps=12 \
--log_eval_freq=4 \
--warmup_steps=1 \
--batch_size=4 \
--sequence_length=512 \
--eval_episodes=1 \
--activation_fn=gelu \
--save_model \
--save_mode=checkpoint \
--text_prop=1.0 \
--eval_text_log_examples \
--text_datasets=wikitext-2-v1 \
--text_datasets_paths=wikitext \
--pretrained_lm=gpt2 \
--disable_cosine_decay
Evaluation command:
python -m pdb eval.py \
--model_path=./models/neko-gato-<your-id-here>/checkpoint_12.pt \
--eval_episodes=1