ramen
ramen copied to clipboard
Reproducing baseline results
I have some trouble reproducing the baseline results with RN and UpDn on CLEVR. Can you share the scripts for reproducing the baseline results (RN, UpDn, etc.) with the precise hyperparameters used (batch size, learning rate, etc.)?
I uploaded the original scripts for RN and UpDn to: https://drive.google.com/drive/folders/1-lD4wDWNSh3n1DsSLA8zAPNB0US7jg8U?usp=sharing
Note that these are not well documented/maintained anymore, but hopefully will help you reproduce the results.
For relation network on CLEVR, this was the script specifying the hyperparameters (see: rn_CLEVR.sh
):
CUDA_VISIBLE_DEVICES=0 python -u train_rn.py --root $ROOT --data_set $DATA_SET --model 'original-fbuf' \
--batch-size 128 \
--test-batch-size 128 \
--num_objects 15 \
--lr 5e-6 \
--lr-step 10 \
--lr-gamma 2 \
--lr-max 0.0005 \
--epochs 63 \
--invert-questions \
--rl_in_size 5120 \
--feature_subdir faster-rcnn \
--expt_name expt_CLEVR
For UpDn on CLEVER, I had used the following script (see butd_CLEVR.sh
):
Lr was not tuned i.e., Adamax optimizer with default lr (i.e., 2e-3) was used.
CUDA_VISIBLE_DEVICES=0 python -u butd_vqa.py --root $ROOT \
--dataset $DATASET \
--spatial_feature_type mesh \
--spatial_feature_length 16 \
--batch_size 64 \
--num_objects 15 \
--h5_prefix use_split \
--expt_name expt_${DATASET}_new \
--feature_subdir faster-rcnn \
--resume latest