unlimiformer
unlimiformer copied to clipboard
reproducing your results
Hi folks, thanks for your help with understanding unlimiformer so far. My team and I trying to reproduce your training results from the paper using the following:
python src/run.py \
src/configs/training/base_training_args.json \
src/configs/data/gov_report.json \
--output_dir output_train_bart_base_local/ \
--learning_rate 1e-5 \
--model_name_or_path facebook/bart-base \
--eval_steps 1000 --save_steps 1000 \
--per_device_eval_batch_size 1 --per_device_train_batch_size 2 \
--extra_metrics bertscore \
--unlimiformer_training \
--max_source_length 16384 \
--test_unlimiformer --eval_max_source_length 999999 --do_eval=True \
> output/output${SLURM_JOB_ID}.txt
My understanding is that we should be reproducing Table 4: (56.6 / 26.3 / 27.6 / 68.2) for (Rouge 1 / 2/ L/ BERTScore). Here is a link to a wandb report of a full run we have produced (it took about 11 hours): https://api.wandb.ai/links/unlimiformer-kg/y29tbk1n
The max_source_length 16384 is a concern given that the training set has some enormous documents. The dataset has a very long tail: plenty over 50k tokens and even one with 250k tokens.
I'll let you know how a second run goes overnight. (I've just cloned your repo and, just to be sure, here is a screen shot of my slurm job: