AllenShow issues

Results 9 issues of


                                            AllenShow

What's the meaning of the parameter 'load_step'?

Thanks

Cannot reproduce baseline tasks?

Hi! Thanks for your great work. I tried to reproduce the baseline tasks, but the results were low compared to the paper. So I am not sure whether I used...

About PopQA

Hi! Thanks for the great work. When reproducing the inference for PopQA using Self-RAG, I got the same score for adaptive_retrieval and always_retrieve. In theory, the adaptive_retrieval result should be...

max_depth argument in retrieval_lm/run_short_form.py

Hello, I'm trying to reproduce paper numbers on PopQA by running the following command : Question Answering python run_short_form.py \ --model_name selfrag/selfrag_llama2_7b \ --input_file eval_data/popqa_longtail_w_gs.jsonl \ --mode MODE --max_new_tokens 100...

For ASQA, how to reproduce the baseline?

Hi! Thanks for your great work. I try to reproduce the baseline for ASQA using Llama-2-7b-hf, like this: python run_baseline_lm.py \ --model_name meta-llama/Llama-2-7b-hf \ --input_file eval_data/asqa_eval_gtr_top100.json \ --max_new_tokens 300 --metric...

AllenShow

What's the meaning of the parameter 'load_step'?

Cannot reproduce baseline tasks?

About PopQA

max_depth argument in retrieval_lm/run_short_form.py

For ASQA, how to reproduce the baseline?

How to fix the bug about 'local variable 'pred' referenced before assignment'?

About baseline's parameter 'task'

Could model_type support QWEN models?

When eval_packing=False, training failed with KeyError: 'eval_loss' .