DriveLM icon indicating copy to clipboard operation
DriveLM copied to clipboard

How to run evaluation on the validation set?

Open anirudh-chakravarthy opened this issue 1 year ago • 1 comments

Hi,

Is is possible to provide a set of instructions to run evaluation on the validation set?

From the README:

test_eval.json is used for evaluation. test_llama.json is used for training

However, when I run:

python demo.py --llama_dir /path/to/llama_model_weights --checkpoint /path/to/pre-trained/checkpoint.pth --data ../test_llama.json  --output ../output.json --batch_size 4 --num_processes 8

python evaluation.py --root_path1 ./output.json --root_path2 ./test_eval.json

I face this random UUID issue. I follow the exact instructions, so I'm not sure why this doesn't work.

For further diagnosis, following the FAQ which say I should run inference on the validation set, I ran:

python convert2llama.py

and changed this line to v1_1_val_nus_q_only.json and output to val_llama.json

And then did:

python demo.py --llama_dir /path/to/llama_model_weights --checkpoint /path/to/pre-trained/checkpoint.pth --data ../val_llama.json  --output ../output_val.json --batch_size 4 --num_processes 8

python evaluation.py --root_path1 ./output_val.json --root_path2 ./v1_1_val_nus_q_only.json

But this doesn't work either, and shows the same UUID error.

anirudh-chakravarthy avatar Nov 12 '24 17:11 anirudh-chakravarthy

Could you post the UUID error here? Are you running eval on your local env or our test server?

ChonghaoSima avatar Nov 18 '24 05:11 ChonghaoSima