apps Unable to run pre-trained (1.5B) model on test set

I'm trying to run the pre-trained 1.5B model linked in the README on the APPS test set. I downloaded the dataset and ran the script train/apps_create_split.py on it, then ran the model with

python generate_gpt_codes.py -t ~/Code/APPS/test.json --load ~/Code/APPS/models/1.5B --save ~/Code/APPS/output/15B

Note that I didn't do training beforehand, the directory models/1.5B is as it was when I downloaded it - I assume this is fine since the README says the models are fine-tuned.

When I look at the contents of all_codes.json, at first it looks okay, but pretty soon all I see are empty entries like this:

... "9": "", "10": "", "11": "", "12": "", "13": "", "14": "" ...

I see several messages in the script output that seem like potential errors:

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.

Input length of input_ids is 1052, but `max_length` is set to 1023. This can lead to unexpected behavior. You should consider increasing `max_new_tokens`.

../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [207,0,0], thread: [35,0,0] Assertion `srcIndex < srcSelectDimSize` failed.

A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.

Many of those errors are printed over and over, and then the end of the logs are just this message thousands of times:

Unexpected exception in generating solution
Batch dimension of `input_ids` should be 5, but is 4.

Dec 09 '22 20:12 jack-jjm

This is mostly being addressed in #22 We recommend using the model GPT-NEO.

I'll leave this open as a reminder to update the README.

The published model has a max length of 1024 and we pad and truncate the input to that amount so if the input is 1024 it leaves no room for the model output.

Dec 27 '22 18:12 xksteven

Closing due to staleness but I also updated the repo to be able to run with HF so hopefully this issue is addressed.

Jun 19 '24 05:06 xksteven