LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

Using `preprocess_phi_3_new` in `LAVIS/open_flamingo/train/sft_data_utils.py` gets labels all -100.

Open JHW5981 opened this issue 1 year ago • 4 comments

Hello, thank you for your wonderful work.

I have a problem re-implementing LazySupervisedDataset and am stuck at the position of retrieving training labels. All labels are -100.

image

Below is a screenshot of my dataset:

image

I completely reuse your LazySupervisedDataset. When I initialize data_path, tokenizer, image_processor, and args, it runs without any issues. However, when I check the labels it generates, the tensor is entirely -100.

I debugged this strange behavior and found that the issue occurs because of the following piece of code:

image

First, when the if-clause above reaches the “user round,” the cur_len is absolutely not equal to total_len, so the line target[:] = IGNORE_INDEX is always executed.

image

Second, the code at line 226 does not skip the bos token but instead skips the "<|user|>" token. I don’t understand the reasoning behind this behavior.

image

JHW5981 avatar Dec 26 '24 17:12 JHW5981