missing_aware_prompts icon indicating copy to clipboard operation
missing_aware_prompts copied to clipboard

About the hatememes_dataset.py

Open herkerser opened this issue 1 year ago • 1 comments

I notice that the class Hatememes set the text_column_name to 'plots' as follows `class HateMemesDataset(BaseDataset): def init(self, *args, split="", missing_info={}, **kwargs): assert split in ["train", "val", "test"] self.split = split

    if split == "train":
        names = ["hatememes_train"]
    elif split == "val":
        names = ["hatememes_dev"]
    elif split == "test":
        names = ["hatememes_test"] 

    super().__init__(
        *args,
        **kwargs,
        names=names,
        text_column_name="plots",
        remove_duplicate=False,
    )`

However, the make_arrow in write_hatememes.py, no column named 'plots' is defined, dataframe = pd.DataFrame( data_list, columns=[ "image", "text", "label", "split", ], ) This may cause error "KeyError: 'Field "plots" does not exist in schema'" when training. I wonder if its a mistake or my misunderstood?

herkerser avatar Dec 01 '23 02:12 herkerser