BertSum icon indicating copy to clipboard operation
BertSum copied to clipboard

Order inconsistency of output candidate file with original test.json when testing bertSum Extractive

Open cece00 opened this issue 2 years ago • 1 comments

Under "test" mode, there will be two files output: xxx.candidate and xxx.gold. The texts in above two files are in the same order, but do not consistent with the original test.json. I have checked that "shuffle=False" in dataloader. So where is wrong? Is there anyone who has encountered the same problem? Can anyone help!?

cece00 avatar Jul 21 '22 07:07 cece00

@cece00 Modify the Line 89 src/model/data_loader.py The following code fixed the similar issue for me

def atoi(text): return int(text) if text.isdigit() else text

def natural_keys(text): return [ atoi(c) for c in re.split(r'(\d+)', text) ]

pts = sorted(glob.glob(args.bert_data_path + 'cnndm.' + corpus_type + '.[0-9]*.bert.pt')) pts.sort(key=natural_keys)

ashokurlana avatar Jul 29 '22 17:07 ashokurlana