BertSum Order inconsistency of output candidate file with original test.json when testing bertSum Extractive

Order inconsistency of output candidate file with original test.json when testing bertSum Extractive

Open cece00 opened this issue 2 years ago • 1 comments

Under "test" mode, there will be two files output: xxx.candidate and xxx.gold. The texts in above two files are in the same order, but do not consistent with the original test.json. I have checked that "shuffle=False" in dataloader. So where is wrong? Is there anyone who has encountered the same problem? Can anyone help!?

Jul 21 '22 07:07 cece00

@cece00 Modify the Line 89 src/model/data_loader.py The following code fixed the similar issue for me

def atoi(text): return int(text) if text.isdigit() else text

def natural_keys(text): return [ atoi(c) for c in re.split(r'(\d+)', text) ]

pts = sorted(glob.glob(args.bert_data_path + 'cnndm.' + corpus_type + '.[0-9]*.bert.pt')) pts.sort(key=natural_keys)

Jul 29 '22 17:07 ashokurlana

BertSum BertSum copied to clipboard

Order inconsistency of output candidate file with original test.json when testing bertSum Extractive

BertSum
BertSum copied to clipboard