BertSum
BertSum copied to clipboard
Order inconsistency of output candidate file with original test.json when testing bertSum Extractive
Under "test" mode, there will be two files output: xxx.candidate and xxx.gold. The texts in above two files are in the same order, but do not consistent with the original test.json. I have checked that "shuffle=False" in dataloader. So where is wrong? Is there anyone who has encountered the same problem? Can anyone help!?
@cece00 Modify the Line 89 src/model/data_loader.py The following code fixed the similar issue for me
def atoi(text): return int(text) if text.isdigit() else text
def natural_keys(text): return [ atoi(c) for c in re.split(r'(\d+)', text) ]
pts = sorted(glob.glob(args.bert_data_path + 'cnndm.' + corpus_type + '.[0-9]*.bert.pt')) pts.sort(key=natural_keys)