StackOverflowNER
StackOverflowNER copied to clipboard
KeyError in /code/BERT_NER/utils_fine_tune/labels_seg.txt`
Hi,
I'm trying to run E2E_SoftNER.py
. I think I have been able to resolve the references to the locations of a lot of the models and files that are associated with the repo, however, I'm getting an error, here's the traceback:
Exception has occurred: KeyError
8
File "/Users/clemente/src/python/github/StackOverflowNER/code/BERT_NER/softner_segmenter_preditct_from_file.py", line 298, in evaluate
preds_list[i].append(label_map[preds[i][j]])
File "/Users/clemente/src/python/github/StackOverflowNER/code/BERT_NER/softner_segmenter_preditct_from_file.py", line 638, in predict_segments
result, predictions = evaluate(args, model, tokenizer, labels, pad_token_label_id, mode="", path=input_file)
File "/Users/clemente/src/python/github/StackOverflowNER/code/BERT_NER/E2E_SoftNER.py", line 186, in Extract_NER
softner_segmenter_preditct_from_file.predict_segments(segmenter_input_file, segmenter_output_file)
File "/Users/clemente/src/python/github/StackOverflowNER/code/BERT_NER/E2E_SoftNER.py", line 206, in <module>
Extract_NER(input_file)
It looks like there might be something off with what this code expects for the format of './utils_fine_tune/labels_seg.txt'
. Looking at label_map
here, it is just a dictionary that doesn't have a key for 8:
> label_map
{0: 'B-Name', 1: 'O', 2: 'CTC_PRED:0', 3: 'CTC_PRED:1', 4: 'md_label:O', 5: 'md_label:Name'}
whereas preds
here seems to be an array with a pretty high number of values:
> preds
array([[ 0, 8, 13, ..., 10, 1, 0],
[ 4, 13, 1, ..., 7, 3, 9],
[ 9, 2, 0, ..., 9, 1, 9],
...,
[ 0, 2, 13, ..., 0, 12, 0],
[ 4, 2, 5, ..., 10, 5, 1],
[ 4, 2, 6, ..., 9, 9, 9]])
Everything in the utils_fine_tune
directory came from the megaupload link you provided, so it could be possible that there was some issue with either the archive, or the data.
If you find the time to take a look at this issue, thanks very much for contributing this code to the community and please let me know if there is anything else you might be interested in from me to help debug or further understand this issue. Hopefully it's just some misunderstanding on my end.
Hi Jeniyat,
Thank you for the contribution! I'm facing the same error as @cuevasclemente
BERT_NER/softner_segmenter_preditct_from_file.py", line 298, in evaluate preds_list[i].append(label_map[preds[i][j]]) KeyError: 11
If you could help understand this issue, it would be really helpful!
@cuevasclemente Hi, I am also trying to run E2E_SoftNER.py. You mentioned that you were able to resolve the references to the locations for many models and files. There might be a problem with my understanding, but I'm getting this error: "ValueError: /data/jeniya/STACKOVERFLOW_DATA/POST_PROCESSED/fasttext_model/fasttext.bin cannot be opened for loading!" since this model is not present when called from "StackOverflowNER-master/code/BERT_NER/utils_ctc/prediction_ctc.py", line 30 fasttext_model = fasttext.load_model('/data/jeniya/STACKOVERFLOW_DATA/POST_PROCESSED/fasttext_model/fasttext.bin')
I tried to load the other models in -> StackOverflowNER-master/resources/pretrained_word_vectors/ folder, but get "... wrong file format" error. What models did you use and how did you load them?
Hi, I think the specific error you're referencing indicates to me that you need to change file locations that this code is looking for. You probably don't have a /data/jeniya/STACKOVERFLOW_DATA
directory on your computer, so you would need to change where those files are pointing to on your local computer in prediction_ctc.py