Can't load local NER model
Hi,
I trained a NER model following your tutorial here, which btw is very nice and clear, thank you so much. However when I try to load the trained tagger, SequenceTagger doesn't seem to recognize the local path. It keeps trying to download the model from huggingface repo. Here's the error I got:
Traceback (most recent call last):
File "flair_evaluate.py", line 9, in <module>
model = SequenceTagger.load('resources/tagger/wikiner_model/final-model.pt')
File "/share/home/cao/.conda/envs/bilstm_crf_Study/lib/python3.8/site-packages/flair/nn/model.py", line 134, in load
model_file = cls._fetch_model(str(model_path))
File "/share/home/cao/.conda/envs/bilstm_crf_Study/lib/python3.8/site-packages/flair/models/sequence_tagger_model.py", line 924, in _fetch_model
model_path = cached_download(
File "/share/home/cao/.conda/envs/bilstm_crf_Study/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 665, in cached_download
_raise_for_status(r)
File "/share/home/cao/.conda/envs/bilstm_crf_Study/lib/python3.8/site-packages/huggingface_hub/utils/_errors.py", line 169, in _raise_for_status
raise e
File "/share/home/cao/.conda/envs/bilstm_crf_Study/lib/python3.8/site-packages/huggingface_hub/utils/_errors.py", line 131, in _raise_for_status
response.raise_for_status()
File "/share/home/cao/.conda/envs/bilstm_crf_Study/lib/python3.8/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/resources/tagger/wikiner_model/final-model.pt/resolve/main/pytorch_model.bin (Request ID: XH-Tc3TQnKyPEKgmNWpJE)
I have checked the paths, the files and directories are all there with the right name. I tried using absolute paths, changing the cache root parameter, but nothing seems to work. Do you have an idea how I can fix this ?
Thank you in advance.
Environment:
- OS : Linux debian 5.10.140-1
- Version : flair 0.11.3
Hello @DanrunFR if the loader does not find a file on the specified path, it tries a lookup on the huggingface repo. So the most likely reason is that somehow the provided path is not correct.
I had the same problem when i was loading the model(pytorch_model.bin) manual-downloaded from HF. You can move your model to your project root and use the file path like './xxx/xxx.bin'. (Please Checking the the code between L823-L831 from ‘models>sequence_tagger_model.py'. ). After rerunning the 'SequenceTagger.load(....)', the terminal displayed the information below. It downloaded some neccessary configuration files:
2023-01-05 15:26:28,612 loading file ./models/ner-english-ontonotes-large/pytorch_model.bin
2023-01-05 15:26:31,487: DEBUG: Starting new HTTPS connection (1): huggingface.co:443
2023-01-05 15:26:32,540: DEBUG: https://huggingface.co:443 "HEAD /xlm-roberta-large/resolve/main/tokenizer_config.json HTTP/1.1" 404 0
2023-01-05 15:26:32,553: DEBUG: Starting new HTTPS connection (1): huggingface.co:443
2023-01-05 15:26:43,637: DEBUG: https://huggingface.co:443 "HEAD /xlm-roberta-large/resolve/main/config.json HTTP/1.1" 200 0
2023-01-05 15:26:43,650: DEBUG: Starting new HTTPS connection (1): huggingface.co:443
2023-01-05 15:26:46,509: DEBUG: https://huggingface.co:443 "HEAD /xlm-roberta-large/resolve/main/tokenizer_config.json HTTP/1.1" 404 0
2023-01-05 15:26:46,517: DEBUG: Starting new HTTPS connection (1): huggingface.co:443
2023-01-05 15:26:47,563: DEBUG: https://huggingface.co:443 "HEAD /xlm-roberta-large/resolve/main/sentencepiece.bpe.model HTTP/1.1" 200 0
2023-01-05 15:26:47,566: DEBUG: Attempting to acquire lock 2009445452864 on C:\Users\EDY/.cache\huggingface\hub\models--xlm-roberta-large\blobs\db9af13bf09fd3028ca32be90d3fb66d5e470399.lock
2023-01-05 15:26:47,567: DEBUG: Lock 2009445452864 acquired on C:\Users\EDY/.cache\huggingface\hub\models--xlm-roberta-large\blobs\db9af13bf09fd3028ca32be90d3fb66d5e470399.lock
2023-01-05 15:26:47,573: DEBUG: Starting new HTTPS connection (1): huggingface.co:443
2023-01-05 15:26:48,773: DEBUG: https://huggingface.co:443 "GET /xlm-roberta-large/resolve/main/sentencepiece.bpe.model HTTP/1.1" 200 5069051
Downloading: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5.07M/5.07M [00:11<00:00, 458kB/s]
2023-01-05 15:26:59,861: DEBUG: Attempting to release lock 2009445452864 on C:\Users\EDY/.cache\huggingface\hub\models--xlm-roberta-large\blobs\db9af13bf09fd3028ca32be90d3fb66d5e470399.lock
2023-01-05 15:26:59,862: DEBUG: Lock 2009445452864 released on C:\Users\EDY/.cache\huggingface\hub\models--xlm-roberta-large\blobs\db9af13bf09fd3028ca32be90d3fb66d5e470399.lock
2023-01-05 15:27:45,420: DEBUG: https://huggingface.co:443 "HEAD /xlm-roberta-large/resolve/main/added_tokens.json HTTP/1.1" 404 0
2023-01-05 15:27:45,428: DEBUG: Starting new HTTPS connection (1): huggingface.co:443
2023-01-05 15:27:46,478: DEBUG: https://huggingface.co:443 "HEAD /xlm-roberta-large/resolve/main/special_tokens_map.json HTTP/1.1" 404 0
2023-01-05 15:27:53,657 SequenceTagger predicts: Dictionary with 76 tags: <unk>, O, B-CARDINAL, E-CARDINAL, S-PERSON, S-CARDINAL, S-PRODUCT, B-PRODUCT, I-PRODUCT, E-PRODUCT, B-WORK_OF_ART, I-WORK_OF_ART, E-WORK_OF_ART, B-PERSON, E-PERSON, S-GPE, B-DATE, I-DATE, E-DATE, S-ORDINAL, S-LANGUAGE, I-PERSON, S-EVENT, S-DATE, B-QUANTITY, E-QUANTITY, S-TIME, B-TIME, I-TIME, E-TIME, B-GPE, E-GPE,
S-ORG, I-GPE, S-NORP, B-FAC, I-FAC, E-FAC, B-NORP, E-NORP, S-PERCENT, B-ORG, E-ORG, B-LANGUAGE, E-LANGUAGE, I-CARDINAL, I-ORG, S-WORK_OF_ART, I-QUANTITY, B-MONEY
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.