spaCy icon indicating copy to clipboard operation
spaCy copied to clipboard

Unable to fine-tune previously trained transformer based spaCy NER.

Open jlustgarten opened this issue 1 year ago • 1 comments

How to reproduce the behaviour

Use spacy to fine-tune a base model with a transformer from hugging face: python -m spacy train config.cfg --output ./output --paths.train ./train.spacy --paths.dev ./dev.spacy

Collect new tagged entries under new sets and set your model location to the output/model-last in a new config: python -m spacy train fine_tune_config.cfg --output ./fine_tune_output --paths.train ./newtrain.spacy --paths.dev ./newdev.spacy

You will get an error about a missing config.json. Even replacing this will then lead to an error of a missing tokenizer.

Your Environment

  • Operating System: Windows 11
  • spaCy version: 3.7.2
  • Platform: Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.35
  • Python version: 3.10.13

jlustgarten avatar Dec 06 '24 04:12 jlustgarten

Apologies I didn't mean to open it up twice!

jlustgarten avatar Dec 06 '24 04:12 jlustgarten