doccano-transformer
doccano-transformer copied to clipboard
Not compatible with spacy 3.x
When running the sample code I get the following error:
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-82-777c977a1d80> in <module>
1 import doccano_transformer
2
----> 3 from doccano_transformer.datasets import NERDataset
4 from doccano_transformer.utils import read_jsonl
5
~/miniconda3/envs/base/lib/python3.7/site-packages/doccano_transformer/datasets.py in <module>
3 from typing import Any, Callable, Iterable, Iterator, List, Optional, TextIO
4
----> 5 from doccano_transformer.examples import Example, NERExample
6
7
~/miniconda3/envs/base/lib/python3.7/site-packages/doccano_transformer/examples.py in <module>
2 from typing import Callable, Iterator, List, Optional
3
----> 4 from spacy.gold import biluo_tags_from_offsets
5
6 from doccano_transformer import utils
ModuleNotFoundError: No module named 'spacy.gold'
Seems like this is removed from spacy v3.x: https://github.com/explosion/spaCy/releases
Just change the line
from space.gold import biluo_tags_from_offsets
to
from spacy.training import offsets_to_biluo_tags
and the corresponding function call in the doccano_transformer.examples file and it should be fine.
[EDIT] It works only if you want to use the dataset.to_conll2003 method.
For dataset.to_spacyit still throws an error, since the token object (created in utils) doesn't seem to be spacy compatible.
- If you don't have other constraints, downgrade spacy should work:
pip install spacy==2.3.2
Or
conda install -c conda-forge spacy==2.3.2
Having the same issue.
How do you let a bug like this just sit? Unusable after spacy 3
@mirfan899 @Matt-Payne you should clearly explain if my approach doesn't work: if you followed, what errors did you get? I clearly mentioned that there might be other constraints :).
What does the issue say? He is not asking for constraints you came up with. It clearly mentions that doccano-ransformer not compatible with spacy 3.
I've written a script to convert the doccano output jsonl to bilou format json that can be directly converted and used for spacy training. Check it out here : https://github.com/abtExp/doccano_to_bilou
My script doesn't rely on spacy, thus no compatibility issues.