pythainlp
pythainlp copied to clipboard
ModuleNotFoundError when calling `crfcut` engine in `sent_tokenize` function
i've try the crfcut
engine in sent_tokenize
function in stable release version of PyThaiNLP via
pip install --upgrade pythainlp
this is what i expected
sent_tokenize(sentence_1, engine="crfcut")
# output: ['ฉันไปประชุมเมื่อวันที่ 11 มีนาคม']
however, i got this as an output instead
sent_tokenize(sentence_1, engine="crfcut")
# ModuleNotFoundError: No module named 'pycrfsuite'
since it is a missing package problem, it can be solved by pip install python-crfsuite
in order to make it compatible to be used. However, is it better to fix it so that the user has no need to take an extra step to install crfsuite everytime they want to use an engine, or we can just leave it as usual here. What do you think ?
python-crfsuite
is often python problem when python was released new version. You can see #655. We doesn't add python-crfsuite to the dependencies list.
I looking new model to removed all crfsuite model but these models are quite efficient and therefore not worth replacing. Deep learning model are not much better.