contextualSpellCheck
contextualSpellCheck copied to clipboard
French (doc add)
As requested in #41, here is how I succeeded in running contextualSpellCheck for French.
Use French spaCy model:
nlp = spacy.load("fr_core_news_sm")
Use camembert/camembert-base-ccnet:
nlp.add_pipe("contextual spellchecker", config={"max_edit_dist": 4,"model_name": "camembert/camembert-base-ccnet"})
Need these dependencies:
pip install sentencepiece
pip install protobuf==3.20
Remark: on the result spaces are lost, thus need a post-processing to get them back properly.
PS: for flaubert/flaubert_large_cased model, need this dependency
pip install sacremoses
Hey, @EtienneAb3d thank you for raising this request. It is excellent to know you were successfully able to use it for french!
Would you like to raise a PR to add an example for the french language similar to other examples? I would be happy to merge the PR as it would be a great addition for people using it for french!
If you have any suggestions or other feedback, feel free to highlight them.
Hi @R1j1t, perhaps later I will find the time to build such a PR. But, on the team side, if you have a direct access to edit, it's only few lines to add to the doc. ;-)
No worries!
Also note that in addition to @EtienneAb3d steps, in a Jupyer Notebook: restart kernel after protobuf install
!pip uninstall -y protobuf
!pip install protobuf==3.20
Also @EtienneAb3d , how did you manage the lost spaces issue?