Tyler Barrus
Tyler Barrus
closing for now. If this is still an issue, please re-open!
I am not sure how well this solution works with non-latin scripts. You can see the [documentation](https://github.com/barrust/pyspellchecker/discussions/90) on how to make a dictionary. If you have success in building a...
You are correct, they are "prioritized" based on the number of instances that are found as the more common words are *more likely* to be the correct word (hence why...
The data used to build the dictionaries are pulled from the [opensubtitles](http://www.opensubtitles.org/) project. The build process is automated using [script/build_dictionaries.py](https://github.com/barrust/pyspellchecker/blob/master/scripts/build_dictionary.py). I don't know of a good method to automatically find...
Thank you for this patch. I am wondering, though, why would this be better than using the `local_dictionary` parameter? ```python from spellchecker import SpellChecker spell = SpellChecker(language=None, local_dictionary=filepath) ```
It is likely an issue with the dictionary based on the data source used to build the dictionaries. I am not a French speaker and am not really able to...
You are correct, this library does not take part of speech into account. It is possible to add those entities into the dictionary that you are leveraging so that it...
The data was orginially pulled from [opensubtitles.org](https://www.opensubtitles.org/en/search/subs) but was heavily modified and the dictionary itself is part of this code. I did not pull the dictionaries directly from any other...
The `scripts/build_dictionary.py` script lists each item used but the data can be found here: https://opus.nlpl.eu/OpenSubtitles2018.php Per this page, the requirements to use are to: 1) Add the url to http://www.opensubtitles.org/...
You are correct that it would be more complicated but possible. It may make more sense for that to be outside the library itself and more of an as needed...