haystack
haystack copied to clipboard
[feature request] Chinese support
this is the list of supported languages. https://github.com/deepset-ai/haystack/blob/3e6def7e03097021c8efd1b5c277bec6e541c162/haystack/preprocessor/preprocessor.py#L17
Chinese is missing
@so2liu Thank you for bringing up this feature request. Let's see whether somebody in our open source community with expertise in Chinese language wants to contribute here.
the line of code where we run iso639_to_nltk.get(language, language)
shouldn't cause a big problem though. The dict iso639_to_nltk
doesn't contain cn
or chinese
but if the key is not found, .get(language, language)
will return language
. Have you tried setting language to chinese
?