snips-nlu icon indicating copy to clipboard operation
snips-nlu copied to clipboard

How can I add a new language to snips nlu?

Open gosailing opened this issue 6 years ago • 13 comments

Appreciate this open source nlu.

Do you support Chinese or do you plan to ? Or What should I do if I want to add a new language support? I guess I don't see them in the doc.

gosailing avatar Mar 12 '18 07:03 gosailing

Hi @gosailing,

For now adding a language is handled internally, we're working with linguists to ensure an advanced support of the language. We plan to open the addition of the language to the community but it's a lot of work to document, so we need some time ;). We've been working on Chinese but did not have time to finish.

ClemDoum avatar Mar 12 '18 18:03 ClemDoum

Are there any plans for Portuguese support?

Vitorbnc avatar Mar 27 '18 22:03 Vitorbnc

For now we have no plan for Portuguese.

@gosailing @Vitorbnc if you want to use the NLU for a language we don't support yet, the main blocker is going to be the builtin entities.

For now we use Rustling and our builtin entities ontology. These 2 tools are used in the Rust inference on-device.

Now if you're not interested in the on-device part you can still bypass the ontology and use your own. There are some other tools that you can use for builtin entities such as duckling

  • replace all imports of from snips_nlu_ontology import x, y, z by from my_nlu_ontology import x, y, z module
  • reimplement the x, y, z function on your side
  • for instance you could use duckling as BuiltinEntityParser

If you're successful, we'll try to see how we can make the ontology module configurable, this way no one will be bound to someone else's ontology

ClemDoum avatar Mar 28 '18 08:03 ClemDoum

is there any road map about for supporting new languages?

bellaj avatar Apr 28 '18 09:04 bellaj

Hi @bellaj, as said here we might add a basic support for Chinese Mandarin. But for the moment we don't have any new language on our short term roadmap, contribution are welcome though, we're thinking of a way to make them easier ;)

ClemDoum avatar May 02 '18 12:05 ClemDoum

Hello @ClemDoum ... have you looked into the upcoming spaCy 2.1 and it's associated Prodigy offering? Perhaps there is a way to include them in the workflow processing pipeline?

bradjonesca avatar May 02 '18 12:05 bradjonesca

I am interested in using Snips NLU with Sinhala Language.Will there be any possibility of doing it? If guided I am ready to contribute in it.

uthpala-era avatar Nov 02 '18 14:11 uthpala-era

Hi @gosailing,

For now adding a language is handled internally, we're working with linguists to ensure an advanced support of the language. We plan to open the addition of the language to the community but it's a lot of work to document, so we need some time ;).

Is there any possibility to reveal a time that it might take for this change? I am waiting , so i can use SNIPS NLU for my research.

uthpala-era avatar Nov 10 '18 23:11 uthpala-era

Hi @uthpala-era, I think that the main blocker for the integration of Sinhala in the NLU will be the handling of builtin entities. For now Rustling (our builtin entity parser) doesn't support Sinhala. So unless you can implement it for Sinhala, the support will be limited.

ClemDoum avatar Nov 12 '18 09:11 ClemDoum

Hi @ClemDoum Im unable to install "english language" kindly help me out . how to install in Windows 7 ,Im getting following error." MissingResource: Language resource 'en' not found. This may be solved by running 'python -m snips_nlu download en' " after installation .

lohitk27 avatar Jan 23 '19 11:01 lohitk27

Hi, I am also interested in adding a new language for Snips NLU. My mother tongue is Czech and I don´t think that this language is anyhow important for you guys to implement it. So I am ready to try to implement it on my own. Is there any guidance on how to do it? I´ve already looked over the source codes of Rustling and other parts of the NLU here on GitHub, but any documentation or at least some hints would be great.

gregorij89 avatar Sep 21 '19 17:09 gregorij89

Is there any possibility to add Greek language in the near future?

malamasn avatar Nov 10 '20 15:11 malamasn

Hello there, Any plans to share a documentation on how to add a custom language please? Mostly interesed in the classification side, no need for the built in entities I see some files referring to the top 10.000 words , stemming and brown words, but it is hard to guess what are exactly the steps to follow. Thanks in advance, Hicham

hicham17 avatar Mar 19 '21 11:03 hicham17