scattertext
scattertext copied to clipboard
Chinese scattertext
Your Environment
- Operating System:
- Python Version Used:
- Scattertext Version Used:
- Environment Information:
- Browser used (if an HTML error): Hi,
It seems in your demo code, developer can directly use "chinese_nlp" module from scattertext package. I am wondering for plotting Chinese scatter text, if we could add a list of user defined stopwords and probably some user-defined dictionary specific for certain Chinese context, then use jieba to do the word segmentation and tie all these cleaned results to your demo program?
Thanks
You could stop list after tokenization by running corpus.remove_terms(...). Otherwise, feel free to modify AsianNLP.py to fit your use case. It just ducktypes spaCy’s interface.