stanza
stanza copied to clipboard
Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages
I try to convert the pre-trained models into onnx format. I use explanation of how to do it from https://pytorch.org/tutorials/advanced/super_resolution_with_onnxruntime.html I created a fork from stanza for this experiment here...
I am trying to find an equivalent functionality to the CoreferenceResolution() function that is part of pycorenlp in the stanza library. Is stanza capable of coreference resolution, beyond simply the...
I need to save in a **JSON** file the analysis of a text and then read it from the file as a [Document](https://github.com/stanfordnlp/stanza/blob/f91ca215e175d4f7b202259fe789374db7829395/stanza/models/common/doc.py#L62) object. I see there are some methods...
Hi I use CoreNLP‘s NER Pipeline through Stanza‘s CoreNLPClient. I find that CoreNLP can provide the confidence of an entity from [here](https://stanfordnlp.github.io/CoreNLP/ner.html#ner-pipeline-overview). But it can't show in `client.annotate(text).mentions` . So...
It doesn't appear that MWT runs properly when supplied pretokenized text !
**Describe the bug** In German, ordinal numbers have a dot after the number: * On the 1st day --> Am 1. Tag * The 23rd item --> Der 23. Eintrag...
There is currently a recommendation to concatenate documents to improve speed but there is almost not information about how to determine the optimal size of the input depending on the...
I am incredibly excited by NLP tools working together and integrating. HuggingFace's model hub is a nice central environment to keep track of models of all kinds - not even...
I have a user-case where I need to know for all tokens whether or not they have space before/after them. I cannot find such information in the documentation and from...
Hi, First of all thanks for this awesome work! I want to get Constituency Parse Tree for Hindi sentences and I am not able to find any resource for that...