python-gatenlp
                                
                                
                                
                                    python-gatenlp copied to clipboard
                            
                            
                            
                        Python text processing, pattern matching, and NLP framework
Consider: * AnnotationSet is an abstract base class * DetachedAnnotationSet is the class used for detached sets * AttachedAnnotationSet is the class used for in-document sets We could just use...
Check if we could return the same match objects for all gazetteers (Token, String, Regex). Also check if the match object could just contain the fields start, end, data, type,...
Similar to GATE's segment processing PR.
Currently a fixed result or match needs to get specified, but sometimes it would be useful to perform the action for all results, all matches, or a selection of results...
The constructor is very complex right now. We need some way to specify/do all the things that can be done or decided at init time in a way that is...
As in the stringannotation plugin but more flexible. One or more texts, based on split anns, insert sep chars or not, insert if, insert from lambda
This has been partly done for old-format files. Need to figure out how this changes for new-style result JSON. See also https://github.com/twitterdev/Twitter-API-v2-sample-code
(See https://github.com/opensearch-project/OpenSearch) Make sure the Python elasticsearch packages are fully compatible with OpenSearch. Destination: * handles how Document instances are getting stored and indexed. * always store the actual document...
This is currently not implemented in the same way for all annotators and there should be a standard way for how to configure and do this: * each annotator (and...
If the ConllU format does not have space hinting in the misc column and no original text comment, we could still use some simple heuristics to make the resulting text...