Mark Sammons comments

Results 25 comments of


Mark Sammons

Sentence annotator don't work when Period don't follow by a space.

this is really a tokenization/sentence splitter issue: sentence annotator relies on the boundaries that the tokenizer provides.

Pipeline (Tokenizer) has issues with non-UTF-8 characters

We have some cleanup code for this kind of problem: https://github.com/CogComp/cogcomp-nlp/blob/master/core-utilities/src/main/java/edu/illinois/cs/cogcomp/core/utilities/TextCleanerStringTransformation.java https://github.com/CogComp/cogcomp-nlp/blob/master/core-utilities/src/main/java/edu/illinois/cs/cogcomp/core/utilities/StringTransformationCleanup.java If these don't cover such cases, this is where the fixes should be added. We could, by default,...

Mark Sammons

Sentence annotator don't work when Period don't follow by a space.

Pipeline (Tokenizer) has issues with non-UTF-8 characters

How to pass parameter [on runtime] to a specific annotator of the AnnotatorService?

Why two ViewNames in json files?

Issue 665

Issue 665

Issue 665

Issue 665

String.join includes an extra copy of separator at the end of the string.

no helpful error message if model file not found