dkpro-core
dkpro-core copied to clipboard
Collection of software components for natural language processing (NLP) based on the Apache UIMA framework.
Source: https://github.com/dkpro/dkpro-core/issues/619#issuecomment-280954963 Having in mind that your Croatian example is bad Croatian in the first place, the correct sentence would be something like this: > moramo odraditi vrlo kompliciran primjer...
For some sentences, CoreNlpParser throws an exception: ``` java.lang.RuntimeException: No roots in graph: dep reln gov --- ---- --- Find where this graph was created and make sure you're adding...
The TokenCaseTransformer seems to have a performance issue when a large number of token annotations has been changed. My case for reproduction: comparing tolower case on two files (1) 1,000...
TigerXmlReader produces wrong begin and end index for target (SemPred) of a semantic frame when the target is noncontiguous. For instance in the following sentence: `w1 w2 w3 w4 w5...
MateParser and MatePosTagger module records internal tags in tagset. Those tags never get actually produced by the parser. They should not be recorded.
The English models for MaltParser use an input POS tag `PRT` which does not exist as a POS tag in the [Penn Treebank Tagset](http://www.clips.ua.ac.be/pages/mbsp-tags). `PRT` is actually a chunk tag...
LanguageToolSegmenter chokes on "丁肇中": ``` Caused by: java.lang.IllegalStateException: Token [丁中] not found in sentence [丁肇中] at de.tudarmstadt.ukp.dkpro.core.languagetool.LanguageToolSegmenter.process(LanguageToolSegmenter.java:90) ```
Some problem here: ``` Caused by: java.lang.NullPointerException at de.tudarmstadt.ukp.dkpro.core.stanfordnlp.util.TreeWithTokens.setTree(TreeWithTokens.java:54) at de.tudarmstadt.ukp.dkpro.core.stanfordnlp.util.TreeWithTokens.(TreeWithTokens.java:48) at de.tudarmstadt.ukp.dkpro.core.stanfordnlp.StanfordParser.process(StanfordParser.java:407) at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:385) ... 13 more ``` Apparently the tree object returned by the parser can...
Since Stanford CoreNLP parser now supports dependency conversion using either the original Stanford Dependencies or the Universal Dependencies, we must check if the dependency tagset recording for English still properly...
The binaries for mecab are packages as models, but they should be packages as binaries and have a corresponding artifactId etc. Cf. other packages that use native binaries, e.g. TreeTagger,...