dkpro-core
dkpro-core copied to clipboard
Collection of software components for natural language processing (NLP) based on the Apache UIMA framework.
We want to be able to add further JCas annotations (e.g. lemmas and named entities) as specified in the annotated gigaword documents.
The current LIF reader/writer doesn't only convert between CAS and LIF, but also performs a schema mapping between the DKPro Core type system and the LAPPS WSEV. It would be...
- [ ] Support attributes (see https://github.com/pubannotation/pubannotation/issues/8) - [ ] Support exporting/importing relation layers
While DKPro has UIMA types for Cardinal and Ordinal, it seems there are no annotators that can produce them. So I implemented my own CardOrdAnnotator for English based on the...
Swedish POS Tagger that uses the Stockholm-Umeå Corpus and tag set (http://www.ling.su.se/english/nlp/tools/stagger). Could then be used with the Maltparser and model which uses the same corpus and tag set. Mostly...
``` Since we lack non-GPL lemmatizes (ok, we have the one from the ClearNLP now), it may be a good idea to integrate the BioLemmatizer (http://biolemmatizer.sourceforge.net). The license appears to...
http://www.llc.manchester.ac.uk/research/projects/germanc/files/ - The "rawcorpus" version could be added as a dataset - The "TEIcorpus" version could be added as a dataset, but we'd have to check if our TeiReader actually...
https://github.com/EuropeanaNewspapers/ner-corpora
If you run ReadBrat.java in the attached Maven project, you get this errror: Exception in thread "main" java.lang.IllegalStateException: Unknown annotation format: [N1 Reference T1 Wikipedia:Q95 Google] Yet, the annotation in...
Change explicit handling of `ROOT` element in constituent and dependency parser components if mapping is disabled. This is a follow-up to #1317