streusle icon indicating copy to clipboard operation
streusle copied to clipboard

Format extension: incorporating annotator notes?

Open nschneid opened this issue 5 years ago • 1 comments

The version of STREUSLE in Xposition contains some annotator notes on P tokens that are not included in the official release. The notes can help clarify the interpretation of the text, provide the annotator's rationale, or help cluster different usages at a finer level of granularity than the supersenses.

Should the .conllulex format have a place for these? An extra column? Or maybe a sentence header row, as they are rare?

Should there also be a standard for releasing rich annotation history metadata (such as who annotated which token, original vs. adjudicated annotations, timestamps, ...)?

nschneid avatar Aug 25 '19 03:08 nschneid

Maybe notes should be in a standoff TSV format (similar to tquery.py output) that gets ingested into the JSON?

Distinguish token notes (tnote), lexical expression notes (lnote), sentence notes (snote)?

Allow notes for arbitrary subsets of a sentence's tokens (e.g. "this was considered but rejected as an MWE")?

nschneid avatar Sep 01 '19 13:09 nschneid