inception icon indicating copy to clipboard operation
inception copied to clipboard

Annotation of whitespace

Open rbawden opened this issue 3 years ago • 4 comments

Is your feature request related to a problem? Please describe. I would like to do text normalisation by annotating spans of text with the normalised version and a label/comment. I would like to be able to annotate whitespace too, either as part of a span or by itself, but I am not currently able to do this, even though I have selected the annotation layer to apply to characters rather than tokens (I have checked and I can annotate parts of words and parts of several words in a single span).

Describe the solution you'd like I would like to be able to select whitespace for annotation or select a span of text containing whitespace at the beginning of the end: e.g. in Gonna take awhile , , I would like to be able to select the space after 'awhile' as well as be able to select the space and the following comma.

Describe alternatives you've considered There are not really any suitable alternatives.

Additional context When I try to select the comma and the preceding whitespace, only the comma is selected:

rbawden avatar Dec 17 '21 11:12 rbawden

Allowing inter-token annotations should in principle be doable e.g. for the HTML-based editor. But documents annotated in such a way would I believe not work with the brat-based editor. Also, the annotations could only be exported as CAS XMI and not in any other format.

reckart avatar Dec 17 '21 11:12 reckart

Thanks for your speedy reply! That's good to know!

I have imported my document as if it were HTML and was able to select adjoining whitespace successfully. I was also able to select whitespace on its own for annotation, with the only problem being that the span is not highlighted like the rest of the text, so is not very visible by the annotators. You can still edit it by using the arrows to click through the annotations. Is this something that could be easily changed?

Thank you again for your help!

rbawden avatar Dec 17 '21 11:12 rbawden

When I tried selecting whitespace only in HTML mode, I got a message like "Cannot create span: no tokens at position XXX-YYY". Which version are you using?

reckart avatar Dec 17 '21 15:12 reckart

Sorry for my delayed reply! I am using the following version: 21.4 (2021-12-16 12:06:42, build 357dfa0c)

rbawden avatar Jan 05 '22 10:01 rbawden