AnnotatedString displays as a regular String, annotations aren't shown
AnnotatedStrings to be released in 1.11 are going to be convenient for various NLP stuff that involves annotating strings! But what surprised me is that such "annotated strings" are displayed as regular Strings, showing no annotations nor indications that they are present at all:
julia> Base.AnnotatedString("good bad", [(1:4, :sentiment => +1), (6:8, :sentiment => -1)])
"good bad"
Presumably, if someone adds annotations to a string as part of the processing, they want to see those annotations...
Any suggestions on how to print them nicely (or at least somehow) automatically? Maybe the default display should include them?
My main concern with printing them by default is that it could be a bit messy, but I'm very much happy to explore some options.
Currently, you can get the string + annotations printed with 2-arg e.g.
julia> str = Base.AnnotatedString("good bad", [(1:4, :sentiment => +1), (6:8, :sentiment => -1)])
"good bad"
julia> show(str)
Base.AnnotatedString{String}("good bad", [(1:4, :sentiment => 1), (6:8, :sentiment => -1)])
For reference, this is what can happen with a big AnnotatedString (156 annotations):
I don't have a complete thought-out suggestion, but I imagined an perfect display to look like:
julia> Base.AnnotatedString("good bad", [(1:4, :sentiment => +1), (6:8, :sentiment => -1), (3:6, :something => 123)])
"good bad"
Annotated with:
"good" (1:4): sentiment => 1
"od b" (3:6): something => 123
"bad" (6:8): sentiment => -1
Of course, there should be some limits on the number of displayed annotations – like when displaying large arrays Julia doesn't print them fully.
What surprised me the most is that the default display doesn't indicate that there are any annotations whatsoever, while they are presumably important the user.
We could potentially implement a 3-arg show that changes the printing to indicate that a string is annotated. Perhaps we could/should also show some annotations in a non-compact 3-arg show too? Not sure, I'd be interested in canvassing some more opinions.
Not too familar with 2/3 arg show() difference, what exactly would it mean for users? When they see it?
My main concern is that the current default display doesn't indicate the presence of annotations in any way. It can be very confusing for beginners: imagine "why does your package's annotate_parts_of_speech(string) just gives me my string back?" kind of questions.
The default display users see for annotated strings should at least hint that annotations are present, right?
The default display users see for annotated strings should at least hint that annotations are present, right?
I'm onboard with this, the question is just how to best do so.
the question is just how to best do so.
The simplest way would just be to add any hint that annotations are present:
julia> AnnotatedString("abc", [...])
AnnotatedString: "abc"
I understand that showing annotations content opens more questions: how many to show, under what conditions, etc. But even just a note that the string is annotated would help users to understand that it's not just a regular string – but contains more data. It would definitely make
"why does your package's annotate_parts_of_speech(string) just gives me my string back?" kind of questions
less likely to arise :)