julia icon indicating copy to clipboard operation
julia copied to clipboard

AnnotatedStrings: unclear what operations support/preserve annotations

Open aplavin opened this issue 1 year ago • 3 comments

Not sure whether I really understand the logic: where annotations are supported and where they aren't. Would be nice to document it... Some of these may be bugs, but hard to tell without knowing the intentions.

julia> a = Base.AnnotatedString("abc", [(2:3, :x => 123)])
julia> av = view(a, 1:2)

1:

# annotations present in both a and av...
julia> Base.annotations(a)
1-element Vector{Tuple{UnitRange{Int64}, Pair{Symbol, Any}}}:
 (2:3, :x => 123)

julia> Base.annotations(av)
1-element Vector{Tuple{UnitRange{Int64}, Pair{Symbol, Any}}}:
 (2:2, :x => 123)

# ... but not when putting them into an array
julia> Base.annotations.([a,av])
2-element Vector{Vector{Tuple{UnitRange{Int64}, Pair{Symbol, Any}}}}:
 [(2:3, :x => 123)]
 []

2:

# regex matching preserves annotations...
julia> first(eachmatch(r"\w", a)).match |> Base.annotations
Tuple{UnitRange{Int64}, Pair{Symbol, Any}}[]

# ... but not always
julia> first(eachmatch(r"\w", av)).match |> Base.annotations
ERROR: MethodError: no method matching annotations(::SubString{String})

# and sometimes matching doesn't work at all:
julia> match(r"\w", av)
ERROR: ArgumentError: regex matching is only available for the String and AnnotatedString types; use String(s) to convert
# match() actually supports substrings, so I guess the error message is wrong?

3:

# replace() works, but silently drops annotations:
julia> replace(a, a=>a) |> Base.annotations
ERROR: MethodError: no method matching annotations(::String)

4:

# join() preserves annotations ...
julia> join([a, av]) |> Base.annotations
1-element Vector{Tuple{UnitRange{Int64}, Pair{Symbol, Any}}}:
 (2:3, :x => 123)

# ... but string() concatenation doesn't, while being conceptually the same operation
julia> string(a, av) |> Base.annotations
ERROR: MethodError: no method matching annotations(::String)

# interpolation also drops annotations:
julia> "x $a $av" |> Base.annotations
ERROR: MethodError: no method matching annotations(::String)

aplavin avatar Aug 07 '24 06:08 aplavin

Regarding join, see also https://github.com/JuliaLang/julia/issues/55382

ararslan avatar Aug 08 '24 00:08 ararslan

Perhaps with functions that have explicitly added support, it may be nice to add a comment like "This function preserves any annotations (see StyledStrings.AnnotatedString)"?

tecosaur avatar Aug 11 '24 14:08 tecosaur

This would probably help the replace case (3 in the first post) where annotations are simply not implemented. But regarding cases 1, 2, 4, I'm not even sure I see the logic on why they behave like this. Eg, eachmatch sometimes preserves annotations and sometimes drops them...

aplavin avatar Aug 27 '24 10:08 aplavin

The same behavior in the actual 1.11 release. At least some of these appear to be bugs and not just incomplete docstrings – correct me if I'm wrong. Eg, from the first post:

  • #1 – gathering annotated strings into an array affects their annotations
  • #2 – the error message says ArgumentError: regex matching is only available for the String and AnnotatedString types which definitely isn't the case. match is applicable to SubStrings as well as AbstractStrings from packages (eg StringViews). It's just not applicable to annotated SubStrings.
  • #4 – an inconsistency, but a pretty major and surprising one: the first recommended way to concatenate strings (string(...)) drops annotations, while the second (*(...)) preserves them.

aplavin avatar Oct 08 '24 17:10 aplavin

Just quickly:

  1. This is weird, I had a glance at this and it wasn't obvious what was going on. A deeper look will happen when I have time.
  2. I think here an extra method might make this consistently work for substrings of String|AnnotatedString{String}
  3. Yea, this came up in a triage call a while back. It's possible to adjust string(...) so that annotations are preserved, but IIRC the sticking point is that while *(s::String...) requires all arguments be strings, because string(...) can take a mixture of types, one can argue that annotated-ness then becomes an attribute of the all objects, not just strings.

tecosaur avatar Oct 08 '24 17:10 tecosaur