julia icon indicating copy to clipboard operation
julia copied to clipboard

Rework annotation ordering/optimisations

Open tecosaur opened this issue 3 months ago • 3 comments

There are some annoyances/discrepancies with annotation handling that I originally tried to fix in #53794, #53800 and #53801.

@LilithHafner was kind enough to have a look at these, and after a long chat we came to the conclusion that this isn't the right approach to take, and that the order with which annotations are applied should be given primacy over annotation ranges.

This PR supersedes the aforementioned PRs, and should make the way annotations are ordered more sensible.

See the commit messages for more details.

tecosaur avatar Apr 28 '24 04:04 tecosaur

Ah, tests are failing because I missed a use of Base.annotatedstring_optimize! in StyledStrings' tests. I'll fix that and re-bump the stdlib shortly.

tecosaur avatar Apr 28 '24 08:04 tecosaur

What part of annotations are semantically meaningful/visible?

IIUC the annotations

[(1:1, :a => 1), (2:2, :a => 1)] and [(1:2, :a => 1)] are semantically equivalent [(1:1, :a => 1), (2:2, :a => 1)] and [ (2:2, :a => 1), (1:1, :a => 1)] are semantically equivalent [(1:1, :a => 1), (1:1, :a => 2)] and [ (1:1, :a => 2), (1:1, :a => 1)] are semantically different [(1:1, :a => 1), (1:1, :b => 2)] and [ (1:1, :a => 2), (1:1, :b => 1)] are semantically different

This should be documented. Specifically, a simple specification that describes what is and is not semantically visible from which one can derive the above (or a different result).

Ideally everything returned by annotations(::AnnotatedString) would be semantically meaningful, that's an easy semantic to document. (and https://www.hyrumslaw.com/)

(I can't review the correctness of an optimization without knowing what "correct" means)

LilithHafner avatar Apr 29 '24 13:04 LilithHafner

Ah yep, we did also discuss a need for more info in the docstrings. I'll see about adding some of the elaboration needed.

tecosaur avatar Apr 29 '24 13:04 tecosaur

I've just adjusted repeat to be consistent with the new semantic,

fullregion = firstindex(str):lastindex(str)
if allequal(first, str.annotations) && first(first(str.annotations)) == fullregion
    newfullregion = firstindex(unannot):lastindex(unannot)
    for (_, annot) in str.annotations
        push!(annotations, (newfullregion, annot))
    end
else
    for offset in 0:len:(r-1)*len
        for (region, annot) in str.annotations
            push!(annotations, (region .+ offset, annot))
        end
    end
end

this change has been folded into 2656884453 * Remove strong ordering of annotation ranges

tecosaur avatar Apr 30 '24 02:04 tecosaur

Now that #54308 has been merged, we need to resolve the merge conflict and adjust the test introduced there.

tecosaur avatar Apr 30 '24 11:04 tecosaur

After CI, I think this should be good to merge if that also sounds good by @LilithHafner :slightly_smiling_face:.

tecosaur avatar Apr 30 '24 11:04 tecosaur

1bed84f4ae6fe6fc9bb39612e274fd1630a51349 LGTM. I have not reviewed https://github.com/JuliaLang/julia/compare/1bed84f4ae6fe6fc9bb39612e274fd1630a51349..4c18472b3e56998f2784ecd5d9deea4cf5371850, but am open to suggestions to how I could do that without re-reviewing this PR from scratch.

LilithHafner avatar Apr 30 '24 12:04 LilithHafner

The only change made here in the diff you link is

    @test chopprefix(sprint(show, str), "Base.") ==
-        "AnnotatedString{String}(\"some string\", [(1:4, :thing => 0x01), (1:11, :all => 0x03), (6:11, :other => 0x02)])"
+        "AnnotatedString{String}(\"some string\", [(1:4, :thing => 0x01), (6:11, :other => 0x02), (1:11, :all => 0x03)])"

i.e. correctly resolving the change introduced by #54308.

HTH.

tecosaur avatar Apr 30 '24 12:04 tecosaur

Okay, the diff in your comment also LGTM.

LilithHafner avatar Apr 30 '24 13:04 LilithHafner

@tecosaur I know you have a preference to not squash but for things that will get backported, it introduces unnecessary stumbling blocks in that automated process that would be better to avoid.

IanButterworth avatar May 01 '24 14:05 IanButterworth

Thanks for mentioning that Ian, I do like to preserve more granular information when it seems sensible, but I wasn't aware that complicated the backporting process. I'll keep that in mind with the few remaining PRs I'm hoping to get backported :slightly_smiling_face:.

(As an aside, is the automated backporting doing something other than cherry picking? Or is the hassle having to supply -m?)

tecosaur avatar May 01 '24 14:05 tecosaur

Yeah the automatic script doesn't handle adding -m. If you wanted to help with adding that it's here https://github.com/KristofferC/Backporter/blob/master/backporter.jl

IanButterworth avatar May 01 '24 14:05 IanButterworth