marimo icon indicating copy to clipboard operation
marimo copied to clipboard

Headings with the same name are highlighted multiple time in the table of contents

Open dmadisetti opened this issue 1 year ago • 6 comments

Describe the bug

I think headings are done on string match. Minor bug, but this means multiple headings with the same name (in my case a repeated "Results" subsection) are incorrectly highlighted

image

Environment

Head

Code to reproduce

Any notebook with repeated headers will give you this:

https://marimo.app/l/tgpmk9

dmadisetti avatar Aug 22 '24 19:08 dmadisetti

To expand on the details of the string match, using the example notebook provided, the following are the ids for the headings:

  • Cell 1
    • H1#Hello → “hello”
    • H2#Hello → “hello_2”
    • H3#Hello → “hello_3”
  • Cell 2
    • H1#Hello → “hello”
  • Cell 3
    • H1#Other → “other”
    • H2#Hello → “hello”

I played around in this branch and took an approach of deriving the index of the headings in the cells to locate the headings in the table of contents (outline panel). The changes are only for the outline panel (not the floating outline). This is probably a naive approach, because now it completely ignores the string matching (id or xpath expression).

Another idea could be to customize the heading ids. This could possibly be achieved with a markdown plugin. I haven’t ventured down that path, yet. The idea is to provide unique ids so the items can be uniquely identified.

Are there any initial thoughts on this approach, or better suggestions?

ontowhee avatar Feb 06 '25 00:02 ontowhee

@ontowhee, thanks for taking a look at this! The headers may move and I always try to find stable IDs over indexes. I haven't though too closely about what bugs might appear, but maybe as new headers are added.

I was thinking of another approach:

  1. activeHeaderId goes from string -> {id: string, occurrence: number}. It still has somewhat an index, but maybe slightly better.
  2. when we do setActiveHeaderId, we do
setActiveHeaderId({
  id: id
  index: document.querySelectorAll(`[id="${CSS.escape(item.by.id)}"]`).indexOf(topmostHeader.current)
})
  1. Then we once we have the index, we can activeHeader.id === identifier && occurrences === activeHeader.index && "text-accent-foreground"

This might be similar to your approach in complexity. Don't know if would has similar bugs. Let me know if I can help explain that approach more.

Also, there is a fair amount of duplication between the outline and floating panel, so maybe there is some codesharing and testing that could be improved if you'd want to take a stab at that.

mscolnick avatar Feb 06 '25 01:02 mscolnick

The ideal spec would prevent dupe IDs right? Wouldn't tweaking IDs at the output level trickle down to outlines?

If we go down the route of keeping of an ID registry, then just want to flag that processing html output for IDs can also address scrolling to an ID for when the output is rendered: https://github.com/marimo-team/marimo/issues/3346

Also, tangentially related- but the citation system could be made a bit better down this path too (right now citations do not work cross cell, and citations can be duplicated between cells). Keeping track of IDs and doing citations in FE could smooth out some of the issues + start to tie to external reference sources (saw this as a feature request in roadmap discussion)

dmadisetti avatar Feb 06 '25 04:02 dmadisetti

Thanks for the suggestions! Unique IDs does sound like a better path. I'm not too familiar with how the IDs are generated, but I can look into it later today and try something out.

ontowhee avatar Feb 06 '25 16:02 ontowhee

without looking into the code yet, my approach would be to simply go with uuids instead of deriving them somehow. @ontowhee are you still working on this?

jo-soft avatar Apr 10 '25 07:04 jo-soft

@jo-soft I'm not working on it. Feel free to pick up on the work!

ontowhee avatar Apr 10 '25 15:04 ontowhee