notebook
notebook copied to clipboard
Make markdown header ID and anchor ID unique
Currently, two headlines with the same text (say "Plots") get the same ID for the appended anchor. This is problematic, if you use repetitive headlines (think "Analysis 1" -> "Plots", "Analysis 2" -> "Plots"...) and @minrk TOC script which links to headers and so links to the second "Plots" go to the first ones, as that's the first with such an ID.
The place which adds the links is this here: https://github.com/jupyter/notebook/blob/master/jupyter_notebook/static/notebook/js/textcell.js#L281
I replaced that locally with:
html.find(":header").addBack(":header").each(function (i, h) {
h = $(h);
- var hash = h.text().replace(/ /g, '-');
+ var hash = h.text().replace(/ /g, '-') + "_" + that.cell_id;
h.attr('id', hash);
and now my header IDs and the anchor ID are unique (and TOC works!) but they are not the same across notebook reloads, so links to a certain headline do not work :-(
Is there any "ID" for a cell which could be used to make the IDs unique?
Any news here? I can put up a PR for this, if that's better to discuss the issue?
If someone has a better idea (aka more stable across notebook reloads), this would be great! It would also probably need a issue for nbviewer, as links to certain headline should be stable there, too.
If you want ID for cells, that are persistent, you need a change of the notebook format. Which won't happen soon. Cf among other https://github.com/ipython/ipython/issues/3777. It raises complexity when you duplicate notebook, copy,past, split cells insure id stay unique. So it will probably not "just work". It needs probably an IPep.
It will probably not be for 4.0 at 4.0 will be "no new features".
We have talked about using the heading levels as structure. If we implemented that, you could create a more unique & stable ID by putting its 'heading path' in the hash - e.g. #Analysis-1/Plots. Not perfect, but better.
Short of that, you could also sequentially number headings with the same text - #Plots, #Plots-2, etc. That's annoying if you insert one near the top, though, and all the later ones are renumbered.
As Matthias says, we don't store any ID for cells - but then even that wouldn't completely solve the issue, because you could repeat the same heading in one long Markdown cell.
Ok, the "heading path" or "numbering" solutions seems doable for me (the "change the notebook format" not -> I don't think I've a deep enough understanding what this would mean for the rest of the functionality).
numbering
- easiest, as it's probably just a jquery call for the new id and if it's taken try one with a number until one is not taken
- will break permanent URLs (manual links into the document; links into a nbviewer one) if you insert a new heading over an old one and reload (or renumber on the fly)
Heading path
- will be harder to implement as the current document/ header structure has to be parsed
- will break permanent URLs if a heading from the path is changed (and the nb is reloaded)
add some metadata hash foreach md cell and add that to the ids (current id+metadata hash) in that cell
- id will end up as
#plots-lkjdsf435lkjh4235lkjh - will be stable over reloads /changes to other md cells
- will be stable as long as the heading itself is not changed
- will still break if the heading is changed
- will still break if in that md cell exists a second heading with the same name (+ numbering?)
- not sure what happens to c/p, splitting and merging of cells -> is metadata copied/merged/split as well? If split will duplicate metadata this would be bad :-(
If adding to metadata isn't a problem (i.e. splitting does not copy metadata) I would try that first, but if you thing that's a problem I would go with simple numbering.
I'm not sure about splitting, but I would expect copying cells to copy their metadata.
Even metadata on copy/past is not obvious. Let say a metadata field is supposed to be unique. Do you keep it on copy/past ? (no it might be duplicated) and cut/past (maybe), but then what if cut/paste/past ? you get a duplicate.
Hi @janschulz : We're tidying up our issue log and closing or progressing issues older than 1 year. Could you please tell us if this issue been resolved to your satisfaction?
This probably should remain open it relates to a lot of the work that I've been doing around auto-header specification and is a useful resource for me to touch back on.
My solution to something like this would be to make available manually specified ids that rely on an extension of markdown syntax (specifically I'm thinking that the pandoc approach of using {#your_header_id} after the atx-style header (i.e., header's denoted with # my header content) is really straightforward and would actually lead to tractable improvements on this issue.
That said, it's not a small undertaking to make something like that available given our current header-id implementation.
Regardless, thank you @JamiesHQ for bringing this to my attention it had escaped my notice until now.
We mitigated it a bit in https://github.com/ipython-contrib/jupyter_contrib_nbextensions/blob/master/src/jupyter_contrib_nbextensions/nbextensions/toc2/toc2.js#L7, but the problem of preserving links from outside to headers when they have the same name still exists.
Sounds good, thanks @mpacer and @janschulz ! I've added mpacer as an assignee. cheerio.
but the problem of preserving links from outside to headers when they have the same name still exists.
That's what I think manual specification will be able to solve. Otherwise this is always going to be an issue because no matter however clever we are about updating links within a notebook, outside links can't be updated and will fail if there's any change in the header positioning (i.e., even if you were to do this in pandoc you would run into this problem). But if we encourage manual header specification, then permanent links can be guaranteed (since they aren't autogenerated).
Is there any work in progress to implement this feature ?