Nested block footnotes rendered in wrong order
(Python Markdown 3.9; mkdocs 1.6.1 with footnotes extension; Python 3.12)
While using mkdocs I stumbled upon this problem:
Footnote anchors and footnotes themselves are rendered out of order when, in the source and in addition to other document-level footnotes, one ore more footnotes are being referenced from within a nested block, such as a list element or a quote block.
If Python Markdown is downgraded to version 3.8.2, footnotes will be rendered correctly.
Here's a reproducible Markdown case:
# Heading
The following footnote anchor is the first one.[^1]
## Subheading
The following footnote anchor
1. should be the second one, but mkdocs renders it in fourth place.[^2]
1. Note that every anchor still points to its correct, matching footnote.
The following footnote anchor is the third one, but will be rendered in second place. [^3]
> Now in a quote block, possibly reproducible in any other nested block. Fourth yet shown as fifth footnote anchor.[^4]
Last, fifth footnote anchor.[^5]
### Footnotes rendered out of order:
[^1]: First
[^2]: Second
[^3]: Third
[^4]: Fourth
[^5]: Fifth
For reference, here is what I see via mkdocs serve:
I imagine https://github.com/Python-Markdown/markdown/pull/1546 is likely responsible for any unexpected, new changes related to footnotes.
So, setting USE_DEFINITION_ORDER to True seems to render more sane results.
This seems to use the previous approach.
With that said, it seems the new approach is a little broken. I assume the testing done in the implementation was not enough to expose some weird ordering cases.
My suspicion is that this is not a bug, at least not based on what I think the new change to footnotes means to implement. I was only minorly involved in the review, but IIRC, the new change was meant to order footnotes in the order they were processed. I believe there was the assumption that it would be in the logical order that the footnotes are applied to the content within paragraphs, but I believe the parser may parse blocks in an unexpected order in some cases, leading to what is viewed in the example given in the opening post. The tests added to the feature were likely not sufficient to expose that the order can be unexpected in some cases.
If the footnotes are truly processed in the order demonstrated in the final output, then there is technically no bug. But it would seem illogical to those who expected they would be processed in the order from top to bottom on the page.
Currently, setting USE_DEFINITION_ORDER to True may be the best answer right now to get the old behavior.
I guess, moving forward, we should evaluate what order the footnote references are actually parsed in, and if they are correct, we either accept that this is the way things are, rework the logic to order the references in logical order, or back out the recent change.
Running some tests, it seems that the way the tree is crawled for inline content, the root level paragraphs are processed first, and then later the content nested under lists and then the blockquote.
We would have to adjust how the tree is walked for inline content in order to ensure an expected logical order, or rewrite the IDs in a logical order in a footnotes treeprocessor.
I'll let @waylan weigh in.
I don't think this new feature does what the original submitter thought it would do, and I'm not sure if it is generally better. The good news is that USE_DEFINITION_ORDER should give you the old behavior.
I'm not surprised by this. In fact, I was very resistant at first to #1546 because I expected this. I only moved forward when it seemed that tests confirmed it worked correctly. I never really dug in to confirm that the tests were complete enough like I should have. However, I am glad that I insisted on having the USE_DENFINITION_ORDER option.
I think probably the real question here is if we should try to fix or roll back #1546. I'm concerned that a fix would require a major refactor of the core, which is a no-go.
@aeskildsen do you have any thoughts on this?
I think probably the real question here is if we should try to fix or roll back https://github.com/Python-Markdown/markdown/pull/1546. I'm concerned that a fix would require a major refactor of the core, which is a no-go.
Yep, I agree. You either have to refactor how inline stuff is processed to be in a logical order, put all of that logic in footnotes treeprocessor to correct the logical order, or just revert the change. I think having this new behavior, as it currently is, being the default, is not ideal.
If we roll back #1546, then how do we handle the USE_DENFINITION_ORDER option. That option didn't exist previously. However, some users could have enabled it since the last release. Removing support for it would raise an error for them. I suppose we could keep it as a do-nothing option.
Yeah, a "do-nothing" may be what we need.
My suspicion is that this is not a bug, at least not based on what I think the new change to footnotes means to implement. I was only minorly involved in the review, but IIRC, the new change was meant to order footnotes in the order they were processed. I believe there was the assumption that it would be in the logical order that the footnotes are applied to the content within paragraphs, but I believe the parser may parse blocks in an unexpected order in some cases, leading to what is viewed in the example given in the opening post. The tests added to the feature were likely not sufficient to expose that the order can be unexpected in some cases.
Running some tests, it seems that the way the tree is crawled for inline content, the root level paragraphs are processed first, and then later the content nested under lists and then the blockquote.
I don't have time to investigate at the moment, but I think the above is accurate. The reordering of footnotes introduced in #1546 relies on footnote references being processed in document order (top to bottom). But the tests unfortunately didn't cover the parser behavior that we see here. So I have to agree, while there is technically no bug in #1546, the previous behavior (as with USE_DEFINITION_ORDER) is preferable.
I did however play around with a few different implementations before settling on #1546 - and I would be happy to attempt a reimplementation that doesn't rely on the processing order. I would add this example as a new test case. It would take a few weeks before I could look at it though. @waylan, would you like to explore this option of reimplementing #1546 before rolling it back?
Right now, because this is the default, it's likely providing very undesirable results for many, at the very least, the new behavior should be turned off by default. Personally, I'm not against a rework attempt per se, but it would need a lot more testing to prove itself if it were to be accepted. Anyway, that is my current opinion at least.
Agreed.
My inclination is to do a "bug fix" release which changes the default of USE_DEFINITION_ORDER to True which would have the effect of reverting the change. That will provide the time to fully explore other options. Then if/when we have a working solution, we can release it as an option which can be turned on (default remains definition order). If over time, we find that the new behavior is popular and works in all sorts of various documents, then we can consider changing the default. This is probably the approach we should have taken with #1546.
I'm fine with this change in the default. We can add a note that disabling the flag is experimental.
I have reverted the default behavior as discussed. However, the new behavior is still not working consistently as reported here. Therefore, I am leaving this open until that issue is resolved.
Good call on this and the future process. I'll get back when I get the chance.