marked icon indicating copy to clipboard operation
marked copied to clipboard

Select Extension priority

Open calculuschild opened this issue 11 months ago • 2 comments

Describe the feature Custom extensions always take priority over default tokens. A way to specify where in the parsing sequence an extension should run so certain default tokens don't get overridden.

Why is this feature necessary? Sometimes there are ambiguous cases in Markdown where the result depends on which token gets parsed first. Example, tables without a starting pipe:

<div> header
|:---:|
cell

This is not be parsed as a table, because HTML is parsed first. However if someone writes a custom table extension for example, suddenly tables take priority and this is rendered as a table. The user would want the extension to execute before the default table tokenizer but after the HTML tokenizer.

Describe alternatives you've considered It is possible to work around this by, in the extension tokenizer, pre-parsing the string and checking if any higher-priority tokens appear. However this adds extra steps and results in the same line potentially being parsed multiple times.

I'm picturing something like breaking the lexer.blockTokens() and lexer.inlineTokens() functions into an array of tokenizers and calling each one in sequence, and a user making an extension would somehow be able to inject their extension into this array (added to the front by default), though I imagine calling each tokenizer from an array instead would cause some slowdown.

In fact I think that was the original approach to the Extensions feature when I was making it, some kind of list of tokenizers, but the slowdown was too much. But now that it's been out for a while I am starting to run up against this limitation more and more.

calculuschild avatar Jan 15 '25 18:01 calculuschild

We could try doing something like that now that most of the lexer's work has been moved to the tokenizers. I would still want to make sure it isn't slowed down for anyone not using extensions.

One other way this could be handled now is by using the provideLexer hook so we could create a different lexer that could be used that has more functionality but is slower.

UziTech avatar Jan 16 '25 20:01 UziTech

#3594 is one way I can see this being done. Initial checks seem to not slow it down to much but I think it can be improved a lot.

UziTech avatar Jan 18 '25 19:01 UziTech