TiddlyWiki5 Revamp markdown plugin

Instead of merging markdown parse tree into TW tree, this PR feeds markdown-rendered HTML into TiddlyWiki for a final parse tree. This approach allows raw html inside the markdown text to be interpreted properly. To demo this PR or view test samples, visit new-markdown-plugin demo wiki; for comparison, here's the same wiki with the current markdown plugin

Other details

Replace remarkable with markdown-it parser for more robust parsing.
Extend syntax: strikethrough, insertion, mark, superscript, subscript, footnotes, definition lists.
Prevent accidental katex in expressions such as $5 and $6, <$list>...</$list>, $:/config%/PageControlButtons/Visibility/$:/core/ui/Buttons/home.
Ensure links to tiddlers work properly under markdown image and link syntax.
Add "syntax" section to user documentation.
Add support for "text/markdown" MIME type.

Mar 13 '22 11:03 cdruan

I think most of the the "test tiddlers" from: https://cdruan.github.io/TiddlyWiki5 should go to the TW markdown edition.

Mar 13 '22 13:03 pmario

Replace remarkable with markdown-it parser for more robust parsing.

What did make you to switch to markdown-it from remarkable.

Mar 13 '22 13:03 pmario

As Jeremy pointed out the checklist plugin only adds visual "bells" without real functionality and it fails to print. see screenshot

There is a tw5-checklist plugin, ~~which I can't find at the moment~~, which actually allows users to change the state. It modifies the tiddler content.

grafik

Mar 14 '22 12:03 pmario

Hi @pmario and @Jermolene

Let me start by explaining why I made the switch to markdown-it. The decision was only made (reluctantly) after I found a series of problems with remarkable:

Remarkable does not handle raw <pre> and <style> blocks correctly when the blocks contain blank lines. This does not meet CommonMark's spec. See example #169:
```
<pre language="haskell"><code>
import Text.HTML.TagSoup

main :: IO ()
main = print $ parseTags tags
</code></pre>
okay
```
The example rendered by remarkable:
```
<pre language="haskell"><code>
import Text.HTML.TagSoup
<p>main :: IO ()
main = print $ parseTags tags
</code></pre>
okay</p>
```
Notice the extra <p> inside <pre> block and the spacing is no longer preserved.
In its rules_block/deflist.js module, the variable at line 98 should have been placed outside the loop. Checking with markdown-it-deflist confirms my finding. The consequence of that is the entire formatting of <dl> list is determined by the last term's spacing (tight or loose).
In its rules_block/htmlblock.js module, its HTML_TAG regular expressions (line 6 & 7) do not take into account of digits. So <h1>..<h6> are not recognized as block tags. So if you enter this:
```
<h4>heading</h4><ul><li>item 1
</li><li>item2</li></ul>
```
Remarkable yields this:
```
<p><h4>heading</h4><ul><li>item 1</p>
</li><li>item2</li></ul>
```
Not only is the <p> undesirable, its end tag cuts off the the <li> expression.

These problems altogether present a worrisome picture of a no longer well maintained project. Then, on markdown-it's github page:

markdown-it is the result of the decision of the authors who contributed to 99% of the Remarkable code to move to a project with the same authorship but new leadership (Vitaly and Alex). It's not a fork.

Having worked with both libraries, I can report that markdown-it has extra routines (such as Tokens and default token rendering) that make writing extensions less tedious.

As for the motivation of this PR, if you haven't done so already, please visit the demo using CURRENT markdown plugin and examine the tiddlers under Issues Fixed section. You'll see the issues I uncovered.

This PR loads successfully on both IE and ipad3—after I removed the let. There are some math expressions that could not be rendered in IE. On the other hand, the current plugin does not load successfully on either platforms. I suspect it may have to do with the remarkable-katex.js module not being es5 compliant.

Mar 14 '22 23:03 cdruan

As for the motivation of this PR, if you haven't done so already, please visit the demo using CURRENT markdown plugin and examine the tiddlers under Issues Fixed section. You'll see the issues I uncovered.

I did see the fixes and I'm with you. There are definitely some problems with the remarkable library. Since I don't use markdown regularly, I don't have a problem to switch.

Mar 15 '22 00:03 pmario

checklist plugin removed

Mar 15 '22 00:03 cdruan

One more thing, the immediate backward incompatibility I can see is that I removed the image parse rule from the renderWikiPragma. There was a debate whether prettylink should be included in the pragma or not. Since prettylink is still not in there and image rule provides the equivalent syntax for images, I took it out of the default setting... Or maybe both should be IN together?

Mar 15 '22 01:03 cdruan

@cdruan ... Why did you close this one? ... I think it would be worth to be included.

Mar 30 '22 08:03 pmario

Hi @cdruan

@cdruan ... Why did you close this one? ... I think it would be worth to be included.

See https://github.com/Jermolene/TiddlyWiki5/pull/6528#discussion_r837977350

Mar 30 '22 19:03 Jermolene

@pmario and @Jermolene

I've rollbacked changes to parseutils.js. The entity decoding (of attribute values) is now performed on the TW parse-tree element and widget nodes—with the exception of <$wikify> and <$entity>—during the post-processing stage.

Other notable updates:

Add tw tag block parsing (similar to html blocks) to properly parse tw tags separated by blank lines.
Remove katex block parsing—I think inline display mode is sufficient. Block parsing only adds ability to include blank lines between $$...$$, but this feature is actually undesirable.
Add Extending Markdown demo tiddler to demonstrate the ability to extend the Markdown plugin.
Update/add few demo tiddlers.
Update user doc, mainly to specify inline math parsing rules.

Please let me know if you have any concerns. Thanks.

Apr 06 '22 09:04 cdruan

Thanks @cdruan much appreciated.

I've rollbacked changes to parseutils.js. The entity decoding (of attribute values) is now performed on the TW parse-tree element and widget nodes—with the exception of <$wikify> and <$entity>—during the post-processing stage.

I'm sorry I didn't feedback quick enough on your earlier comments. The decoding of attributes must be applied consistently across all widgets. Arbitrary exceptions add significantly to the cognitive load of the system and future updates.

Is there another approach that doesn't affect wikitext parsing?

Apr 06 '22 10:04 Jermolene

Hi @Jermolene

Thanks for your feedback. I was hoping the disruption is deemed minimal because decoding is applied only to the string type of attribute values in markdown tiddlers. The likelihood of users issuing widget calls inside the markdown text is probably low.

Besides allowing consistent grammar across html elements and widgets, decoding the attributes allows me to pass LaTex and fence block markdown contents to TW via <$codeblock> and <$latex> widgets. (I can go into details on why I ultimately chose this strategy.) I also translate markdown links and images to <$link> and <$image> calls when appropriate and it'd be nice to have tooltips properly rendered when there are entities in the text. So I'm quite dependent on having a mechanism to decode entities in attribute string values.

I believe I found another widget, <$macrocall>, that requires delayed entity decoding... There are probably other exceptions. Would it be acceptable if the behavior is clearly documented? Alternatively, I could apply decoding only to the attributes of the aforementioned widgets (and html elements) that this PR depends on. Would that work? Personally, it would be nice to have a sure way of escaping quotes in the attribute values. Although delimiting with """ minimizes conflicts, theoretically, you can still have a string that contains """.

By the way, I just realized that current markdown plugin may be indirectly decoding entities in attribute values anyway. See example. For your reference, same markdown tiddlers rendered by this PR.

Apr 07 '22 06:04 cdruan

@Jermolene ... This should also be considered for v5.2.3

Jul 08 '22 18:07 pmario

Thanks for your patience @cdruan

Jul 09 '22 07:07 Jermolene

Hi @cdruan - apologies for not noticing sooner – please could you resolve the merge conflict? Many thanks

Jul 09 '22 07:07 Jermolene

I've rollbacked changes to parseutils.js. The entity decoding (of attribute values) is now performed on the TW parse-tree element and widget nodes—with the exception of <$wikify> and <$entity>—during the post-processing stage.

I'm sorry I didn't feedback quick enough on your earlier comments. The decoding of attributes must be applied consistently across all widgets. Arbitrary exceptions add significantly to the cognitive load of the system and future updates.

Is there another approach that doesn't affect wikitext parsing?

@cdruan @Jermolene has this issue been resolved?

Jul 09 '22 09:07 saqimtiaz

--cdruan wrote-- I've rollbacked changes to parseutils.js. The entity decoding (of attribute values) is now performed on the TW parse-tree element and widget nodes—with the exception of <$wikify> and <$entity>—during the post-processing stage.

--Jermolene wrote-- I'm sorry I didn't feedback quick enough on your earlier comments. The decoding of attributes must be applied consistently across all widgets. Arbitrary exceptions add significantly to the cognitive load of the system and future updates.

Is there another approach that doesn't affect wikitext parsing?

--saqimtias wrote-- @cdruan @Jermolene has this issue been resolved?

I think there has been a misunderstanding in the first place. @cdruan did modify code in the parseutils.js which is part of the TW core. Mixing PRs in that way shouldn't happen. That's why the "offending" code was removed from this PR.

As far as I can see, this PR only contains code that is needed to handle markdown if the plugin is installed. It shouldn't interfere with the TW core code anymore.

There may still be a problem in parseutils.js, but this would be a different issue and a different PR.

Jul 18 '22 15:07 pmario

Sorry, I just saw the recent discussions.

I think we have yet to resolve the issue of decoding HTML entities inside widget and html tag's attribute string values. My current implementation is to decode entities inside attribute values of all html tags and widget tags, except <$wikify> and <$entity> and <$macrocall>. This only applies to markdown tiddlers.

However, I realized there's another approach if we could allow another string literal syntax, say: e"....string content...", similar to a raw string syntax in python, etc. Any strings of this form will have their entities decoded during parsing. This way, we could maintain backward compatibility while allowing content like this:

<$codeblock code=e"var s = &quot;foobar&quot;;"/>

And there won't be any seemingly arbitrary "exceptions" as before.

Would this new approach be acceptable?

Jul 19 '22 09:07 cdruan

I'm not sure. Why is <$codeblock code="""var s = "foobar";;"""/> not good enough? It works out of the box.

Jul 19 '22 11:07 pmario

@pmario

Why is <$codeblock code="""var s = "foobar";;"""/> not good enough? It works out of the box.

This PR needs to be able to programmatically pass any content of a fenced code block to <$codeblock> widget. So it needs to assume that """ could be in the content.

Jul 19 '22 12:07 cdruan

I could temporarily extend parseStringLiteral() to parse and decode entities inside e"...." strings and then restore the function to its original functionality once markdown parsing is done. (A version of this approach has been tested already.) So the extended string syntax will be an unpublished feature that only works within this markdown parser. This would be less drastic and no need to update TW5 doc.

Jul 19 '22 12:07 cdruan

I've updated this PR with the approach described in this comment. I also made few additional updates to support the new "text/markdown" MIME type (commit 8e687eb).

The demo pages new vs. old have been updated as well, powered by 5.2.3-prerelease version.

Jul 21 '22 11:07 cdruan

Extend syntax: strikethrough, insertion, mark, superscript, subscript, footnotes, definition lists.

Very appreciated update, thanks!

I see that a checklist plugin has been removed.

Is there any plan to support markdown checklists (TODOs), like here on GitHub?

- [ ] to be done
- [x] task is done

Jul 27 '22 08:07 nicolnt

Will the revamp also allow the plug-in to store meta file data in markdown front matter instead of needing two files per markdown tiddler?

Jul 27 '22 08:07 kravlost

Will the revamp also allow the plug-in to store meta file data in markdown front matter instead of needing two files per markdown tiddler?

The markdown plugin handles how markdown tiddlers are rendered and displayed in TiddlyWiki, and is not related to loading/save files in the node.js configuration which is a core behaviour.

Jul 27 '22 08:07 saqimtiaz

Will the revamp also allow the plug-in to store meta file data in markdown front matter instead of needing two files per markdown tiddler?

The markdown plugin handles how markdown tiddlers are rendered and displayed in TiddlyWiki, and is not related to loading/save files in the node.js configuration which is a core behaviour.

#4702 says it would be possible with the new markdown plugin by adding the markdown-it-frontmatter plugin to the tw5 plugin. It would be great if this was supported.

Jul 27 '22 08:07 kravlost

#4702 says it would be possible with the new markdown plugin by adding the markdown-it-frontmatter plugin to the tw5 plugin. It would be great if this was supported.

You are mistaken. As described in that comment, the markdown-it-frontmatter plugin can hide the frontmatter but this would have nothing to do with how the tiddlers are saved to disk and the .meta files created on node.js.

Jul 27 '22 09:07 saqimtiaz

#4702 says it would be possible with the new markdown plugin by adding the markdown-it-frontmatter plugin to the tw5 plugin. It would be great if this was supported.

You are mistaken. As described in that comment, the markdown-it-frontmatter plugin can hide the frontmatter but this would have nothing to do with how the tiddlers are saved to disk and the .meta files created on node.js.

Ah, OK. My mistake.

Jul 27 '22 09:07 kravlost

After playing around with markdown tiddlers, I can tell that this PR fixes #6916 and #6919. Are there any blockers?

Aug 25 '22 19:08 MaxGyver83

Thanks @MaxGyver83.

@cdruan is this PR ready for another review?

Aug 26 '22 14:08 Jermolene

TiddlyWiki5 TiddlyWiki5 copied to clipboard

Revamp markdown plugin

Other details

TiddlyWiki5
TiddlyWiki5 copied to clipboard