Nuemark2.0 issues
Doing a quick code review of your dev branch, I am confused if the Blocks or Inline is scanned first? If Blocks, then it seems this code:
*** Bold Italic ***
Would render as an <hr> since you are just scanning first few characters.?
Also be aware that CommonMark is VERY flexible for <hr>. I personally don't think you need to cover every edge case, but it could help to document "Only the common Markdown/CommonMark rules are covered."
@tipiirai
[](somelink) in nuemark2 breaks the nue server and generator.
I used [](/) on a random page in the docs, and the server just hangs indefinitely
I know, that this is possible using markdown extensions (e.g. [! /img/blog-build.png href="/"]), but the default Markdown variant should still work imo.
The next ones generate and don't let the server hang, but still generate unexpected output:
| Markdown | HTML |
|---|---|
[[! yo.svg]](/) |
<p><[! custom="[!"><script type="application/json">{"_":"yo.svg"}</script>](/)</p> |
[[my-tag]](/) |
<p><[my-tag custom="[my-tag">](/)</p> |
Btw I tried your example and this *** Bold Italic *** renders to nothing (mabye a section split??? idk) in my test on a doc page.
I also tested *** Bold Italic *** and it generates <p><strong>* Bold Italic</strong>*</p> but I think it should generate <p>*** Bold Italic ***</p>.
@tipiirai just so you know, dev branch fails to build the docs (when there's no current .dist) since table changes:
Files where it fails on [table]:
-
packages/nuejs.org/docs/syntax-highlighting.md -
packages/nuejs.org/blog/rethinking-reactivity/index.md
Error message:
227 |
228 | const html = rows.map((row, i) => {
229 | const is_head = head && i == 0
230 | const is_foot = table.foot && i > 1 && i == rows.length - 1
231 |
232 | const cells = row.map(td => elem(is_head || is_foot ? 'th' : 'td', renderInline(td, md_opts)))
^
TypeError: row.map is not a function. (In 'row.map((td) => elem(is_head || is_foot ? "th" : "td", renderInline(td, md_opts)))', 'row.map' is undefined)
at .../nue/packages/nuemark/src/render-tag.js:232:23
at map (1:11)
at renderTable (.../nue/packages/nuemark/src/render-tag.js:228:21)
at table (.../nue/packages/nuemark/src/render-tag.js:98:18)
at map (1:11)
at renderBlocks (.../nue/packages/nuemark/src/render-blocks.js:18:17)
at renderPage (.../nue/packages/nuekit/src/layout/page.js:128:23)
at .../nue/packages/nuekit/src/nuekit.js:125:22
@tomByrer @nobkd fixed all issues mentioned on this article
Thank youu! I'll check later today, if I find more unexpected results :)
| Input | Expected (Commonmark Spec Implementation) | Nuemark 2 |
|---|---|---|
|
|
|
The first half (before hr) is correct, the second half is wrong (e.g. position of * in first after hr).
PS: I started building a test using the commonmark-spec test suite. See https://github.com/nuejs/nue/compare/dev...nobkd:nue:test/cmark-spec. Maybe you want to try using it.
Oh, i just let the cmark tests run through without expect once, and the following Markdown tests let nuemark hang (this completely ignores all the other failing test):
Hanging tests by test number: [573, 576, 577, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 629]
(for now excluded in my branch: https://github.com/nuejs/nue/compare/dev...nobkd:nue:test/cmark-spec)
show hanging tests: category; test number; md
Images; 573
![foo *bar*]
[foo *bar*]: train.jpg "train & tracks"
Images; 576
![foo *bar*][]
[foo *bar*]: train.jpg "train & tracks"
Images; 577
![foo *bar*][foobar
[FOOBAR]: train.jpg "train & tracks"
Images; 582
![foo][bar]
[bar]: /url
Images; 583
![foo][bar]
[BAR]: /url
Images; 584
![foo][]
[foo]: /url "title"
Images; 585
![*foo* bar][]
[*foo* bar]: /url "title"
Images; 586
![Foo][]
[foo]: /url "title"
Images; 587
![foo]
[]
[foo]: /url "title"
Images; 588
![foo]
[foo]: /url "title"
Images; 589
![*foo* bar]
[*foo* bar]: /url "title"
Images; 590
![[foo]]
[[foo]]: /url "title"
Images; 591
![Foo]
[foo]: /url "title"
Raw HTML; 629
foo <![CDATA[>&<]]>
Edit: there are probably more...
No more loops with unclosed image tags such as ![foo *bar*]
Are image reflinks like ![foo][bar] supported in Markdown?
btw: Nuemark will not support raw HTML, because that violates the separation of concerns principle
Are image reflinks like
![foo][bar]supported in Markdown?
The commonmark reference implementation does support it: https://spec.commonmark.org/dingus/?text=!%5Bimg%5D%5Btag%5D%0A%0A%5Btag%5D%3A%20%2Fimg.png%0A#result
(PS: did you do the strike through implementation with one tilde (~) or with two (~~)? ~strike~ oder ~~strike~~?
Also, may I know why you used the pipe symbol |marked text| for <mark> and not the (in my opinion) more commonly used ==marked text==? (To be more similar to glow?)
(PPS: do we have a simple way to write <details> with <summary>?)
Edit : you can btw remove marked from this tourimage: https://github.com/nuejs/nue/blob/dev/packages%2Fnuejs.org%2Ftour%2Fimg%2Fnpm-nue.png (maybe remove node modules and do a clean install on dev, to get all the dependency changes)
Found more problems:
1. **bold** not bold **bold**
2. **test**:
expected:
<ol>
<li><p><strong>bold</strong> not bold <strong>bold</strong></p></li>
<li><p><strong>test</strong>:</p></li>
</ol>
reality:
<ol>
<li><p><strong>bold** not bold **bold</strong></p></li>
<li><p>**test**:</p></li>
</ol>
You can see this issue e.g. on the docs index page in on dev branch: http://localhost:8080/docs/
Edit: maybe the ps should not belong to the list items?
Nuemark also doesn't support <br> line breaks using two or more spaces at the end of a line or a backslash at the end of a line:
foo
bar
foo\
bar
expected:
<p>foo<br />
bar</p>
<p>foo<br />
bar</p>
Collapsed part fixed by #420
Escaping seems to not work properly: E.g. https://nuejs.org/docs/content-syntax.html#code-blocks
or
I also tried using code blocks with more than three backticks to wrap the md code block, but that didn't work. I tried this:
````md
```md
// here is a CSS code block
:root {
--base-100: #f3f4f6;
--base-200: #e5e7eb;
--base-300: #d1d5db;
--base-400: #6b7280;
}
```
````
But it results in this:
<pre></pre>
<p>:root { }</p>
<pre></pre>
@tipiirai I am wondering what was the original decision to write own parser in contrast to extending/adopting an existing one?
It seems to me, that micromark could be a great foundation that supports CommonMark out of the box hense does not have a behaviour mentioned in https://github.com/nuejs/nue/issues/379#issuecomment-2426528771
Are image reflinks like
![foo][bar]supported in Markdown?
Yes, image reflinks are supported by CommonMark. You may test it directly on GitHub.
what was the original decision to write own parser in contrast to extending/adopting an existing one?
It was explained here : https://nuejs.org/blog/nue-release-candidate/#new-markdown-parser
I don't really know if the same applies to micromark
I don't really know if the same applies to micromark
Thank you for the provided extra information. No, none of the listed problems applies to micromark (remark, unified project). Moreover, the unified project was started to solve the listed problems and provide a unified AST that can be easily transformed on any step.
I asked the reasoning for writing another parser due to the following observations:
- I used to use setext headings and found that Nuemark does not support those;
- I have discovered from the #414 that Nuemark renders lists differently from the standard.
I assume that adopting and already existing powerful foundation would reduce a lot of pain in maintaining of another parser and provide a match to the existing standard. I highly recommend you looking into the unified ecosystem. Here are a few benefits:
-
remarkis widely adopted as it is already a foundation of MDX; - it is well documented, well typed and has an actual AST specification;
- plugins are very easy to write and this would provide even an option for Nue users write own extensions.
I have discovered from the https://github.com/nuejs/nue/issues/414 that Nuemark renders lists differently from the standard
I'm here for the same reason 😅
I honestly don't care much what is used I just want the thing to work out well and without issues in the end, hope it'll come to that !
Users write other Markdown much more than Nue's Markdown. This means that users would prefer Nuemark work in the same way as CommonMark (or GitHub Flavoured Markdown). — (paraphrased) Jakob's Law
Comming from https://github.com/nuejs/nue/issues/414#issuecomment-2506687179 and continuing the previous discussion, a few remarks on that point below.
-
Markdown by definition is a superset of HTML, i. e. it empowers the users write a simpler form of HTML but it still is an HTML code. Forbidding HTML in Markdown can be acceptable due to security reasons but it still violates this original intention.
Although, I support Nue's team intetion to invent a simple yet more powerful syntax than CommonMark currently is (provide blocks etc.). There is a relevant directive proposal and
mast-util-directive.However, while forbidding HTML seems to be a technically practical decision and a good enforcment for the user to write nicer looking markup, it takes the power from the user disabling them doing things they might want in some exceptional cases. And it's important to remember: "you are not the user". As an example, there might be a hightly technical post explaining some bits of HTML and a need to provide live examples in place, or some small bits where providing ARIA would be necessary. These are out of my mind but I am sure there will be very unpredicted real use cases.
-
Markdown in 2024 is well specified. CommonMark is the most popular and well adopted standard of Markdown syntax. The website explains why it exists and provides a lint to versioned specification.
There is another popular standard that we all use — GitHub Flafoured Markdown (GFM) that is an extension of CommonMark and has own specification.
There is also Multimarkdown that has own specification and toolkit but is less adopted. It aims to provide more powerful markup toolset: footnotes, citations, abreviations etc. Multimarkdown still supports all basic Markdown features (I am not sure about HTML) but it's diverges a lot and therefore provides own file extension
.mmd.There might be others that can be found in the Interner and Wikipedia but I didn't dig into the topic further.
Coming from the point 2, Nuemark can head in one of the following directions:
-
diverge from Markdown, provide a (significantly) different markup, an independent toolkit (parser, compiler, syntax highlighters, editor extensions) for it and use unassociated with Markdown file extension like
.nm; - adopt the Markdown (Commonmark) fully, either extend an existing toolkit or provide an independent one, and keep using common file extension
.mdproviding and optional one.nmdif a proper distinguishing is necessary.
In any of those cases, Nuemark requires a proper specification, in my opinion. Currently, it is vaguely defined.
I personally think, that nuemark should be commonmark compliant. Maybe not in the area of indented code blocks (maybe clashing with multiline components), but if they were supported, it would be great.
I think, we should not reuse the --- for section splitting, because it is an already used syntax in cmark, and might confuse people. Maybe adding a new syntax, like +++ which isn't used yet would be a good alternative.
What do you think?
Markdown has a lot of issues and despite it being the most popular, it is worth considering alternatives sometimes. Especially because switching dialects isn't as difficult for the users as switching programming languages. My personal favorite is Typst (https://typst.app/docs/tutorial/advanced-styling/), but ASCIIDOC is also very good (https://docs.asciidoctor.org/asciidoc/latest/text/)
@yerlaser can you please uncover what many issues there are with Markdown? I am very curious to learn
@yerlaser can you please uncover what many issues there are with Markdown? I am very curious to learn
Well, one of the biggest issues you described yourself - it doesn't have a common standard.
And the language features too few formatting primitives so any heavy usage requires creating of a new variant.
Aside from that, the language syntax is not orthogonal where my biggest gripe is that **bold** and __bold__ as well as *italic* and _italic_ mean the same.
On the same note, - can start an unordered list and mark a <h2> title.
Another issue is manual line breaks that are made in markdown with two trailing spaces.