mdast-util-to-markdown icon indicating copy to clipboard operation
mdast-util-to-markdown copied to clipboard

roundtripping emphasis in emphasis edge case changes document structure

Open ChristianMurphy opened this issue 4 years ago • 6 comments

Subject of the issue

***emphasis*in emphasis*

is stringified as:

\***emphasis*in emphasis*

which changes the structure

Your environment

  • OS: Ubuntu
  • Packages: mdast-util-to-markdown 0.6.2
  • Env: node v15.5.1, npm 7.3.0

Steps to reproduce

parse

***emphasis*in emphasis*

which has the structure

{
    "type": "root",
    "children": [
        {
            "type": "paragraph",
            "children": [
                {
                    "type": "text"
                },
                {
                    "type": "emphasis",
                    "children": [
                        {
                            "type": "emphasis",
                            "children": [
                                {
                                    "type": "text"
                                }
                            ]
                        },
                        {
                            "type": "text"
                        }
                    ]
                }
            ]
        }
    ]
}

and stringify it:

\***emphasis*in emphasis*

the resulting markdown has a different structure than the original

{
    "type": "root",
    "children": [
        {
            "type": "paragraph",
            "children": [
                {
                    "type": "text"
                },
                {
                    "type": "emphasis",
                    "children": [
                        {
                            "type": "text"
                        }
                    ]
                }
            ]
        }
    ]
}

:notebook: comparing how the two pieces of markdown text are being parsed with https://spec.commonmark.org/dingus it appears in both cases it is parsed as expected.

Expected behavior

structure is the same

Actual behavior

structure is different

ChristianMurphy avatar Feb 03 '21 23:02 ChristianMurphy

Here are some more examples:

a ***b*c d*

a \***b*c d*

a ***b* d*

a \***b* d*

Yields (CM dingus):

<p>a *<em><em>b</em>c d</em></p>
<p>a ***b<em>c d</em></p>
<p>a *<em><em>b</em> d</em></p>
<p>a *<em><em>b</em> d</em></p>

so whether that escape “works” relates also to what comes after the “run”.

I don’t really see an (easy) fix 🤔

wooorm avatar Feb 04 '21 09:02 wooorm

A related edge case which can happen on the tail of emphasis next to emphasis

*a*_b__

ChristianMurphy avatar Feb 05 '21 15:02 ChristianMurphy

Hi! This was marked as ready to be worked on! Note that while this is ready to be worked on, nothing is said about priority: it may take a while for this to be solved.

Is this something you can and want to work on?

Team: please use the area/* (to describe the scope of the change), platform/* (if this is related to a specific one), and semver/* and type/* labels to annotate this. If this is first-timers friendly, add good first issue and if this could use help, add help wanted.

github-actions[bot] avatar Aug 21 '21 15:08 github-actions[bot]

I came up with a way to solve this, I think: https://github.com/syntax-tree/unist/discussions/60#discussioncomment-2111096.

wooorm avatar Feb 04 '22 13:02 wooorm