mdx icon indicating copy to clipboard operation
mdx copied to clipboard

Fix invalid syntax output

Open calebeby opened this issue 3 years ago • 10 comments

Closes #2112

Previously, mdx generated const ;, which is not valid syntax. If you comment out the fix, the test fails.

In the test I didn't add an assertion for the HTML output since that is more tied to remark-shiki-twoslash, but if you would prefer me to add that I can.

@bholmesdev helped me narrow this down, thanks!

calebeby avatar Sep 02 '22 20:09 calebeby

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated
mdx ✅ Ready (Inspect) Visit Preview Oct 11, 2022 at 2:49PM (UTC)

vercel[bot] avatar Sep 02 '22 20:09 vercel[bot]

Codecov Report

Merging #2123 (b1fc010) into main (996771a) will not change coverage. The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main     #2123   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           22        22           
  Lines         2044      2046    +2     
=========================================
+ Hits          2044      2046    +2     
Impacted Files Coverage Δ
packages/mdx/lib/plugin/recma-jsx-rewrite.js 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

codecov-commenter avatar Sep 02 '22 20:09 codecov-commenter

Additionally, this might just be due to remark-shiki-twoslash doing weird things: a) remark is for markdown, not for HTML. rehype is for HTML. b) Injecting strings of HTML into an AST is an bad idea, yes it can be fixed with rehype-raw but that essentially means parsing a file twice? c) it has several long open issues

If possible, I’d suggest a different syntax highlighter or sending PRs to shiki twoslash to improve the situation

I definitely agree with some of your concerns here, the double-parsing is definitely less-than-ideal. I'd like to improve the situation with twoslash, but that is a big project and for now I just wanted to get this bugfix through.

I would like to hear more about a and c if you are willing to clarify. Which long-open issues are more concerning to you? And I think it makes sense that remark-shiki-twoslash operates at the markdown phase, so it can process the pre-escaping code text, and also so that it can read from the metadata next to the language id after the opening backticks.

calebeby avatar Sep 05 '22 16:09 calebeby

Also, I am not familiar with the type-coverage check. Is it related to my casting to any to deal with TypeScript's CJS/ESM types interop? Is there a better way to do what I'm doing?

calebeby avatar Sep 05 '22 16:09 calebeby

(should pass when my code change is included, should fail with unexpected token ; when my code change is not included).

🤔🤔🤔 It still all just works for me with your test case without your code change, nothing broken 🤷‍♂️ Here’s my patch (excluding package lock changes):

+++ b/package.json
@@ -244,5 +244,8 @@
         false
       ]
     ]
+  },
+  "dependencies": {
+    "remark-shiki-twoslash": "^3.1.0"
   }
 }
diff --git a/packages/mdx/test/compile.js b/packages/mdx/test/compile.js
index d5a10ee0..aa8a5366 100644
--- a/packages/mdx/test/compile.js
+++ b/packages/mdx/test/compile.js
@@ -21,6 +21,7 @@ import remarkGfm from 'remark-gfm'
 import remarkMath from 'remark-math'
 import {VFile} from 'vfile'
 import {SourceMapGenerator} from 'source-map'
+import * as remarkShikiTwoslash from 'remark-shiki-twoslash'
 import {compile, compileSync, createProcessor, nodeTypes} from '../index.js'
 // @ts-expect-error: make sure a single react is used.
 import {renderToStaticMarkup as renderToStaticMarkup_} from '../../react/node_modules/react-dom/server.js'
@@ -1173,6 +1174,28 @@ test('remark-rehype options', async () => {
   )
 })
 
+test('remark-shiki-twoslash with layout function', async () => {
+  // Just test that the output can be generated correctly without JS syntax errors
+  renderToStaticMarkup(
+    React.createElement(
+      await run(
+        await compile(
+          `
+export default function ({ children }) { return <div>{children}</div>; }
+
+\`\`\`js twoslash
+console.log('hi')
+\`\`\`
+`,
+          {
+            remarkPlugins: [/** @type any */ (remarkShikiTwoslash).default],
+            rehypePlugins: [[rehypeRaw, {passThrough: nodeTypes}]]
+          }
+        )
+      )
+    )
+  )
+})
 test('MDX (JSX)', async () => {
   assert.equal(
     renderToStaticMarkup(

wooorm avatar Sep 06 '22 18:09 wooorm

double-parsing is definitely less-than-ideal. I'd like to improve the situation with twoslash, but that is a big project and for now I just wanted to get this bugfix through.

The main offenders are:

  • https://github.com/shikijs/twoslash/blob/7b124d5e1240ff36af9e633d49e30b1abbe50849/packages/remark-shiki-twoslash/src/index.ts#L46-L55
  • https://github.com/shikijs/twoslash/blob/7b124d5e1240ff36af9e633d49e30b1abbe50849/packages/remark-shiki-twoslash/src/exceptionMessageDOM.ts#L9-L137

These are injecting raw strings of HTML into an AST. Shiki can build an AST as far as I am aware. The alternative in remark space for that is https://github.com/syntax-tree/mdast-util-to-hast#fields-on-nodes. But the best place for dealing with HTML is in hast/rehype. That can also access code, language, meta, etc.

I would like to hear more about a and c if you are willing to clarify.

For a), see:

  • https://github.com/remarkjs/remark-rehype#what-is-this
  • https://mdxjs.com/packages/mdx/#architecture

For c), (your more specific questions I asked in the previous quote answer), for specific issues, quickly looking through them these two come up:

  • https://github.com/shikijs/twoslash/issues/125
  • https://github.com/shikijs/twoslash/issues/129

Also, I am not familiar with the type-coverage check. Is it related to my casting to any to deal with TypeScript's CJS/ESM types interop? Is there a better way to do what I'm doing?

Correct, type-coverage prevents the use of any. There are a couple ways around this, but I’d prefer to narrow down the problem and make a MVP without twoslash if possible. Let’s try and get that first and then hopefully this won’t be needed!

wooorm avatar Sep 06 '22 18:09 wooorm

It still all just works for me with your test case without your code change, nothing broken

I commented out the code fix and now it is failing as expected in CI. If you pull the latest commit it should fail locally in the same way hopefully?

https://github.com/mdx-js/mdx/runs/8217507997?check_suite_focus=true#step:5:40

calebeby avatar Sep 06 '22 22:09 calebeby

@wooorm can you help me understand why there is a distinction between a mdxJsxFlowElement and an element in the hast tree/rehype phase? Is it just because a mdxJsxFlowElement might be a component instead of an element? Or because its subtree might contain expressions/etc that is not valid html?

If I was making a rehype plugin that for example did something to h1 elements, and I wanted it to work as a rehypePlugin for mdx, or for rehype outside of mdx, should I look in the tree for both mdxJsxFlowElement and element with name/tagName of h1?

I'm asking this because I'm wondering whether rehype-raw is maybe not intended for use with MDX. Are nodes with type of element intended to be in a hast mdx tree?

calebeby avatar Sep 06 '22 23:09 calebeby

I think I have a reproduction that doesn't rely on shiki-twoslash!

If I put <custom-element/> in my MDX code, MDX deals with it just fine and the relevant node is like: { type: 'mdxJsxFlowElement', name: 'custom-element' }

If I inject <custom-element/> as { type: 'element', tagName: 'custom-element' } via a plugin (a raw node that rehype-raw parses) then the const ; bug happens.

So, depending on the answer to the question "are element nodes valid in MDX hast trees?":

If "no, mdxJsxFlowElement nodes should be used instead in MDX hast trees": Maybe there should be an option of rehype-raw to output mdxJsxFlowElement/etc instead of element nodes. Or a separate plugin that is essentially the same but with that change.

If "yes, element is valid in a MDX hast tree":

Then I can push a test like this that reproduces the bug without relying on remark-shiki-twoslash:

{
  remarkPlugins: [
    () => {
      const transform = async (markdownAST) => {
        markdownAST.children.push({
          type: 'html',
          value: `<custom-element />`,
          position: [],
          children: []
        })
      }
      return transform
    }
  ],
  rehypePlugins: [
    [rehypeRaw, {passThrough: nodeTypes}],
  ]
}

(or something similar that directly injects the element instead of injecting a raw node and pushing it through rehype-raw)

Does that sound right to you?

Thanks for your patience with me!

calebeby avatar Sep 06 '22 23:09 calebeby

Sweet!

The answer is: yes, `element` is valid in a MDX hast tree!

Hmm, I still have can’t reproduce an error with a test like that:

  renderToStaticMarkup(
    React.createElement(
      await run(
        await compile('', {
          remarkPlugins: [
            () => {
              const transform = async (markdownAST) => {
                markdownAST.children.push({
                  type: 'html',
                  value: '<custom-element />',
                  position: [],
                  children: []
                })
              }

              return transform
            }
          ],
          rehypePlugins: [[rehypeRaw, {passThrough: nodeTypes}]]
        })
      )
    )
  )

Note: the /> part on a custom element is not actually valid HTML, so if that’s what Shiki is injecting, that’s a bug there?

Do you have a more complete example that reproduces your problem?

wooorm avatar Sep 08 '22 18:09 wooorm

Friendly ping! :)

wooorm avatar Oct 07 '22 18:10 wooorm

Hey @wooorm, thanks for the ping. I'd really like to get back to this, it's on my list but school comes first right now :)

calebeby avatar Oct 09 '22 01:10 calebeby

@wooorm @calebeby No worries! I'll take it from here :) I was able to reproduce Caleb's bug on my end as well when using Shiki Twoslash. Working on a test case that's more minimal!

bholmesdev avatar Oct 10 '22 18:10 bholmesdev

~~@wooorm Hm, I'm getting a test-coverage threshold failure after rebasing that I didn't see before. Even reverting the code change, I still get a coverage error locally. Any idea what's causing this?~~ Saw your message! Confirmed it's the test itself. Not sure what I'm missing...

bholmesdev avatar Oct 11 '22 14:10 bholmesdev

Alright @wooorm, should be ready to review 👍

bholmesdev avatar Oct 11 '22 14:10 bholmesdev

Thanks everyone, particularly @calebeby and @bholmesdev, the reduced case helped a lot. I reduced it some more, uncovered the underlying problem, which I also fixed. And kept this length check in because it doesn’t hurt to have it!

See the linked commit message for much more information on what was going on and how it was solved.

wooorm avatar Oct 11 '22 17:10 wooorm

Released in 2.1.5!

wooorm avatar Oct 11 '22 17:10 wooorm