vale icon indicating copy to clipboard operation
vale copied to clipboard

Support MDX as a separate language from Markdown

Open bates64 opened this issue 1 year ago • 1 comments

Check for existing issues

  • [X] Completed

Describe the feature

Currently, people using Vale on MDX files tell Vale that it's just Markdown.

[formats]
mdx = md

But it isn't! MDX has clear differences from Markdown:

  • Comments are different https://github.com/errata-ai/vale/issues/762
  • Vale must ignore lines that are actually JavaScript, i.e. those starting with import
  • Vale must ignore JSX expressions e.g. {1 +1}
  • Vale must not treat HTML as prose, i.e. <div foo="bar"> should not generate warnings (applies to Markdown also) Syntax reference

bates64 avatar May 28 '24 14:05 bates64

+1

discdiver avatar Jul 09 '24 19:07 discdiver

I cannot see any significant changes in the recent release history to suggest any different, but I see the ConfigGenerator has a MDX checkbox which uses a package - this would suggest this is officially supported but I get that it's worked around atm. I've tried with and without this and cannot get vale to even read mdx files so I wanted to check if the old way using [formats] worked still (my docs don't have any comments in so the approach referred above should work for me) with

StylesPath = .github/styles
MinAlertLevel = error
IgnoredScopes = code, tt, img, url, a
SkippedScopes = script, style, pre, figure, code
# Ignore code surrounded by backticks or plus sign, parameters defaults, URLs, and angle brackets.
Packages     = MDX

[formats]
mdx = md

[*.{md,mdx}]
BasedOnStyles = MyStyles

[*.mdx]
CommentDelimiters = {/*, */}

Thanks IA

ml4 avatar Feb 03 '25 13:02 ml4

This is going to be available in the next release through an external tool, mdx2vast, which I've created.

I've been testing it and the results have been good, including being able to handle the entire Docusaurus docs without any syntax-related false positives.

jdkato avatar Mar 12 '25 00:03 jdkato

Is there expected to be a performance hit between markdown and mdx parsing in the new version? Especially when run in container, it's feeling pretty big. Some numbers:


Using the brew 3.10 binary:

$ git stash && time vale . 
Saved working directory and index state WIP on {branch}: 97e2e5a chore(deps): update {snip}/vale docker tag to v3.10.0
✔ 0 errors, 0 warnings and 0 suggestions in 163 files.
vale .  20.05s user 0.59s system 414% cpu 4.974 total
$ git stash pop -q && time vale .
✔ 0 errors, 0 warnings and 0 suggestions in 163 files.
vale .  61.84s user 10.96s system 480% cpu 15.149 total

Using 3.10 docker image:

$ git stash && time docker run --rm -v /$(pwd)/:/docs/ -w //docs {snip}/vale:v3.10.0 . --no-wrap --minAlertLevel=warning
Saved working directory and index state WIP on {branch}: 97e2e5a chore(deps): update {snip}/vale docker tag to v3.10.0
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested

{snip}

✖ 8 errors, 0 warnings and 0 suggestions in 163 files.
docker run --rm -v /$(pwd)/:/docs/ -w //docs  . --no-wrap   0.02s user 0.02s system 0% cpu 8.867 total
$ git stash pop -q && time docker run --rm -v /$(pwd)/:/docs/ -w //docs {snip}/vale:v3.10.0 . --no-wrap --minAlertLevel=warning
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
✔ 0 errors, 0 warnings and 0 suggestions in 163 files.
docker run --rm -v /$(pwd)/:/docs/ -w //docs  . --no-wrap   0.02s user 0.03s system 0% cpu 39.297 total

Using 3.9.4 docker image:

$ git stash && time docker run --rm -v /$(pwd)/:/docs/ -w //docs {snip}/vale:v3.9.4 . --no-wrap --minAlertLevel=warning
Saved working directory and index state WIP on {branch}: 97e2e5a chore(deps): update {snip}/vale docker tag to v3.10.0
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
✔ 0 errors, 0 warnings and 0 suggestions in 163 files.
docker run --rm -v /$(pwd)/:/docs/ -w //docs  . --no-wrap   0.02s user 0.03s system 0% cpu 9.042 total
$ git stash pop -q && time docker run --rm -v /$(pwd)/:/docs/ -w //docs {snip}/vale:v3.9.4 . --no-wrap --minAlertLevel=error&> out
docker run --rm -v /$(pwd)/:/docs/ -w //docs  . --no-wrap  &> out  0.03s user 0.07s system 1% cpu 5.726 total

(redirecting output due to nearly 3k reported errors)

rainecheck avatar Mar 26 '25 02:03 rainecheck

Yes, there's going to be overhead associated with calling out to an external JavaScript dependency.

This can likely be optimized some (I haven't spent any time looking into this), but it will still be slower than Markdown.

jdkato avatar Mar 26 '25 02:03 jdkato