marked icon indicating copy to clipboard operation
marked copied to clipboard

Markdown strikethrough is not limited to two tilde wrapping

Open denizeren opened this issue 6 years ago • 5 comments

Describe the bug According to Github Flavored Markdown specification strikethrough text is any text wrapped in two tildes ( ~ ). However marked library applies strikethrough to text when it's wrapped with single tildes ( ~ ) which causes unexpected behavior.

GFM spec got updated for restricting strikethrough to two-tilde wrapping.

To Reproduce Steps to reproduce the behavior:

  1. Add any text between 2 tildes(ex: ~test~ ).
  2. Paste it into Marked Demo for viewing parsed HTML. You will see strikethrough text which shouldn't be strikethrough.

Expected behavior Only two-tilde wrapped texts should be strikethrough.

See example.

denizeren avatar Nov 01 '19 15:11 denizeren

this is what ~test~ looks like on GitHub: ~test~

It looks like GitHub allows single tildes to be used for strikethrough even though the spec says otherwise.

UziTech avatar Nov 01 '19 15:11 UziTech

A little bit more research shows tests in GFM with single tildes as strikethrough.

https://github.com/github/cmark-gfm/blob/36c1553d2a1f04dc1628e76b18490edeff78b8d0/test/extensions.txt#L502-L510

It looks like this is a legacy issue. The spec was originally supposed to only support two tildes but, because of a long standing bug, a lot of strikethroughs on github.com are single tildes and GitHub doesn't want to confuse it's users.

Seems we have two options:

  1. Change single tildes to do nothing and only allow strikethrough on double tilde.
  2. Let single tildes be strikethrough like github.com

Since The GFM spec doesn't explicitly disallow single tildes from being used for strikethrough and the gfm option is supposed to copy what markdown looks like on github.com I say we leave it the way it is.

It looks like the reason for strikethrough to be two tildes is because single tildes are used by markdown-it as subscript, but until that is actually in a spec I don't see any reason we should change it.

UziTech avatar Nov 01 '19 16:11 UziTech

For anyone who's looking for a quick fix: new marked.InlineLexer({ gfm: true }).rules.del = /^~~+(?=\S)([\s\S]*?\S)~~+/

bladeSk avatar Mar 22 '20 12:03 bladeSk

For anyone who's looking for a quick fix: new marked.InlineLexer({ gfm: true }).rules.del = /^~~+(?=\S)([\s\S]*?\S)~~+/

Thanks, but can you elaborate more on how to override this lexer rule to an existing marked instance? Thanks in advance.

zhao-chong avatar May 23 '21 04:05 zhao-chong

This is how you could use an extension to override the current behavior:

const marked = require("marked");

marked.use({
  tokenizer: {
    del(src) {
      const cap = /^~~+(?=\S)([\s\S]*?\S)~~+/.exec(src);
      if (cap) {
        return {
          type: 'del',
          raw: cap[0],
          text: cap[2]
        };
      }
    }
  }
});

UziTech avatar May 23 '21 14:05 UziTech

The spec now states that one or two tildes should be strikethrough to match the implementation.

Strikethrough text is any text wrapped in a matching pair of one or two tildes (~).

UziTech avatar Dec 17 '22 04:12 UziTech