gotosocial icon indicating copy to clipboard operation
gotosocial copied to clipboard

[bug] ASCII hearts get removed from markdown posts </3

Open autumnull opened this issue 2 years ago • 6 comments

Describe the bug with a clear and concise description of what the bug is.

Including either <3 or </3 in markdown posts causes strange behavior.

What's your GoToSocial Version?

v0.6.0

GoToSocial Arch

armv7 Binary

Browser version

No response

What happened?

Input text:

hello <3 old friend <3 i loved u </3 :(( you stole my heart

Content HTML:

<p>hello old friend i loved u

Note that the closing </p> tag is missing, not just the latter half of the sentence.

To summarize:

  • <3 is removed
  • </3 is removed along with everything following it, including the closing </p> tag.

What you expected to happen?

Expected HTML:

<p>hello &lt;3 old friend &lt;3 i loved u &lt;/3 :(( you stole my heart</p>

Taken from the CommonMark dingus :)

How to reproduce it?

Provide input as above.

Anything else we need to know?

No response

autumnull avatar Dec 09 '22 13:12 autumnull

Upon inspection, the issue is in blackfriday. It converts the above markdown to

<p>hello  old friend  i loved u </3 :(( you stole my heart</p>

which is then sanitized (pretty reasonably) by bluemonday to remove the unescaped < and everything after it.

The secondary issue is that blackfriday has had no commits since 2020, and has a big pile of unresolved issues, so it is unlikely that the issue in blackfriday will be fixed.

So I tested a few other markdown libraries. The second most popular on github, gomarkdown, does not work. same issue. However, the commonmark implementation in go, golang-commonmark, does work.

My suggestion is to switch to using golang-commonmark. Unfortunately it does come with a handful of dependencies. But this is the best suggestion i have -- they seem to be more intent on sticking to spec, which i think is desirable.

autumnull avatar Dec 09 '22 19:12 autumnull

Scratch that suggestion, actually i just found goldmark which has no dependencies, is more popular and well-maintained, and is also commonmark spec compliant. It seems ideal.

autumnull avatar Dec 09 '22 19:12 autumnull

@autumnull does it work on goldmark?

Fastidious avatar Dec 09 '22 19:12 Fastidious

yes.

autumnull avatar Dec 09 '22 20:12 autumnull

Oooh good find 👀 I'm open to switching from blackfriday. Perhaps it's worth making a separate issue for doing the switch so we can write up what's required? Or do you want to do it here?

tsmethurst avatar Dec 09 '22 22:12 tsmethurst

i'm currently working on switching it over, I don't mind if you wana make a new issue if that's tidier for project managment etc. <3

autumnull avatar Dec 09 '22 22:12 autumnull