sidebar: possible issues with HTML rendering
Is this an expected result?
I would think promnesia shouldn't render any HTML it finds in the context, seems dangerous
Had a discord message which had a block of HTML in it as the context:

seems that it gets rendered/executed:

Probably want to escape HTML so that this doesn't happen?
The text used, for reference:
curl 'https://myanimelist.net/profile/purplepinapples' -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'Referer: https://myanimelist.net/' -H 'Connection: keep-alive' -H 'Cookie: MALSESSIONID=MALHLOGSESSID=; m_gdpr_mdl_2=1; is_logged_in=1; anime_update_advanced=1; clubcomments=a%3A6%3A%7Bi%3A77624%3Bi%3A1588205393%3Bi%3A29693%3Bi%3A1588528890%3Bi%3A18421%3Bi%3A1588906736%3Bi%3A7367%3Bi%3A1588919526%3Bi%3A19736%3Bi%3A1588955723%3Bi%3A72940%3Bi%3A1589744327%3B%7D' -H 'Upgrade-Insecure-Requests: 1' -H 'Pragma: no-cache' -H 'Cache-Control: no-cache'
<html>
<head><title>403 Forbidden</title></head>
<body bgcolor="white">
<center><h1>403 Forbidden</h1></center>
</body>
</html>
Related issue might be rendering markdown in the context, since github/reddit/discord all follow a similar syntax
still wouldn't fix this from a security concern though, as currently it seems like this executes arbitrary HTML
Hmm so this should be the relevant bit of code that displays it: https://github.com/karlicoss/promnesia/blob/0e1e9a1ccd1f07b2a64336c18c7f41ca24fcbcd4/extension/src/display.js#L160-L196
HTML is used at all because the markdown parser only keeps track of the block's HTML: https://github.com/karlicoss/promnesia/blob/0e1e9a1ccd1f07b2a64336c18c7f41ca24fcbcd4/src/promnesia/sources/markdown.py#L63-L64
So in the extension it uses safeSetInnerHTML function, which AFAIK is a safe way of converting HTML text to actual DOM https://github.com/karlicoss/promnesia/blob/0e1e9a1ccd1f07b2a64336c18c7f41ca24fcbcd4/extension/src/common.js#L226-L235 E.g. DomParser won't arbitrarily execute scripts etc https://stackoverflow.com/questions/22945884/domparser-appending-script-tags-to-head-body-but-not-executing
In fact, chrome/firefox stores security policy wouldn't allow arbitrary eval-like stuff, so to my knowledge it should be safe (maybe except for potentially spoofing the parent page?).
But maybe good point that there should be a setting to opt in/out of it? I'm certainly thinking about security implications but not an expert in this stuff so might be better to make defaults safe just in case
I guess either way even if it's safe, seems that markdown library's HTML rendered has an issue here? Since presumably this stuff quoted in '```' should be rendered as <pre> or something like that?
Alright, just in case heres the content of that sidebar node: https://gist.githubusercontent.com/seanbreckenridge/fedd7db7f111bd97edbfd84722fe0441/raw/dce9cba7b29216bb25ac3f9cb310ee64485453f0/promnesia.html
Also, not sure if it just auto-detects markdown -- I assume I have to feed it through mistletoe in my discord source?
Perhaps the same should be done for github body here, and for reddit as well?
Also not sure on the security implications, should probably just link to #14; perhaps someone who knows more about this can comment in the future
If this is expected behavior, feel free to close the issue
Ah, sorry! I read 'markdown' and looked in the markdown source :man_facepalming: no, it doesn't autodetect. So that means in the extension it would go via the 'anchorme' path https://github.com/karlicoss/promnesia/blob/0e1e9a1ccd1f07b2a64336c18c7f41ca24fcbcd4/extension/src/display.js#L163-L187 (to try to detect the URLs and make them clickable), and then it just sets HTML as is -- so there is possibility for such artifacts... IIRC the problem that it's tricky to keep the <a> tags, but also clean up everything else (ideally other tags shouldn't seep through at all?). Not sure if there is some easy way to solve it, but would be nice.. Ideally I guess it should simply be a sequence of <a> and <pre> tags or something like it?
And yeah -- I think ideally sources returning markdown would add the HTML hint, didn't have time to try it so far though -- let me know if/how it works!
Whoops -- meant 'markdown hint', but there isn't one yet. I guess that would either be HTML hint + rendering markdown in the backend, or new 'markdown hint' + actually rendering markdown in the frontend. Not sure which way is the best, but the former is probably easier to quickly test
partially solved by #234, since I can now render the discord markdown on the backend using the helper functions added there
doesn't address possible security issue though, though I'm not sure if that's an issue
up to you if you want to leave this open or close it, personally my issues been resolved
Yeah let's keep it, I think there is some useful context
just an example that reproduces it (cause I already forgot at this point :sweat_smile: )
# link = '<a style="font-size: 2rem; line-height: var(--menu-bar-height);" href="https://beepb00p.xyz">back to blog</a>'
after indexing this ends in promnesia as

even though the indexer isn't enforcing HTML, anchrome detects HTML, so it doesn't do anything with the tag