nimforum
nimforum copied to clipboard
Formatting of search results is poor
Surely this is not a reasonable result for static statement:

The code block is broken, and a random sentence is printed in a huge font while the title's weight is weaker than the author's.
I just tried to take a look at this out of curiosity and received this output in the forum:

That seems reasonable enough to me. I'm not 100%, but was the issue maybe fixed in the meantime and can thus be closed? Or am I overlooking anything?
it does indeed seems it is fixed but looking at past commits it is unclear what fixed it (I do remember the issue in the past, do not remember recent occurences). and unclear whether or not the forum has been recently deployed. https://github.com/nim-lang/nimforum/issues/317 would be useful :)
https://forum.nim-lang.org/t/7969#50770 seems to be broken: "Couldn't render historic post in #50770"

And it cannot be rendered properly in the search results.
mmh, if this screenshot is to be trusted we do have a last deploy last october:

https://forum.nim-lang.org/t/7969#50770 seems to be broken: "Couldn't render historic post in #50770"
this is related to this (still open) issue: https://github.com/nim-lang/nimforum/issues/330
I think there are some cases still not working
@pietroppeter and @PhilippMDoerner Search the forum for option.sepBy. You should get this:
The issue is only present when the code block gets cut off and doesn't have an ending and/or beginning ```
I can see the issue in the forum, however, I can't seem to replicate it when I run this locally.
I pretty much just copy pasted jack's post above to make a thread out of it, but no matter what I search, this is the kind of stuff I get:

This is with the forum compiled under 1.9.1, which shouldn't make a difference here.
Ohhhhh it's explicitly the comment ## match --<name>:<option> that is making an issue here.
The combination of the comment being ## and no indication that this should be a codeblock being present.
So basically it just sees the "##" in normal text and assumes "That's normal markdown, so that's a headline" and then we have our problem.
Now the next question that follows this is: If you have a code-comment in a search-result code-block and the indicators for the code-block get cut off... how do you treat this then? That seems like a pretty ugly issue to fix. To prevent this kind of thing you'd kind of have to parse the text to ensure if a code-block occurs that it is always closed at the very least at the end of the comment before ellipsing out.
And that is only if the snippet you want to show is near the start or the end of a codeblock. If you're inside of a code-block you also need to scan this and put the entire snippet in between codeblock markings. I'll need to investigate whether that is even possible since I know this makes use of sqlite FTS, though I'm more familiar with FTS5 than FTS4
@PhilippMDoerner You are correct. This issue would likely be fairly difficult to solve based on the reasons you mentioned. With the current structure of the code, I'm also not sure how to approach it. FTS with the snippet extension only returns the partial text from each match, so you'd probably need to get the full text from sqlite to compare so you'd be able to see where the start/end code block markers are. An ugly problem indeed.
I was curious how Discourse solves this, and it looks like they just strip all formatting and render the plain text: https://discourse.codecombat.com/search?q=penPositions%20%3D. Maybe we could do that as well.
they just strip all formatting and render the plain text
It seems a very reasonable thing to do indeed. Maybe highlighting (bold?) the search term as they do
they just strip all formatting and render the plain text
It seems a very reasonable thing to do indeed. Maybe highlighting (bold?) the search term as they do
That actually runs into problems for exactly the example we were looking at. Say the comment looks like this:
## [<p1> (<p2> <p1>)*]
And you have a searchquery of <p1>.
Now naturally you can just go and do a "replace" on the HTML string to insert <strong> <p1> </strong> for any hit to<p1> in the string.
The issue comes in rendering.
How do you render this int he frontend correctly now, because you want the <strong> tags to render as HTML, while you want <p1> to render as text?
If you try to render this as HTML, you'll get the <p1> bold but you also get
Demonstration:

As you can see, there is no
Now we can of course circumvent that, we can say to just render the text as is. But that means it'll also render the <strong> tags for text instead of as HTML.
Demonstration:

So either we just render plaintext, or we accept that any HTML-strings in the content itself will not be rendered as text.
As an additional issue though, I haven't found an easy way in karax so far to retain the whitespacing, so it all looks like the unreadable messy textblobs you see above.
@PhilippMDoerner If you want to render a search of <p1>, you will have to escape the characters that HTML considers special characters. See here for a complete table of conversions. I believe you should only have to escape &, <, >, ", and '. See the lodash escape table.
So in your case, you would replace <p1> by <strong> <p1> </strong>
It does turn out we're overcomplicating things. I just throw together a bit of a solution for multiple issues:
- The snippets that get rendered (using SQLITE FTS4-SQL feature called SNIPPET) sometimes fail to render properly. You instead get to actually see the
**that should get rendered into strong tags, because there's no whitespace after the match. I added those, that get rid of an entire class of rendering issues - To prevent
##and#(the leading whitespace is important) to be rendered into a headline, we might as well do a simple replace of#in general to\\#. Most of the time that fixes the issue, sometimes that leads to\#being rendered instead of#. Which I can live with.
Mysteriously that even takes care of the HTML issue

That fixes the bolded text issue, but now you have more issues because the markdown is parsing proc sepBy[T, T2](p1: Parser[T], p2: Parser[P2]): Parser[seq[T]] = as proc sepBy T, T2: Parser[seq[T]]. You're losing information in the text because markdown sees [X](Y) and parses it as a link Y with text X. I still think you're better off using the plaintext and escaping the HTML special characters as I mentioned above. Otherwise you're going to have to keep finding hacks to workaround markdown parsing issues.
The alternative is plain-text and hacks that work around making things sometimes bold and sometimes not, we're running into hacks either way.
Because you'll have to escape HTML-characters first and then replace whatever placeholder you inserted around search-matches with <strong> and </strong> tags, Neither feels particularly clean. The benefit this one provides is that at the very least it provides some more readable formatting.
But they aren't hacks. HTML has special characters that need to be escaped if you want to display them normally. Replacing &, <, >, ", and ' with their respective escape codes is how you insert text into an HTML element in JS.
I get the formatting will be cleaner in the code samples, but that isn't really an issue when you can just click into the post and read the full code. There is a good reason that Discourse doesn't render text with formatting. It's tricky business and broken text gets rendered like it currently does in the Nim forum.
Also, if you start escaping Markdown characters, the formatting in the regular, non-code text will be broken and won't display headers and links correctly.