Post content doesn't save properly
We've had two separate reports of this now, same issue both times: meta:293657 and meta:293622.
I've checked the logs and the __html attribute that should contain the HTML rendered by our JS was sent with no content in it. I haven't been able to work out why.
One workaround may be to check on the server side that the generated HTML is longer than the Markdown sent (I can't think of any cases where the Markdown would be longer than the HTML), and render the Markdown server-side if it's not.
Markdown is pretty compact; I can't think of a case where it would be longer than the HTML either. We do want to find the root cause, but that sounds like a reasonable workaround in the meantime. (Let's be sure to leave a code comment so future-us doesn't wonder why we're doing this extra check and remove it...)
Would a check that the HTML is not empty be sufficient - were both cases completely empty?
Unusual but possible case where Markdown could be longer than its resulting HTML:
Markdown (231 characters):
Fairly Long Title | Measurement Before (with notes) | Measurement After (with notes)
----------------- | ------------------------------- | ------------------------------
Content | Content | Content
Resulting HTML (223 characters):
<thead>
<tr>
<th>Fairly Long Title</th>
<th>Measurement Before (with notes)</th>
<th>Measurement After (with notes)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Content</td>
<td>Content</td>
<td>Content</td>
</tr>
</tbody>
</table>
This only happens because the --- | --- | --- separating table headings from table body can optionally be extended to make the Markdown readable as a table even when not rendered as HTML. This means arbitrarily many Markdown characters can be added without adding anything to the HTML.
Not everyone aligns their tables nicely in Markdown like this (it only benefits future editors, not readers), so some look more like this:
Markdown (124 characters):
Fairly Long Title | Measurement Before (with notes) | Measurement After (with notes)
---|---|---
Content | Content | Content
However, I've seen at least some aligned Markdown tables (and I sometimes do this myself).
For a single row table like this, the HTML adds 99 characters which is easily matched by adding hyphens and spaces to align the columns in the Markdown. Adding more rows increases the 99 characters cost of converting to HTML, but also introduces more opportunity for extra spaces in the extra rows in the Markdown, so a fairly natural looking table with arbitrarily many rows could easily have longer Markdown than HTML.
Is the case where JS is not enabled already handled in a different way?
Would a check that the HTML is not empty be sufficient - were both cases completely empty?
No - the JS, when it runs, adds a HTML comment at the start of the post as a marker that the HTML was generated from JS. Those markers are present, but nothing else, which indicates that the renderer was called but returned nothing - presumably because it was fed nothing.
Is the case where JS is not enabled already handled in a different way?
It should be, but I have a sneaking suspicion that server-side rendering is only triggered when the HTML is completely empty, which - as above - it wasn't.
How about a check that the HTML is longer than the standard comment about being generated by JS?
That would work for the workaround
If that comment is always identical could we just check for equality?
I'm wondering if the workaround may end up being long term, if the root cause turns out to be a particular browser's JS implementation. Hence wanting to cover even rare edge cases like the Markdown table.
Possible complication: The initial Meta post has now been edited to mention a second occurrence, this time only missing part of the content (not to be confused with the second report, which was a post from a different user). It's been edited since being reported, and now looks fine, so I'm not sure what was missing (I've commented under the Meta post to ask). Can you see partial HTML in the logs?
Not sure if this is a separate problem. I guess it's still worth having the check for having only the HTML comment where the HTML is supposed to be, but we might need some other check if the HTML is sometimes incomplete but not completely missing. I hope this doesn't lead to having to stop trusting the client to generate the HTML.
Does the log show which browsers were used? Do we need to just stop trusting client generated HTML for Brave...?
I've had a reply to my comment to say the missing part was the image. This suggests it is not related to this bug and I'll raise it separately.
I've tested in development and adding an image doesn't update the preview until you click in the post body field again. This is fine when you add further text after uploading the image, but if the image is the last thing you add before posting/saving then it won't be added to the HTML. It's still in the Markdown so will appear in the HTML next time you make an edit (which is what happened in the post commented about).
So it's a rare bug that only happens to users who add an image as the last thing in their post, and don't add meaningful alt text.
Turns out I'd already raised on Meta that the edit preview doesn't update after mouse interactions: Editor preview left outdated after adding content without the keyboard. I've edited that to mention that this can also affect the saved post.
That fully explains the second example in the earlier post, so the only remaining examples should be fixed by #1592