[Bug]: Sometimes lemmy returns the wrong rich link preview
Requirements
- [X] Is this a bug report? For questions or discussions use https://lemmy.ml/c/lemmy_support
- [X] Did you check to see if this issue already exists?
- [X] Is this only a single bug? Do not put multiple bugs in one issue.
- [X] Do you agree to follow the rules in our Code of Conduct?
- [X] Is this a backend issue? Use the lemmy-ui repo for UI / frontend issues.
Summary
This is something I noticed the last week. Sometimes when linking to a lemmy post, the wrong preview image and blurb is being returned. Initially I thought it was just a fluke when I linked to this post and got a different result in the preview, but today it happened a second time from mastodon, with pretty catastrophic results. I posted one link to mastodon, and the preview was a completely different and very NSFW preview instead. The only way to "fix" it was to edit my post and add a random query at the end.
I asked on matrix and it seems I'm not the only one affected https://mstdn.social/@DJxSpeedy/113443024738078089
I checked manually on my post and the preview meta tags are all, currently correc.
Not sure if this is a UI issue or a backend issue. Feel free to move.
Steps to Reproduce
- Post a lemmy link somewhere which fetches link previews.
- Randomly get the wrong one
Technical Details
I'm on lemmy 0.19.6 but this seemns to have also affected previous versions
Version
0.19.6
Lemmy Instance URL
lemmy.dbzer0.com
Do you happen to be able to reproduce this reliably? Does this seem to only happen when posting Lemmy links to Mastodon or did you see this anywhere else as well? Are the steps for "posting to Mastodon" to create a toot and put the link in the body and that leads to this or something else?
I missed the link to the Lemmy post originally. The lemmy.zip post looks normal when viewed from Lemmy, while it seems broken on Mastodon.
https://lemmy.dbzer0.com/post/31168360, which links to https://lemmy.dbzer0.com/post/31115200, shows wrong metadata even on Lemmy:
{
"id": 31168360,
"name": "Will upgrade to 0.19.6 sometime today",
"url": "https://lemmy.dbzer0.com/post/31115200",
// ...
"published": "2024-11-09T11:05:01.327669Z",
"updated": "2024-11-09T11:05:30.492140Z",
// ...
"embed_title": "Ambulance hits Oregon cyclist, rushes him to hospital, then sticks him with $1,800 bill, lawsuit says - Divisions by zero",
"embed_description": "The cyclist, who suffered a broken nose, was initially treated at the scene by\nthe ambulance driver.",
"thumbnail_url": "https://lemmy.dbzer0.com/pictrs/image/3a3d2376-11a9-4fd1-bb85-dd58ed314a45.webp",
"ap_id": "https://lemmy.dbzer0.com/post/31168360",
// ...
"url_content_type": "text/html; charset=utf-8"
}
Given that both of these cases are about metadata when posting links to Lemmy posts I'm inclined to say that this is likely a UI issue, which is where opengraph metadata comes from.
I don't think this can be a lemmy-ui issue, because the only metadata fetching it does, is to fetch a page title when creating a post. The thumbnails, embed title, and embed description are all fetched in the back-end, from the URL.
I've seen these cases only when you create a post with the wrong url, then correct it afterwards.
The example above is pretty strange, and must have been done not using the cross-post button, since instead of using the https://join-lemmy.org/news/2024-11-08_-_Lemmy_Release_v0.19.6 url, it used a different lemmy link. My guess is that the original URL was the ambulance one, then you tried to correct it but used a different internal lemmy link instead of the join-lemmy.org link.
Its tough to follow but I'd need a reproducible example.
The ones that happened to me are certainly not the wrong url and then edited, as they are things I've never even seen myself.
I'll post here when it happens to me again.
Seems like the problem is that the message created by the lemmy's "share" button contais the post's body in it, and the link to the post is added below the body's text. If the body contains any other URL, apps like WhatsApp and Telegram will show a preview to that URL, and not the post's URL.
I posted one link to mastodon, and the preview was a completely different and very NSFW preview instead.
This probably happens because the URL in the body can get truncated, resulting in a different URL.
Seems like the problem is that the message created by the lemmy's "share" button contais the post's body in it, and the link to the post is added below the body's text. If the body contains any other URL, apps like WhatsApp and Telegram will show a preview to that URL, and not the post's URL.
A solution to this would be adding the post's body text AFTER the post's URL.
I just saw this happening on a feddit.uk (Lemmy 0.19.7) post that I was linking elsewhere:
https://feddit.uk/post/21568696
This post actually got multiple different rich previews within a few seconds of each other and it is not edited.
The first place where I linked this was showing a text preview for what appears to have been 10 month old post https://feddit.uk/post/8913602, at least the title and description match. There was also an image preview that I don't see on that post.
When I noticed the wrong preview I opened the post in my browser, looked at the page source, and I saw metadata for an entirely different post (screenshot above). That time it returned metadata of 2 month old post https://feddit.uk/post/19003214. This is also confirmed by the html containing <link data-inferno-helmet="true" rel="canonical" href="https://feddit.uk/post/19003214">.
When I used curl to retrieve the page again separately or reloaded again in my browser it returned correct metadata.
The HTML did not have any relevant changes other than title, meta tags and the canonical link.
I think this sounds similar to something I sometimes see when viewing Lemmy posts from Mastodon accounts. As an added wrinkle, it doesn't appear that way on all Mastodon accounts.
Sometimes, it displays a false preview on Mastodon Server A, but not Mastodon Server B.
Another post might look correct on Server A, and wrong on Server B.
And often, posts look completely fine on both.
Here's a screenshot of a post that is displaying correctly on one Mastodon server:
And here's that same post seen on a different Mastodon server, with a false link preview attached:
If you click on the "false" link (in this example, the Gundam thing), you will be redirected to the correct post (in this case, the episode discussion).
I unfortunately don't know how to repro this consistently.
It could be a caching issue, solved by #3248