bridgy-fed
bridgy-fed copied to clipboard
Web feed polling: if a post was already fetched, metaformats aren't used
jeejeebhoy.ca is bridged from the web to Bluesky at https://bsky.app/profile/did:plc:sg5uen22rksnwygsonuicm4h . We successfully found and used an OGP image for https://jeejeebhoy.ca/2024/07/26/paris-2024-olympics-opening-ceremony/ , but we didn't for https://jeejeebhoy.ca/2024/08/14/cover-reveals-title/ , I think because we'd fetched it earlier as part of processing https://bsky.app/profile/did:plc:4wtl4ihomtq4zax5cz2y62us/post/3kzrgvc7ybx2n , which links to it. We then later polled jeejeebhoy.ca's feed, saw it, saw that it didn't have an mf2 image, and loaded it with metaformats:
https://github.com/snarfed/bridgy-fed/blob/e776ff944b61120a7bea96e0baa6b31db696dfe8/web.py#L760-L763
...but the metaformats code is in Web.fetch
:
https://github.com/snarfed/bridgy-fed/blob/e776ff944b61120a7bea96e0baa6b31db696dfe8/web.py#L476-L484
...and this post was already in the datastore, so it never got there. So we need to handle metaformats somewhere else that will run even when we load but don't need to fetch.
I think this is the root cause, but I'm only 90% sure, not quite 100%.