pinafore
pinafore copied to clipboard
Support oEmbed audio/video players
I couldn't find an issue about this, so I thought I'd ask. Is it intentional that the html
portion of oEmbed-backed card previews are not considered? I can see privacy and UI elegance concerns justifying this.
If so, perhaps the embed (typically an iframe, possibly enforced to avoid weird stuff) could be inserted after some kind of user interaction. E.g. similar to the play button for non-autoplay GIFs, it could then render after or instead of the default card.
Examples:
Bandcamp | PeerTube |
---|---|
![]() |
![]() |
The default Mastodon web app handles this by rendering a fixed-height placeholder using the cover image and a button group to either play inline or open the link externally:

I didn't even know that there was an html
attribute that could render an iframe. In any case this would be blocked by Pinafore's CSP. It doesn't allow any cross-origin iframes or cross-origin scripts, so the above example would be totally blocked.
There is definitely a usability vs performance/security tradeoff here, but given my design for Pinafore I prefer to err on the side of performance/security. Pinafore currently doesn't run any code except from pinafore.social
and it has no iframes at all, which is great for performance. I'd be disappointed if the timeline slowed down or a vulnerability were introduced because of a media player in the timeline (which the user may not even interact with).
OTOH sanitizing the HTML and displaying it as-is may be okay... assuming the third-party source was just using it to bold text or something. But again that potentially introduces security risks, or at the very least just the risk that the layout could be messed up. (CSP would block third-party scripts, but I'm not currently blocking third-party CSS, so they could embed CSS that just wrecks the whole page visually.)
What if we 1) don't render it by default (only on click, like Mastodon web), and 2) even then limit it to cases where the html has been verified beforehand by Pinafore to be a string with a single "simple" <iframe>
element, with fixed width and height attributes. Thus not allowing it to be used to display inline content, which is indeed not what this feature is supposed to be used for. (It exists primarily for media playback)
If we find that the string isn't just an iframe, or somehow seems too complicated to validate simply, we err on the side of caution and act like today with just an external link from an OG card.
We could parse the string statically, see what the element tagname and attributes are, and if we like what we see, create a similar string to render upon click. (By roundtripping instead of merely pass/fail, we further reduce risk by ensuring we produce only what we allow even, if our checks were somehow fooled).
Right, but doing so would also mean weakening the current CSP to allow for third-party iframes. That potentially leaves us open to clickjacking attacks. I'm not sure it's worth the tradeoff.