bridgy-fed
bridgy-fed copied to clipboard
discard stylesheets etc in HTML snippets
My user page shows this right now:
That post, https://snarfed.org/2023-06-02_bridgy-stats-update-8 , has HTML content
in its stored Object
, which is fine, except it starts with an inline <style>
, and the snippet is including its text. We should sniff for HTML in snippet generation and drop <style>
tags, and maybe others.
https://github.com/snarfed/bridgy-fed/blob/d9cd5d14b9a8156b48574c9246a44ea053fa04ea/pages.py#L216-L218
The only general purpose sniffing code I see in granary right now is in _content_for_create
:
https://github.com/snarfed/granary/blob/7724cb29457b74a7ae961eaaedf6cf0326fc4362/granary/source.py#L805-L815