Empty tag attributes are not parsed correctly
iex(4)> Floki.parse_document("<a href></a>")
{:ok, [{"a", [{"href", "href"}], []}]}
Floki interprets this example as if it was <a href="href"> which is of course wrong. I would expect either Floki to represent the empty attribute as an empty string, or to omit it altogether.
Does not seem to affect fast_html
This is a limitation of the default parser, mochiweb_html. Please try to use FastHTML or HTML5ever as the README suggest.
That's worth documenting, rather than closing this bug as "completed", no? What's the point of shipping with a broken parser?
On Thu, 6 Jun 2024, at 17:28, Philip Sampaio wrote:
This is a limitation of the default parser,
mochiweb_html. Please try to useFastHTMLorHTML5everas the README suggest.— Reply to this email directly, view it on GitHub https://github.com/philss/floki/issues/558#issuecomment-2152940878, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFIPSBYNWVFVZ5CTZZDAHDZGCE3ZAVCNFSM6AAAAABI5CN4SCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJSHE2DAOBXHA. You are receiving this because you authored the thread.Message ID: @.***>
@1player sorry, I didn't want to sound rude. My point was to point out that this is documented in our README: https://github.com/philss/floki?tab=readme-ov-file#alternative-html-parsers