html5ever icon indicating copy to clipboard operation
html5ever copied to clipboard

Make async ergonomics nicer.

Open b7-7b opened this issue 2 years ago • 1 comments

Using html5ever in async code is painful, because of Cell in StrTendril. As far as I can tell, that choice is for performance reasons (the documentation claims that the use of non-thread-safe primitives is explained in README.md, but no such explanation exists anywhere I can find) but in my use case I'm quite happy to trade a bit of performance for massively enhanced ergonomics with async.

To that end, I'd like to see something which swaps out StrTendril (aka Tendril<UTF8, NonAtomic>) for Tendril<UTF8, Atomic>. A feature to enable atomicity seems reasonable, but there might be a better option. As far as I can tell that's a completely source-compatible change, so this can probably be implemented with something like:

#[cfg(not(feature="atomic"))]
type H5ETendril = tendril::StrTendril;
#[cfg(feature="atomic")]
type H5ETendril = tendril::Tendril<tendril::UTF8, tendril::Atomic>;

and then replacing references to tendril::StrTendril with crate::H5ETendril.

Obviously the names can be bikeshedded. Those are the first ones I came up with.

Whether this belongs in this crate or tendril is also up for debate, but in my opinion: people using tendril directly already have plenty of ways to give themselves atomicity, and changing defaults through a feature there would just lead to dependency hell.

I can put in a PR, if it does turn out to be that simple.

b7-7b avatar Jul 06 '22 21:07 b7-7b

I just started messing with html5ever today (through scraper, FWIW) and immediately ran into this. Painful means possible, though, yes? If so, I would love some help. If the ergonomics work doesn't happen, an example or some documentation could go a long way. I'd be willing to offer my time doing that if I can get something working in the first place.

dhduvall avatar Sep 06 '22 06:09 dhduvall