lol-html icon indicating copy to clipboard operation
lol-html copied to clipboard

Try implement async content handlers

Open inikulin opened this issue 5 years ago • 9 comments

Figure out the performance consequences of making content handlers async

inikulin avatar Nov 13 '19 14:11 inikulin

Hey 👋 Wanted to chime in with a use case in case you're looking for one to drive this. In CloudFlare we have a need for the ability to suspend the streaming rewriter, so we can do various async work: fetching, scheduling microtasks, etc. It's difficult to describe the why we need this without explaining the full infrastructure and product but the tl;dr is that we're running arbitrary customer provided JavaScript inside these handlers that ultimately inserts dynamic HTML--don't worry, customers aren't co-located in the same Workers, so there's no trust issues between them. Their code might have async stuff and how we load their code is already async, so we need a way to "pause" the rewriting while we generate the content for a matching handler.

We'd be flexible on the exact semantics, but it seems like in the JS world, returning a Promise might be doable. Then you could do something like this:

class ElementHandler {
  async element(node) {
    const first = await fetchFirst();
    node.append(first, { html: true });

    const second = await fetchSecond();
    node.append(second, { html: true });
  }
}

Because this doesn't work, we haven't yet tested whether you can append partial HTML chunks or if they need to be complete well-formed HTML. If it ends up being possible to append partials, we'd likely "stream" them in as well, manually reading since the append() api doesn't support Readables or anything. That all is somewhat separate from the async support, just mentioning in case it impacts it. We can append well-formed HTML with matching open/close tags instead of chunking.

Of course, this library is agnostic to JS so this library itself would probably instead use Rust Future's and Workers wrap that in Promise logic, though I could see some potential interopt challenges depending on how Cloudflare integrates this library with V8.

-- Edit: it might be interesting to note we also need to run lol-html in other non-Worker environments.

jayphelps avatar Nov 29 '19 20:11 jayphelps

Thanks for your input! We have similar motiviations, fetch() or cache operations are common use-cases for HTMLRewriter's handlers.

@inikulin knows more about Rust's Futures and await than I do, but the plan would be to implement it on the Cloudflare workers' side and probably leave the library as is for now.

Appending a stream sounds like a good idea to me.

xtuc avatar Nov 29 '19 21:11 xtuc

Hello! We wanted to inline Google Fonts like in this Cloudflare blog post. It was suggested in a comment to use HTMLRewriter instead using a Regex to search through the stream like what's done in the blog post.

We need to stream the HTML response to obtain the href attribute of the stylesheet link, so we look for the stylesheet link. Then, in the handler, we want to fetch the CSS using the href to obtain the stylesheet content and replace the stylesheet link with that content. However, this happens asynchronously which doesn't currently seem to be supported. Wasn't sure if there was another way to approach this.

chlao avatar Jun 02 '20 21:06 chlao

@chlao support of async handlers for HTMLRewriter is work in progress, we will post in this thread once it will be available.

sejoker avatar Jun 18 '20 12:06 sejoker

I just saw the 2020/6/18 Workers Runtime Release Notes about HTMLRewriter API now supports asynchronous handlers in the forum

And I tried and it worked 🎉

Check this playground link

See the Testing tab you will find GoogleFonts link is replaced by inlined CSS.

Thanks, Cloudflare team for implementing this, it will be super useful for my use cases.

vinaypuppal avatar Jun 18 '20 19:06 vinaypuppal

@vinaypuppal indeed, it unblocks so many possibilities of using HTMLRewriter to the full potential, enjoy.

sejoker avatar Jun 19 '20 09:06 sejoker

Can we use this inside tokio? Thanks!

edo888 avatar Jun 25 '20 23:06 edo888

@sejoker can you clarify: the async support was only for Cloudflare Workers, not for this library, is that correct? I recall hearing that in the end it was implemented using fibers in Cap'n Proto, rather than providing direct support in lol-html? I imagine that means there's no plans to support it natively in lol-html and consumers should use their own fiber-like abstraction or similar?

jayphelps avatar Nov 01 '20 19:11 jayphelps

@jayphelps that is correct for the time being. Though, we have some scenarios in mind that will force us to implement async handlers for this lib. Though, no promises for now.

inikulin avatar Nov 01 '20 23:11 inikulin

@inikulin is this actually done? I can't see any trace of implementing async content handlers

methyl avatar Jan 24 '24 13:01 methyl