spin icon indicating copy to clipboard operation
spin copied to clipboard

WASI P3 HTTP middleware

Open itowlson opened this issue 3 months ago • 11 comments

As part of WASI P3 work, we want to explore implementing HTTP middleware using component composition. The purpose is to allow application developers to use off-the-shelf or custom components in a pipeline to add or customise behaviour without needing to bake it into the core application component.

For example, the following site requires a CORS check and authentication, and enriches the request with geolocation information. None of these are core application concerns, and they could ideally be pulled off the shelf rather than custom coded.

Image

Manifest

Spin cannot naturally do this in the current dependency model, because dependencies model only imports of the application component. (It can do one level of middleware unnaturally, by making the main component the middleware and having it depend on the application, but 1. that currently allows only one level and 2. that's unnatural, ew.)

So to support middleware we have to decide:

  1. Do we try to wedge it into the current dependencies section?
  2. If not, where do we want to put it?

The trouble with using the dependencies section is that it's a map, whereas the middleware pipeline is a sequence. All middleware components use the same interface, and the order of plugging them together is significant. I'm sure some creative soul has an idea for how to overcome this mismatch, but it defeated me. Ideas welcome!

If we can't use the dependencies section, where do we put it? The two options appear to be the trigger or the component:

# Trigger?
[[trigger.http]]
route = "/..."
component = "cat-videos"
pipeline = ["cors", "auth", "geo"]

# Or alternative trigger?
[[trigger.http]]
route = "/..."
component = ["cors", "auth", "geo", "cat-videos"]

# Or component?
[component.cat-videos]
source = "catvids.wasm"
pipeline = ["cors", "auth", "geo"]

I'm attracted to putting it on the trigger, because:

  1. It's HTTP specific - it can be part of the HTTP trigger schema rather than the common component schema
  2. It frames middleware as part of the serving pipeline rather than as inherent to the component

(okay mostly 1 I admit)

However, all our composition and Wasm loading infrastructure is component centric. During prototyping (https://github.com/itowlson/spin/tree/http-middleware) I found it really hard to see how the relevant subsystems could reach into trigger-specific config this way - for example how would spin registry push know that this field on the HTTP trigger required binaries to be resolved and packaged?

So for prototyping I've been putting it on the component. But I'd value discussion and feedback on this.

Interface

Middleware needs to be able to pass a request on to the next entry in the chain. In principle this could be done by composing each component's wasi:http/handler import onto the next one's export, then the middleware code calling handle on the incoming request to pass it on. But many middleware items will want to do their own HTTP stuff, e.g. authenticating using OIDC or whatever.

https://github.com/WebAssembly/WASI/issues/793 proposes adding an origin interface to the WASI-HTTP proxy world. This would be an additional interface identical to handler, but with the semantics of "pass it on along the chain" rather than "fling it across the network." My prototype called this next and had it in a Spin-specific package, but the concept is the same. If origin lands soon enough then we can use that, otherwise we use our own and adopt the spec one when it arrives.

Unlike dependencies, Spin (or the HTTP trigger, or whatever) needs special knowledge of the middleware origin/next interface, because it's not amenable to plug composition (and we don't want devs to have to specify the mapping on every pipeline entry!).

Permissions

Currently, dependencies get the choice of "main component permissions or nothing," governed by the dependencies_inherit_configuration flag. We should not expect fine-grained permissions in the middleware timeframe, so middleware will likely inherit the same permissions behaviour, governed by the same flag. This fits nicely with "middleware is an attribute of the component" but not so nicely with it being on the trigger (because if the component needs to allow-list the OIDC provider then it is not middleware-agnostic after all).

This is a bit of a brain dump and I am sure other brains have thought about this a lot more than mine so please weigh in.

itowlson avatar Oct 07 '25 01:10 itowlson

https://github.com/spinframework/spin/blob/e4a160066d43f88023466f3940dbc9f31bb5dc94/crates/manifest/src/schema/v2.rs#L133

🙂

I think my theory was something like:

[[trigger.http]]
component = "cat-video"
components.middleware = ["cors", "auth", "geo"]

lann avatar Oct 07 '25 01:10 lann

The worry with that is still that the meaning of the middleware key and the stuff it's meant to do is specific to the HTTP trigger. But good catch as it does provide a home for "hey loader you're gonna need these things for Wasming."

I will have an investigate of this - thanks!

itowlson avatar Oct 07 '25 01:10 itowlson

Another option we have is to implement middleware more like service chaining with the "linking" implemented in the host rather than composition. That has its own tradeoffs but seems better than the watered-down functionality of dependencies.

This would also give us a solution to the "origin" problem, where we could use another magic hostname like e.g. next.middleware.spin.alt or what have you.

lann avatar Oct 07 '25 02:10 lann

Another option we have is to implement middleware more like service chaining with the "linking" implemented in the host

@lann does that mean you would have to explicitly call the middleware in your code handler over HTTP?

radu-matei avatar Oct 07 '25 12:10 radu-matei

The worry with that is still that the meaning of the middleware key and the stuff it's meant to do is specific to the HTTP trigger.

@itowlson -- I am not convinced I follow the above given the very first sentence of this issue:

As part of WASI P3 work, we want to explore implementing HTTP middleware using component composition.

Yes, I imagine we could have "middleware" for non-HTTP components as well in the future. But I worry that a) I don't see the feature requests for that (not saying it doesn't exist, just that overwhelmingly, we've been talking HTTP specific middleware), and b) in my opinion it would be a loss delaying HTTP middleware longer given a).

What do you think?

radu-matei avatar Oct 07 '25 12:10 radu-matei

@radu-matei The concern is not non-HTTP middleware. Such a thing is totally out of scope unless someone asks for it.

The concern is that it how the composition subsystem is to understand the trigger-specific semantics of the middleware sub-key. I.e. I think this solves the "how does spin registry push know that these are components that need to be included as binaries in the OCI artifact" but I haven't yet convinced myself it solves the "the thing that understands the trigger-specific key has access to the right things at the right time" problem. It's definitely worth exploring though: the HTTP trigger would presumably have specific code, I just need to assure myself that it can be made to line up with the composition stuff without putting specific knowledge of the middleware sub-key into agnostic code.

itowlson avatar Oct 07 '25 18:10 itowlson

A specific concern is precomposition in OCI artifacts. By default, Spin performs all composition at registry push time (for compatibility with hosts that don't do composition). But how does spin registry push know how to interpret the components.middleware table (or any other trigger-config equivalent), when that table and its semantics are specific to the HTTP trigger?

(Lann's suggestion of service-chaining-a-like would bypass this although possibly at the expense of borking hosts that don't support service chaining. Not sure, we can certainly look into this too.)

(Also, to be clear, as an interim fix we could give spin registry push special knowledge of HTTP trigger keys: it would kinda be kicking the can down the road, but it's quite possible we'd kick it far enough that we would never need to think about it ever again.)

itowlson avatar Oct 07 '25 19:10 itowlson

Okay, I started looking at doing it in the HTTP trigger and there's another little nasty: we have an assumption that we can look up InstancePres by component ID. Per-trigger middleware means we can no longer rely on that assumption: two triggers might point to the same component ID, but have different middleware resulting in different compositions. I'll investigate how easy it is to modify the current assumption: I think the current mapping is RouteMatch -> component ID -> InstancePre in which case we might be able to get away with switching to something richer in the middle.

itowlson avatar Oct 07 '25 22:10 itowlson

All right! I have it working using the trigger:

https://github.com/itowlson/spin-middleware-terrifying-nonsense/blob/17e1bf4e23710524fe51ee05689b2ed3f5165609/spin.toml#L14

Please note that "I have it working" is, as ever, code for "it has worked once, in a trivial test case, on my machine." I had to violate enough fairly foundational assumptions, and did so in a sufficiently cavalier manner, that I'd have little confidence of it working right now in more complicated examples or in other hosts!

Some notes / learnings:

  • I have not yet tested if middleware components are correctly included in OCIs. My guess is "it depends."
  • I do not have a solution for SpinKube. It's going to require either terrifying changes to deep bits of SpinKube or OCI having knowledge of trigger-specific keys in the components map.
  • The trigger.components thing is currently defined as a map from strings (e.g. middleware) to a ComponentSpec of array of ComponentSpec. So with the current interpretation of ComponentSpec we can't write something like middleware = ["github:[email protected]"] or even middleware = [{ package = "github:[email protected]" }] because members have to be either strings, which are interpreted as IDs in the manifest, or inline components, which requires the source = stuff. Fortunately, we have no back compat obligations in regard to trigger.components (it is marked as "reserved") so we can rejig this if we want.

The WIP branch is https://github.com/itowlson/spin/tree/http-middleware-get-triggy-with-it-2 but I would advise against looking at the diffs unless you're really into Pier Paolo Pasolini.

itowlson avatar Nov 03 '25 02:11 itowlson

Updated the spin-middleware-terrifying-nonsense testbed to show it modifying the message body (with streaming both in the middleware and at the service). You'll need the https://github.com/dicej/spin/tree/async-fixes branch of Spin to run it, though (which is the triggy branch but with some upstream fixes).

(note that the testbed is currently deliberately slowed down to make it easier to verify streaming behaviour by eye)

itowlson avatar Nov 05 '25 02:11 itowlson

New version: https://github.com/itowlson/spin/tree/http-middleware-take-2 awaits your gasps of dismay.

What's changed:

  • I did OCI. I don't love it, I will never love it, but it works. Has worked. Once. On my machine.

  • I violated fewer fundamental assumption, because in shock news violating fundamental assumptions 1. turned out to cause some elusive bugs that looked hard to track down and 2. didn't work for the OCI case anyway. The OCI solution which I hated, ironically, worked perfectly, so I adopted that throughout.

    • This also makes the changes far less invasive to the engine/instance plumbing.
  • Many, but not all, things have slightly less stupid names.

What's the same:

  • For now, middlewares must still manifest components, and can still be identified only by component ID references (e.g. not by file paths or registry package refs).

  • I am still using my kludgy special snowflake middleware interface instead of the forthcoming WASI RC.

The big fundamental change from the first approach is that I now move middleware specifications from the triggers to the components during lockfile creation. This means synthesising extra manifest components if more than one trigger points to the same component and at least one of those triggers has middleware in play. (This may turn out to violate other fundamental assumptions, in fact I'm pretty sure I can think of at least two bugs with what I've done so far, but if there are only two bugs then it'll be a bloody miracle.) The component splitting and OCI stuff should be pretty generic should a need ever arise: I think all interface-specific composition logic is restricted to the trigger.

Anyway please feel free to kick tyres. The middleware-terrifying-nonsense sample still works and should be usable as a model.

itowlson avatar Nov 12 '25 22:11 itowlson