workers-rs icon indicating copy to clipboard operation
workers-rs copied to clipboard

feat: create trait definitions for model and streamable model

Open parzivale opened this issue 3 months ago • 14 comments

The main objective behind this pr is to simplify working with streaming responses from text generation models, however the design should be flexible enough for other applications.

For now the Model and StreamableModel are left to the user to implement for their use case, but the ideal end goal for this pr is to define types for all the I/O interfaces workers-ai uses and then seal Model and StreamableModel.

parzivale avatar Sep 28 '25 21:09 parzivale

Very interesting approach. It would be great to see some tests for the sorts of workflows you're seeking to enable here, and that would also help for review too.

guybedford avatar Oct 02 '25 02:10 guybedford

Sure! should I create a new entry in the test folder for this? would there be a better place to put the tests?

parzivale avatar Oct 02 '25 18:10 parzivale

Yes, in the tests folder, we have one big test app that is built altogether and tested as one.

guybedford avatar Oct 02 '25 18:10 guybedford

There isn't currently an AI binding in the tests wrangler.toml, I can test locally on my account via a new binding but this will break CI right?

parzivale avatar Oct 02 '25 19:10 parzivale

You should just be able to add a new binding there, if there's issues with that happy to look into it further.

guybedford avatar Oct 02 '25 19:10 guybedford

@guybedford I've added new test cases to show a very simple example implementation

parzivale avatar Oct 02 '25 19:10 parzivale

Hey! I'm eager to merge this, do you have any feedback on what needs to be changed?

parzivale avatar Oct 28 '25 17:10 parzivale

I'm still not quite sure I follow why we need a custom Stream type for AI specifically over just improving the lower-level streaming primitives to support the use case.

The idea behind that was to split out the different return types, for streaming replies we don't expect a type T that implements Deserialize/Serialize like we would normally do with a model response. Instead we expect a stream of T. I would be happy to break out the streaming implementation into something else, but I do think it is important to distinguish between streaming and non-streaming responses.

parzivale avatar Oct 31 '25 10:10 parzivale

The idea behind that was to split out the different return types, for streaming replies we don't expect a type T that implements Deserialize/Serialize like we would normally do with a model response. Instead we expect a stream of T. I would be happy to break out the streaming implementation into something else, but I do think it is important to distinguish between streaming and non-streaming responses.

Would typed ReadableStream solve this as an alternative if we had that generic?

guybedford avatar Oct 31 '25 18:10 guybedford

The idea behind that was to split out the different return types, for streaming replies we don't expect a type T that implements Deserialize/Serialize like we would normally do with a model response. Instead we expect a stream of T. I would be happy to break out the streaming implementation into something else, but I do think it is important to distinguish between streaming and non-streaming responses.

Would typed ReadableStream solve this as an alternative if we had that generic?

No as there needs to be an additional wrapper as workers ai stream returns see events rather than just the T we need, so there needs to be a bit of postprocessing for things like tool calls when streaming.

parzivale avatar Oct 31 '25 19:10 parzivale

No as there needs to be an additional wrapper as workers ai stream returns see events rather than just the T we need, so there needs to be a bit of postprocessing for things like tool calls when streaming.

Could this be interpreted as a transform stream?

guybedford avatar Oct 31 '25 19:10 guybedford

Yes it could! That would be a pretty good solution, the only problem is documentation, could we type alias it so in docs we can describe usage?

parzivale avatar Oct 31 '25 19:10 parzivale

My normal first stop when dealing with return types is to go to definition and using a transform stream would make that harder

parzivale avatar Oct 31 '25 19:10 parzivale

I'm still happy to review this for the release tomorrow if there is interest in finishing up the last change requests.

guybedford avatar Nov 20 '25 22:11 guybedford