feat: create trait definitions for model and streamable model
The main objective behind this pr is to simplify working with streaming responses from text generation models, however the design should be flexible enough for other applications.
For now the Model and StreamableModel are left to the user to implement for their use case, but the ideal end goal for this pr is to define types for all the I/O interfaces workers-ai uses and then seal Model and StreamableModel.
Very interesting approach. It would be great to see some tests for the sorts of workflows you're seeking to enable here, and that would also help for review too.
Sure! should I create a new entry in the test folder for this? would there be a better place to put the tests?
Yes, in the tests folder, we have one big test app that is built altogether and tested as one.
There isn't currently an AI binding in the tests wrangler.toml, I can test locally on my account via a new binding but this will break CI right?
You should just be able to add a new binding there, if there's issues with that happy to look into it further.
@guybedford I've added new test cases to show a very simple example implementation
Hey! I'm eager to merge this, do you have any feedback on what needs to be changed?
I'm still not quite sure I follow why we need a custom Stream type for AI specifically over just improving the lower-level streaming primitives to support the use case.
The idea behind that was to split out the different return types, for streaming replies we don't expect a type T that implements Deserialize/Serialize like we would normally do with a model response. Instead we expect a stream of T. I would be happy to break out the streaming implementation into something else, but I do think it is important to distinguish between streaming and non-streaming responses.
The idea behind that was to split out the different return types, for streaming replies we don't expect a type T that implements Deserialize/Serialize like we would normally do with a model response. Instead we expect a stream of T. I would be happy to break out the streaming implementation into something else, but I do think it is important to distinguish between streaming and non-streaming responses.
Would typed ReadableStream solve this as an alternative if we had that generic?
The idea behind that was to split out the different return types, for streaming replies we don't expect a type T that implements Deserialize/Serialize like we would normally do with a model response. Instead we expect a stream of T. I would be happy to break out the streaming implementation into something else, but I do think it is important to distinguish between streaming and non-streaming responses.
Would typed ReadableStream solve this as an alternative if we had that generic?
No as there needs to be an additional wrapper as workers ai stream returns see events rather than just the T we need, so there needs to be a bit of postprocessing for things like tool calls when streaming.
No as there needs to be an additional wrapper as workers ai stream returns see events rather than just the T we need, so there needs to be a bit of postprocessing for things like tool calls when streaming.
Could this be interpreted as a transform stream?
Yes it could! That would be a pretty good solution, the only problem is documentation, could we type alias it so in docs we can describe usage?
My normal first stop when dealing with return types is to go to definition and using a transform stream would make that harder
I'm still happy to review this for the release tomorrow if there is interest in finishing up the last change requests.