Jlama icon indicating copy to clipboard operation
Jlama copied to clipboard

streaming server support?

Open kevintanhongann opened this issue 1 year ago • 6 comments

Is there a way to run and expose an API streaming server compatible with OpenAI API specifications?

kevintanhongann avatar Feb 23 '24 12:02 kevintanhongann

Probably, here's the current API call for chat

https://github.com/tjake/Jlama/blob/main/jlama-cli/src/main/java/com/github/tjake/jlama/cli/serve/GenerateResource.java

tjake avatar Feb 23 '24 13:02 tjake

I want this feature too

phact avatar Feb 23 '24 14:02 phact

I am pretty sure that this would (at least) require Generator#generate to be enhanced with a callback that is called when the generation is complete.

geoand avatar Mar 01 '24 08:03 geoand

You mean for stream=false?

phact avatar Mar 07 '24 15:03 phact

For both :)

geoand avatar Mar 07 '24 16:03 geoand

working PR here https://github.com/tjake/Jlama/pull/23

phact avatar Mar 07 '24 21:03 phact