LocalAI icon indicating copy to clipboard operation
LocalAI copied to clipboard

[completionEndpoint] adds support for `Stream: true`

Open samm81 opened this issue 1 year ago • 5 comments

Description

This PR fixes #416

Notes for Reviewers sorry for stepping on your toes @krishnaduttPanchagnula , (pr here) but I wanted to get it working for myself, so took a stab at it, then when I figured it out I supposed I should commit it back :)

Signed commits

  • [X] Yes, I signed my commits.

samm81 avatar May 31 '23 22:05 samm81

working on implementing a test, but wanted to get this out first

samm81 avatar May 31 '23 22:05 samm81

I don't know why the commit is showing up as unverified :thinking: I don't see where in the github settings to tell github that I used my ssh key...

samm81 avatar May 31 '23 22:05 samm81

I don't know why the commit is showing up as unverified thinking I don't see where in the github settings to tell github that I used my ssh key...

figured it out!

samm81 avatar May 31 '23 22:05 samm81

Fantastic, thanks @samm81! I think we are almost there :+1:

mudler avatar Jun 01 '23 09:06 mudler

updated, ptal!

samm81 avatar Jun 01 '23 17:06 samm81

should be good to go!

samm81 avatar Jun 02 '23 18:06 samm81

testing:

Details
$ curl -N http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
     "model": "ggml-model-q4_0.bin",
     "prompt": "a long time ago in a galaxy far, far away",
     "max_tokens": 32
   }'
{"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" | we made it to the Chinese Garden! | and I was so happy 😁 | I wish they had more food options but still it'"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}%
$ curl -N http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
     "model": "ggml-model-q4_0.bin",
     "prompt": ["a long time ago in a galaxy far, far away", "went to a crazy party | "],
     "max_tokens": 32,
     "temperature": 0.7
   }'
{"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" | I was getting ready to go to bed | and Sharon came upstairs | she said something about the carpet that was on the floor of our"},{"text":"爽亮饭店 | met some people | but didn't feel like talking to them much, they were all too loud and"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}%
$ curl -N http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
     "model": "ggml-model-q4_0.bin",
     "prompt": ["a long time ago in a galaxy far, far away", "went to a crazy party"],
     "max_tokens": 32,
     "temperature": 0.7,
     "stream": true
   }'
{"error":{"code":500,"message":"cannot handle more than 1 `PromptStrings` when `Stream`ing","type":""}}%
$ curl -N http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
     "model": "ggml-model-q4_0.bin",
     "prompt": "a long time ago in a galaxy far, far away",
     "max_tokens": 32,
     "temperature": 0.7,
     "stream": true
   }'
data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" |"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" I"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" found"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" my"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" way"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" to"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" the"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" train"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" station"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"!"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" |"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" But"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" it"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"'"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"s"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" really"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" small"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"..."}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" |"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" On"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" the"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" train"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" now"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" "}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"\ufffd"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"\ufffd"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"\ufffd"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"\ufffd"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" |"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" Train"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" is"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" full"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"model":"ggml-model-q4_0.bin","choices":[{"finish_reason":"stop"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: [DONE]

samm81 avatar Jun 02 '23 19:06 samm81