Alpaca-LoRA-Serve Streaming response

Hi,

thank you very much for this work. Do you plan to support streaming response any time soon like text-generation-webui does ?

Best Alexander

Mar 20 '23 20:03 alexanderfrey

I am planning to add streaming feature soon! :) in that case batch requests handling feature will be disabled (well will remain as an option)

Mar 21 '23 04:03 deep-diver

Can you elaborate on how you would implement it ? Maybe I can help. From what I understand it helps reducing the latency in a way such that the user wont have to wait until the full response is returned.

Mar 21 '23 06:03 alexanderfrey

Thanks @alexanderfrey

Hugging Face library does not support streaming feature out of the box, so I thought I need a sort of money patch. Luckily, I found one from different open source project : https://github.com/hyperonym/basaran/issues/57

if it turns out hard for me to implement streaming feature in the current version, I will let you know!

Mar 21 '23 07:03 deep-diver

check this out : https://twitter.com/algo_diver/status/1638079375085305856?s=20

Mar 21 '23 08:03 deep-diver

just updated the repository and experimentally running here : https://notebookse.jarvislabs.ai/BuOu_VbEuUHb09VEVHhfnFq4-PMhBRVCcfHBRCOrq7c4O9GI4dIGoidvNf76UsRL

Mar 22 '23 02:03 deep-diver

@alexanderfrey

in the streaming, most of the parameters in GenerationConfig are not supported. Do you think you can make that happen?

Mar 22 '23 03:03 deep-diver

let me have a look tonight. can not promise but definitely interested to contribute

Mar 22 '23 16:03 alexanderfrey

Alpaca-LoRA-Serve Alpaca-LoRA-Serve copied to clipboard

Streaming response

Alpaca-LoRA-Serve
Alpaca-LoRA-Serve copied to clipboard