FastChat
FastChat copied to clipboard
Python function instead of CLI
is there any way I can use the quantized Vicuna 13b on GPU, but not use the CLI? Let me know if you have any pointers
@iRanadheer : Could you describe your use case? If neither CLI nor web server works, what kind of interface do you expect?
Recently @suquark adds a feature for openAI-like API, does that suffice for your use case?
See #389
openAI-like APIs have been added. Now you can use python's request library to send post to the openai-like endpoint. Closing.