lorax
lorax copied to clipboard
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
### System Info When using Lorax with a LoRA adapter via the /v1/chat/completions endpoint, the adapter works as expected when "stream": false. However, when I set "stream": true, the response...
Hello Predibase Team, First, thank you , after quick but careful review of the code and documentation, I’d like to ask for clarification and raise some points regarding the claims...
### System Info In https://github.com/predibase/lorax/tree/main/clients/python#predibase-inference-endpoints shouldn't the `endpoint_url` be updated? Instead of `endpoint_url = f"https://api.app.predibase.com/v1/llms/{llm_deployment_name}"` shouldn't it be as mentioned in https://loraexchange.ai/reference/python_client/#predibase-inference-endpoints ``` # You can get your Predibase API...