lorax
lorax copied to clipboard
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
### System Info Meet this error when more than 1 user request to server (I try to run previous image version and it still work fine) ``` ID not found...
# What does this PR do? Fixes #550 Reference: https://github.com/huggingface/text-generation-inference/pull/2179/files ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other...
### System Info Using two official Docker images (latest and main). ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command...
# What does this PR do? Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks...
Trying to figure out why we can't see the traces from requests being used internally
### Feature request LoRAX currently only supports text generation models (e.g., causal language models). It lacks support for sequence classification models like `AutoModelForSequenceClassification`. This issue proposes adding support for `AutoModelForSequenceClassification`...
Using `--compile` option on the main branch is currently broken. I've fixed the first issue, but this just leads to the next issue, which we need to debug, and changes...
Introduces function-calling capabilities with JSON schema enforcement on output. Example using `Mistral-7B-Instruct-v0.3`: ``` curl 127.0.0.1:8080/generate -X POST -d '{ "inputs": "WHat is the current temperature of New York, San Francisco...
### System Info ```shell docker run ghcr.io/predibase/lorax:ea5d74b --gpus all -it --rm -e HF_TOKEN=... lorax-launcher --source hub --model-id meta-llama/Llama-2-7b-chat-hf --default-adapter-source local --revision f5db02db724555f92da89c216ac04704f23d4590 ``` We notice ``` 2024-08-01T13:25:45.718148Z INFO hf_hub: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55:...
# What does this PR do? Fixes #541 (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks...