lorax issues

Support Lora Adapter generated from mistral-finetune

1

### Feature request Recent mistral models inlcuding mistral 7b v0.3 instruct have consolidated.safetensors which have different weights key names compared to what LoRAx expects. Also there are keys like lm_head,...

tensimixt

if LoRAX is based on punica kernels will it be able to support LoRA Adapters for Mistral NeMO 12B?

2

### Feature request if LoRAX is based on punica kernels will it be able to support LoRA Adapters for Mistral NeMO 12B? which has a vocab size > 130k. Currently...

tensimixt

LORAX_USE_GLOBAL_HF_TOKEN is not applied at the first time of calling adapter from huggingface private hub

### System Info ghcr.io/predibase/lorax:24cb494 ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command - [ ] My own modifications ###...

monologg

Stop word is included on phi-2

### System Info When using predibase serverless I see stop words included in the stream. I assumed it is supposed to stop and not include them ### Information - [...

yunmanger1

Fails hard on CUDA error

7

### System Info We are using streaming v1 chat completions API. After some amount of requests or a request with large enough context lorax server fails to respond. And all...

yunmanger1

Refactor the lora load function for clarity and simplicity

This should prevent some nasty illegal memory access errors 1. Consolidate individual list comprehensions into a single for loop 2. Distinct code to create the lora weight pointers tensor 3....

ajtejankar

RuntimeError: CUDA error: no kernel image is available for execution on the device

1

### System Info I am trying to run a qwen2-7b-instruct with AWQ quantized in a kubernetes environment. GPU is single T4 (16 GB VRAM). I see that it is unable...

nethi

Fix: Use correct local path when loading base model from s3

Previously, when loading a base model from s3: `--source s3 --model-id s3://bucket/model` The model would be downloaded to the cache path `/data/models--model`. However, when the base model is first loaded,...

fadebek

(WIP) Support targeting the embedding layer for LoRA

1

# What does this PR do? 1. Re-organize the code in BatchLoraWeights.load. This function was a bit hard to understand as there were multiple list comprehensions with almost same looping...

ajtejankar

Adding Whisper model

1

### Model description OpenAI whisper model eg. medium.en ### Open source status - [x] The model implementation is available - [X] The model weights are available ### Provide useful links...

Jeevi10

lorax
lorax copied to clipboard

Metadata

Support Lora Adapter generated from mistral-finetune

if LoRAX is based on punica kernels will it be able to support LoRA Adapters for Mistral NeMO 12B?

LORAX_USE_GLOBAL_HF_TOKEN is not applied at the first time of calling adapter from huggingface private hub

Stop word is included on phi-2

Fails hard on CUDA error

Refactor the lora load function for clarity and simplicity

RuntimeError: CUDA error: no kernel image is available for execution on the device

Fix: Use correct local path when loading base model from s3

(WIP) Support targeting the embedding layer for LoRA

Adding Whisper model

← Metadata

Owner

Metadata

lorax lorax copied to clipboard

Metadata

← Metadata

Owner

Metadata

lorax
lorax copied to clipboard