lorax issues

optimize healthcheck

# What does this PR do? Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks...

noah-yoshida

Multi-responses for a single inference

### Feature request Lorax should have a feature where It could generated multiple generation request for same prompt via a single API call. ### Motivation It is helpful as when...

mrcchef

Things started failing after new commit into main

2

### System Info amazon linux 2 Running it in l40s ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command -...

gane5hvarma

Issue: recognizing a base causal language model as an embedding model

1

### System Info AWS EC2 G6e.xlarge instance (1 L40S GPU), Linux Machine Latest lorax version as of today, using the docker command Python 3.12.6 PyTorch version: 2.4.0+cu121 CUDA version: 12.1...

veezbo

Unexpected response with long-context model (Phi-3)

### System Info `ghcr.io/predibase/lorax:f1ef0ee` ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command - [ ] My own modifications ###...

prd-tuong-nguyen

arc runner: v2

# What does this PR do? Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks...

noah-yoshida

Otel v2

# What does this PR do? Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks...

noah-yoshida

Not able to run source code

### System Info OS version: Ubuntu 22.04 Rust version (if self-compiling, cargo version): Cargo 1.75.0 Model being used (curl 127.0.0.1:8080/info | jq): If local model please explicit the kind of...

nirvitarka

Lorax Stop Responding after non-concurrent/concurrent requests

### System Info Hi, I'm currently working on using Lorax to test, benchmark, and serve my LoRA adapters, but I keep facing the same problem. The issue is that after...

MAHMUTGOKSU

feat: LoRAX On-Premise Deployment Playbook & DX Enhancements

1

## Problem: Addressing "Frozen Pain" in On-Premise LoRAX Adoption This Pull Request introduces a comprehensive **LoRAX Deployment Playbook** designed to drastically improve the on-premise adoption experience, directly addressing documented pain...

minhkhoango

lorax
lorax copied to clipboard

Metadata

optimize healthcheck

Multi-responses for a single inference

Things started failing after new commit into main

Issue: recognizing a base causal language model as an embedding model

Unexpected response with long-context model (Phi-3)

arc runner: v2

Otel v2

Not able to run source code

Lorax Stop Responding after non-concurrent/concurrent requests

feat: LoRAX On-Premise Deployment Playbook & DX Enhancements

← Metadata

Owner

Metadata

lorax lorax copied to clipboard

Metadata

← Metadata

Owner

Metadata

lorax
lorax copied to clipboard