aphrodite-engine icon indicating copy to clipboard operation
aphrodite-engine copied to clipboard

Large-scale LLM inference engine

Results 201 aphrodite-engine issues
Sort by recently updated
recently updated
newest added

### Your current environment ```text PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Manjaro Linux (x86_64) GCC version:...

bug

This PR adds a custom floating point quantization method powered by [TorchAO](https://github.com/pytorch/ao), which achieves a high throughput, thanks to the optimized [fp6_llm](https://github.com/usyd-fsalab/fp6_llm) kernel. Use `-q torchao --torchao-fp-bits 6` to load...

### Your current environment ```python env.py Collecting environment information... PyTorch version: 2.3.0 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu...

bug

### Your current environment I have server with 4x3090ti. I can run llama 3 70b with vllm in docker with command: `sudo docker run --shm-size=32g --log-opt max-size=10m --log-opt max-file=1 --rm...

### Your current environment Collecting environment information... PyTorch version: N/A Is debug build: N/A CUDA used to build PyTorch: N/A ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.4 LTS...

bug

### Your current environment ```text PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.3 LTS (x86_64) GCC...

bug

Seems like our current implementation has an issue: ``` dynatemp_logits = logits[dynatemp_mask] ERROR: | ~~~~~~^^^^^^^^^^^^^^^ ERROR: | IndexError: The shape of the mask [1] at index 0 does not match...

Syncs the kobold lite embed and disables certain features that aphrodite cannot currently use. KoboldCPP impersonation version has not been incremented as no new features need to be enabled.

### Your current environment ```text The output of `python env.py` ``` ### How would you like to use Aphrodite? I want to get some AMD Navi based GPUs, but I...

### Your current environment ```text PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.4 LTS (x86_64) GCC...

bug