Results 35 issues of Srinivas Billa

Hi, Do you know roughly if/when the fp6 implementation will be released? Thanks

Hi, First of all, thank you for this library! Very clean and I appreciate that all I need is pytorch! I wanted to make an issue for the integration of...

### 🚀 The feature, motivation and pitch Fp6 allows for models such llama 70b to fit in a single a100 GPU. Also 6bit is often the sweet spot between performance...

feature request

### System Info N/A ### Who can help? _No response_ ### Information - [x] The official example scripts - [ ] My own modified scripts ### Tasks - [x] An...

bug
triaged

Trying to build an engine for llama 3 70b. I get `KeyError : "Architecture"`

neeed more info

Hi, I just ran your code, and fixed some stuff(attrdict and espeak-ng were missing in the setup). I also made a colab notebook for the LJSpeech model inference. Thanks

hi, it gets to 7000 steps, outputs ```^C```. Doesn't save a point cloud either

Hi, Ive been testing the openai-like-server with [Ollama-webui](https://github.com/ollama-webui/ollama-webui) and when using the rag pipeline, the code returns a list. Not a string and so I get this error from the...

Hey, Amazing work! Can we expect a followup with larger models(13B, 30B and 65B)? Also I think combining your method with https://arxiv.org/abs//2306.15595 would be amazing to get an open source...

### 🚀 The feature, motivation and pitch Claim major improvements over vllm. Unfortunately no code only the paper. arxiv.org/abs/2405.04437 ### Alternatives _No response_ ### Additional context _No response_

feature request