Results 24 issues of oobabooga

### Describe the bug The LaTeX font is dark instead of light in the dark theme, rendering it unreadable. ### Is there an existing issue for this? - [X] I...

bug

>RWKV is a RNN with Transformer-level LLM performance https://github.com/BlinkDL/RWKV-LM This is work in progress and still broken for now.

I have expressed my interest in having RWKV officially implemented in Hugging Face in https://github.com/huggingface/transformers/issues/17230. Meanwhile, I have a distilled set of suggestions for how this library could be made...

It uses a trick to create an iterator from `stopping_criteria`. This way, it is not necessary to call `model.generate` multiple times, making things a lot (really, a lot) faster. Currently...

### Feature request I would like the ability to lazy load models to the GPU using `AutoModelForCausalLM.from_pretrained`. At the moment, it is possible to reduce the RAM usage using the...

bug
Big Model Inference

As new features and functionalities become available, additional GUI elements will need to be added, and the current layout is already cramped. The interface needs to have some kind of...

enhancement

I have tried quantizing galactica-30b with this command: ``` CUDA_VISIBLE_DEVICES=0 python opt.py /models/galactica-30b --wbits 4 --save galactica-30b-4bit.pt c4 ``` And then using it in the [web UI](https://github.com/oobabooga/text-generation-webui) with this one:...

Example: https://huggingface.co/ozcur/alpaca-native-4bit Usage: ``` python server.py --model alpaca-native-4bit --wbits 4 --groupsize 128 ```