Fares Abawi comments

Results 26 comments of


                                            Fares Abawi

Will it run on 3080 GTX 16GB VRAM?

I was able to run 7B on two 1080 Ti (only inference). Next, I'll try 13B and 33B. It still needs refining but it works! I forked LLaMA here: https://github.com/modular-ml/wrapyfi-examples_llama...

Will it run on 3080 GTX 16GB VRAM?

> No chance You can do it with Wrapyfi # LLaMA with Wrapyfi Wrapyfi enables distributing LLaMA (inference only) on multiple GPUs/machines, each with less than 16GB VRAM **currently distributes...

Inference on GPU

I was able to run 7B on two 1080 Ti (only inference). Next, I'll try 13B and 33B. It still needs refining but it works! I forked LLaMA here: https://github.com/modular-ml/wrapyfi-examples_llama...

7B model CUDA out of memory on rtx3090ti 24Gb

@Jehuty-ML might have to do with their recent update to the sequence length (1024 to 2048). Also, try changing the batch size to 2 and reduce the example prompts to...

The lowest config that is able to run it?

You can distribute the model on two machines or GPUs and transmit the activations over ZeroMQ. Follow these instructions: # LLaMA with Wrapyfi Wrapyfi enables distributing LLaMA (inference only) on...

Post your hardware specs here if you got it to work. 🛠

I was able to run 7B on two 1080 Ti (only inference). Next, I'll try 13B and 33B. It still needs refining but it works! I forked LLaMA here: https://github.com/modular-ml/wrapyfi-examples_llama...

Able to load 13B model on 2x3090 24Gb! But not inference... :(

I was able to run 7B on two 1080 Ti (only inference). Next, I'll try 13B and 33B. It still needs refining but it works! I forked LLaMA here: https://github.com/modular-ml/wrapyfi-examples_llama...

How to run 13B model on 4*16G V100？

I was able to run 7B on two 1080 Ti (only inference). Next, I'll try 13B and 33B. It still needs refining but it works! I forked LLaMA here: https://github.com/modular-ml/wrapyfi-examples_llama...

Failure on A100 32GB

I was able to run 7B on two 1080 Ti (only inference). Next, I'll try 13B and 33B. It still needs refining but it works! I forked LLaMA here: https://github.com/modular-ml/wrapyfi-examples_llama...

Failure on A100 32GB

> Same for me, but in my case I have 2x RTX 2070 (8Gb each) 16Gb in total. How could we use multiple gpus? > > ``` > # |...