S. Neuhaus comments

Results 111 comments of


                                            S. Neuhaus

Inference on GPU

> See [my fork](https://github.com/bjoernpl/llama_gradio_interface) for the code for rolling generation and the Gradio interface. @bjoernpl Works great, thanks! Have you tried changing the gradio interface to use the [gradio chatbot...

Would unloading kernel modules on OS X be an effective mitigation?

That would interfere with the operation of existing USB/networking devices, wouldn't it?

Is gpt4all J not using llama weights? can it be used commercially?

The README currently has a large headline "Original GPT4All Model (based on GPL Licensed LLaMa)" which is misleading and should be changed.

Update README.md

while you are at it, why not fix the misleading headline "Original GPT4All Model (based on GPL Licensed LLaMa)"?

Update README.md

> Done. @neuhaus. Llama is not GPL licensed

Update README.md

Hello @mnabimd you are confusing the LLaMA software (which is GPL licensed) with the LLaMA model. The license covering the LLaMA model weights, which is unfortunately much more restrictive, can...

Is LLaMa supported?

The first download links have been distributed by Facebook earlier today. I've already seen a torrent on twitter - oh my!

Is LLaMa supported?

The [Llama-INT8 fork](https://github.com/tloen/llama-int8) works with the 13B model on a single GPU with approx. 18GB RAM.

benchmarks?

Ryzen 7 3700X, 128GB RAM @ 3200, llama.cpp numbers: ``` $ ./main -m models/7B/ggml-model-q4_0.bin -t 8 -n 128 main: mem per token = 14434244 bytes main: load time = 1270.15...

7B model CUDA out of memory on rtx3090ti 24Gb

> i have seen someone in this issues Message area said that 7B model just needs 8.5G VRAM. but why i ran the example.py returns out of memory on a...