MillionthOdin16 comments

Results 85 comments of


                                            MillionthOdin16

server.py not starting with GPTQ latest git 534edc7

Awesome! Worked for me too. I completely forgot to rebuild the kernel -_-

Bot just generates nonsense

Yes, that's the issue. It needs to be the cuda ones with no act order.

Fails to load model

I'm also getting this. 64GB system ram and 24 VRAM. The model is like 18GB, and I have more than enough ram to handle it on both types.

feature: interactively exporting loaded model to binfile

If you're familiar with the mixing of loras I think it would be helpful for a lot of people here if you could link some resources on it. I've heard...

fix(perf/UX): Use num physical cores by default, warn about E/P cores

Is this changing the default number of threads from the intended 4 to 8? I can't tell from a quick read, but if that's the case, eight seems a bit...

Unable to run llama.cpp or GPT4All demos

> @Zetaphor Correct, llama.cpp has set the default token context window at 512 for performance, which is also the default `n_ctx` value in [langchain](https://github.com/hwchase17/langchain/blob/master/langchain/llms/llamacpp.py#L30). You can set it at 2048...

MillionthOdin16