Timon Käch

Results 72 comments of Timon Käch
trafficstars

Yeah AiTemplate seems good too.

Thank you very much! Did I read correctly that I have to wait 20 minute for inference?

Splitting half on gpu and other half on ram doesn't work? Because GPT J was quite fast. (1-2 tokens / sec)

Closing as text generation webui does all of this and even gpt q 4bit

Hello @anzz1 I also have a i5 13600k and I think I could speed up the generation with this code. Since I'm an beginner, where do I have to put...

Doesn't this just remove the error? Emoji's still don't work

Can we install the older version again until this is fixed? How?

Just checked it. LLaMA CPP is 40ms per token for me and the python bindings are 200ms per token so it's much slower. Sadly downgrading to version 0.1.27 is still...

I also would like to know how to do this? I have 2x3060 12gb so I could load the 13b model but it doesn't seem to be implemented