exllama openllama support

openllama support

Open cnut1648 opened this issue 1 year ago • 4 comments

Hi, really nice work here! I really appreciate it that you bring llama inference to consumer grade GPUs!! There is an ongoing project https://github.com/openlm-research/open_llama which seems to have a lot potentials. Do you think this will be supported in the future? Thanks!

Jun 26 '23 20:06 cnut1648

Afaik should be supported natively, have you tried? the underlying architecture is the same as llama models

Jun 26 '23 21:06 nikshepsvn

Okey good to know @nikshepsvn , I will try that tmrw. Will update here!

Jun 26 '23 23:06 cnut1648

I've always assumed as much but just decided I'd look into it when they release a 33B model. I'm an elitist.

Jun 26 '23 23:06 turboderp

Confirming that open_llama_13b, quantized with GPTQ for Llama to 4 bits/32g, works well .

Jul 03 '23 04:07 alain40

exllama exllama copied to clipboard

openllama support

exllama
exllama copied to clipboard