exllama icon indicating copy to clipboard operation
exllama copied to clipboard

openllama support

Open cnut1648 opened this issue 1 year ago • 4 comments

Hi, really nice work here! I really appreciate it that you bring llama inference to consumer grade GPUs!! There is an ongoing project https://github.com/openlm-research/open_llama which seems to have a lot potentials. Do you think this will be supported in the future? Thanks!

cnut1648 avatar Jun 26 '23 20:06 cnut1648

Afaik should be supported natively, have you tried? the underlying architecture is the same as llama models

nikshepsvn avatar Jun 26 '23 21:06 nikshepsvn

Okey good to know @nikshepsvn , I will try that tmrw. Will update here!

cnut1648 avatar Jun 26 '23 23:06 cnut1648

I've always assumed as much but just decided I'd look into it when they release a 33B model. I'm an elitist.

turboderp avatar Jun 26 '23 23:06 turboderp

Confirming that open_llama_13b, quantized with GPTQ for Llama to 4 bits/32g, works well .

alain40 avatar Jul 03 '23 04:07 alain40