Question on quantized LLAMA3 versions for use with EAGLE
Hi,
this is a question to anyone who has tried EAGLE with LLAMA3, I was wondering which LLAMA3 model exactly you were using? I.e. I assume a quantized version since the original one from Meta is huge, which quantization gave a good ratio of quality and performance in combination with EAGLE? I would also appreciate if someone could point me to a quantized LLAMA3 model repo which is known to work with EAGLE, so far I have found GGUF versions which are not supported and it seems I am not able to quantize the original LLAMA3 myself due to insufficient RAM. Any hints would be greatly appreciated, thank you!
Any progress?I want to train a draft model with a quantized target model. Thanks~
Hi, I'm just wondering if you both had any progress. Thank you in advance!
@UltramanKuz @jin-eld
Hi, I'm just wondering if you both had any progress. Thank you in advance!
@brayden-hai I gave up on it, LLAMA3 was just too huge for my local system to quantize
Hi, I'm just wondering if you both had any progress. Thank you in advance!
I also gave up on it