Junyang Lin

Results 173 comments of Junyang Lin

no just click the link and it links to the 72b demo. we use it as the default one

@ggerganov feel free to take a look at this small code change 😃

You mean 1.5 comsumes more memory than 1? I did test the inference costs, see the doc here https://qwen.readthedocs.io/en/latest/benchmark/hf_infer.html . Maybe I should add a training costs to you guys.

Context length might be a matter. Are you using the official script or Llama factory or Axolotl?

> > Context length might be a matter. Are you using the official script or Llama factory or Axolotl? > > The Llama-factory framework uses normal VRAM for LoRA, but...

#573 here is the solution. tons of issues are related to the huggingfaceembedding...

> For > > > check_code_quality > > let's resolve it toward the end of the PR (before we merge to main) I think we only have code quality issue...

> Hi @JustinLin610 Thank you for the updating again. OK, I will take care of them, but could you share on which GPU you ran the tests? (You ran with...

Hey, I guess our update of the testing solves the issues mentioned. Take another look?

#573 this problem happens a lot with llama_index. I advise you to load embedding in llama_index and see if it works. or you can just use the solution in the...