Running LoRA fine tuned versions of LLama 3.2 using transformers.js

Open younestouati opened this issue 1 year ago • 1 comments

Hello,

I am wondering if it’s possible to run a LoRA fine-tuned version of LLaMA 3.2 in the browser using transformers.js. Ideally, I would like to load the base model once and then dynamically load and swap between different LoRA adapters at runtime based on the current task, without reloading the base model each time.

Is this supported in transformers.js? If so, are there any tutorials or examples illustrating how to set this up in a browser environment?

Any guidance or documentation on this would be greatly appreciated. Thank you!

Oct 17 '24 10:10 younestouati

Since transformersjs uses onnx under the hood, perhaps you integrate this functionality they have already implemented in onnx.

https://onnxruntime.ai/blogs/multilora

Jan 19 '25 23:01 escottgoodwin