LLaDA icon indicating copy to clipboard operation
LLaDA copied to clipboard

Plans for Different Sized Models - 1.5B or 3B

Open AmanPriyanshu opened this issue 10 months ago • 4 comments

Thank you for this amazing work!

Is there a plan to train and release smaller (1.5B to 3B) or larger (14B to 70B) sized models for LLaDA?

AmanPriyanshu avatar Feb 26 '25 16:02 AmanPriyanshu

Thank you for your interest! Currently, we don't have a specific plan or timeline.

nieshenx avatar Feb 27 '25 02:02 nieshenx

Thank you for your interest! Currently, we don't have a specific plan or timeline.

Would you like to ask if it is applicable to general model quantization techniques? llada seems to perform well on coding tasks, and if it can reduce the burden of reasoning, it is believed to lead to widespread use:D

Wenbobobo avatar Feb 28 '25 14:02 Wenbobobo

Thank you for your attention!

Currently, we are just a very small team and don't have the capacity to carry out the model quantification work at the moment. I've noticed that some people in the community are attempting to do this. Perhaps you could refer to their work.

nieshenx avatar Mar 04 '25 05:03 nieshenx

Is it public who's working on this? And if so, do they have a public page or something?

Angular-Angel avatar Mar 12 '25 05:03 Angular-Angel