Lion
Lion copied to clipboard
Is ther possible to distill Llama3 to Qwen2-1.5B and how to do
As a test case, I want to distill Llama3 to small llm:fo example Qwen2-1.5B if it is possible and if loss function should be changed?
thanks~