Sting-Scorpion

Results 1 issues of Sting-Scorpion

Thank you for your solid work. I would like to ask if the code will be suitable for Qwen3 models, which added q_norm and k_norm in self_attention and would lead...