Tieyang Yu

Results 2 comments of Tieyang Yu

> it seems that the first problem can be solved by the following code in function `def get_accelerate_model(args, checkpoint_dir):` > > ``` > for name, module in model.named_modules(): > if...

The reason for removing the non-equivalence is that the weight in rms cannot be multiplied by the end of XQ. Otherwise, Q.T cannot be canceled.