Tieyang Yu
Results
2
comments of
Tieyang Yu
> it seems that the first problem can be solved by the following code in function `def get_accelerate_model(args, checkpoint_dir):` > > ``` > for name, module in model.named_modules(): > if...
The reason for removing the non-equivalence is that the weight in rms cannot be multiplied by the end of XQ. Otherwise, Q.T cannot be canceled.