DoRA I find some confusion code in pefy

I find some confusion code in pefy

Open guokan987 opened this issue 11 months ago • 3 comments

code: result_dora = (mag_norm_scale - 1) * (F.linear(x, transpose(weight, self.fan_in_fan_out)) ) + mag_norm_scale * lora_B(lora_A(x)) * scaling Question: what is the effect of (mag_norm_scale - 1) and mag_norm_scale ? And, result_dora can't equals the F.linear(x, transpose(weight, self.fan_in_fan_out)) in the Initializing stage due to the parameter "mag_norm_scale - 1"

Mar 15 '24 06:03 guokan987

DoRA DoRA copied to clipboard

I find some confusion code in pefy

DoRA
DoRA copied to clipboard