SVF icon indicating copy to clipboard operation
SVF copied to clipboard

SVD Initialisation

Open HichTala opened this issue 1 year ago • 3 comments

Hi,

First of all, thank you for your work, it's really appreciated.

I have a question about replacing full rank with low rank in this section. https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L60C1-L63C8

While for convolution modules there is no problem, https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L117C1-L125C63

It seems to me that for Linear modules the weights of matrices U, S and V are randomly initialized. I understood that this wasn't the case in the paper (I may be wrong) but I understood that the module initialization had to be based on an SVD on the pre-trained weights.

Thank you in advance for your clarification.

HichTala avatar May 05 '24 18:05 HichTala

Hi,

First of all, thank you for your work, it's really appreciated.

I have a question about replacing full rank with low rank in this section. https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L60C1-L63C8

While for convolution modules there is no problem, https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L117C1-L125C63

It seems to me that for Linear modules the weights of matrices U, S and V are randomly initialized. I understood that this wasn't the case in the paper (I may be wrong) but I understood that the module initialization had to be based on an SVD on the pre-trained weights.

Thank you in advance for your clarification.

I have the same problem with Linear module. When employing torch.svd() to Linear weight, S could be nan. Do you solve this problem?

DUT-CSJ avatar Jun 03 '24 10:06 DUT-CSJ

Hi,

First of all, thank you for your work, it's really appreciated.

I have a question about replacing full rank with low rank in this section. https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L60C1-L63C8

While for convolution modules there is no problem, https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L117C1-L125C63

It seems to me that for Linear modules the weights of matrices U, S and V are randomly initialized. I understood that this wasn't the case in the paper (I may be wrong) but I understood that the module initialization had to be based on an SVD on the pre-trained weights.

Thank you in advance for your clarification.

I seem to solve this problem through fixing the code.

DUT-CSJ avatar Jun 04 '24 01:06 DUT-CSJ

Hi, I solved the problem too by initializing U, S, V matrix using the SVD instead of the random initialization of the code.

HichTala avatar Jun 04 '24 14:06 HichTala