SVF
SVF copied to clipboard
SVD Initialisation
Hi,
First of all, thank you for your work, it's really appreciated.
I have a question about replacing full rank with low rank in this section. https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L60C1-L63C8
While for convolution modules there is no problem, https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L117C1-L125C63
It seems to me that for Linear modules the weights of matrices U, S and V are randomly initialized. I understood that this wasn't the case in the paper (I may be wrong) but I understood that the module initialization had to be based on an SVD on the pre-trained weights.
Thank you in advance for your clarification.
Hi,
First of all, thank you for your work, it's really appreciated.
I have a question about replacing full rank with low rank in this section. https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L60C1-L63C8
While for convolution modules there is no problem, https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L117C1-L125C63
It seems to me that for Linear modules the weights of matrices U, S and V are randomly initialized. I understood that this wasn't the case in the paper (I may be wrong) but I understood that the module initialization had to be based on an SVD on the pre-trained weights.
Thank you in advance for your clarification.
I have the same problem with Linear module. When employing torch.svd() to Linear weight, S could be nan. Do you solve this problem?
Hi,
First of all, thank you for your work, it's really appreciated.
I have a question about replacing full rank with low rank in this section. https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L60C1-L63C8
While for convolution modules there is no problem, https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L117C1-L125C63
It seems to me that for Linear modules the weights of matrices U, S and V are randomly initialized. I understood that this wasn't the case in the paper (I may be wrong) but I understood that the module initialization had to be based on an SVD on the pre-trained weights.
Thank you in advance for your clarification.
I seem to solve this problem through fixing the code.
Hi, I solved the problem too by initializing U, S, V matrix using the SVD instead of the random initialization of the code.