pytorch-NMF
pytorch-NMF copied to clipboard
The returned H of torchnmf.nmf.NMFD() seems wrong
I have a matrix $V$ whose dimension is $513 \times 15$. I want to find 15 $W_t$ matrices and an $H$ matrix such that
This the equation 4 from the NMFD paper.
I understand that I need to add an extra dimension to the $V$ matrix for PyTorch. So, after I made the dimension of my $V$ torch.Size([1, 513, 15])
, I called net = torchnmf.nmf.NMFD(V_tensor.shape, rank=2, T=15)
and net.fit(V_tensor)
. (Note: V_tensor
is the tensor version of my V
)
However, the function torchnmf.nmf.NMFD()
only returns an H
whose dimension is only $1 \times 2 \times 1$. I think the dimension of H
should be $1 \times 2 \times 15$.
It seems that in your reconstruct(H, W)
function of the NMFD
class you calculate the 1d convolution between H
and W
.
return F.conv1d(H, W.flip(2), padding=pad_size)
In my opinion, this reconstruct
is not the same as equation 4 of the original paper. If you expand the summation of equation 4, you can find the non-first column of $W_0 H$ (i.e. W_0H[:, 1:]
) has a contribution to the estimated $\hat{V}$. The non-first-two column of $W_1 H$ (i.e. W_1H[:, 2:]
) has a contribution to the estimated $\hat{V}$. Therefore, the returned H
should not be padded with 0's left and right during the reconstruct
function.
Please let me know if I misunderstood it, or if it is actually a bug. I am trying to use your torchnmf.nmf.NMFD()
to do some audio analysis. I am very happy to have further discussions with you.
Thank you!
Hi @ZhimaoLin, good question! You're right. It's not the same as in the paper. It's just a different interpretation of how convolution works.
In the original definition of convolution, given two sequences with lengths A
and B
respectively, their convolution results in a sequence with a length of A + B - 1
. So in here you can see I actually make H
with a size of M - T + 1
and implement the original convolution inside the reconstruct
function.
A temporary workaround would be padding extra T - 1
zeros on the right side of your V
before passing it to the constructor so you can have H
in the size you want.
I plan to add padding options for users to choose which types of convolution they want—probably going to follow the format from scipy where you have three options, full
, valid
, and same
. The current implementation is full
, while your requested behaviour is same
. For valid
the length of H
would be M + T - 1
. Let me know whether it works for you or if you have other suggestions.
Anyway, thanks for bringing up the issue!