vae_cf
vae_cf copied to clipboard
Question about l2 normalization of input
Hi, I have a question about l2 normalization of input.
At q_graph method in Multi_VAE and forward_pass method in Multi_DAE, why do you apply l2 normalization to input vector? I don't know the meaning of that normalization.
Sorry to bother you, Thank you!
I wonder that also. Since it is a DAE, and I don't see any noise being added to the input, I think the noise is the L2 normalization itself?
The noise is added by using dropout. I think the L2 normalization is used to normalize feedback data for solve accommodating user or fastidious user problem.
Can you explain the L2 normalization in other words please? I didnt understand. And why L2 norm instead of layer/batch norm? Btw, I have implemented it on my own on another dataset and the normalization didn't seem to help.
Here 's what I think, It 's from the problem you will face with explicit feedback: for example, accommodating users rate 3/5 star for item they don't like and 5/5 star for item they like, fastidious users rate 1/5 star for item they don't like and 3/5 star for item they like. So user feedback data can be distorted by some users. So you have to normalize user feedback and that why L2 norm for individual user . With implicit feedback I think you will face this problem. And I think when your feedback data is normal distribution or like normal distribution, you don't need normalization.
I have an unrelated question. When using Gaussian, the log likelihood contains confidence c_ij, but not in multinomial likelihood. Can you explain why multinomial doesn't need c_ij?
Your explanation makes sense to me. In this case where you have your inputs binary, normalization does not help.
Regarding the other issue, can you point it in the code?
In VAE CF paper, the Gaussian log likelihood (Eq 3) contains confidence weight c_ui but not in the Multinomial log likelihood (Eq 2). Can you explain why multinomial doesn't need confidence weight?
And I think in case of binary inputs, your feedback still can be distorted.
@jin530 In my opinion,both DAE and VAE ,the author used the structure of denosing ,that is L2 and dropoout,added Bernoulli noise to the input data