deep-limits
deep-limits copied to clipboard
Question for the distribution of singular values
Hi, I run your code in the file deep_singular_value_dist.m. I have a question about the variable scale
in the file. It turn out this variable is so important here. If it is set to be 1, the result of the figure 1 and figure 2 do not show any obvious difference. Why does it happen? What's the function of variable scale
? Can you clarify it to me? I do not find a clue in your PhD thesis. Thank you very much!
It's been a while, but I think it's meant to reflect the ratio of the lengthscale of the kernel to its output variance. I think you're right that if it's exactly 1 then nothing blows up, and that it's an unstated assumption that this is hard to achieve in practice with neural nets (kind of like with RNNs).