Boltzmann.jl
Boltzmann.jl copied to clipboard
CRBM momentum only applied to W
I have seen that your function only seems to update dW for the weights but not for all the other parameters. Is it what you intented to do?
function grad_apply_momentum!(crbm::ConditionalRBM{T}, X::Mat{T},
dtheta::Tuple, ctx::Dict) where T
dW, dA, dB, db, dc = dtheta
momentum = @get(ctx, :momentum, 0.9)
dW_prev = @get_array(ctx, :dW_prev, size(dW), zeros(T, size(dW)))
# same as: dW += momentum * dW_prev
axpy!(momentum, dW_prev, dW)
end
I think momentum for other parameters didn't make it any better for my use case at that time, so I just decided to not include unchecked feature. However, if in your case it improves things, it makes sense to update the code.
Out of curiosity, what are using RBMs for? I thought everybody has moved to variational autoencoders which are much faster to train and more numerically stable. Although I'm not sure there's direct counterpart among VAEs for conditional RBM.
Yeah, I'm pretty sure I was always setting momentum to 0.0 in my experiments... which is probably why I forgot to implement that for the autoregressive weights :) I'm not sure how much it'll help, but it's probably worth adding that in for consistency.