Boltzmann.jl icon indicating copy to clipboard operation
Boltzmann.jl copied to clipboard

CRBM momentum only applied to W

Open davidbp opened this issue 6 years ago • 2 comments

I have seen that your function only seems to update dW for the weights but not for all the other parameters. Is it what you intented to do?

function grad_apply_momentum!(crbm::ConditionalRBM{T}, X::Mat{T},
                              dtheta::Tuple, ctx::Dict) where T
    dW, dA, dB, db, dc = dtheta
    momentum = @get(ctx, :momentum, 0.9)
    dW_prev = @get_array(ctx, :dW_prev, size(dW), zeros(T, size(dW)))
    # same as: dW += momentum * dW_prev
    axpy!(momentum, dW_prev, dW)
end

davidbp avatar Feb 27 '18 14:02 davidbp

I think momentum for other parameters didn't make it any better for my use case at that time, so I just decided to not include unchecked feature. However, if in your case it improves things, it makes sense to update the code.

Out of curiosity, what are using RBMs for? I thought everybody has moved to variational autoencoders which are much faster to train and more numerically stable. Although I'm not sure there's direct counterpart among VAEs for conditional RBM.

dfdx avatar Feb 27 '18 22:02 dfdx

Yeah, I'm pretty sure I was always setting momentum to 0.0 in my experiments... which is probably why I forgot to implement that for the autoregressive weights :) I'm not sure how much it'll help, but it's probably worth adding that in for consistency.

rofinn avatar Feb 27 '18 23:02 rofinn