pytorch-vqvae Codebook embedding does not update

I found ctx.needs_input_grad[1] is False during training VQ-VAE. Is this correct, and does it mean the embedding of the codebook does not update during training?

https://github.com/ritheshkumar95/pytorch-vqvae/blob/8d123c0d043bebc8734d37785dd13dd20e7e5e0e/functions.py#L53

Apr 24 '20 22:04 zhxgj

i agree with you upfloor. it is so weird.

Sep 10 '21 23:09 zhangbo2008

I found ctx.needs_input_grad[1] is False during training VQ-VAE. Is this correct, and does it mean the embedding of the codebook does not update during training?

https://github.com/ritheshkumar95/pytorch-vqvae/blob/8d123c0d043bebc8734d37785dd13dd20e7e5e0e/functions.py#L53

This part of code has not been executed! But I printed the "model.codebook.embedding.weight.data" and found that this part will be updated!

Dec 26 '22 09:12 chenaoxuan

Actually, ctx.needs_input_grad[0] and ctx.needs_input_grad[1] are set to true and false alternatively. For the 1st step, ctx.needs_input_grad[0] is true and ctx.needs_input_grad[1] is false. For the 2nd step, ctx.needs_input_grad[0] becomes false and ctx.needs_input_grad[1] becomes true. For the 3rd step, ctx.needs_input_grad[0] is true and ctx.needs_input_grad[1] is false. So on and so forth...

This setting is reasonable because there are two "agents", namely codebook and autoencoder, updating w.r.t. to different parts of the loss function.

Jul 12 '23 07:07 Roller44

Actually, ctx.needs_input_grad[0] and ctx.needs_input_grad[1] are set to true and false alternatively. For the 1st step, ctx.needs_input_grad[0] is true and ctx.needs_input_grad[1] is false. For the 2nd step, ctx.needs_input_grad[0] becomes false and ctx.needs_input_grad[1] becomes true. For the 3rd step, ctx.needs_input_grad[0] is true and ctx.needs_input_grad[1] is false. So on and so forth...

This setting is reasonable because there are two "agents", namely codebook and autoencoder, updating w.r.t. to different parts of the loss function.

I debug the code and find that ctx.needs_input_grad[1] is always false rather than being set to true and false alternatively. A basic fact is that if a variable $A$ doesn't require gradient, it doesn't mean that it will not be updated during optimization. The attirbute requires_grad describes whether its gradient should be calculated. In other word, whether other variables calculated by $A$ should be updated rather than updating $A$ itself!

Therefore, though ctx.needs_input_grad[1].requires_grad is always False, the codebook can still be updated.

May 10 '24 11:05 RipeMangoBox