ViP
ViP copied to clipboard
Fails on zero grad
In instances where a neuron doesn't factor into the loss (e.g., a component of the loss is disabled for a specific experiment, resulting in a neuron or set of neurons being unused), autograd returns None for the unused connections. This results in a crash at the line:
param.grad *= 1./float(args['psuedo_batch_loop']*args['batch_size']
With the error:
TypeError: unsupported operand type(s) for *=: 'NoneType' and 'float'
This can be remedied by inserting:
if param.grad is not None:
prior to the line in question, but I'm unsure of any upstream consequences.
That should've been fixed with issue #7 with the following line: https://github.com/MichiganCOG/ViP/blob/dev/train.py#L182.
Do you have this version from dev pulled?
I'm using an older version (apart from pulling from master, I immediately made train.py unmergeable). My mistake for missing that issue.
I came back to this --- it appears the modification in the dev branch resolves a different problem. That is, the weights that are causing and issue for me are not frozen, but have no gradient because they do not contribute to the loss.
Consider three regression nodes --- yaw, pitch, and roll. I modify training to only regress yaw by performing backpropagation on that node directly. The weights leading into the nodes for roll and pitch are left as "None" by the autograd on loss.backward(), and thus fail at the cited line.
Can you post your code? Training and relevant loss and model files. A github link would work.