optnet icon indicating copy to clipboard operation
optnet copied to clipboard

MNIST experiments creating qpth issues

Open guptakartik opened this issue 6 years ago • 5 comments

Hi,

I was running the optnet code for MNIST classification with the default configurations for only 10 epochs. In the first couple of epochs I get the warning "qpth warning: Returning an inaccurate and potentially incorrect solutino" and in the subsequent iterations the loss becomes nan. Is there something obviously wrong with my configurations?

guptakartik avatar Oct 23 '17 01:10 guptakartik

Hi, I just tried running the MNIST experiment and am hitting nans there too. It's been a while since I've ran that example and I've changed the qpth library since the MNIST experiment was last working. It looks like the solver's hitting some nans internally, causing the precision issue and bad gradients. For now you can try reverting to an older commit of qpth, one from around the time I last updated the MNIST example. I'll try to look into the internal solver issues soon.

-Brandon.

bamos avatar Oct 23 '17 11:10 bamos

Thanks for the quick reply! I will try working with the older commit of qpth.

guptakartik avatar Oct 23 '17 16:10 guptakartik

Hi, I tried most of the early versions of qpth but none of them works. They fail in various ways, mostly inside qpth. Could you check which version can work?

Xingyu-Lin avatar Oct 23 '17 19:10 Xingyu-Lin

Hi Brandon, It would be really helpful if you could point us to the right version of qpth, since we have been unable to get it to work.

guptakartik avatar Nov 06 '17 19:11 guptakartik

Hi, the nans were coming up in the backwards pass in qpth and I've pushed a fix to it here: https://github.com/locuslab/qpth/commit/e2cac495909159aae12461262d0ee540ddf9abd6

Here's the convergence of one of my new runs (I did modify z0 and s0 to be fixed, pull this from the latest version of this repo). For the loss being so jumpy, the LR should probably be bumped down:

image

Can you try running the training again with the latest versions of this repo and qpth?

-Brandon.

bamos avatar Nov 09 '17 21:11 bamos