ysjakking
Results
2
issues of
ysjakking
From line 84,85 and 97,98 of the optimizer.py , we can see the b1 and b2 here are correspond to '1-b1' and '1-b2' respectively of the original adam paper, i.e.,...
I work on a GPU: Tesla k40c with (CNMeM is enabled with initial size: 75.0% of memory, cuDNN 4007). And it takes me about 10+ min for training a epoch....