umap
umap copied to clipboard
How to look at the loss value?
Is there a possibility to print the loss value during the optimization process? Or at least after the last epoch?
It would make sense to add this feature, as it would help to set an appropriate number of epochs when one tunes the algorithm for specific data.
Probably this wont be easy because of negative sampling, it is endless sourse of gradient proportional to current learning rate amplitude
@vseledkin is correct, the negative sampling is a method used to avoid ever actually computing the full loss value which is remarkably expensive. I did have a function that could be used to compute loss, but in practice it only scaled to datasets of a few thousand points, so I never included it.
@lmcinnes but how can we estimate that the method has converged?
Convergence is not checked - -instead the optimization is run fora specified number of epochs.
On Thu, Aug 2, 2018 at 12:58 PM Artsiom [email protected] wrote:
@lmcinnes https://github.com/lmcinnes but how can we estimate that the method has converged?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lmcinnes/umap/issues/100#issuecomment-409996476, or mute the thread https://github.com/notifications/unsubscribe-auth/ALaKBScVUNEE0oqbcmtZUOzYkAfRyAt4ks5uMy-YgaJpZM4VsNWO .
But how can I make sure that 300 epoch is not better than 200 epochs? If we had a loss function or at least the gradient norms, than we could infer how many epochs is enough.
Contributions welcome :)
@vseledkin is correct, the negative sampling is a method used to avoid ever actually computing the full loss value which is remarkably expensive. I did have a function that could be used to compute loss, but in practice it only scaled to datasets of a few thousand points, so I never included it.
I don't suppose you would be willing to share it? I'm looking at applications of UMAP on reasonably small data, where having the loss function explicitly is every useful.
+1
The Parametric UMAP submodule has a non-parametric module that saves loss. See the bottom figure in this notebook.
https://github.com/lmcinnes/umap/blob/master/notebooks/Parametric_UMAP/06.0-nonparametric-umap.ipynb