SpectralNet icon indicating copy to clipboard operation
SpectralNet copied to clipboard

Cholesky decomposition unsuccessful

Open lanyiyun opened this issue 6 years ago • 5 comments

Hi,

This has been asked before. I ran into the cholesky issue repeatedly in spite of trying large batch size. I wonder how is your experience of resolving this issue. Any tips would help, thank you in advance!

lanyiyun avatar Aug 22 '18 14:08 lanyiyun

Lowering the learning rate helps as well. This occurs because the problem is a constrained convex optimization. If you go too fast then you can fly off the surface and get singularities.

On Wed, Aug 22, 2018 at 7:16 AM Yiyun Lan [email protected] wrote:

Hi,

This has been asked before. I ran into the cholesky issue repeatedly in spite of trying large batch size. I wonder how is your experience of resolving this issue. Any tips would help, thank you in advance!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/KlugerLab/SpectralNet/issues/11, or mute the thread https://github.com/notifications/unsubscribe-auth/AJXOK3HKFvbSD5JJmLKqgRyMVrfwnk2Yks5uTWfGgaJpZM4WHvyc .

kstant0725 avatar Sep 03 '18 21:09 kstant0725

Another tip is reducing the number of clusters, if possible.

One requirement of SpectralNet is that the orthonormalization layer is of rank equal to the number of clusters you set. Each minibatch must have enough variety / structure to have a full rank orthonormalization matrix. Thus, the dual to increasing the minibatch size is decreasing the cluster number. If your clusters are relatively balanced, and the number of clusters is on the order of a dozen or so, you're probably fine as is. But if it's much larger you might have problems. We have a few ideas in mind for loosening this restriction but there are no concrete plans yet.

lihenryhfl avatar Sep 20 '18 22:09 lihenryhfl

Thank you for your input, that makes a lot sense. I was trying to get 30+ clusters in a fairly large dataset. And most likely it is not balanced.

lanyiyun avatar Sep 20 '18 23:09 lanyiyun

I see. Yeah, this could be the reason why you had problems, especially if the classes are not balanced, unfortunately.

lihenryhfl avatar Sep 21 '18 01:09 lihenryhfl

I find this problem in some datasets, such as FRGC. Then I find it works when I change the epsilon (core/layers.py line 11) from 1e-7 to 1e-5 and reduce the spec_lr from 1e-3 to 1e-5.

spdj2271 avatar Jan 05 '22 08:01 spdj2271