rusty-machine
                                
                                 rusty-machine copied to clipboard
                                
                                    rusty-machine copied to clipboard
                            
                            
                            
                        Numerical Instability of GMM
There are a few problems with the current GMM implementation.
- [ ] Using detandinverseis numerically unstable, should usecholeskyinstead.
- [x] The regularization constant is added horribly incorrectly at the moment. Should add only to diagonal and after the nested loop - not during.
- [ ] Should compute probabilities in log-space and work there wherever possible.
Scikit learn's implementation is a good reference.
Can you break this down a little bit more, or cite reference literature or a reference implementation?
I'm very invested in the stability of GMM, as it's the standard way to cluster spectral frames in audio segmentation problems. Currently I've found that if I initialize too many clusters it totally goes off the rails.
Added an implementation for reference. I haven't got any literary references yet, will try to find some.
I've started working on this already but it is a sizable task and so progress is a little slow.
Okay, do you want to push stuff to a separate branch on this project? I can also pull down changes and work on it then. I spent an hour or so digging into sklearn last night.
I haven't had much time to work on it today and am away from my PC so cannot push the changes right now. The only part I had started really was switching to log-space for the probabilities - this part here.
I hadn't pushed anything to a branch yet as it was broken :D . Feel free to start pushing forward with it and if anything I would be happy to put a PR into your branch later (if I can indeed contribute anything).
Sounds great. I'll work on the cholesky implementation first. It seems that's most of the numerical instability.
Ok, sure! I would agree that is the worst part, I just thought doing log space probabilities would be easier...
Cool. Why don't I take a crack at it first. I'm finding it pretty complicated to try and imagine things in non-log-space while looking at the sklearn code, so I'm just doing it all at once and sticking to log space while using cholesky.
I see quite a few rusty improvements we could make to the sklearn implementation, but I'm saving those for later. At the moment it's just a bad translation.
That sounds perfect, thank you for your help with this!
I've taken a first crack at it with #155. I can't seem to figure out why the means aren't separating though. They do drift in the direction of the points, but they all drift basically the same way. I think this has something to do with the log responsibilities of the components, but I can't find a bug anywhere.
Lots of println!s, too.