GaussianMixtures.jl
                                
                                 GaussianMixtures.jl copied to clipboard
                                
                                    GaussianMixtures.jl copied to clipboard
                            
                            
                            
                        PosDefException: matrix is not positive definite.
In src/train.jl using the kinit:kmeans I get a
ERROR: PosDefException: matrix is not positive definite; Cholesky factorization failed. as the covariance matrix using cov(xx[sel, :]) at (row 130ish) seems not to fulfill the criteria for my data. It works for gmm.n<20 and also if kinit=:split for large nr mixtures.
I solve it using a simple isposdef check followed by adding a small regularization term in main.jl.
C = cov(xx[sel, :])
if isposdef(C)
    return cholinv(C)
else
    return cholinv(C+I*eps(eltype(C)))
end
I tried to also to add the check in scr/gmms.jl in the function
function cholinv(Σ::Matrix{T}) where {T}
    if isposdef(Σ)
        return cholesky(inv(cholesky(0.5(Σ + Σ')))).U
    else
        return cholesky(inv(cholesky(0.5(Σ + Σ') + I*eps(T)))).U
    end
end
and it works, but it resulted in roughly a 10% speed penalty (tested for 3x3 and 30x30), so perhaps the check might be best in train.jl?
This looks like a numerical instability, indeed, I'd say something which this the output of cov() should be positive definite.
So, in your case you could look at the y = xx[sel, :], and report this to the LinearAlgebra people that for this matrix, isposdef(cov(y)) == false.
The regulatization would make the algorithm more robust, which is a good thing. But I wonder if the data results in such numerical instabilities if the modelling of the data actually gives results that make sense.