ProDy
ProDy copied to clipboard
Add support for weighted PCA and ICA/tICA?
I think it is worth considering adding support for weighted PCA and ICA/tICA in ProDy.
The former should be fairly easy, since it already exists to an extent in the current PCA class already: https://github.com/prody/ProDy/blob/master/prody/dynamics/pca.py#L180
but only when the input data is an Ensemble class with weights. A similar treatment should be added to https://github.com/prody/ProDy/blob/master/prody/dynamics/pca.py#L166
So that one can pass a weight vector (for each sample) or matrix (for each sample and atom) as a parameter to PCA.buildCovariance.
ICA is trickier to implement, but the covariance matrix is the same. Only the decomposition part is different. A good formula to follow is probably from scikit learn: https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.FastICA.html
sounds good to me
We may also want to consider giving an option for scikit-learn PCA, which seems to be faster
sounds good to me
We may also want to consider giving an option for scikit-learn PCA, which seems to be faster
Great! I can take the WPCA for a spin if you'd like.
I wonder if their speed-up comes from the fact that they are using SVD instead of the regular eigensolver, which is provided as an option in ProDy already: https://github.com/prody/ProDy/blob/master/prody/dynamics/pca.py#L230 (Although I think the API point performSVD
should be integrated into calcModes
and can be turned on by a switch).
sounds good to me We may also want to consider giving an option for scikit-learn PCA, which seems to be faster
Great! I can take the WPCA for a spin if you'd like.
Yes, go ahead!
I wonder if their speed-up comes from the fact that they are using SVD instead of the regular eigensolver, which is provided as an option in ProDy already: https://github.com/prody/ProDy/blob/master/prody/dynamics/pca.py#L230 (Although I think the API point
performSVD
should be integrated intocalcModes
and can be turned on by a switch).
I'm not sure. Could be. I haven't yet got round to systematically comparing it.
There's an implementation in https://github.com/scipion-em/scipion-em-continuousflex/blob/rv_pdb_dimred/continuousflex/protocols/protocol_pdb_dimred.py that I'd be comparing with.
They also have UMAP that looks quite similar so may be worth adapting into ProDy too