cca_zoo
                                
                                 cca_zoo copied to clipboard
                                
                                    cca_zoo copied to clipboard
                            
                            
                            
                        nan in transformed matrix
Hi, I run basic CCA on some dataset, it works well, but sometimes I get nans at the final columns of now of the transformed matrices, for example:
basic_cca = CCA(latent_dims=768)
basic_cca.fit((layer, second_layer))
U, V = basic_cca.transform((layer, second_layer))
Here layer and second_layer are both matrices of shape [Nx768], and sometimes I get that the last column of U (i.e. U[:, 767]) is all nans. Just to mention, if I change latent_dims to 767 or lower, everything works(but I need it to be 768). Is there any idea how to solve it? Or what can I change in order to solve it? Thanks
Another problem - sometimes, for totally identical inputs I don't get the same transformed matrices, i.e:
basic_cca = CCA(latent_dims=768)
basic_cca.fit((layer, layer))
U, V = basic_cca.transform((layer, layer))
and U not equal V, is there a way to fix this?
My guess is that this is a situation where the number of samples==number of features in one/both views?
If yes (this was the situation where I was able to reproduce your problems) then there are always likely to be some numerical instabilities and I will add a warning for this case. I'll have a look at why in particular the last eigenvalue/vector are the main problem. For that reason I'm a little bit wary of suggesting a hacky solution.
One option is to use MCCA rather than CCA since MCCA solves a different (but equivalent for 2 views) eigenvalue problem. This seems to be slightly more stable for the first k-1 eigenvectors but still looks unstable in the last one.
No, number of samples is 6 times larger than number of features in both views. Most of the times it works well, but sometimes I get this error.
So the gist of the way the solver works is to get the PCA components of the original data and runs the algorithm on the reduced data. This is mathematically equivalent to running CCA on the original data.
This is related to the reason why you can't get more than min(p,q) CCA components from p and q dimensional inputs.
My guess (related to the n=p case and I suspect nonetheless true in your n=6p case) is that your data is not full rank i.e has <768 principal components (non-zero eigenvalues).
If this is the case then there isn't much that can be done to 'solve' your problem but from my end it is something I can check i.e. if any eigenvalues are zero then limit the number of possible CCA components.
Is this the case for your inputs (can check by running svd on the input data matrices and looking for zeros)