velocyto-notebooks
velocyto-notebooks copied to clipboard
Typos and running jupyter notebooks
Hello, looking at the jupyter notebook code I had errors with both notebooks when calling the following line:
vlm.set_clusters(vlm.ca["ClusterName"], cluter_colors_dict=colors_dict) changing "cluter_colors_dict" to "cluster_colors_dict" solved this issue.
However, running through the rest of the jupyter notebooks I am unable to proceed after calling: vlm.perform_PCA()
The problem seems to be that the kernel will go dead and attempt to restart I am not really sure what I haven't set up correctly but help would be appreciated
This is very weird, I tried to understand the problem but I don't have enough information. Do you think you could give me some extra info. For example what is the shape of the attribute S_norm before that call, and if there are any Nans.
I am running the notebook through Anaconda-Navigator's web based Jupyter Notebook application. I figured since it was a jupyter notebook it would play nicely and I should be able to walk through the example code. Perhaps I just need to be running more locally through command line and/or another IDE...this will also make investigating values and attribute easier too.
At any rate below is the printed matrix and shape:
Does this help? How can I better assess the Nans/what other information can I provide?
Everything looks allright. Are you running on a powerful enough machine? Could this be a memory issue? Could you monitor that while running? Can you check which version of velocyto and loompy you are using?
Really I don't understand where the problem might be... perform_PCA is only calling PCA from scikit-learn.
It is basically doing just the following:
self.pca = PCA()
self.pcs = self.pca.fit_transform(self.S_norm.T)
That's why the fact it crashes is so puzzling to me.
I have 24 GB RAM on this MAC would you recommend more? velocyto is ver 0.13.1 loompy 1.10
And running the code I wrote above crashes the notebook as well?
I am out of suggestions, try to run the code of the notebook in a script instead. Let's hope it throws an error instead of chrashing... so I can understand where the problem is.
I'm testing the above code rn. I am also thinking it is just something weird with the jupyter web app method of running this.
says PCA not defined i guess scikit-learn isn't imported right then? I've got the latest release: 0.19.1
I didn't mean you had run the code literally as is. I assumed you would have added the required from sklearn.decomposition import PCA
. Sorry for not being more clear
That's my bad, I've not programmed for a little bit. I tried that and same kernel crash error. I am going to try the code locally tomorrow through a script and not the online NotebookApp. Thanks for your help.
Ok so running in Ipy console actually gives useful error messages. I am going to check the other issues for any similar problems but see the following:
ok now try with my code above again but give as input X
instead of self.S_norm.T.shape
. Where X is:
X = np.random.normal(size=self.S_norm.T.shape)
Could it be there are nans? Is this the correct call?
Could it be there are nans?
No
Is this the correct call?
Yes
I think this last test proved that the problem is your python installation. The reason is that there is basically not a single line of my code running in the above code. and what is failing is the call to sklearn PCA.
Something is broken in your isntallation. Please reinstall a conda environment from scratch starting from miniconda and then follow the installation guide in the docs.
Alright, I'll try reinstalling. Thanks
I reinstalled and followed installation instructions in the docs. Running the code in the dentate gyrus notebook and the adjusted call we discussed leads to a segmentation fault: 11 error from Ipython and from the command line as a script. So reinstallation of python doesn't seem to fix the problem.
I can try to run this through our computing cluster and therefore run on a more powerful and perhaps stabler architecture? Do you have any other ideas for solving this issue?
I don't have other suggestion right now. I can only promise that, in the next weeks, I will test again the notebook and the installation from a couple of different environment in the attempt to replicate your problem, but as soon as this is an isolated problem that I cannot trace back to a putative cause, I cannot put it as my number 1 priority.
Running the script on the cluster returns no segmentation fault error 11. This confirms that the issue is something related to installation/architecture. While not truly resolved, you may consider this issue closed.
Hello,
I have been having the same issue of "Segmentation faults:11" while running perform_PCA()
. I have used velocyto before on my laptop (MacBook Pro,2018, 16GB mem) without this error occurring. However, after having to update and reinstall certain packages, in addition to the Mohajve update, I have begun to get this error. I can also reproduce this using fit_transform(X)
from the sklearn library for my dataset as well as random data.
Reading around, these issues seem to be similar to this post: https://github.com/scikit-learn/scikit-learn/issues/8236
In brief, the error was suggested to come from incompatibilities between Scipy and XGBoost: https://github.com/scikit-learn/scikit-learn/issues/8236#issuecomment-395141179
The alternate they have suggested is to use the numpy implementation of svd. I can reproduce the error on my laptop using the scipy implementation of svd scipy.linalg.svd(X)
, which I am guessing is also used in velocyto. I am able to resolve this using the numpy implementation np.linalg.svd(X)
. Hence, the segfaults might not be due to a memory issue rather an incompatibility issue.
Is there a possible way to incorporate the numpy implementation in veolcyto?