NanoSim
NanoSim copied to clipboard
No module named `sklearn.neighbours.kde`
I did a fresh install of conda (via miniconda3) and installed nanosim.
When running, I get this error:
Traceback (most recent call last):
File "/home/philae/.local/share/miniconda3/bin/simulator.py", line 2400, in <module>
main()
File "/home/philae/.local/share/miniconda3/bin/simulator.py", line 2161, in main
read_profile(ref_g, number, model_prefix, perfect, args.mode, strandness, dna_type=dna_type, chimeric=chimeric)
File "/home/philae/.local/share/miniconda3/bin/simulator.py", line 523, in read_profile
kde_ht = joblib.load(model_prefix + "_ht_length.pkl")
File "/home/philae/.local/share/miniconda3/lib/python3.9/site-packages/joblib/numpy_pickle.py", line 587, in load
obj = _unpickle(fobj, filename, mmap_mode)
File "/home/philae/.local/share/miniconda3/lib/python3.9/site-packages/joblib/numpy_pickle.py", line 506, in _unpickle
obj = unpickler.load()
File "/home/philae/.local/share/miniconda3/lib/python3.9/pickle.py", line 1212, in load
dispatch[key[0]](self)
File "/home/philae/.local/share/miniconda3/lib/python3.9/pickle.py", line 1528, in load_global
klass = self.find_class(module, name)
File "/home/philae/.local/share/miniconda3/lib/python3.9/pickle.py", line 1579, in find_class
__import__(module, level=0)
ModuleNotFoundError: No module named 'sklearn.neighbors.kde'
It looks similar to https://github.com/bcgsc/NanoSim/issues/61, so I may be able so solve it, but anyway the package should work in a fresh install.
I got it working after doing
conda install pip
conda install cython
pip install scikit-learn=0.22.1
but like the linked issue I now get these depracation and incompatibility warnings:
/home/philae/.local/share/miniconda3/lib/python3.9/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.neighbors.kde module is deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.neighbors. Anything that cannot be imported from sklearn.neighbors is now part of the private API.
warnings.warn(message, FutureWarning)
/home/philae/.local/share/miniconda3/lib/python3.9/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.neighbors.kd_tree module is deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.neighbors. Anything that cannot be imported from sklearn.neighbors is now part of the private API.
warnings.warn(message, FutureWarning)
/home/philae/.local/share/miniconda3/lib/python3.9/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.neighbors.dist_metrics module is deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.neighbors. Anything that cannot be imported from sklearn.neighbors is now part of the private API.
warnings.warn(message, FutureWarning)
/home/philae/.local/share/miniconda3/lib/python3.9/site-packages/sklearn/base.py:313: UserWarning: Trying to unpickle estimator KernelDensity from version 0.21.3 when using version 0.22.1. This might lead to breaking code or invalid results. Use at your own risk.
warnings.warn(
Hey Ragnar,
It is known issue with scikit-learn version incompatibility. Please refer to #131 for some tips from @kmnip and myself. If you install from bioconda, it is less likely to have installation issues. We think that requirements.txt is overly restrictive and it should be updated to avoid these issues. We will take care of that shortly. Thanks for your interest in using NanoSim. Cheers.
For completeness: As far as I'm aware (this is my first time using conda), I added the bioconda
and conda-forge
channels and then installed it, so that would mean this problem indeed also happens when installing from bioconda.
Hi, I'm having a the same issue and I noticed that even if I create an env with python3.7, the error keeps pointing to a directory named python3.9, inside the conda directory. The same as in your case https://github.com/bcgsc/NanoSim/issues/165#issuecomment-1105058924 .
In my case:
/opt/conda/envs/mms/lib/python3.9
Of course it doesn't exists, but /opt/conda/envs/mms/lib/python3.7
do.
I was thinking if a hard coded path may be making NanoSim look for sklearn
in a different python version that the one in which is actually installed.
When I launch python (3.7) sklearn.neighbors.kde
(scikit-learn==0.23
) is installed:
$ python
Python 3.7.0 | packaged by conda-forge | (default, Nov 12 2018, 20:15:55)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sklearn.neighbors.kde
/opt/conda/envs/mms/lib/python3.7/site-packages/sklearn/utils/deprecation.py:143: FutureWarning: The sklearn.neighbors.kde module is deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.neighbors. Anything that cannot be imported from sklearn.neighbors is now part of the private API.
warnings.warn(message, FutureWarning)
>>> quit()
But when I run it I get:
Traceback (most recent call last):
File "/opt/MMs/libs/../../NanoSim/src/simulator.py", line 2400, in <module>
main()
File "/opt/MMs/libs/../../NanoSim/src/simulator.py", line 2359, in main
read_profile(genome_list, [], model_prefix, perfect, args.mode, strandness, dna_type=dna_type_list, abun=abun,
File "/opt/MMs/libs/../../NanoSim/src/simulator.py", line 523, in read_profile
kde_ht = joblib.load(model_prefix + "_ht_length.pkl")
File "/opt/conda/envs/mms/lib/python3.9/site-packages/joblib/numpy_pickle.py", line 587, in load
obj = _unpickle(fobj, filename, mmap_mode)
File "/opt/conda/envs/mms/lib/python3.9/site-packages/joblib/numpy_pickle.py", line 506, in _unpickle
obj = unpickler.load()
File "/opt/conda/envs/mms/lib/python3.9/pickle.py", line 1212, in load
dispatch[key[0]](self)
File "/opt/conda/envs/mms/lib/python3.9/pickle.py", line 1528, in load_global
klass = self.find_class(module, name)
File "/opt/conda/envs/mms/lib/python3.9/pickle.py", line 1579, in find_class
__import__(module, level=0)
ModuleNotFoundError: No module named 'sklearn.neighbors.kde'
I don't know, may be it says something to you.
PS. I'm installing and running everything in fresh singularity containers.
The pretrained models in NanoSim were made using an older version of scikit-learn
(e.g. <=0.22.1).
If you have to use these models (instead of creating your own models), then you must use scikit-learn=0.22.1
but not the newer versions. If you have a newer version of scikit-learn
installed, then you will get the error for No module named 'sklearn.neighbors.kde'
.
If you would like to create your own models (instead of using the pretrained models), then NanoSim should work just fine with scikit-learn=1.0.2
from my own experience.
On top of this incompatibility issue, some users also have difficulty with installing all the dependent packages with conda
.
I strongly recommend that you create a dedicated environment for running NanoSim. If you have issues with conda install
being eternally stuck, use mamba instead of conda
to install your conda packages: https://github.com/mamba-org/mamba .
So, integrating all these together:
conda create -n nanosim_pretrained
conda activate nanosim_pretrained
mamba install scikit-learn=0.22.1 six samtools pysam pybedtools minimap2 joblib htseq genometools-genometools
Note that here I only specified the version for scikit-learn
but not for the other packages. mamba
should be able to pick the appropriate versions for the specified packages, python, and numpy, etc.
Hope this helps whoever stumble upon this issue in the future!
pip install scikit-learn==0.22.1
solved my problem.
Hi!
I also ran into this issue recently. Since the pretrained model require sckikit-learn <= 0.22.1
, wouldn't it be adequate to pin this version in the bioconda recipe?
Best, Hadrien
Hi @HadrienG ,
We will make a new release that includes updated pretrained models. For the existing models, this specific environment works for me:
requirements.txt
genometools-genometools
htseq=0.11.3
joblib=1.1.0
last
minimap2=2.17
numpy=1.21.5
pybedtools=0.8.1
pysam=0.15.3
samtools
scikit-learn=0.22.1
scipy=1.7.3
six=1.16.0
conda create -n nanosim
conda activate nanosim
mamba install --file requirements.txt -c conda-forge -c bioconda
For those installing this via bioconda I've now patched the repodata to force scikit-learn >=0.20.0,<=0.22.1. That should hopefully resolve the issue there.