dtype error
Hi!
I have tried to run the peakachu score_genome function and my scripts were as follows:
_peakachu score_genome -r 5000 --balance -p /workdir/nf/inter_30.hic -O nf-peakachu-5kb-scores.bedpe -m /model/high-confidence.600million.5kb.w6.pkl_
However, I got the error probably associated with the dtype:
_```
_/share/home/lhl_zhulin/miniconda3/envs/juicer/lib/python3.8/site-packages/sklearn/base.py:348: InconsistentVersionWarning: Trying to unpickle estimator DecisionTreeClassifier from version 1.1.2 when using version 1.3.1. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
warnings.warn(
Traceback (most recent call last):
File "/share/home/lhl_zhulin/miniconda3/envs/juicer/bin/peakachu", line 91, in
- expected: {'names': ['left_child', 'right_child', 'feature', 'threshold', 'impurity', 'n_node_samples', 'weighted_n_node_samples', 'missing_go_to_left'], 'formats': ['<i8', '<i8', '<i8', '<f8', '<f8', '<i8', '<f8', 'u1'], 'offsets': [0, 8, 16, 24, 32, 40, 48, 56], 'itemsize': 64}
- got : [('left_child', '<i8'), ('right_child', '<i8'), ('feature', '<i8'), ('threshold', '<f8'), ('impurity', '<f8'), ('n_node_samples', '<i8'), ('weighted_n_node_samples', '<f8')]_
It seems to be a incompatible dtype error and my .hic file was produced by juicer1.6, I didn't think this file had a confused format.
So how did this error happen? And What can I do to change its format to adapt the sofeware?
Thank you very much!
Thank you for reporting this.
Based on the error message and traceback, my first guess for the source of this issue is that it is caused by differences in sklearn versions. My second guess is that something is causing the Tree setstate input to become malformed. My third guess is common python environment issues such as using the appropriate binary for your machine -- M1 macbook users know this pain.
In any case, the input expected by Tree's setstate is a dict with three keys: names, formats, and itemsize. What it got definitely looks like it came from a tree, but looks like a list of sets, not a dict.
I hope @XiaoTaoWang can think of a simple solution. I will take a look through recent changelogs related to pickle/sklearn if I get the time for it. But no promises.
Best next steps:
- run
conda listfrom within your activated environment, and post the result. Perhaps we can solve this by pinning something. - Describe your install process. Using conda? pip? something else?a
- the type of machine
Thanks again
Thanks for your considerable solutions! Here are my information of running conda list: `
# packages in environment at /share/home/lhl_zhulin/miniconda3/envs/juicer:
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge https://mirrors.bfsu.edu.cn/anaconda/cloud/conda-forge
_openmp_mutex 4.5 2_kmp_llvm https://mirrors.bfsu.edu.cn/anaconda/cloud/conda-forge
asciitree 0.3.3 pypi_0 pypi
bioframe 0.5.0 pypi_0 pypi
bwa 0.7.17 h7132678_9 https://mirrors.bfsu.edu.cn/anaconda/cloud/bioconda
bwa-mem2 2.2.1 hd03093a_2 https://mirrors.bfsu.edu.cn/anaconda/cloud/bioconda
bzip2 1.0.8 h7b6447c_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
c-ares 1.19.0 h5eee18b_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
ca-certificates 2023.7.22 hbcca054_0 https://mirrors.bfsu.edu.cn/anaconda/cloud/conda-forge
certifi 2023.7.22 pypi_0 pypi
charset-normalizer 3.3.0 pypi_0 pypi
click 8.1.7 pypi_0 pypi
contourpy 1.1.1 pypi_0 pypi
cooler 0.9.3 pypi_0 pypi
cooltools 0.5.4 pypi_0 pypi
curl 7.88.1 h5eee18b_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
cycler 0.12.1 pypi_0 pypi
cython 3.0.4 pypi_0 pypi
cytoolz 0.12.2 pypi_0 pypi
dill 0.3.7 pypi_0 pypi
fonttools 4.43.1 pypi_0 pypi
gdbm 1.18 hd4cb3f1_4 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
h5py 3.10.0 pypi_0 pypi
hic-straw 0.0.6 pypi_0 pypi
idna 3.4 pypi_0 pypi
imageio 2.31.5 pypi_0 pypi
importlib-metadata 6.8.0 pypi_0 pypi
importlib-resources 6.1.0 pypi_0 pypi
joblib 1.3.2 pypi_0 pypi
kiwisolver 1.4.5 pypi_0 pypi
krb5 1.19.4 h568e23c_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
lazy-loader 0.3 pypi_0 pypi
ld_impl_linux-64 2.38 h1181459_1 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
libcurl 7.88.1 h91b91d3_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
libedit 3.1.20221030 h5eee18b_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
libev 4.33 h7f8727e_1 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
libffi 3.3 he6710b0_2 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
libgcc-ng 12.2.0 h65d4601_19 https://mirrors.bfsu.edu.cn/anaconda/cloud/conda-forge
libnghttp2 1.46.0 hce63b2e_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
libssh2 1.10.0 h8f2d780_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
libstdcxx-ng 13.2.0 h7e041cc_2 https://mirrors.bfsu.edu.cn/anaconda/cloud/conda-forge
libzlib 1.2.13 h166bdaf_4 https://mirrors.bfsu.edu.cn/anaconda/cloud/conda-forge
llvm-openmp 14.0.6 h9e868ea_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
llvmlite 0.41.0 pypi_0 pypi
matplotlib 3.7.3 pypi_0 pypi
multiprocess 0.70.15 pypi_0 pypi
ncurses 6.4 h6a678d5_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
networkx 3.1 pypi_0 pypi
numba 0.58.0 pypi_0 pypi
numpy 1.24.4 pypi_0 pypi
openssl 1.1.1w hd590300_0 https://mirrors.bfsu.edu.cn/anaconda/cloud/conda-forge
packaging 23.2 pypi_0 pypi
pandas 1.5.3 pypi_0 pypi
peakachu 2.2.post1 pypi_0 pypi
perl 5.34.0 h5eee18b_2 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
pillow 10.1.0 pypi_0 pypi
pip 23.2.1 py38h06a4308_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
pybind11 2.11.1 py38h7f3f72f_2 https://mirrors.bfsu.edu.cn/anaconda/cloud/conda-forge
pybind11-global 2.11.1 py38h7f3f72f_2 https://mirrors.bfsu.edu.cn/anaconda/cloud/conda-forge
pyfaidx 0.7.2.2 pypi_0 pypi
pyparsing 3.1.1 pypi_0 pypi
python 3.8.8 hdb3f193_5 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
python-dateutil 2.8.2 pypi_0 pypi
python_abi 3.8 2_cp38 https://mirrors.bfsu.edu.cn/anaconda/cloud/conda-forge
pytz 2023.3.post1 pypi_0 pypi
pywavelets 1.4.1 pypi_0 pypi
pyyaml 6.0.1 pypi_0 pypi
readline 8.2 h5eee18b_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
requests 2.31.0 pypi_0 pypi
samtools 1.6 hcd7b337_9 https://mirrors.bfsu.edu.cn/anaconda/cloud/bioconda
scikit-image 0.21.0 pypi_0 pypi
scikit-learn 1.3.1 pypi_0 pypi
scipy 1.10.1 pypi_0 pypi
setuptools 68.0.0 py38h06a4308_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
simplejson 3.19.2 pypi_0 pypi
six 1.16.0 pypi_0 pypi
sqlite 3.41.2 h5eee18b_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
threadpoolctl 3.2.0 pypi_0 pypi
tifffile 2023.7.10 pypi_0 pypi
tk 8.6.12 h1ccaba5_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
toolz 0.12.0 pypi_0 pypi
typing-extensions 4.8.0 pypi_0 pypi
urllib3 2.0.7 pypi_0 pypi
wheel 0.41.2 py38h06a4308_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
xz 5.2.10 h5eee18b_1 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
zipp 3.17.0 pypi_0 pypi
zlib 1.2.13 h166bdaf_4 https://mirrors.bfsu.edu.cn/anaconda/cloud/conda-forge
I installed peakachu by conda and I have been using the server based on linux, which was managed by slurm system. As you said, I am considering that maybe the python environment caused this problem and I will also create a new environment to try again. Thank you for your thoughtful consideration again!
No problem :)
Your output shows installs from several channels -- pypi, anaconda, etc. While conflicts between pip and conda are better managed now in the past, conda's dep management still gets confused sometimes. Usually something happens like this:
I make a conda env. I pip install something. pip upgrades matplotlib or whatever. conda gets confused. turns out something installed by conda breaks if matplotlib updates. Not sure if this is really what happens, but close enough.
The other issue is anaconda's default channel does not always have architecture-specific binaries for a package. The preferred channel is conda-forge, which is better maintained and more reliable. Your channel priority should be conda-forge > bioconda > default.
miniconda-forge on github explains things more. I like mamba / micromamba, but editing your regular miniconda config should work just as well.
To avoid most python env gotchas, take this advice: build your conda env all in one go, as in, specify all the libraries you need at creation time. Their dependencies will all get resolved together. Ideally, you never modify the env. If you need to install something new, prefer conda install over pip. If you pip install something, then use only pip after that.
We'll try this first and look for another solution if not resolved.
Good luck!
checking in, were you able to get things working?