pySCENIC icon indicating copy to clipboard operation
pySCENIC copied to clipboard

[BUG]“pyscenic ctx” cant not load “.feather” correctly

Open JRZL123 opened this issue 3 years ago • 7 comments

Dear pyscenic development team, I encountered some problems when using “pyscenic ctx”

Describe the bug

"mm9-500bp-upstream-10species.mc9nr.genes_vs_motifs.rankings.feather" is not a cisTarget Feather database in Feather v1 or v2 format. when I use function “pyscenic ctx”. ".feather" download Via zsync_ curl and checksum matches are OK

Reproduce the behavior

  1. Command run when the error occurred:
pyscenic ctx Astrocyte.tsv mm9-500bp-upstream-10species.mc9nr.genes_vs_motifs.rankings.feather mm9-tss-centered-10kb-10species.mc9nr.genes_vs_motifs.rankings.feather --annotations_fname motifs-v9-nr.mgi-m0.001-o0.0.tbl --expression_mtx_fname Astrocyte.loom --mode "dask_multiprocessing" --output Astrocyte_reg.csv --num_workers 4 --mask_dropouts
  1. Error encountered:
2022-08-29 11:11:22,243 - pyscenic.cli.pyscenic - INFO - Creating modules.

2022-08-29 11:11:22,613 - pyscenic.cli.pyscenic - INFO - Loading expression matrix.

2022-08-29 11:11:22,867 - pyscenic.utils - INFO - Calculating Pearson correlations.

2022-08-29 11:11:22,987 - pyscenic.utils - WARNING - Note on correlation calculation: the default behaviour for calculating the correlations has changed after pySCENIC verion 0.9.16. Previously, the default was to calculate the correlation between a TF and target gene using only cells with non-zero expression values (mask_dropouts=True). The current default is now to use all cells to match the behavior of the R verision of SCENIC. The original settings can be retained by setting 'rho_mask_dropouts=True' in the modules_from_adjacencies function, or '--mask_dropouts' from the CLI.
        Dropout masking is currently set to [True].

2022-08-29 11:11:25,281 - pyscenic.utils - INFO - Creating modules.

2022-08-29 11:11:50,503 - pyscenic.cli.pyscenic - INFO - Loading databases.
Traceback (most recent call last):
  File "/home/lhz197104/miniconda3/bin/pyscenic", line 8, in <module>
    sys.exit(main())
  File "/home/lhz197104/miniconda3/lib/python3.8/site-packages/pyscenic/cli/pyscenic.py", line 677, in main
    args.func(args)
  File "/home/lhz197104/miniconda3/lib/python3.8/site-packages/pyscenic/cli/pyscenic.py", line 215, in prune_targets_command
    dbs = _load_dbs(args.database_fname)
  File "/home/lhz197104/miniconda3/lib/python3.8/site-packages/pyscenic/cli/pyscenic.py", line 176, in _load_dbs
    return [opendb(fname=fname.name, name=get_name(fname.name)) for fname in fnames]
  File "/home/lhz197104/miniconda3/lib/python3.8/site-packages/pyscenic/cli/pyscenic.py", line 176, in <listcomp>
    return [opendb(fname=fname.name, name=get_name(fname.name)) for fname in fnames]
  File "/home/lhz197104/miniconda3/lib/python3.8/site-packages/ctxcore/rnkdb.py", line 180, in opendb
    return FeatherRankingDatabase(fname, name=name)
  File "/home/lhz197104/miniconda3/lib/python3.8/site-packages/ctxcore/rnkdb.py", line 109, in __init__
    self.ct_db = CisTargetDatabase.init_ct_db(
  File "/home/lhz197104/miniconda3/lib/python3.8/site-packages/ctxcore/ctdb.py", line 170, in init_ct_db
    raise ValueError(
ValueError: "mm9-500bp-upstream-10species.mc9nr.genes_vs_motifs.rankings.feather" is not a cisTarget Feather database in Feather v1 or v2 format.

Expected behavior Did I do something wrong? How to solve this problem?

Please complete the following information:

  • pySCENIC version: [0.12.0]
  • Installation method: [Conda]
  • Run environment: [CLI ]
  • OS: [WSL2 Ubuntu 20.04.4]
  • Package versions:
aiohttp==3.8.1
aiosignal==1.2.0
arboreto==0.1.6
async-timeout==4.0.2
attrs==22.1.0
bokeh==2.4.3
boltons==21.0.0
Bottleneck @ file:///tmp/build/80754af9/bottleneck_1648028895253/work
brotlipy==0.7.0
certifi @ file:///opt/conda/conda-bld/certifi_1655968806487/work/certifi
cffi @ file:///opt/conda/conda-bld/cffi_1642701102775/work
charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work
click @ file:///tmp/build/80754af9/click_1646038465422/work
cloudpickle==2.1.0
colorama @ file:///tmp/build/80754af9/colorama_1607707115595/work
conda==4.14.0
conda-content-trust @ file:///tmp/build/80754af9/conda-content-trust_1617045594566/work
conda-package-handling @ file:///tmp/build/80754af9/conda-package-handling_1649087926789/work
cryptography @ file:///tmp/build/80754af9/cryptography_1639400846433/work
ctxcore==0.2.0
cycler @ file:///tmp/build/80754af9/cycler_1637851556182/work
Cython @ file:///tmp/build/80754af9/cython_1647832478439/work
cytoolz==0.11.0
dask==2022.8.1
dill==0.3.5.1
distributed==2022.8.1
fonttools==4.25.0
frozendict==2.3.4
frozenlist==1.3.1
fsspec==2022.7.1
h5py==2.10.0
HeapDict==1.0.1
idna @ file:///tmp/build/80754af9/idna_1637925883363/work
interlap==0.2.7
Jinja2==3.1.2
joblib @ file:///tmp/build/80754af9/joblib_1635411271373/work
kiwisolver @ file:///opt/conda/conda-bld/kiwisolver_1653292039266/work
llvmlite==0.38.0
locket==1.0.0
loompy==3.0.7
MarkupSafe==2.1.1
matplotlib @ file:///tmp/build/80754af9/matplotlib-suite_1647441664166/work
mkl-fft==1.3.1
mkl-random @ file:///tmp/build/80754af9/mkl_random_1626186064646/work
mkl-service==2.4.0
mock @ file:///tmp/build/80754af9/mock_1607622725907/work
msgpack==1.0.4
multidict==6.0.2
multiprocessing-on-dill==3.5.0a4
networkx==2.8.6
numba @ file:///opt/conda/conda-bld/numba_1648040517072/work
numexpr @ file:///tmp/build/80754af9/numexpr_1640704208950/work
numpy @ file:///opt/conda/conda-bld/numpy_and_numpy_base_1651563629415/work
numpy-groupies==0.9.19
packaging @ file:///tmp/build/80754af9/packaging_1637314298585/work
pandas==1.4.3
partd==1.3.0
patsy==0.5.2
Pillow==9.0.1
psutil==5.9.1
pyarrow==9.0.0
pycosat==0.6.3
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
pynndescent==0.5.7
pyOpenSSL @ file:///opt/conda/conda-bld/pyopenssl_1643788558760/work
pyparsing @ file:///opt/conda/conda-bld/pyparsing_1661452539315/work
pysam==0.19.1
pyscenic==0.12.0
PySocks @ file:///tmp/build/80754af9/pysocks_1605305779399/work
python-dateutil @ file:///tmp/build/80754af9/python-dateutil_1626374649649/work
pytz==2022.2.1
PyYAML==6.0
requests @ file:///opt/conda/conda-bld/requests_1641824580448/work
ruamel-yaml-conda @ file:///tmp/build/80754af9/ruamel_yaml_1616016699510/work
scikit-learn @ file:///tmp/build/80754af9/scikit-learn_1642617107864/work
scipy @ file:///tmp/build/80754af9/scipy_1641555001653/work
seaborn @ file:///tmp/build/80754af9/seaborn_1629307859561/work
sip==4.19.13
six @ file:///tmp/build/80754af9/six_1644875935023/work
sortedcontainers==2.4.0
statsmodels @ file:///tmp/build/80754af9/statsmodels_1648033297787/work
tables==3.6.1
tblib==1.7.0
threadpoolctl @ file:///Users/ktietz/demo/mc3/conda-bld/threadpoolctl_1629802263681/work
toolz @ file:///tmp/build/80754af9/toolz_1636545406491/work
tornado @ file:///tmp/build/80754af9/tornado_1606942300299/work
tqdm @ file:///opt/conda/conda-bld/tqdm_1647339053476/work
typing_extensions==4.3.0
umap-learn==0.5.3
urllib3 @ file:///opt/conda/conda-bld/urllib3_1643638302206/work
velocyto==0.17.17
yarl==1.8.1
zict==2.2.0 

JRZL123 avatar Aug 29 '22 03:08 JRZL123

Install pySCENIC 0.12.0: https://pypi.org/project/pyscenic/

ghuls avatar Sep 02 '22 12:09 ghuls

Install pySCENIC 0.12.0: https://pypi.org/project/pyscenic/<

Same problem, nothings change (ㄒoㄒ)

JRZL123 avatar Sep 04 '22 09:09 JRZL123

Install pySCENIC 0.12.0: https://pypi.org/project/pyscenic/

pretty sure"mm9.feather" it's the problem. I used "mm10_10kbp_up_10kbp_down_full_tx_clustered.genes_vs_motifs.rankings.feather" & "mm10_500bp_up_100bp_down_full_tx_clustered.genes_vs_motifs.rankings.feather" run perfectly find! Now the only question is, if I use the "mm10.feather" to analyze on the count matrix obtained by "mm9", will there be serious consequences?

JRZL123 avatar Sep 04 '22 13:09 JRZL123

Install pySCENIC 0.12.0: https://pypi.org/project/pyscenic/

pretty sure"mm9.feather" it's the problem. I used "mm10_10kbp_up_10kbp_down_full_tx_clustered.genes_vs_motifs.rankings.feather" & "mm10_500bp_up_100bp_down_full_tx_clustered.genes_vs_motifs.rankings.feather" run perfectly find! Now the only question is, if I use the "mm10.feather" to analyze on the count matrix obtained by "mm9", will there be serious consequences?

Normally not a lot. It might even be better as the mm9 gene annotation used in the old database was quite old (so you might recover more genes from your count matrix).

ghuls avatar Sep 14 '22 09:09 ghuls

Hi, I'm having the same issue with the mm10 files, running pySCENIC 0.12.0.

ValueError: "mm10_10kbp_up_10kbp_down_full_tx_v10_clust.genes_vs_motifs.rankings.feather" is not a cisTarget Feather database in Feather v1 or v2 format.

Any more ideas?

AdiRavid avatar Oct 20 '22 10:10 AdiRavid

Can you redownload the file? The file didn't exist before (different name).

ghuls avatar Nov 21 '22 14:11 ghuls

Trying again, but the checksum file doesn't exist. sha256sum.txt

AdiRavid avatar Dec 13 '22 08:12 AdiRavid