scenicplus
scenicplus copied to clipboard
run_scenicplus stops running midway
Describe the bug Hello, I am trying to perform run_scenicplus. It gets through part of the way, but then will just stop running and I don't know why. I'm fairly new to using Python so I'm not sure if it is generating any error outputs anywhere but I don't see any immediately to have any indication of why it is stopping.
To Reproduce
from scenicplus.wrappers.run_scenicplus import run_scenicplus
try:
#sys.stderr = open(os.devnull, "w") # silence stderr
run_scenicplus(
scplus_obj = scplus_obj,
variable = ['Dev.Stage'],
species = 'hsapiens',
assembly = 'hg38',
tf_file = os.path.join(projDir, 'TF_names_v_1.01.txt'),
save_path = os.path.join(projDir, 'scenicplus'),
biomart_host = biomart_host,
upstream = [1000, 150000],
downstream = [1000, 150000],
calculate_TF_eGRN_correlation = True,
calculate_DEGs_DARs = True,
export_to_loom_file = True,
export_to_UCSC_file = True,
path_bedToBigBed = projDir,
n_cpu = 12,
_temp_dir = os.path.join(projDir, 'ray_spill'),
save_patrial = True)
#sys.stderr = sys.__stderr__ # unsilence stderr
except Exception as e:
#in case of failure, still save the object
dill.dump(scplus_obj, open(os.path.join(projDir, 'scenicplus/scplus_obj.pkl'), 'wb'), protocol=-1)
raise(e)
Error output
2024-02-16 09:13:22,128 SCENIC+_wrapper INFO Created folder : /u/scratch/c/chienp/scenicplus/scenicplus
2024-02-16 09:13:22,128 SCENIC+_wrapper INFO Merging cistromes
2024-02-16 09:13:33,539 SCENIC+_wrapper INFO Getting search space
2024-02-16 09:13:34,813 R2G INFO Downloading gene annotation from biomart dataset: hsapiens_gene_ensembl
2024-02-16 09:13:48,102 R2G INFO Downloading chromosome sizes from: http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes
2024-02-16 09:13:48,673 R2G INFO Extending promoter annotation to 10 bp upstream and 10 downstream
Warning! Start and End columns now have different dtypes: int32 and int64
Warning! Start and End columns now have different dtypes: int32 and int64
2024-02-16 09:13:50,397 R2G INFO Extending search space to:
150000 bp downstream of the end of the gene.
150000 bp upstream of the start of the gene.
Warning! Start and End columns now have different dtypes: int32 and int64
Warning! Start and End columns now have different dtypes: int32 and int64
2024-02-16 09:13:57,875 R2G INFO Intersecting with regions.
Warning! Start and End columns now have different dtypes: int32 and int64
2024-02-16 09:13:58,754 R2G INFO Calculating distances from region to gene
2024-02-16 09:15:21,819 R2G INFO Imploding multiple entries per region and gene
2024-02-16 09:17:56,569 R2G INFO Done!
2024-02-16 09:17:56,922 SCENIC+_wrapper INFO Inferring region to gene relationships
2024-02-16 09:17:57,078 R2G INFO Calculating region to gene importances, using GBM method
>>>
The output just stops here.
Expected behavior For run_scenicplus to continue processing
Version (please complete the following information):
- Python 3.8.18
- SCENIC+: '1.0.1.dev6+ge5ba6fc'
Hi @peggiechien
When a python process suddenly stops without any error message it is usually a sign that you are running out of memory. How much memory do you have available?
Also, I would recommend to use the development version of the code, it's almost a complete rewrite that is a lot more memory efficient. Please see this discussion: https://github.com/aertslab/scenicplus/discussions/202 for more information.
All the best,
Seppe
I requested for 64G and then 128G memory and for both cases the process stops at the same spot. Does it need more memory than that?
I'll try the development version, though I'm a little confused about how to set it up. Your link regarding modifying the config.yaml
file doesn't work in that post: https://github.com/aertslab/scenicplus/blob/development/Snakemake/config/config.yaml . Do you have a different link instead?
Also, am I supposed to see a Snakemake
directory after installing SCENIC+ and that's where I'll find the config.yaml
file? I downloaded the development version using this:
git clone -b development https://github.com/aertslab/scenicplus
And in the directory I see this:
$ ls
cytoscape_styles Dockerfile Dockerfile_meme docs notebooks README.md requirements.txt resources setup.py src
This is the version I installed:
>>> import scenicplus
>>> scenicplus.__version__
'1.0a1'
Hi @peggiechien
You might need more memory than that. It depends on the amount of cells and regions.
Once you installed the development version of SCENIC+ you can initialize the snakemake directory using the following command
$ scenicplus init_snakemake --help
usage: scenicplus init_snakemake [-h] --out_dir OUT_DIR
Initialize snakemake pipeline
optional arguments:
-h, --help show this help message and exit
--out_dir OUT_DIR Path to out dir.
This will create the snakemake directory in which you will find the config.yaml file. After modifying this file for your data you can run the snakemake pipeline by running:
$ snakemake --cores <NUMBER_OF_CORES>
Inside the snakemake directory.
I hope this helps.
Let me know if you get stuck somewhere.
All the best,
Seppe
Hi @SeppeDeWinter
I'm running into this issue:
$ scenicplus init_snakemake --help
Traceback (most recent call last):
File "/u/home/c/chienp/.local/bin/scenicplus", line 33, in <module>
sys.exit(load_entry_point('scenicplus', 'console_scripts', 'scenicplus')())
File "/u/home/c/chienp/scenicplus/src/scenicplus/cli/scenicplus.py", line 1123, in main
parser, subparsers = create_argument_parser()
File "/u/home/c/chienp/scenicplus/src/scenicplus/cli/scenicplus.py", line 1110, in create_argument_parser
add_parser_for_motif_enrichment_cistarget(inference_subparsers)
File "/u/home/c/chienp/scenicplus/src/scenicplus/cli/scenicplus.py", line 404, in add_parser_for_motif_enrichment_cistarget
from pycistarget.cli.pycistarget import CISTARGET_DEFAULTS
ModuleNotFoundError: No module named 'pycistarget.cli'
Not sure if this is the right track but I installed pycistarget separately using pip install -e .
and that didn't fix the issue.
On a separate note, I saw in the other thread that for the config.yaml file I would need to have bed files for the "region_set_folder". Is there a way I can generate the bed files from somewhere in the pipeline leading up to using SCENIC+?
Hi @peggiechien
Related to the pycistarget.cli
error, the development version of pycistarget should be installed. I fixed this in the code last week (see: https://github.com/aertslab/scenicplus/commit/2f8891cdd68981cb56c38596c13392079c8d76b2). If you reinstall SCENIC+ now this issue should be fixed.
Indeed you need bed files. Typically we use regions in topics and differentially accessible regions (DARs). See "Inferring candidate enhancer regions" in the PBMC tutorial for example: https://scenicplus.readthedocs.io/en/latest/pbmc_multiome_tutorial.html#Inferring-candidate-enhancer-regions.
These regions can be saved to bed files. For example for binarized topics:
from pycistarget.utils import region_names_to_coordinates
import os
OUTDIR = <OUTPUT_DIRECTORY>
for topic in region_bin_topics_otsu.keys():
regions = region_bin_topics_otsu[topic].index
region_names_to_coordinates(regions).to_csv(
os.path.join(OUTDIR, f"{topic}_otsu.bed"),
sep = "\t", header = False, index = False
)
Same can be done using the DARs.
I hope this helps,
All the best.
Seppe
Hi @SeppeDeWinter
Thank you! I was able to generate the bed files.
I reinstalled SCENIC+ and am now running into this:
$ scenicplus init_snakemake --help
usage: scenicplus init_snakemake [-h] --out_dir OUT_DIR
Initialize snakemake pipeline
options:
-h, --help show this help message and exit
--out_dir OUT_DIR Path to out dir.
$ scenicplus init_snakemake --out_dir out_dir
Traceback (most recent call last):
File "/u/home/c/chienp/.local/bin/scenicplus", line 33, in <module>
sys.exit(load_entry_point('scenicplus', 'console_scripts', 'scenicplus')())
File "/u/home/c/chienp/scenicplus/src/scenicplus/cli/scenicplus.py", line 1137, in main
args.func(args)
File "/u/home/c/chienp/scenicplus/src/scenicplus/cli/scenicplus.py", line 21, in init_snakemake
from scenicplus.cli.commands import init_snakemake_folder
File "/u/home/c/chienp/scenicplus/src/scenicplus/cli/commands.py", line 15, in <module>
from importlib_resources import files
ModuleNotFoundError: No module named 'importlib_resources'
Hi @peggiechien
Can you try rerunning after installing importlib_resources
pip install importlib_resources
I have now also added this package as a requirement: https://github.com/aertslab/scenicplus/commit/6527e3c3bf6bcd293335570f06000b5949a0770a.
I hope this solves your issue.
All the best, Seppe
Hi @SeppeDeWinter
That solved the issue! Thank you! I got snakemake to work now, but am running into a new error in the pipeline. I've updated the config.yaml file and am trying to run this from the Snakemake folder:
$ snakemake --cores 12
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 12
Rules claiming more threads will be scaled down.
Job stats:
job count
---------------------------- -------
AUCell_direct 1
AUCell_extended 1
all 1
download_genome_annotations 1
eGRN_direct 1
eGRN_extended 1
get_search_space 1
motif_enrichment_cistarget 1
motif_enrichment_dem 1
prepare_GEX_ACC_non_multiome 1
prepare_menr 1
region_to_gene 1
scplus_mudata 1
tf_to_gene 1
total 14
Select jobs to execute...
[Sat Mar 9 16:54:13 2024]
rule motif_enrichment_cistarget:
input: /u/scratch/c/chienp/scenicplus_invivo/candidate_enhancers_bed, /u/scratch/c/chienp/scenicplus/hg38_screen_v10_clust.regions_vs_motifs.rankings.feather, /u/scratch/c/chienp/scenicplus_invivo/motifs-v10nr_clust-nr.hgnc-m0.001-o0.0.tbl
output: ctx_results.hdf5, ctx_results.html
jobid: 9
reason: Missing output files: ctx_results.hdf5
threads: 12
resources: tmpdir=/tmp
usage: scenicplus grn_inference motif_enrichment_cistarget [-h] --region_set_folder REGION_SET_FOLDER --cistarget_db_fname CISTARGET_DB_FNAME
--output_fname_cistarget_result OUTPUT_FNAME_CISTARGET_RESULT --temp_dir TEMP_DIR --species
SPECIES [--fr_overlap_w_ctx_db FRACTION_OVERLAP_W_CISTARGET_DATABASE]
[--auc_threshold AUC_THRESHOLD] [--nes_threshold NES_THRESHOLD]
[--rank_threshold RANK_THRESHOLD] [--path_to_motif_annotations PATH_TO_MOTIF_ANNOTATIONS]
[--annotation_version ANNOTATION_VERSION] [--motif_similarity_fdr MOTIF_SIMILARITY_FDR]
[--orthologous_identity_threshold ORTHOLOGOUS_IDENTITY_THRESHOLD]
[--annotations_to_use [ANNOTATIONS_TO_USE ...]] [--write_html]
[--output_fname_cistarget_html OUTPUT_FNAME_CISTARGET_HTML] [--n_cpu N_CPU]
scenicplus grn_inference motif_enrichment_cistarget: error: argument --temp_dir: expected one argument
[Sat Mar 9 16:54:15 2024]
Error in rule motif_enrichment_cistarget:
jobid: 9
input: /u/scratch/c/chienp/scenicplus_invivo/candidate_enhancers_bed, /u/scratch/c/chienp/scenicplus/hg38_screen_v10_clust.regions_vs_motifs.rankings.feather, /u/scratch/c/chienp/scenicplus_invivo/motifs-v10nr_clust-nr.hgnc-m0.001-o0.0.tbl
output: ctx_results.hdf5, ctx_results.html
shell:
scenicplus grn_inference motif_enrichment_cistarget --region_set_folder /u/scratch/c/chienp/scenicplus_invivo/candidate_enhancers_bed --cistarget_db_fname /u/scratch/c/chienp/scenicplus/hg38_screen_v10_clust.regions_vs_motifs.rankings.feather --output_fname_cistarget_result ctx_results.hdf5 --temp_dir --species homo_sapiens --fr_overlap_w_ctx_db 0.4 --auc_threshold 0.005 --nes_threshold 3.0 --rank_threshold 0.05 --path_to_motif_annotations /u/scratch/c/chienp/scenicplus_invivo/motifs-v10nr_clust-nr.hgnc-m0.001-o0.0.tbl --annotation_version v10nr_clust --motif_similarity_fdr 0.001 --orthologous_identity_threshold 0.0 --annotations_to_use Direct_annot Orthology_annot --write_html --output_fname_cistarget_html ctx_results.html --n_cpu 12
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-03-09T165410.853333.snakemake.log
Sorry to keep bothering you with all the errors I've been getting. I appreciate all your help so far!
Hi @peggiechien
No worries. Did you specify a temporary directory in config.yaml
, under params_general
(line 42)
...
41 params_general:
42 temp_dir: ""
43 n_cpu: 40
44 seed: 666
...
All the best,
Seppe
Hi @SeppeDeWinter
I fixed the temp_dir and am now running into this:
$ snakemake --cores 12
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 12
Rules claiming more threads will be scaled down.
Job stats:
job count
---------------------------- -------
AUCell_direct 1
AUCell_extended 1
all 1
download_genome_annotations 1
eGRN_direct 1
eGRN_extended 1
get_search_space 1
motif_enrichment_cistarget 1
motif_enrichment_dem 1
prepare_GEX_ACC_non_multiome 1
prepare_menr 1
region_to_gene 1
scplus_mudata 1
tf_to_gene 1
total 14
Select jobs to execute...
[Tue Mar 12 10:14:50 2024]
rule motif_enrichment_cistarget:
input: /u/scratch/c/chienp/scenicplus_invivo/candidate_enhancers_bed, /u/scratch/c/chienp/scenicplus/hg38_screen_v10_clust.regions_vs_motifs.rankings.feather, /u/scratch/c/chienp/scenicplus_invivo/motifs-v10nr_clust-nr.hgnc-m0.001-o0.0.tbl
output: ctx_results.hdf5, ctx_results.html
jobid: 9
reason: Missing output files: ctx_results.hdf5
threads: 12
resources: tmpdir=/tmp
2024-03-12 10:15:01,011 SCENIC+ INFO Reading region sets from: /u/scratch/c/chienp/scenicplus_invivo/candidate_enhancers_bed
2024-03-12 10:15:01,033 SCENIC+ INFO Writing html to: ctx_results.html
Traceback (most recent call last):
File "/u/home/c/chienp/miniforge3/envs/scenicplus/bin/scenicplus", line 33, in <module>
sys.exit(load_entry_point('scenicplus', 'console_scripts', 'scenicplus')())
File "/u/home/c/chienp/scenicplus/src/scenicplus/cli/scenicplus.py", line 1137, in main
args.func(args)
File "/u/home/c/chienp/scenicplus/src/scenicplus/cli/scenicplus.py", line 386, in motif_enrichment_cistarget
run_motif_enrichment_cistarget(
File "/u/home/c/chienp/scenicplus/src/scenicplus/cli/commands.py", line 182, in run_motif_enrichment_cistarget
all_motif_enrichment_df = pd.concat(
File "/u/home/c/chienp/.local/lib/python3.8/site-packages/pandas/util/_decorators.py", line 317, in wrapper
return func(*args, **kwargs)
File "/u/home/c/chienp/.local/lib/python3.8/site-packages/pandas/core/reshape/concat.py", line 369, in concat
op = _Concatenator(
File "/u/home/c/chienp/.local/lib/python3.8/site-packages/pandas/core/reshape/concat.py", line 426, in __init__
raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate
[Tue Mar 12 10:15:01 2024]
Error in rule motif_enrichment_cistarget:
jobid: 9
input: /u/scratch/c/chienp/scenicplus_invivo/candidate_enhancers_bed, /u/scratch/c/chienp/scenicplus/hg38_screen_v10_clust.regions_vs_motifs.rankings.feather, /u/scratch/c/chienp/scenicplus_invivo/motifs-v10nr_clust-nr.hgnc-m0.001-o0.0.tbl
output: ctx_results.hdf5, ctx_results.html
shell:
scenicplus grn_inference motif_enrichment_cistarget --region_set_folder /u/scratch/c/chienp/scenicplus_invivo/candidate_enhancers_bed --cistarget_db_fname /u/scratch/c/chienp/scenicplus/hg38_screen_v10_clust.regions_vs_motifs.rankings.feather --output_fname_cistarget_result ctx_results.hdf5 --temp_dir temp_dir --species homo_sapiens --fr_overlap_w_ctx_db 0.4 --auc_threshold 0.005 --nes_threshold 3.0 --rank_threshold 0.05 --path_to_motif_annotations /u/scratch/c/chienp/scenicplus_invivo/motifs-v10nr_clust-nr.hgnc-m0.001-o0.0.tbl --annotation_version v10nr_clust --motif_similarity_fdr 0.001 --orthologous_identity_threshold 0.0 --annotations_to_use Direct_annot Orthology_annot --write_html --output_fname_cistarget_html ctx_results.html --n_cpu 12
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-03-12T101448.880650.snakemake.log
Hi @peggiechien
This error can only occur when nog single motif is enriched, this is very unlikely.
Can you show the folder structure of: /u/scratch/c/chienp/scenicplus_invivo/candidate_enhancers_bed
Can you also show the head of some bed files?
All the best,
Seppe
Hi @SeppeDeWinter
I figured out my issue was the folder structure of the bed files folder - I forgot to put the bed files in a DARs folder. I got farther in the pipeline now! But I'm stuck at the motif_enrichment_dem step now:
$ snakemake --cores 12
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 12
Rules claiming more threads will be scaled down.
Job stats:
job count
---------------------------- -------
AUCell_direct 1
AUCell_extended 1
all 1
download_genome_annotations 1
eGRN_direct 1
eGRN_extended 1
get_search_space 1
motif_enrichment_cistarget 1
motif_enrichment_dem 1
prepare_GEX_ACC_non_multiome 1
prepare_menr 1
region_to_gene 1
scplus_mudata 1
tf_to_gene 1
total 14
Select jobs to execute...
[Fri Mar 15 12:09:15 2024]
rule motif_enrichment_cistarget:
input: /u/scratch/c/chienp/scenicplus_invivo/data/candidate_enhancers_bed, /u/scratch/c/chienp/scenicplus/screen_databases/hg38_screen_v10_clust.regions_vs_motifs.rankings.feather, /u/scratch/c/chienp/scenicplus/screen_databases/motifs-v10nr_clust-nr.hgnc-m0.001-o0.0.tbl
output: /u/scratch/c/chienp/scenicplus_invivo/outs/ctx_results.hdf5, /u/scratch/c/chienp/scenicplus_invivo/outs/ctx_results.html
jobid: 9
reason: Missing output files: /u/scratch/c/chienp/scenicplus_invivo/outs/ctx_results.hdf5; Set of input files has changed since last execution
threads: 12
resources: tmpdir=/tmp
2024-03-15 12:09:30,003 SCENIC+ INFO Reading region sets from: /u/scratch/c/chienp/scenicplus_invivo/data/candidate_enhancers_bed
2024-03-15 12:09:30,005 SCENIC+ INFO Reading all .bed files in: DARS_Dev.Stage
2024-03-15 12:09:38,201 cisTarget INFO Reading cisTarget database
2024-03-15 12:09:38,356 cisTarget INFO Reading cisTarget database
2024-03-15 12:09:38,484 cisTarget INFO Reading cisTarget database
2024-03-15 12:11:07,993 cisTarget INFO Running cisTarget for DARS_Dev.Stage_Stage_4 which has 20590 regions
2024-03-15 12:11:08,580 cisTarget INFO Running cisTarget for DARS_Dev.Stage_Stage_5 which has 29873 regions
2024-03-15 12:11:14,706 cisTarget INFO Running cisTarget for DARS_Dev.Stage_Stage_3 which has 28582 regions
2024-03-15 12:11:39,577 cisTarget INFO Annotating motifs for DARS_Dev.Stage_Stage_4
2024-03-15 12:11:42,359 cisTarget INFO Getting cistromes for DARS_Dev.Stage_Stage_4
2024-03-15 12:11:44,387 cisTarget INFO Annotating motifs for DARS_Dev.Stage_Stage_5
2024-03-15 12:11:47,644 cisTarget INFO Getting cistromes for DARS_Dev.Stage_Stage_5
2024-03-15 12:11:50,234 cisTarget INFO Annotating motifs for DARS_Dev.Stage_Stage_3
2024-03-15 12:11:53,707 cisTarget INFO Getting cistromes for DARS_Dev.Stage_Stage_3
2024-03-15 12:11:56,990 SCENIC+ INFO Writing html to: /u/scratch/c/chienp/scenicplus_invivo/outs/ctx_results.html
2024-03-15 12:11:57,063 SCENIC+ INFO Writing output to: /u/scratch/c/chienp/scenicplus_invivo/outs/ctx_results.hdf5
[Fri Mar 15 12:12:07 2024]
Finished job 9.
1 of 14 steps (7%) done
Select jobs to execute...
[Fri Mar 15 12:12:07 2024]
rule download_genome_annotations:
output: /u/scratch/c/chienp/scenicplus_invivo/outs/genome_annotation.tsv, /u/scratch/c/chienp/scenicplus_invivo/outs/chromsizes.tsv
jobid: 8
reason: Missing output files: /u/scratch/c/chienp/scenicplus_invivo/outs/genome_annotation.tsv, /u/scratch/c/chienp/scenicplus_invivo/outs/chromsizes.tsv
resources: tmpdir=/tmp
[Fri Mar 15 12:12:07 2024]
rule prepare_GEX_ACC_non_multiome:
input: /u/scratch/c/chienp/scenicplus_invivo/data/cistopic_obj.pkl, /u/scratch/c/chienp/scenicplus_invivo/data/Mix_AP_07_MP_ad.h5ad
output: /u/scratch/c/chienp/scenicplus_invivo/outs/ACC_GEX.h5mu
jobid: 2
reason: Missing output files: /u/scratch/c/chienp/scenicplus_invivo/outs/ACC_GEX.h5mu
resources: tmpdir=/tmp
2024-03-15 12:12:23,910 SCENIC+ INFO Reading cisTopic object.
2024-03-15 12:12:24,460 SCENIC+ INFO Reading gene expression AnnData.
2024-03-15 12:12:24,878 cisTopic INFO Imputing region accessibility
2024-03-15 12:12:24,878 cisTopic INFO Impute region accessibility for regions 0-20000
2024-03-15 12:12:26,274 cisTopic INFO Impute region accessibility for regions 20000-40000
2024-03-15 12:12:27,850 cisTopic INFO Impute region accessibility for regions 40000-60000
2024-03-15 12:12:29,415 cisTopic INFO Impute region accessibility for regions 60000-80000
2024-03-15 12:12:30,893 cisTopic INFO Impute region accessibility for regions 80000-100000
2024-03-15 12:12:32,352 cisTopic INFO Impute region accessibility for regions 100000-120000
2024-03-15 12:12:33,809 cisTopic INFO Impute region accessibility for regions 120000-140000
2024-03-15 12:12:35,282 cisTopic INFO Impute region accessibility for regions 140000-160000
2024-03-15 12:12:36,819 cisTopic INFO Impute region accessibility for regions 160000-180000
2024-03-15 12:12:38,355 cisTopic INFO Impute region accessibility for regions 180000-200000
2024-03-15 12:12:39,945 cisTopic INFO Impute region accessibility for regions 200000-220000
2024-03-15 12:12:41,683 cisTopic INFO Impute region accessibility for regions 220000-240000
2024-03-15 12:12:43,166 cisTopic INFO Impute region accessibility for regions 240000-260000
2024-03-15 12:12:43,768 cisTopic INFO Done!
2024-03-15 12:12:43,772 Ingesting non-multiome data INFO Following annotations were found in both assays under key Dev.Stage:
Stage_3, Stage_4, Stage_5.
Keeping 2407 cells for RNA and 5561 for ATAC.
2024-03-15 12:13:15,205 Download gene annotation INFO Using genome: GRCh38.p14
2024-03-15 12:13:15,216 Download gene annotation INFO Found corresponding genome Id 51 on NCBI
2024-03-15 12:13:15,723 Download gene annotation INFO Found corresponding assembly Id 11968211 on NCBI
2024-03-15 12:13:16,234 Download gene annotation INFO Downloading assembly information from: ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/405/GCF_000001405.40_GRCh38.p14/GCF_000001405.40_GRCh38.p14_assembly_report.txt
2024-03-15 12:13:19,455 Download gene annotation INFO Found following assembled molecules (chromosomes):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
X
Y
MT
2024-03-15 12:13:19,480 Download gene annotation INFO Converting chromosomes names to UCSC style as follows:
Original UCSC
1 chr1
2 chr2
3 chr3
4 chr4
5 chr5
6 chr6
7 chr7
8 chr8
9 chr9
10 chr10
11 chr11
12 chr12
13 chr13
14 chr14
15 chr15
16 chr16
17 chr17
18 chr18
19 chr19
20 chr20
21 chr21
22 chr22
X chrX
Y chrY
MT chrM
2024-03-15 12:13:19,525 SCENIC+ INFO Saving chromosome sizes to: /u/scratch/c/chienp/scenicplus_invivo/outs/chromsizes.tsv
2024-03-15 12:13:19,536 SCENIC+ INFO Saving genome annotation to: /u/scratch/c/chienp/scenicplus_invivo/outs/genome_annotation.tsv
[Fri Mar 15 12:13:20 2024]
Finished job 8.
2 of 14 steps (14%) done
Select jobs to execute...
2024-03-15 12:13:33,260 Ingesting non-multiome data INFO Automatically set `nr_metacells` to: Stage_3: 232, Stage_4: 158, Stage_5: 92
2024-03-15 12:13:33,260 Ingesting non-multiome data INFO Generating pseudo multi-ome data
... storing 'Dev.Stage' as categorical
... storing 'Dev.Stage' as categorical
... storing 'Chromosome' as categorical
[Fri Mar 15 12:14:12 2024]
Finished job 2.
3 of 14 steps (21%) done
[Fri Mar 15 12:14:12 2024]
rule motif_enrichment_dem:
input: /u/scratch/c/chienp/scenicplus_invivo/data/candidate_enhancers_bed, /u/scratch/c/chienp/scenicplus/screen_databases/hg38_screen_v10_clust.regions_vs_motifs.scores.feather, /u/scratch/c/chienp/scenicplus_invivo/outs/genome_annotation.tsv, /u/scratch/c/chienp/scenicplus/screen_databases/motifs-v10nr_clust-nr.hgnc-m0.001-o0.0.tbl
output: /u/scratch/c/chienp/scenicplus_invivo/outs/dem_results.hdf5, /u/scratch/c/chienp/scenicplus_invivo/outs/dem_results.html
jobid: 7
reason: Missing output files: /u/scratch/c/chienp/scenicplus_invivo/outs/dem_results.hdf5; Input files updated by another job: /u/scratch/c/chienp/scenicplus_invivo/outs/genome_annotation.tsv
threads: 12
resources: tmpdir=/tmp
2024-03-15 12:14:27,326 SCENIC+ INFO Reading region sets from: /u/scratch/c/chienp/scenicplus_invivo/data/candidate_enhancers_bed
2024-03-15 12:14:27,328 SCENIC+ INFO Reading all .bed files in: DARS_Dev.Stage
2024-03-15 12:14:53,588 DEM INFO Running DEM for DARS_Dev.Stage_Stage_5_vs_all
2024-03-15 12:14:53,973 DEM INFO Running DEM for DARS_Dev.Stage_Stage_4_vs_all
2024-03-15 12:14:54,128 DEM INFO Running DEM for DARS_Dev.Stage_Stage_3_vs_all
2024-03-15 12:15:23,032 DEM INFO Adding motif-to-TF annotation
2024-03-15 12:15:29,442 DEM INFO Adding motif-to-TF annotation
2024-03-15 12:15:29,546 DEM INFO Adding motif-to-TF annotation
2024-03-15 12:15:31,928 SCENIC+ INFO Writing html to: /u/scratch/c/chienp/scenicplus_invivo/outs/dem_results.html
2024-03-15 12:15:31,956 SCENIC+ INFO Writing output to: /u/scratch/c/chienp/scenicplus_invivo/outs/dem_results.hdf5
Waiting at most 5 seconds for missing files.
MissingOutputException in rule motif_enrichment_dem in file /u/scratch/c/chienp/scenicplus_invivo/Snakemake/workflow/Snakefile, line 93:
Job 7 completed successfully, but some output files are missing. Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait:
/u/scratch/c/chienp/scenicplus_invivo/outs/dem_results.hdf5
Removing output files of failed job motif_enrichment_dem since they might be corrupted:
/u/scratch/c/chienp/scenicplus_invivo/outs/dem_results.html
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-03-15T120911.705817.snakemake.log
Hi @peggiechien
Can you try rerunning the workflow in an empty output directory?
As the error message suggests, you can also try increasing the wait time of SnakeMake if you suspect that your file system is on the slow side (--latency-wait
).
All the best,
Seppe
Hello, it seems to be a Snakemake problem, because I had the same issue with the final object not being saved. I was able to run the command using the cli and it wrote the file.
(I installed scenic using the instructions, but I had to manually bump dask to dask==2024.05)
Setting --latency-wait
did not help, as the file did not even appear.
Ubuntu 22.04.4 LTS (Jammy Jellyfish) 5.15.0-105-generic
My conda environment
# packages in environment at /my/env/scenicplus2:
#
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
adjusttext 1.0.4 pypi_0 pypi
aiohttp 3.9.3 pypi_0 pypi
aiosignal 1.3.1 pypi_0 pypi
anndata 0.10.5.post1 pypi_0 pypi
annoy 1.17.3 pypi_0 pypi
anyio 4.3.0 pypi_0 pypi
appdirs 1.4.4 pypi_0 pypi
arboreto 0.1.6 pypi_0 pypi
argon2-cffi 23.1.0 pypi_0 pypi
argon2-cffi-bindings 21.2.0 pypi_0 pypi
argparse-dataclass 2.0.0 pypi_0 pypi
array-api-compat 1.5.1 pypi_0 pypi
arrow 1.3.0 pypi_0 pypi
asttokens 2.4.1 pypi_0 pypi
async-lru 2.0.4 pypi_0 pypi
attr 0.3.2 pypi_0 pypi
attrs 23.2.0 pypi_0 pypi
babel 2.15.0 pypi_0 pypi
bbknn 1.6.0 pypi_0 pypi
beautifulsoup4 4.12.3 pypi_0 pypi
bidict 0.23.1 pypi_0 pypi
bioservices 1.11.2 pypi_0 pypi
bleach 6.1.0 pypi_0 pypi
blosc2 2.5.1 pypi_0 pypi
bokeh 3.4.0 pypi_0 pypi
boltons 23.1.1 pypi_0 pypi
bs4 0.0.2 pypi_0 pypi
bzip2 1.0.8 hd590300_5 conda-forge
ca-certificates 2024.2.2 hbcca054_0 conda-forge
cattrs 23.2.3 pypi_0 pypi
certifi 2024.2.2 pypi_0 pypi
cffi 1.16.0 pypi_0 pypi
charset-normalizer 3.3.2 pypi_0 pypi
click 8.1.7 pypi_0 pypi
cloudpickle 3.0.0 pypi_0 pypi
colorama 0.4.6 pypi_0 pypi
colorlog 6.8.2 pypi_0 pypi
comm 0.2.2 pypi_0 pypi
conda-inject 1.3.1 pypi_0 pypi
configargparse 1.7 pypi_0 pypi
connection-pool 0.0.3 pypi_0 pypi
contourpy 1.2.0 pypi_0 pypi
ctxcore 0.2.0 pypi_0 pypi
cycler 0.12.1 pypi_0 pypi
cython 0.29.37 pypi_0 pypi
cytoolz 0.12.3 pypi_0 pypi
dask 2024.5.0 pypi_0 pypi
dataclasses-json 0.6.4 pypi_0 pypi
datrie 0.8.2 pypi_0 pypi
debugpy 1.8.1 pypi_0 pypi
decorator 5.1.1 pypi_0 pypi
defusedxml 0.7.1 pypi_0 pypi
dill 0.3.8 pypi_0 pypi
distributed 2024.2.1 pypi_0 pypi
docutils 0.20.1 pypi_0 pypi
dpath 2.1.6 pypi_0 pypi
easydev 0.13.1 pypi_0 pypi
et-xmlfile 1.1.0 pypi_0 pypi
executing 2.0.1 pypi_0 pypi
fastjsonschema 2.19.1 pypi_0 pypi
fbpca 1.0 pypi_0 pypi
filelock 3.13.1 pypi_0 pypi
fonttools 4.50.0 pypi_0 pypi
fqdn 1.5.1 pypi_0 pypi
frozendict 2.4.0 pypi_0 pypi
frozenlist 1.4.1 pypi_0 pypi
fsspec 2024.3.1 pypi_0 pypi
future 1.0.0 pypi_0 pypi
gensim 4.3.2 pypi_0 pypi
geosketch 1.2 pypi_0 pypi
gevent 24.2.1 pypi_0 pypi
gitdb 4.0.11 pypi_0 pypi
gitpython 3.1.42 pypi_0 pypi
globre 0.1.5 pypi_0 pypi
greenlet 3.0.3 pypi_0 pypi
grequests 0.7.0 pypi_0 pypi
gseapy 0.10.8 pypi_0 pypi
h11 0.14.0 pypi_0 pypi
h5py 3.10.0 pypi_0 pypi
harmonypy 0.0.9 pypi_0 pypi
httpcore 1.0.5 pypi_0 pypi
httpx 0.27.0 pypi_0 pypi
humanfriendly 10.0 pypi_0 pypi
idna 3.6 pypi_0 pypi
igraph 0.11.4 pypi_0 pypi
imageio 2.34.0 pypi_0 pypi
immutables 0.20 pypi_0 pypi
importlib-metadata 7.0.1 pypi_0 pypi
importlib-resources 6.1.2 pypi_0 pypi
interlap 0.2.7 pypi_0 pypi
intervaltree 3.1.0 pypi_0 pypi
ipykernel 6.29.4 pypi_0 pypi
ipython 8.22.2 pypi_0 pypi
ipywidgets 8.1.2 pypi_0 pypi
isoduration 20.11.0 pypi_0 pypi
jedi 0.19.1 pypi_0 pypi
jinja2 3.1.3 pypi_0 pypi
joblib 1.3.2 pypi_0 pypi
json5 0.9.25 pypi_0 pypi
jsonpickle 3.0.3 pypi_0 pypi
jsonpointer 2.4 pypi_0 pypi
jsonschema 4.21.1 pypi_0 pypi
jsonschema-specifications 2023.12.1 pypi_0 pypi
jupyter 1.0.0 pypi_0 pypi
jupyter-client 8.6.1 pypi_0 pypi
jupyter-console 6.6.3 pypi_0 pypi
jupyter-core 5.7.2 pypi_0 pypi
jupyter-events 0.10.0 pypi_0 pypi
jupyter-lsp 2.2.5 pypi_0 pypi
jupyter-server 2.14.0 pypi_0 pypi
jupyter-server-terminals 0.5.3 pypi_0 pypi
jupyterlab 4.1.8 pypi_0 pypi
jupyterlab-pygments 0.3.0 pypi_0 pypi
jupyterlab-server 2.27.1 pypi_0 pypi
jupyterlab-widgets 3.0.10 pypi_0 pypi
kaleido 0.2.1 pypi_0 pypi
kiwisolver 1.4.5 pypi_0 pypi
lazy-loader 0.3 pypi_0 pypi
ld_impl_linux-64 2.40 h55db66e_0 conda-forge
lda 3.0.0 pypi_0 pypi
leidenalg 0.10.2 pypi_0 pypi
libexpat 2.6.2 h59595ed_0 conda-forge
libffi 3.4.2 h7f98852_5 conda-forge
libgcc-ng 13.2.0 h77fa898_7 conda-forge
libgomp 13.2.0 h77fa898_7 conda-forge
libnsl 2.0.1 hd590300_0 conda-forge
libsqlite 3.45.3 h2797004_0 conda-forge
libuuid 2.38.1 h0b41bf4_0 conda-forge
libxcrypt 4.4.36 hd590300_1 conda-forge
libzlib 1.2.13 hd590300_5 conda-forge
line-profiler 4.1.2 pypi_0 pypi
llvmlite 0.42.0 pypi_0 pypi
locket 1.0.0 pypi_0 pypi
loompy 3.0.7 pypi_0 pypi
loomxpy 0.4.2 pypi_0 pypi
lxml 5.1.0 pypi_0 pypi
lz4 4.3.3 pypi_0 pypi
macs2 2.2.9.1 pypi_0 pypi
markdown-it-py 3.0.0 pypi_0 pypi
markupsafe 2.1.5 pypi_0 pypi
marshmallow 3.21.1 pypi_0 pypi
matplotlib 3.6.3 pypi_0 pypi
matplotlib-inline 0.1.6 pypi_0 pypi
mdurl 0.1.2 pypi_0 pypi
mistune 3.0.2 pypi_0 pypi
mizani 0.9.3 pypi_0 pypi
msgpack 1.0.8 pypi_0 pypi
mudata 0.2.3 pypi_0 pypi
multidict 6.0.5 pypi_0 pypi
multiprocessing-on-dill 3.5.0a4 pypi_0 pypi
mypy-extensions 1.0.0 pypi_0 pypi
natsort 8.4.0 pypi_0 pypi
nbclient 0.10.0 pypi_0 pypi
nbconvert 7.16.4 pypi_0 pypi
nbformat 5.10.3 pypi_0 pypi
ncls 0.0.68 pypi_0 pypi
ncurses 6.5 h59595ed_0 conda-forge
ndindex 1.8 pypi_0 pypi
nest-asyncio 1.6.0 pypi_0 pypi
networkx 3.2.1 pypi_0 pypi
notebook 7.1.3 pypi_0 pypi
notebook-shim 0.2.4 pypi_0 pypi
numba 0.59.0 pypi_0 pypi
numexpr 2.9.0 pypi_0 pypi
numpy 1.26.4 pypi_0 pypi
numpy-groupies 0.10.2 pypi_0 pypi
openpyxl 3.1.2 pypi_0 pypi
openssl 3.3.0 hd590300_0 conda-forge
overrides 7.7.0 pypi_0 pypi
packaging 24.0 pypi_0 pypi
pandas 1.5.0 pypi_0 pypi
pandocfilters 1.5.1 pypi_0 pypi
parso 0.8.3 pypi_0 pypi
partd 1.4.1 pypi_0 pypi
patsy 0.5.6 pypi_0 pypi
pexpect 4.9.0 pypi_0 pypi
pillow 10.2.0 pypi_0 pypi
pip 24.0 pyhd8ed1ab_0 conda-forge
plac 1.4.3 pypi_0 pypi
platformdirs 4.2.0 pypi_0 pypi
plotly 5.19.0 pypi_0 pypi
plotnine 0.12.4 pypi_0 pypi
polars 0.20.13 pypi_0 pypi
progressbar2 4.4.2 pypi_0 pypi
prometheus-client 0.20.0 pypi_0 pypi
prompt-toolkit 3.0.43 pypi_0 pypi
protobuf 5.26.0 pypi_0 pypi
psutil 5.9.8 pypi_0 pypi
ptyprocess 0.7.0 pypi_0 pypi
pulp 2.8.0 pypi_0 pypi
pure-eval 0.2.2 pypi_0 pypi
py-cpuinfo 9.0.0 pypi_0 pypi
pyarrow 15.0.0 pypi_0 pypi
pyarrow-hotfix 0.6 pypi_0 pypi
pybedtools 0.9.1 pypi_0 pypi
pybigtools 0.1.2 pypi_0 pypi
pybigwig 0.3.22 pypi_0 pypi
pybiomart 0.2.0 pypi_0 pypi
pycistarget 1.0a2 pypi_0 pypi
pycistopic 2.0a0 pypi_0 pypi
pycparser 2.22 pypi_0 pypi
pyfasta 0.5.2 pypi_0 pypi
pygam 0.9.0 pypi_0 pypi
pygments 2.17.2 pypi_0 pypi
pynndescent 0.5.11 pypi_0 pypi
pyparsing 3.1.2 pypi_0 pypi
pyranges 0.0.111 pypi_0 pypi
pyrle 0.0.39 pypi_0 pypi
pysam 0.22.0 pypi_0 pypi
pyscenic 0.12.1+8.gd2309fe pypi_0 pypi
python 3.11.9 hb806964_0_cpython conda-forge
python-dateutil 2.9.0.post0 pypi_0 pypi
python-json-logger 2.0.7 pypi_0 pypi
python-utils 3.8.2 pypi_0 pypi
pytz 2024.1 pypi_0 pypi
pyvis 0.3.2 pypi_0 pypi
pyyaml 6.0.1 pypi_0 pypi
pyzmq 26.0.3 pypi_0 pypi
qtconsole 5.5.2 pypi_0 pypi
qtpy 2.4.1 pypi_0 pypi
ray 2.9.3 pypi_0 pypi
readline 8.2 h8228510_1 conda-forge
referencing 0.34.0 pypi_0 pypi
requests 2.31.0 pypi_0 pypi
requests-cache 1.2.0 pypi_0 pypi
reretry 0.11.8 pypi_0 pypi
rfc3339-validator 0.1.4 pypi_0 pypi
rfc3986-validator 0.1.1 pypi_0 pypi
rich 13.7.1 pypi_0 pypi
rich-argparse 1.4.0 pypi_0 pypi
rpds-py 0.18.0 pypi_0 pypi
scanorama 1.7.4 pypi_0 pypi
scanpy 1.8.2 pypi_0 pypi
scatac-fragment-tools 0.1.0 pypi_0 pypi
scenicplus 1.0a1 pypi_0 pypi
scikit-image 0.22.0 pypi_0 pypi
scikit-learn 1.3.2 pypi_0 pypi
scipy 1.12.0 pypi_0 pypi
scrublet 0.2.3 pypi_0 pypi
seaborn 0.13.2 pypi_0 pypi
send2trash 1.8.3 pypi_0 pypi
setuptools 69.5.1 pyhd8ed1ab_0 conda-forge
sinfo 0.3.4 pypi_0 pypi
six 1.16.0 pypi_0 pypi
smart-open 6.4.0 pypi_0 pypi
smmap 5.0.1 pypi_0 pypi
snakemake 8.5.5 pypi_0 pypi
snakemake-interface-common 1.17.1 pypi_0 pypi
snakemake-interface-executor-plugins 8.2.0 pypi_0 pypi
snakemake-interface-report-plugins 1.0.0 pypi_0 pypi
snakemake-interface-storage-plugins 3.1.1 pypi_0 pypi
sniffio 1.3.1 pypi_0 pypi
sorted-nearest 0.0.39 pypi_0 pypi
sortedcontainers 2.4.0 pypi_0 pypi
soupsieve 2.5 pypi_0 pypi
stack-data 0.6.3 pypi_0 pypi
statistics 1.0.3.5 pypi_0 pypi
statsmodels 0.14.1 pypi_0 pypi
stdlib-list 0.10.0 pypi_0 pypi
stopit 1.1.2 pypi_0 pypi
suds-community 1.1.2 pypi_0 pypi
tables 3.9.2 pypi_0 pypi
tabulate 0.9.0 pypi_0 pypi
tblib 3.0.0 pypi_0 pypi
tenacity 8.2.3 pypi_0 pypi
terminado 0.18.1 pypi_0 pypi
texttable 1.7.0 pypi_0 pypi
threadpoolctl 3.4.0 pypi_0 pypi
throttler 1.2.2 pypi_0 pypi
tifffile 2024.2.12 pypi_0 pypi
tinycss2 1.3.0 pypi_0 pypi
tk 8.6.13 noxft_h4845f30_101 conda-forge
tmtoolkit 0.12.0 pypi_0 pypi
toolz 0.12.1 pypi_0 pypi
toposort 1.10 pypi_0 pypi
tornado 6.4 pypi_0 pypi
tqdm 4.66.2 pypi_0 pypi
traitlets 5.14.2 pypi_0 pypi
tspex 0.6.3 pypi_0 pypi
types-python-dateutil 2.9.0.20240316 pypi_0 pypi
typing 3.7.4.3 pypi_0 pypi
typing-extensions 4.10.0 pypi_0 pypi
typing-inspect 0.9.0 pypi_0 pypi
tzdata 2024a h0c530f3_0 conda-forge
umap-learn 0.5.5 pypi_0 pypi
uri-template 1.3.0 pypi_0 pypi
url-normalize 1.4.3 pypi_0 pypi
urllib3 2.2.1 pypi_0 pypi
wcwidth 0.2.13 pypi_0 pypi
webcolors 1.13 pypi_0 pypi
webencodings 0.5.1 pypi_0 pypi
websocket-client 1.8.0 pypi_0 pypi
wheel 0.43.0 pyhd8ed1ab_1 conda-forge
widgetsnbextension 4.0.10 pypi_0 pypi
wrapt 1.16.0 pypi_0 pypi
xlrd 2.0.1 pypi_0 pypi
xmltodict 0.13.0 pypi_0 pypi
xyzservices 2023.10.1 pypi_0 pypi
xz 5.2.6 h166bdaf_0 conda-forge
yarl 1.9.4 pypi_0 pypi
yte 1.5.4 pypi_0 pypi
zict 3.0.0 pypi_0 pypi
zipp 3.18.1 pypi_0 pypi
zope-event 5.0 pypi_0 pypi
zope-interface 6.2 pypi_0 pypi
Hello, Thank you for the fantastic tool! I'm having the exact same error as in the previous comment:
Waiting at most 5 seconds for missing files. MissingOutputException in rule motif_enrichment_dem in file /work/leticia/sds/sd17d003/Leti/SCENIC+/scplus_pipeline2/Snakemake/workflow/Snakefile, line 93: Job 7 completed successfully, but some output files are missing. Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait: dem_results.hdf5 Removing output files of failed job motif_enrichment_dem since they might be corrupted: dem_results.html Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2024-09-04T160741.698427.snakemake.log
Somehow Writing output to: dem_results.hdf5 seems to be really slow (or something else is not working but no error is reported) and --latency-wait 60 didn't help... how could it be fixed? Thank you!