scenicplus
scenicplus copied to clipboard
An error related to "peak_calling"
Describe the bug When I attempt to run "peak_calling", I encounter an error, which I believe is related to numpy. I installed scenicplus using Python 3.11 following the instructions on the website: https://github.com/aertslab/scenicplus/tree/development. Based on my understanding, I am using "MACS2-2.2.7.1-py3.8-linux-x86_64". Could the error be caused by the difference in Python versions? I would appreciate any advice you can provide. Thank you so much for your help.
To Reproduce narrow_peaks_dict = peak_calling(macs_path, bed_paths, os.path.join(work_dir, 'scATAC/consensus_peak_calling/MACS/'), genome_size='hs', n_cpu=2, input_format='BEDPE', shift=73, ext_size=146, keep_dup = 'all', q_value = 0.05)
Version (please complete the following information):
- Python 3.11.8
- SCENIC+: '1.0a1'
- MACS2-2.2.7.1-py3.8-linux-x86_64?
- numpy: 1.26.4
Error output
2024-03-31 00:27:23,775 INFO worker.py:1724 -- Started a local Ray instance.
E0331 00:27:25.916030700 3019 socket_utils_common_posix.cc:224] check for SO_REUSEPORT: UNKNOWN:Protocol not available {created_time:"2024-03-31T00:27:25.915182686+09:00", errno:92, os_error:"Protocol not available", syscall:"getsockopt(SO_REUSEPORT)"}
(macs_call_peak_ray pid=3242) 2024-03-31 00:27:30,145 cisTopic INFO Calling peaks for Non-PE with macs2 callpeak --treatment scATAC/consensus_peak_calling/pseudobulk_bed_files/Non-PE.fragments.tsv.gz --name Non-PE --outdir scATAC/consensus_peak_calling/MACS/ --format BEDPE --gsize hs --qvalue 0.05 --nomodel --shift 73 --extsize 146 --keep-dup all --call-summits --nolambda
(macs_call_peak_ray pid=3241) 2024-03-31 00:27:30,150 cisTopic INFO Calling peaks for EMT1B with macs2 callpeak --treatment scATAC/consensus_peak_calling/pseudobulk_bed_files/EMT1B.fragments.tsv.gz --name EMT1B --outdir scATAC/consensus_peak_calling/MACS/ --format BEDPE --gsize hs --qvalue 0.05 --nomodel --shift 73 --extsize 146 --keep-dup all --call-summits --nolambda
---------------------------------------------------------------------------
RayTaskError(RuntimeError) Traceback (most recent call last)
Cell In[10], line 8
6 macs_path='macs2'
7 # Run peak calling
----> 8 narrow_peaks_dict = peak_calling(macs_path,
9 bed_paths,
10 os.path.join(work_dir, 'scATAC/consensus_peak_calling/MACS/'),
11 genome_size='hs',
12 n_cpu=2,
13 input_format='BEDPE',
14 shift=73,
15 ext_size=146,
16 keep_dup = 'all',
17 q_value = 0.05)
File ~/anaconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/pseudobulk_peak_calling.py:286, in peak_calling(macs_path, bed_paths, outdir, genome_size, n_cpu, input_format, shift, ext_size, keep_dup, q_value, nolambda, skip_empty_peaks, **kwargs)
284 except Exception as e:
285 ray.shutdown()
--> 286 raise(e)
287 ray.shutdown()
288 else:
File ~/anaconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/pseudobulk_peak_calling.py:264, in peak_calling(macs_path, bed_paths, outdir, genome_size, n_cpu, input_format, shift, ext_size, keep_dup, q_value, nolambda, skip_empty_peaks, **kwargs)
262 ray.init(num_cpus=n_cpu, **kwargs)
263 try:
--> 264 narrow_peaks = ray.get(
265 [
266 macs_call_peak_ray.remote(
267 macs_path,
268 bed_paths[name],
269 name,
270 outdir,
271 genome_size,
272 input_format,
273 shift,
274 ext_size,
275 keep_dup,
276 q_value,
277 nolambda,
278 skip_empty_peaks
279
280 )
281 for name in list(bed_paths.keys())
282 ]
283 )
284 except Exception as e:
285 ray.shutdown()
File ~/anaconda3/envs/scenicplus/lib/python3.11/site-packages/ray/_private/auto_init_hook.py:22, in wrap_auto_init.<locals>.auto_init_wrapper(*args, **kwargs)
19 @wraps(fn)
20 def auto_init_wrapper(*args, **kwargs):
21 auto_init_ray()
---> 22 return fn(*args, **kwargs)
File ~/anaconda3/envs/scenicplus/lib/python3.11/site-packages/ray/_private/client_mode_hook.py:103, in client_mode_hook.<locals>.wrapper(*args, **kwargs)
101 if func.__name__ != "init" or is_client_mode_enabled_by_default:
102 return getattr(ray, func.__name__)(*args, **kwargs)
--> 103 return func(*args, **kwargs)
File ~/anaconda3/envs/scenicplus/lib/python3.11/site-packages/ray/_private/worker.py:2624, in get(object_refs, timeout)
2622 worker.core_worker.dump_object_store_memory_usage()
2623 if isinstance(value, RayTaskError):
-> 2624 raise value.as_instanceof_cause()
2625 else:
2626 raise value
RayTaskError(RuntimeError): ray::macs_call_peak_ray() (pid=3241, ip=192.168.0.8)
File "/home/m03077yh/anaconda3/envs/scenicplus/lib/python3.11/subprocess.py", line 466, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/m03077yh/anaconda3/envs/scenicplus/lib/python3.11/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'macs2 callpeak --treatment scATAC/consensus_peak_calling/pseudobulk_bed_files/EMT1B.fragments.tsv.gz --name EMT1B --outdir scATAC/consensus_peak_calling/MACS/ --format BEDPE --gsize hs --qvalue 0.05 --nomodel --shift 73 --extsize 146 --keep-dup all --call-summits --nolambda' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
ray::macs_call_peak_ray() (pid=3241, ip=192.168.0.8)
File "/home/m03077yh/anaconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/pseudobulk_peak_calling.py", line 445, in macs_call_peak_ray
MACS_peak_calling = MACSCallPeak(
^^^^^^^^^^^^^
File "/home/m03077yh/anaconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/pseudobulk_peak_calling.py", line 523, in __init__
self.call_peak()
File "/home/m03077yh/anaconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/pseudobulk_peak_calling.py", line 564, in call_peak
raise RuntimeError(
RuntimeError: command 'macs2 callpeak --treatment scATAC/consensus_peak_calling/pseudobulk_bed_files/EMT1B.fragments.tsv.gz --name EMT1B --outdir scATAC/consensus_peak_calling/MACS/ --format BEDPE --gsize hs --qvalue 0.05 --nomodel --shift 73 --extsize 146 --keep-dup all --call-summits --nolambda' return with error (code 1): b'Traceback (most recent call last):\n File "/home/m03077yh/.local/bin/macs2", line 4, in <module>\n __import__(\'pkg_resources\').run_script(\'MACS2==2.2.7.1\', \'macs2\')\n File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 667, in run_script\n self.require(requires)[0].run_script(script_name, ns)\n File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 1463, in run_script\n exec(code, namespace, namespace)\n File "/home/m03077yh/.local/lib/python3.8/site-packages/MACS2-2.2.7.1-py3.8-linux-x86_64.egg/EGG-INFO/scripts/macs2", line 653, in <module>\n main()\n File "/home/m03077yh/.local/lib/python3.8/site-packages/MACS2-2.2.7.1-py3.8-linux-x86_64.egg/EGG-INFO/scripts/macs2", line 49, in main\n from MACS2.callpeak_cmd import run\n File "/home/m03077yh/.local/lib/python3.8/site-packages/MACS2-2.2.7.1-py3.8-linux-x86_64.egg/MACS2/callpeak_cmd.py", line 23, in <module>\n from MACS2.OptValidator import opt_validate\n File "/home/m03077yh/.local/lib/python3.8/site-packages/MACS2-2.2.7.1-py3.8-linux-x86_64.egg/MACS2/OptValidator.py", line 20, in <module>\n from MACS2.IO.Parser import BEDParser, ELANDResultParser, ELANDMultiParser, \\\n File "__init__.pxd", line 242, in init MACS2.IO.Parser\nValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 80 from PyObject\n'
Hi @m03077yhtnt
Can you try again after reinstalling numpy
pip install --upgrade numpy --force-reinstall
Best, Seppe
Hi @m03077yhtnt
Can you try again after reinstalling numpy
pip install --upgrade numpy --force-reinstall
Best, Seppe
I have the same error. After upgrading numpy there are more errors related to package incompatibility. ---------------------------------------------------------------------------
WorkerCrashedError Traceback (most recent call last)
Cell In[27], line 6
2 macs_path = "macs2"
4 os.makedirs(os.path.join(out_dir, "consensus_peak_calling/MACS"), exist_ok = True)
----> 6 narrow_peak_dict = peak_calling(
7 macs_path = macs_path,
8 bed_paths = bed_paths,
9 outdir = os.path.join(os.path.join(out_dir, "consensus_peak_calling/MACS")),
10 genome_size = 'hs',
11 n_cpu = 8,
12 input_format = 'BEDPE',
13 shift = 73,
14 ext_size = 146,
15 keep_dup = 'all',
16 q_value = 0.05,
17 _temp_dir = temp_dir
18 )
File ~/.conda/envs/scenicplus3.11/lib/python3.11/site-packages/pycisTopic/pseudobulk_peak_calling.py:286, in peak_calling(macs_path, bed_paths, outdir, genome_size, n_cpu, input_format, shift, ext_size, keep_dup, q_value, nolambda, skip_empty_peaks, **kwargs)
284 except Exception as e:
285 ray.shutdown()
--> 286 raise(e)
287 ray.shutdown()
288 else:
File ~/.conda/envs/scenicplus3.11/lib/python3.11/site-packages/pycisTopic/pseudobulk_peak_calling.py:264, in peak_calling(macs_path, bed_paths, outdir, genome_size, n_cpu, input_format, shift, ext_size, keep_dup, q_value, nolambda, skip_empty_peaks, **kwargs)
262 ray.init(num_cpus=n_cpu, **kwargs)
263 try:
--> 264 narrow_peaks = ray.get(
265 [
266 macs_call_peak_ray.remote(
267 macs_path,
268 bed_paths[name],
269 name,
270 outdir,
271 genome_size,
272 input_format,
273 shift,
274 ext_size,
275 keep_dup,
276 q_value,
277 nolambda,
278 skip_empty_peaks
279
280 )
281 for name in list(bed_paths.keys())
282 ]
283 )
284 except Exception as e:
285 ray.shutdown()
File ~/.conda/envs/scenicplus3.11/lib/python3.11/site-packages/ray/_private/auto_init_hook.py:22, in wrap_auto_init.<locals>.auto_init_wrapper(*args, **kwargs)
19 @wraps(fn)
20 def auto_init_wrapper(*args, **kwargs):
21 auto_init_ray()
---> 22 return fn(*args, **kwargs)
File ~/.conda/envs/scenicplus3.11/lib/python3.11/site-packages/ray/_private/client_mode_hook.py:103, in client_mode_hook.<locals>.wrapper(*args, **kwargs)
101 if func.__name__ != "init" or is_client_mode_enabled_by_default:
102 return getattr(ray, func.__name__)(*args, **kwargs)
--> 103 return func(*args, **kwargs)
File ~/.conda/envs/scenicplus3.11/lib/python3.11/site-packages/ray/_private/worker.py:2626, in get(object_refs, timeout)
2624 raise value.as_instanceof_cause()
2625 else:
-> 2626 raise value
2628 if is_individual_id:
2629 values = values[0]
WorkerCrashedError: The worker died unexpectedly while executing this task. Check python-core-worker-*.log files for more information.
Hi @cjiang310437
Is it feasible to try to run this code with one core?
Best,
Seppe
Hi @cjiang310437
Is it feasible to try to run this code with one core?
Best,
Seppe
Hi @SeppeDeWinter, Thanks for replying. I tried upgrading numpy and one core but both did not work. It looks like it's not the issue of numpy or job parallels. Here is the situation: I have been running scenicplus on our HPC clusters. The code always worked fine. But recently our HPC clusters machines are upgraded from CentOS7 to RHEL9. And this code started to getting the following errors for 'undefined_symbol: log_finite' using the RHEL9 machines. The error occured both on peak_calling(macs2) and purturbation(velocyto) steps. Can I get any idea for resolving the issue?
The error got from peakcalling using scenicplus version downloaded from main branch:
The error got from plot_perturbation_effect_in_embedding using scenicplus version downloaded from old branch:
Any suggestions would be appreciated. Thank you!
Best, Cheng
Hi @cjiang310437
Did you try to reinstall all packages after the OS change? It might be that some packages need to be recompiled to the new os.
All the best,
Seppe