Error running atacworks train
Hi there,
I'm running atacworks on some Drosophila ATAC-seq data we had generated, and it's throwing an error at me!
atacworks train
--noisybw ClkZT14_2.sort.bam.cutsites.smoothed_100.coverage.bw
--cleanbw ClkZT14_1.sort.bam.cutsites.smoothed_100.coverage.bw
--cleanpeakfile ClkZT14_1.sort.bam.peaks.bed
--genome dm6.chrom.sizes
--val_chrom chr2R
--holdout_chrom chr3L
--out_home "./"
--exp_name "856_train"
--distributed
INFO:2022-05-22 17:50:19,291:AtacWorks-peak2bw] Reading input file
INFO:2022-05-22 17:50:19,295:AtacWorks-peak2bw] Read 15265 peaks.
INFO:2022-05-22 17:50:19,297:AtacWorks-peak2bw] Adding score
INFO:2022-05-22 17:50:19,297:AtacWorks-peak2bw] Writing peaks to bedGraph file
Discarding 0 entries outside sizes file.
INFO:2022-05-22 17:50:19,335:AtacWorks-peak2bw] Writing peaks to bigWig file ./856_train_2022.05.22_17.50/bigwig_peakfiles/ClkZT14_1.sort.bam.peaks.bed.bw
INFO:2022-05-22 17:50:19,364:AtacWorks-peak2bw] Done!
INFO:2022-05-22 17:50:19,367:AtacWorks-intervals] Generating training intervals
INFO:2022-05-22 17:50:20,831:AtacWorks-intervals] Generating val intervals
INFO:2022-05-22 17:50:20,840:AtacWorks-bw2h5] Reading intervals
INFO:2022-05-22 17:50:20,841:AtacWorks-bw2h5] Read 1691 intervals
INFO:2022-05-22 17:50:20,841:AtacWorks-bw2h5] Selecting intervals with nonzero coverage
Traceback (most recent call last):
File "/home/albz/miniconda3/envs/atacworks/bin/atacworks", line 8, in
Any thoughts on what could be going wrong?
@albertdyu, it looks like there's a problem reading the values for the 1680th training interval. If you look at the output files that are generated by this command you should be able to find an intervals/ folder which will contain a file with train in the name that should gave 1691 rows. You can look at the 1680th row in this file. It could be an issue with this chromosome name not being included in your bigwig file, or your bigwig file being produced using a genome assembly different from dm6.chrom.sizes.
That did the trick! I accidentally used a sizes file with slightly different chromosome sizes relative to the assembly I mapped to - good catch! Thank you!
I was able to successfully train a model, but now I'm having some trouble with denoising, haha...
atacworks denoise
--noisybw ClkZT14_2.sort.bam.cutsites.smoothed_100.coverage.bw
--genome gfachrome.sizes
--weights_path ./856_train_latest/model_best.pth.tar
--out_home "./"
--exp_name "856_ZT14_2_denoise"
--distributed
--num_workers 0
INFO:2022-05-22 19:19:49,827:AtacWorks-intervals] Generating intervals tiling across all chromosomes in sizes file: gfachrome.sizes
INFO:2022-05-22 19:19:49,841:AtacWorks-intervals] Done!
INFO:2022-05-22 19:19:49,841:AtacWorks-bw2h5] Reading intervals
INFO:2022-05-22 19:19:49,842:AtacWorks-bw2h5] Read 2747 intervals
INFO:2022-05-22 19:19:49,842:AtacWorks-bw2h5] Writing data in 3 batches.
INFO:2022-05-22 19:19:49,842:AtacWorks-bw2h5] Extracting data for each batch and writing to h5 file
INFO:2022-05-22 19:19:49,842:AtacWorks-bw2h5] batch 0 of 3
INFO:2022-05-22 19:19:58,719:AtacWorks-bw2h5] Done! Saved to ./856_ZT14_2_denoise_2022.05.22_19.19/bw2h5/ClkZT14_2.sort.bam.cutsites.smoothed_100.coverage.bw.denoise.h5
INFO:2022-05-22 19:19:58,719:AtacWorks-main] Checking input files for compatibility
Building model: resnet ...
Loading model weights from ./856_train_latest/model_best.pth.tar...
Finished loading.
Finished building.
Inference -------------------- [ 0/2747]
Inference -------------------- [ 50/2747]
Inference #------------------- [ 100/2747]
Inference #------------------- [ 150/2747]
Inference #------------------- [ 200/2747]
Inference ##------------------ [ 250/2747]
Inference ##------------------ [ 300/2747]
Inference ###----------------- [ 350/2747]
Inference ###----------------- [ 400/2747]
Inference ###----------------- [ 450/2747]
Inference ####---------------- [ 500/2747]
Inference ####---------------- [ 550/2747]
Inference ####---------------- [ 600/2747]
Inference #####--------------- [ 650/2747]
Inference #####--------------- [ 700/2747]
Inference #####--------------- [ 750/2747]
Inference ######-------------- [ 800/2747]
Inference ######-------------- [ 850/2747]
Inference #######------------- [ 900/2747]
Inference #######------------- [ 950/2747]
Inference #######------------- [1000/2747]
Inference ########------------ [1050/2747]
Inference ########------------ [1100/2747]
Traceback (most recent call last):
File "/home/albz/miniconda3/envs/atacworks/bin/atacworks", line 8, in
sys.exit(main())
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/scripts/main.py", line 569, in main
worker(args.gpu_idx, ngpus_per_node, args, res_queue)
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/scripts/worker.py", line 290, in infer_worker
pad=args.pad)
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/atacworks/dl4atac/infer.py", line 80, in infer
res_queue.put((idxes, batch_res))
File "", line 2, in put
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/managers.py", line 772, in _callmethod
raise convert_to_error(kind, result)
multiprocessing.managers.RemoteError:
Traceback (most recent call last): File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/managers.py", line 228, in serve_client request = recv() File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/connection.py", line 251, in recv return _ForkingPickler.loads(buf.getbuffer()) File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 282, in rebuild_storage_fd fd = df.detach() File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/resource_sharer.py", line 58, in detach return reduction.recv_handle(conn) File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/reduction.py", line 182, in recv_handle return recvfds(s, 1)[0] File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/reduction.py", line 161, in recvfds len(ancdata)) RuntimeError: received 0 items of ancdata
Process Process-2:
Traceback (most recent call last):
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/scripts/main.py", line 217, in writer
if not res_queue.empty():
File "
It outputs a half-finished bedgraph file before going down.
I'm running:
CUDA 11.6 PyTorch 1.7.1 Python 3.6.7
Any thoughts? I appreciate your help!