CHEUI
CHEUI copied to clipboard
CHEUI_preprocess_m5C.py give error
I successfully run the CHEUI_preprocess_m6A.py. but when I ran the CHEUI_preprocess_m5C.py script then the below error occurred. Can you please resolve this issue? Thanks in advance
Here is the command which I used:
nohup python3 /home/apps/CHEUI/scripts/CHEUI_preprocess_m5C.py -i /Drive7/CHEUI/nanopolish_out.txt -m /home/apps/CHEUI/kmer_models/model_kmer.csv -o out_C_signals -n 35 &
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/home/apps/CHEUI/scripts/CHEUI_preprocess_m5C.py", line 482, in parse_nanopolish
counter)
File "/home/apps/CHEUI/scripts/CHEUI_preprocess_m5C.py", line 160, in _parse_kmers
samples = [float(i) for i in checked_line[samples_idx].split(',')]
File "/home/apps/CHEUI/scripts/CHEUI_preprocess_m5C.py", line 160, in
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/apps/CHEUI/scripts/CHEUI_preprocess_m5C.py", line 560, in
Hi,
Sorry about the issue. Can you please share few lines of the nanopolish input file you used? Also, we recommend using the C++ version for faster preprocessing.
Thanks, Akanksha
here is the input file for your reference: contig position reference_kmer read_name strand event_index event_level_mean event_stdv event_length model_kmer model_ mean model_stdv standardized_level start_idx end_idx samples gene1 452 TTTAA 3a40d16c-e4fe-4e28-bc3c-7d31eabfdeb7 t 2 81.02 1.173 0.00398 TTTAA 84.44 2.46 -1.18 20956 20968 81.951 ,82.6334,82.0875,79.085,79.9039,80.3133,81.2686,82.0875,80.1768,79.2215,81.4051,82.0875 gene1 452 TTTAA 3a40d16c-e4fe-4e28-bc3c-7d31eabfdeb7 t 3 83.74 1.112 0.00498 TTTAA 84.44 2.46 -0.24 20941 20956 83.861 7,81.8145,84.1346,81.8145,82.9063,85.0899,83.8617,82.3604,83.4522,83.1793,84.1346,85.0899,84.544,85.4994,84.4076 gene1 452 TTTAA 3a40d16c-e4fe-4e28-bc3c-7d31eabfdeb7 t 4 85.62 1.945 0.02590 TTTAA 84.44 2.46 0.41 20863 20941 87.273 6,84.817,86.5912,87.1371,86.1817,86.1817,84.1346,85.6358,85.6358,85.6358,85.7723,85.6358,85.9088,85.6358,87.0006,88.5018,86.0453,85.2264,86.4547,86.5912,84.95 35,84.9535,85.4994,84.4076,86.3182,94.9162,82.7699,85.9088,83.3158,83.5887,87.0006,88.2289,86.1817,85.9088,76.0826,84.6805,85.7723,85.2264,86.3182,84.1346,86. 7276,84.9535,83.9981,85.2264,83.8617,85.4994,85.4994,83.7252,83.9981,83.5887,85.4994,86.5912,84.4076,84.1346,86.8641,84.9535,83.8617,83.9981,87.9559,85.9088,8 5.9088,86.5912,84.6805,86.8641,86.5912,85.7723,85.4994,86.8641,85.7723,85.9088,85.0899,87.5465,86.4547,88.6383,86.0453,83.1793,86.4547,85.2264 gene1 452 TTTAA 3a40d16c-e4fe-4e28-bc3c-7d31eabfdeb7 t 5 84.54 0.925 0.00232 TTTAA 84.44 2.46 0.04 20856 20863 85.362 9,83.9981,82.9063,85.6358,83.8617,84.6805,85.3629 gene1 453 TTAAA 3a40d16c-e4fe-4e28-bc3c-7d31eabfdeb7 t 6 94.67 3.645 0.01029 TTAAA 92.05 3.04 0.73 20825 20856 95.735 ,95.735,93.4149,93.5514,94.0973,95.735,96.5539,97.3727,95.735,96.8268,98.1916,95.8715,95.3256,95.3256,98.1916,76.492,94.5067,93.9609,91.9137,95.4621,94.3703,9 5.4621,96.6904,95.1891,92.869,95.1891,97.7822,92.5961,94.7797,95.0527,94.7797 gene1 453 TTAAA 3a40d16c-e4fe-4e28-bc3c-7d31eabfdeb7 t 7 96.60 1.775 0.00365 TTAAA 92.05 3.04 1.27 20814 20825 95.052 7,95.0527,94.7797,95.735,98.874,98.3281,95.4621,94.0973,98.601,98.601,98.0551 gene1 454 TAAAT 3a40d16c-e4fe-4e28-bc3c-7d31eabfdeb7 t 8 97.39 1.404 0.00432 TAAAT 108.51 2.68 -3.53 20801 20814 97.509 2,96.1445,97.7822,95.4621,95.1891,97.2363,99.4199,99.0104,97.0998,98.0551,96.9633,96.2809,99.9658 gene1 455 AAATG 3a40d16c-e4fe-4e28-bc3c-7d31eabfdeb7 t 9 110.97 3.286 0.02722 AAATG 110.67 3.11 0.08 20719 20801 104.06 ,97.2363,114.705,108.427,111.839,113.886,112.112,115.66,110.747,113.75,113.477,112.931,115.933,113.75,111.293,111.976,111.976,112.522,110.201,101.058,113.067, 105.971,112.931,111.43,116.206,102.695,116.889,110.611,108.837,108.837,115.66,113.75,111.703,108.837,113.204,114.705,110.065,111.703,109.519,109.519,103.378,1 13.34,112.385,107.881,114.159,111.839,113.477,109.11,111.02,110.747,112.794,114.296,111.43,108.973,110.065,111.293,109.792,110.884,112.249,111.43,113.75,112.7 94,112.249,108.291,114.023,110.474,110.884,113.204,106.653,114.432,110.884,112.658,109.792,109.792,109.656,109.519,109.656,109.792,110.338,109.519,108.291,108 .837 gene1 456 AATGC 3a40d16c-e4fe-4e28-bc3c-7d31eabfdeb7 t 10 80.66 7.228 0.00764 AATGC 81.80 3.87 -0.25 20696 20719 72.807 1,78.9485,75.6731,80.7227,72.5342,69.1223,78.8121,86.7276,69.6682,79.7674,75.9461,94.2338,77.0379,81.1321,79.6309,75.8096,88.7748,82.3604,84.2711,99.8293,85.4 994,79.4944,86.4547 gene1 456 AATGC 3a40d16c-e4fe-4e28-bc3c-7d31eabfdeb7 t 11 82.82 1.951 0.00432 AATGC 81.80 3.87 0.22 20683 20696 82.769 9,85.9088,81.5416,85.0899,82.9063,85.4994,83.7252,82.6334,81.2686,79.2215,83.8617,79.9039,82.3604 gene1 456 AATGC 3a40d16c-e4fe-4e28-bc3c-7d31eabfdeb7 t 12 79.87 3.572 0.00299 AATGC 81.80 3.87 -0.42 20674 20683 78.539 1,86.0453,73.7625,79.9039,77.3108,82.7699,76.3555,81.5416,82.6334 gene1 457 ATGCA 3a40d16c-e4fe-4e28-bc3c-7d31eabfdeb7 t 13 84.56 2.839 0.00598 ATGCA 84.45 3.13 0.03 20656 20674 85.908 8,83.3158,80.9957,89.3207,89.5936,84.6805,87.5465,83.4522,77.3108,85.0899,84.1346,86.3182,81.951,86.1817,83.7252,82.4969,84.817,85.2264 gene1 457 ATGCA 3a40d16c-e4fe-4e28-bc3c-7d31eabfdeb7 t 14 82.56 3.086 0.01029 ATGCA 84.45 3.13 -0.51 20625 20656 80.449 8,84.2711,84.6805,85.6358,79.7674,79.4944,76.492,77.4473,83.3158,80.7227,85.4994,86.1817,83.0428,84.6805,81.5416,82.3604,75.9461,86.1817,82.0875,82.2239,84.27 11,79.7674,86.5912,90.276,84.817,81.6781,82.7699,84.4076,81.6781,80.1768,80.8592 gene1 457 ATGCA 3a40d16c-e4fe-4e28-bc3c-7d31eabfdeb7 t 15 87.31 2.366 0.00332 ATGCA 84.45 3.13 0.78 20615 20625 88.365 3,89.1842,91.7772,88.2289,83.8617,85.2264,87.0006,83.7252,87.2736,88.5018 gene1 457 ATGCA 3a40d16c-e4fe-4e28-bc3c-7d31eabfdeb7 t 16 79.71 3.939 0.00232 ATGCA 84.45 3.13 -1.29 20608 20615 72.124 --More--(0%) Thanks in advance for resolving the issue.
Thanks, the format for nanopolish file looks ok. Can you please try 2 things:
- run the preprocessing code on the test file (https://github.com/comprna/CHEUI/blob/master/test/nanopolish_output_test.txt) to see if the installation is correct.
- Can you please try it with the C++ version of preprocessing? Otherwise, we might have to find a way to share the nanopolish file to debug. Thanks, Akanksha
1- run the preprocessing code on the test file:
The test run was completely successful. No error occurs. I used the command:
nohup ./CHEUI -i /home/aclab/apps/CHEUI/test/nanopolish_output_test.txt -o /home/aclab/apps/CHEUI/scripts/preprocessing_CPP/test/out_C_signals+IDs.p/ -m /home/aclab/apps/CHEUI/kmer_models/model_kmer.csv -n 50 --m5C &
2- I try the C++ version of preprocessing:
nohup ./CHEUI -i /Drive7/nanopolish_out.txt -o /Drive7/out_C_signals+IDs.p/ -m /home/aclab/apps/CHEUI/kmer_models/model_kmer.csv -n 50 --m5C &
But it also gives following error: terminate called after throwing an instance of 'std::invalid_argument' what(): stof
"please sort out this issue if possible, I want to run this pipeline. Thanks in advance
Thank you so much for running the 2 tests. Do you get the error message after some temp files are created or its straight away? We can look at our end if you are happy to share the nanopolish file. I know it can be huge though. Email:[email protected] Thanks, Akanksha
Do you get the error message after some temp files are created or is it straight away?
The temp file(51 in number) was made and also main output file(nanopolish_out_signals+IDS.p) has some data.
Here below the command is processed before getting error happens:
2500000 processed lines
3500000 processed lines
4000000 processed lines
5500000 processed lines
500000 processed lines
500000 processed lines
1000000 processed lines
5000000 processed lines
2000000 processed lines
3500000 processed lines
3500000 processed lines
5000000 processed lines
6000000 processed lines
6500000 processed lines
2000000 processed lines
2500000 processed lines
3500000 processed lines
1000000 processed lines
1500000 processed lines
2000000 processed lines
5500000 processed lines
6000000 processed lines
500000 processed lines
5000000 processed lines
5500000 processed lines
6000000 processed lines
7000000 processed lines
2000000 processed lines
2500000 processed lines
5500000 processed lines
6500000 processed lines
2000000 processed lines
2500000 processed lines
3500000 processed lines
5500000 processed lines
7000000 processed lines
1000000 processed lines
2000000 processed lines
3500000 processed lines
4500000 processed lines
6000000 processed lines
2000000 processed lines
2500000 processed lines
3000000 processed lines
4000000 processed lines
1500000 processed lines
2000000 processed lines
4000000 processed lines
4500000 processed lines
7000000 processed lines
500000 processed lines
1000000 processed lines
3000000 processed lines
3500000 processed lines
5500000 processed lines
6000000 processed lines
1000000 processed lines
1500000 processed lines
2000000 processed lines
2500000 processed lines
3000000 processed lines
6500000 processed lines
500000 processed lines
1000000 processed lines
3000000 processed lines
3500000 processed lines
7000000 processed lines
1500000 processed lines
2000000 processed lines
6000000 processed lines
4500000 processed lines
5000000 processed lines
5500000 processed lines
6000000 processed lines
7000000 processed lines
500000 processed lines
1500000 processed lines
2000000 processed lines
4000000 processed lines
7000000 processed lines
1500000 processed lines
3000000 processed lines
6000000 processed lines
1000000 processed lines
2000000 processed lines
4500000 processed lines
5500000 processed lines
6000000 processed lines
500000 processed lines
2500000 processed lines
5000000 processed lines
500000 processed lines
1000000 processed lines
2500000 processed lines
3500000 processed lines
5000000 processed lines
5000000 processed lines
6000000 processed lines
4000000 processed lines
5000000 processed lines
7000000 processed lines
1500000 processed lines
2500000 processed lines
3500000 processed lines
4000000 processed lines
6000000 processed lines
6500000 processed lines
7000000 processed lines
1500000 processed lines
2000000 processed lines
4500000 processed lines
6500000 processed lines
1500000 processed lines
2000000 processed lines
2500000 processed lines
3500000 processed lines
4000000 processed lines
5000000 processed lines
5500000 processed lines
6500000 processed lines
2500000 processed lines
3500000 processed lines
4000000 processed lines
6500000 processed lines
1000000 processed lines
1500000 processed lines
2500000 processed lines
4000000 processed lines
6500000 processed lines
7000000 processed lines
500000 processed lines
1500000 processed lines
2500000 processed lines
7000000 processed lines
500000 processed lines
2000000 processed lines
2500000 processed lines
5000000 processed lines
5500000 processed lines
5500000 processed lines
6500000 processed lines
2500000 processed lines
3000000 processed lines
3500000 processed lines
4500000 processed lines
5000000 processed lines
6000000 processed lines
1500000 processed lines
4000000 processed lines
6500000 processed lines
2500000 processed lines
6000000 processed lines
7000000 processed lines
500000 processed lines
2000000 processed lines
3000000 processed lines
4000000 processed lines
7000000 processed lines
1500000 processed lines
3000000 processed lines
3500000 processed lines
5000000 processed lines
1000000 processed lines
1500000 processed lines
2000000 processed lines
3500000 processed lines
4500000 processed lines
5000000 processed lines
6000000 processed lines
1500000 processed lines
1500000 processed lines
2500000 processed lines
3000000 processed lines
4000000 processed lines
4500000 processed lines
5000000 processed lines
5500000 processed lines
7000000 processed lines
500000 processed lines
1000000 processed lines
2500000 processed lines
3000000 processed lines
3500000 processed lines
6000000 processed lines
6500000 processed lines
7000000 processed lines
500000 processed lines
4000000 processed lines
4500000 processed lines
5000000 processed lines
5500000 processed lines
6000000 processed lines
7000000 processed lines
3000000 processed lines
3500000 processed lines
5000000 processed lines
5500000 processed lines
6500000 processed lines
500000 processed lines
1500000 processed lines
2000000 processed lines
2500000 processed lines
3000000 processed lines
3500000 processed lines
4500000 processed lines
5500000 processed lines
6000000 processed lines
6500000 processed lines
7000000 processed lines
500000 processed lines
1500000 processed lines
2000000 processed lines
2500000 processed lines
3000000 processed lines
3500000 processed lines
4000000 processed lines
5000000 processed lines
6000000 processed lines
500000 processed lines
1500000 processed lines
3500000 processed lines
500000 processed lines
1000000 processed lines
1500000 processed lines
1500000 processed lines
3500000 processed lines
4000000 processed lines
5500000 processed lines
6500000 processed lines
7000000 processed lines
500000 processed lines
1000000 processed lines
1500000 processed lines
2000000 processed lines
3000000 processed lines
5000000 processed lines
5500000 processed lines
6000000 processed lines
500000 processed lines
2000000 processed lines
3000000 processed lines
3500000 processed lines
4000000 processed lines
6000000 processed lines
6500000 processed lines
1000000 processed lines
1500000 processed lines
2500000 processed lines
3000000 processed lines
5500000 processed lines
500000 processed lines
1500000 processed lines
4500000 processed lines
6000000 processed lines
6500000 processed lines
7000000 processed lines
1000000 processed lines
1500000 processed lines
4500000 processed lines
6000000 processed lines
500000 processed lines
1000000 processed lines
1500000 processed lines
3000000 processed lines
3500000 processed lines
5500000 processed lines
6500000 processed lines
1500000 processed lines
7000000 processed lines
500000 processed lines
2500000 processed lines
3000000 processed lines
3500000 processed lines
6500000 processed lines
500000 processed lines
1000000 processed lines
4000000 processed lines
6000000 processed lines
2500000 processed lines
3500000 processed lines
4000000 processed lines
5000000 processed lines
5500000 processed lines
6500000 processed lines
1000000 processed lines
1500000 processed lines
2000000 processed lines
4000000 processed lines
5500000 processed lines
5500000 processed lines
6500000 processed lines
7000000 processed lines
500000 processed lines
1000000 processed lines
1500000 processed lines
3000000 processed lines
3500000 processed lines
4000000 processed lines
5000000 processed lines
5500000 processed lines
6500000 processed lines
7000000 processed lines
500000 processed lines
1000000 processed lines
4000000 processed lines
4500000 processed lines
5000000 processed lines
2000000 processed lines
2500000 processed lines
3000000 processed lines
4000000 processed lines
4500000 processed lines
6000000 processed lines
7000000 processed lines
500000 processed lines
1000000 processed lines
2000000 processed lines
2500000 processed lines
3000000 processed lines
5500000 processed lines
6500000 processed lines
3500000 processed lines
4000000 processed lines
5500000 processed lines
6000000 processed lines
6500000 processed lines
1000000 processed lines
2000000 processed lines
2500000 processed lines
6000000 processed lines
6500000 processed lines
2500000 processed lines
3500000 processed lines
5000000 processed lines
5500000 processed lines
2500000 processed lines
3000000 processed lines
3500000 processed lines
5000000 processed lines
5500000 processed lines
6000000 processed lines
6000000 processed lines
6500000 processed lines
7000000 processed lines
1500000 processed lines
2000000 processed lines
3000000 processed lines
3500000 processed lines
5500000 processed lines
6500000 processed lines
7000000 processed lines
500000 processed lines
4500000 processed lines
5000000 processed lines
6000000 processed lines
7000000 processed lines
1000000 processed lines
2000000 processed lines
3500000 processed lines
4000000 processed lines
500000 processed lines
1000000 processed lines
1500000 processed lines
2000000 processed lines
2500000 processed lines
6000000 processed lines
6500000 processed lines
2000000 processed lines
2500000 processed lines
3000000 processed lines
3500000 processed lines
4000000 processed lines
4500000 processed lines
5000000 processed lines
6000000 processed lines
500000 processed lines
3000000 processed lines
4500000 processed lines
5500000 processed lines
6500000 processed lines
7000000 processed lines
500000 processed lines
1000000 processed lines
4000000 processed lines
4500000 processed lines
7000000 processed lines
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/home/aclab/apps/CHEUI/scripts/CHEUI_preprocess_m5C.py", line 482, in parse_nanopolish
counter)
File "/home/aclab/apps/CHEUI/scripts/CHEUI_preprocess_m5C.py", line 160, in _parse_kmers
samples = [float(i) for i in checked_line[samples_idx].split(',')]
File "/home/aclab/apps/CHEUI/scripts/CHEUI_preprocess_m5C.py", line 160, in
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/aclab/apps/CHEUI/scripts/CHEUI_preprocess_m5C.py", line 560, in
Thanks. Just wondering is it possible to run the script without multiprocessing (n=1).
Also can you please share the number of lines in your nanopolish file.
Thanks, Akanksha
the number of lines in my nanopolish file: 254747835.
As you suggested I ran the the script with n=1 but it also not completed:
500000 processed lines
4000000 processed lines
4500000 processed lines
5000000 processed lines
5500000 processed lines
6000000 processed lines
7000000 processed lines
7500000 processed lines
8000000 processed lines
9500000 processed lines
13000000 processed lines
14000000 processed lines
15000000 processed lines
15500000 processed lines
16000000 processed lines
16500000 processed lines
17000000 processed lines
18000000 processed lines
19500000 processed lines
21500000 processed lines
22000000 processed lines
23000000 processed lines
24000000 processed lines
25500000 processed lines
27000000 processed lines
28500000 processed lines
29000000 processed lines
29500000 processed lines
30000000 processed lines
31000000 processed lines
31500000 processed lines
32000000 processed lines
34000000 processed lines
34500000 processed lines
35000000 processed lines
37500000 processed lines
38000000 processed lines
39000000 processed lines
40000000 processed lines
41500000 processed lines
42500000 processed lines
43500000 processed lines
45000000 processed lines
47000000 processed lines
48000000 processed lines
48500000 processed lines
49000000 processed lines
49500000 processed lines
51000000 processed lines
54000000 processed lines
54500000 processed lines
55000000 processed lines
56000000 processed lines
56500000 processed lines
57500000 processed lines
60500000 processed lines
64000000 processed lines
64500000 processed lines
65500000 processed lines
66500000 processed lines
67000000 processed lines
68000000 processed lines
68500000 processed lines
70000000 processed lines
70500000 processed lines
71000000 processed lines
74000000 processed lines
76000000 processed lines
76500000 processed lines
77000000 processed lines
77500000 processed lines
78500000 processed lines
80000000 processed lines
80500000 processed lines
81500000 processed lines
83500000 processed lines
84500000 processed lines
85000000 processed lines
85500000 processed lines
86000000 processed lines
86500000 processed lines
87500000 processed lines
89500000 processed lines
90000000 processed lines
90500000 processed lines
94000000 processed lines
96000000 processed lines
97500000 processed lines
98000000 processed lines
98500000 processed lines
100000000 processed lines
102000000 processed lines
103000000 processed lines
104000000 processed lines
106500000 processed lines
107500000 processed lines
109000000 processed lines
110000000 processed lines
110500000 processed lines
112000000 processed lines
112500000 processed lines
113500000 processed lines
114500000 processed lines
115500000 processed lines
116500000 processed lines
117000000 processed lines
118500000 processed lines
119000000 processed lines
121000000 processed lines
122500000 processed lines
123000000 processed lines
124500000 processed lines
125500000 processed lines
126500000 processed lines
127000000 processed lines
127500000 processed lines
128500000 processed lines
130000000 processed lines
130500000 processed lines
133000000 processed lines
134000000 processed lines
136000000 processed lines
138500000 processed lines
139000000 processed lines
139500000 processed lines
140500000 processed lines
141500000 processed lines
142000000 processed lines
143500000 processed lines
144500000 processed lines
145000000 processed lines
145500000 processed lines
148500000 processed lines
149000000 processed lines
150000000 processed lines
151000000 processed lines
151500000 processed lines
154000000 processed lines
154500000 processed lines
155000000 processed lines
155500000 processed lines
157000000 processed lines
157500000 processed lines
160000000 processed lines
160500000 processed lines
162000000 processed lines
167000000 processed lines
167500000 processed lines
169000000 processed lines
172000000 processed lines
172500000 processed lines
173000000 processed lines
174500000 processed lines
175000000 processed lines
175500000 processed lines
178000000 processed lines
179000000 processed lines
180000000 processed lines
182500000 processed lines
184000000 processed lines
185000000 processed lines
188000000 processed lines
190500000 processed lines
190500000 processed lines
191500000 processed lines
192000000 processed lines
192500000 processed lines
195000000 processed lines
196000000 processed lines
197000000 processed lines
197500000 processed lines
198000000 processed lines
199500000 processed lines
200500000 processed lines
201500000 processed lines
202500000 processed lines
202500000 processed lines
205500000 processed lines
208000000 processed lines
211500000 processed lines
214000000 processed lines
214500000 processed lines
216000000 processed lines
216500000 processed lines
219000000 processed lines
220000000 processed lines
222500000 processed lines
223500000 processed lines
224500000 processed lines
225000000 processed lines
226000000 processed lines
226500000 processed lines
227500000 processed lines
229000000 processed lines
231500000 processed lines
233000000 processed lines
233500000 processed lines
234000000 processed lines
235000000 processed lines
235500000 processed lines
239000000 processed lines
239500000 processed lines
240500000 processed lines
241000000 processed lines
242000000 processed lines
244000000 processed lines
245000000 processed lines
246000000 processed lines
246500000 processed lines
247000000 processed lines
248000000 processed lines
248000000 processed lines
248500000 processed lines
249500000 processed lines
250000000 processed lines
251000000 processed lines
252000000 processed lines
253000000 processed lines
253500000 processed lines
Traceback (most recent call last):
File "/home/aclab/apps/CHEUI/scripts/CHEUI_preprocess_m5C.py", line 555, in
I run the command in nohup. I don't think its the issue. Is my nanopolish file is too large for the command running.
Thanks. Yes running in background should be fine. I think it gets stuck somewhere between line 253500000 to end of your file which is 254747839. Can you please cp the lines from 253000000 till end of your file into another file and then share with me. It seems like either the nanopolish file is corrupt or the preprocessing script is missing a check to handle the special case that might be in thenano polish file.
Also if you can share the output file that you got from the above run might be useful.
Thanks, Akanksha
Could you please take a moment to review the nanopolish out file I shared on May 4th? I'd appreciate it if you could identify any issues or problems. Thank you! in advance.
Hi, Sorry about the delay in reply. So if you look at last two lines of your file they look weird.
1747834 gene9 616.0 GCTGA c9ee63a7-28b7-40a9-a743-c0eccfba7f73 t 127.0 81.03 2.141 0.00996 GCTGA 89.96 2.85 -2.77 27749.0 27779.0 81.3491,85.069,81.2113,80.2469,79.1447,78.0425,78.4558,82.4513,80.1091,79.4202,80.3847,80.2469,82.0379,82.4513,75.9759,82.0379,79.9713,83.829,82.0379,87.5489,78.[post-run summary] total reads: 384066, unparseable: 0, qc fail: 6183, could not calibrate: 415, no alignment: 266, bad fast5: 0
1747835 0425,79.558,82.1757,80.9358,80.798,80.3847,81.9002,82.1757,81.0735,81.7624
```
That's where the preprocessing code will get stuck. If you remove these two lines. The issue should be resolved.
Also, If you already have the output from the preprocessing code I think it should be fine to run the next steps.
Thanks,
Akanskha
when I run the CHEUI model 1 code in nohup command then when i open nohup file then it shows this message. I want to know my file is run completely or not: This message occur in both command for m6A and m5C. Ran out of input All signals have been processed 44982579
whichever file is made in read level detection code, i procees it for site level detection:
The below output file for m6A site detection level. I want to understand the output file of the CHEUI model 2 code because if the second column is my m6A site like in 1st row it says 1495 is m6A site but at my sequence the 1495 doesn't contain m6A sites. likewise 1541 position is also don't contain m6A sites.
transcript.fa 1495 TAGCAGGAA 161 0.30434782608695654 0.30759284
transcript.fa 1502 AACTACTAG 167 0.10256410256410256 0.12798941
transcript.fa 1505 TACTAGTAC 158 0.36904761904761907 0.43889725
transcript.fa 1508 TAGTACCCT 149 0.16393442622950818 0.39143988
transcript.fa 1522 AACAAATAG 108 0.1368421052631579 0.2662186
transcript.fa 1523 ACAAATAGG 108 0.13402061855670103 0.49044427
transcript.fa 1525 AAATAGGAT 112 0.1625 0.4378153
transcript.fa 1539 ACACATAAT 161 0.547945205479452 0.5977786
transcript.fa 1541 ACATAATCC 156 0.3389830508474576 0.86891454
transcript.fa 1542 CATAATCCA 153 0.23622047244094488 0.74543554
transcript.fa 1546 ATCCACCTA 157 0.40963855421686746 0.49316177
transcript.fa 1555 TCCCAGTAG 134 0.17708333333333334 0.18778525
transcript.fa 1558 CAGTAGGAG 131 0.2948717948717949 0.54169106
Thanks in advance
please can you resolve this issue thanks in advance
Hi, If it says "All signals have been processed" that means its complete. To get the position of the center nucleotide you need to add +5. The position column gives the position of the first nucleotide of the 9mer sequence. I hope it helps.
Thanks, Akanksha