bedmethyl to bigWig: thread 'main' panicked at
Hello,
modkit version: v0.4.4
I'm trying to do a Bigwig conversion, but I got an error message.
thread 'main' panicked at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bigtools-0.5.5/src/bbi/bbiwrite.rs:819:14:
Expected to always send.: TrySendError { kind: Disconnected }
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
thread 'tokio-runtime-worker' panicked at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bigtools-0.5.5/src/bbi/bbiwrite.rs:1321:88:
called Result::unwrap() on an Err value: JoinError::Cancelled(Id(68711))
thread 'tokio-runtime-worker' panicked at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bigtools-0.5.5/src/bbi/bbiwrite.rs:1321:88:
called Result::unwrap() on an Err value: JoinError::Cancelled(Id(68713))
CMD
modkit bedmethyl tobigwig --negative-strand-values output/06_modification_analysis_output/00_converted_bigwig/mock_2_modified_base_count_sort.bed --mod-codes a -g ref/homo_sapiens.GRCh38.109.cdna.fasta.fai output/06_modification_analysis_output/00_converted_bigwig/mock_2_modified_base_count_sort.bigWig
More details, the process is not initially broken, I obtained the intermediate output file of 19M, then the file gone ! 2.9G Aug 8 14:11 mock_1_modified_base_count_sort.bed 19M Aug 8 15:23 mock_1_modified_base_count_sort.bigWig
Thank you for your help. Chuang
Hello @Chuang1118,
Sorry about the crash. Could you run the commend with --log $log_filename and attach it here?
I have other pipeline using modkit bedmethyl tobigwig to get genomic coordinates bigwig instead of transcriptome bigwig
1/ conversion transcript coordinates to genomics
2/ sorting
3/ to bigwig
I got the error message:
thread 'tokio-runtime-worker' panicked at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bigtools-0.5.5/src/bbi/bbiwrite.rs:1334:14:
Couldn't send section.: "SendError(..)"
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
> Error! duplicated record at 1 944219 944220 a 15 - 944219 944220 255,0,0 15 26.67 4 11 0 0 6 0 0
There are many transcript coordinates have same genomic coordinates e.g.
ENST00000327044 2741 2742 a 116 + 2741 2742 255,0,0 116 29.31 34 82 0 1 82 0 0
ENST00000477976 4187 4188 a 15 + 4187 4188 255,0,0 15 26.67 4 11 0 0 6 0 0
both transcripts map to chr1 944219 -
log file/ CMD
[src/logging.rs::60][2025-08-12 17:09:46][DEBUG] command line: modkit bedmethyl tobigwig --negative-strand-values output/08_methylbed2bigwig_analysis_output/03_reformated_modified_sites/mock_1_modified_base_count_filtered_genomic_reformated_sorted.bed --mod-codes a -g ref/homo_sapiens.GRCh38.109.fasta.fai output/08_methylbed2bigwig_analysis_output/04_converted_bigwig/mock_1_modified_base_count_filtered_genomic_reformated_sorted.bigWig --log-filepath output/08_methylbed2bigwig_analysis_output/04_converted_bigwig/mock_1_bigWig.log
Thank you for your time.
Hello @Chuang1118,
I think the problems you're seeing here are due to the duplicated bedMethyl records. Modkit shouldn't crash the way it is now, it should handle this case more gracefully. Do you think it would be possible to send me a small bedMethyl that reproduces the crash?
To the larger problem at hand, you're trying to do something valid that's a little tricky (and correct me if I'm misunderstanding): map your reads to the transcriptome, then visualize the pileup on the genome. You could map the reads directly to the genome, but I'm assuming you'd rather have the transcript assignments from the transcriptome-mapping (correct?). When you take the transcriptome pileup and lift-over the records to genome coordinates, you get multiple records for a given genomic position, which isn't really a valid bedMethyl. A couple ideas come to mind: (1) you could sum the overlapping records together, (2) you could lift-over the reads in the BAM and assign a tag to the separate transcripts that overlap in genome-coordinates, then do a pileup for each tag (or use --partition-tag).
I don't know of a tool that does either of these things very easily. Maybe Modkit should be that tool? It doesn't have that functionality right now. Could you describe your use case/pipeline and maybe I can come up with some features to add?
Thanks.
Hello ArtRand,
For reminder, I used modkit version: v0.4.4
Yes, I am agree with you. From a biological perspective, 1 gene can produce multiple different transcripts, hence the duplicated bedMethyl records.
In my case, see attached mock_1_modified_base_count_filtered_genomic_reformatted_sorted.txt
cat mock_1_modified_base_count_filtered_genomic_reformatted_sorted.bed | grep 944219
1 944219 944220 a 116 - 2741 2742 255,0,0 116 29.31 34 82 0 1 82 0 0
1 944219 944220 a 15 - 4187 4188 255,0,0 15 26.67 4 11 0 0 6 0 0
Modkit shouldn't crash the way it is now, it should handle this case more gracefully.
Current version modkit (v0.5.0) can handle this issue ?
My goal is to visualize the transcriptome pileup on the genome. because there is no transcription browser available like igv and my input is naive RNA.
You could map the reads directly to the genome, but I'm assuming you'd rather have the transcript assignments from the transcriptome-mapping
Do you have expertise in the field of dRNA mapping on the reference genome with model DNA, or vice versa ?
(2) you could lift-over the reads in the BAM and assign a tag to the separate transcripts that overlap in genome-coordinates, then do a pileup for each tag (or use --partition-tag).
There is one line of my bam. I'm not sure I understood --partition-tag
6e47ac3a-af4a-4cd5-b59e-b75f62b75b99 0 ENST00000488147 38 1 37S61M1D86M13I69M1D467M1I88M1I30M1D20M4I18M2D62M2I12M1D85M3D23M170S * 0 0 AGCTGGCGGCGGGAGCTTCCGGGAGGGCGGCTCGCAGGCACCATGACTCCTGTGAGGATGCAGCACTCCCTGGCAGGTCAGACCTATGCCGTGCCCTTACCCAGCCAGACCTGCGGCGAGAGGAGGCCGTCCAGCAGATGGCGGATGCCCTGCAGTACCTGCAGAAGGTCTCTGGAGACATCTTCAGCAGGATCTCCCAGCAGGTAGAGCAGAGCCGGAGCCAGGTGCAGGCCATTGGAGAGAAGGTCTCCTTGGCCCAGGCCAAGTTGAGAAGATCAAGGGCAGCAAGAAGGCCATCAAGGTGTTCTCCAGTGCCAAGTACCCTGCTCCAGAGCGCCTGCAGGAATATGGCTCCATCTTCACCGGCGCCCAGGACCCTGGCCTGCAGAGACGCCCCCGCCACAGGATCCAGAGCAAGCACCGCCCCCTGGACGAGCGGGCCCTGCAGGAGAAGCTGAAGGACTTTCCTGTGTGCGTGAGCACCAAGCCGGAGCCCGAGGACGATGCAGAAGAGGGACTTGGGGGTCTTCCCAGCAACATCAGCTCTGTCAGCTCCTTGCTGCTCTTCAACACCACCGAGAACCTGTACAAGAAGTATGTCTTCCTGGACCCCCTGGCTGGTGCTGTAACAAAGACCCATGTGATGCTGGGGGCAGAGACAGAGGAGAAGCTGTTTGATGCCCCCTTGTCCATCAGCAAGAGAGAGCAGCTGGAACAGCAGGTCCCAGAGAACTTACTTCTATGTGCCAGACCTGGGCCAGGTGCCTGAGATTGATGTTCCATCCTACCTGCCTGACCTGTCCGGCATTGCCAATGACCTCACTGTACATTGCCGACCTGGGCCCCGGCATTGCCCCTCTGCCCCTGGCACCATTCCTTCCAGAACTGCCCACCTCACACTGAGGTAGCCGAGCCTCTCAAGACCTACAAGATGGGGTACTAACACCACCCCCACCGCCCCCCCACCACCACCCCAGCTCCTGAGGTGCTGGCCAGTGCACCCCCACTCCCACCCTCAACCGCGGCCCCTGTAGGCCAAGGCGCCAGGCAGGACGACAGCAGCAGCGCGTCTCCTTCAGGTGGGAGCAGCTCTTTGAGGCCACCTGATTTCTGGCGTGCTCAGTGCACTTCGGTGGATTTCTGTGGGTTTGTTAAGTGGTCAGAAATTCTCAATTTTTTTGAATAGTTTCGATTTCAAATATCTTGTTCTACTTGGTCCATAAAATAGTGGTTTTCAAAAAAAAAAAAT &&&(4:;999:>>=<<<:;;;;=@8***+5568=>AAAB>==<>><<<;<<>@A@>?==???@@>>@>>>>?@AAA@?=<<;:9:;7688A;2,($$(*-;?>;BBCC@=<:;;;=>=BBA@@A?>===;:<=?>??>;77768BB@@CBBB@;:70;<7777DA@?<<<=@AAAA@??>?AABCCA@@?><;=>=?DDEBAABCCDCA@??=<<<?AAAAA???@BCDFECCB@@@@@AA@C9''())/..5668>ABB>858<6(''(67@=;;<?A?@BBBABCDBB@@>???BBA???AB@?@@A@?>?@@A@???>@@?><<<>?@?>==>?@@?@AB?=;;;;<:808635:;@B=><:::<@@D???>BBA??>=>>>??><:;;?BA<;;<===>=>>>>?====<:<<=<<=>>ABA??@A@A==?>?@BBBAA>?>>@ADDDCCDED@7E@?>@?>@AB>@<;;;==>=???AAA@?>??BA??>?@A??@>@?>?DBAA<;=<;93/678@>>?AA>>=>BA??>>?>>???>?=<==AAABBBABBCCB50(((-/23333A:99:CCB@ABBABBBBBBA@>@>=8=?A??A??>9888=?>>@?>>>?@@?>>>@>@>>:::9;;=@AB@@667<A==?A@??>>@@BCDFABBBCA><;<=?>>?BBBB=?AAA@@@@BBCCECDAAAABBCABBBCCAACCB?>>=???AABE00(''*'&(%'7>ACB996)%%;?@==?AAAABABCDCCDCCDBC..,,56;;?>@?@A@?ACDAAAA@?:777:@BBA?620-+,,/0'&&&/))))(&&'')0712062;985550/334557:8:;4.3679?A@?<80/.//*(+8;;;:;<==?@@=2)&&;9<BCDC@<<=9999?@=:=<;=><<861/++++>=:<=<;;;<>?BB?656.0225555/348=@=879:;=<07@ABCCDDBC<:::?@ACCCCCCCBC@???AA@?>>=<5566732..///568:ACC@><==ADDFFG@====@A@A@@@<999;?AACDB>>96669<===/--7;>>=>>>?@??0///879BBA2.,)'-993-'&&'-9===<47:<>?::7-,.-';@ABADE?;;??>>>AAAA@@>>?????@;D?AD@B99==@989?C>==>?>>>@E8876579>?B@?43;/9=4444:::=<;<==@EHIFA=;3364//.'$&*./////,&-('( qs:f:17.413 du:f:12.867 ns:i:51468 ts:i:2600 mx:i:1 ch:i:768 st:Z:2025-03-06T18:48:54.254+00:00 rn:i:624 fn:Z:PBC13265_e32414dd_60f9ce1c_3.pod5 sm:f:830.16 sd:f:118.119 sv:Z:pa dx:i:0 RG:Z:60f9ce1cf4df970330efd0925891a30e3d74bf7e_rna004_130bps_hac@v5.0.0 MN:i:1249 MM:Z:A+a.,0,2,1,5,0,2,0,0,1,2,3,1,0,4,0,1,0,2,2,1,0,0,2,0,0,0,0,0,1,2,0,2,0,1,0,0,1,1,2,0,1,1,5,4,1,2,0,0,0,0,1,3,1,1,2,3,0,0,0,1,0,2,0,0,2,1,1,0,2,6,0,1,0,0,0,0,1,1,0,0,0,0,0,3,0,0,1,0,2,0,2,0,0,0,1,1,1,0,0,1,0,0,6,0,0,2,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,4,4,1,2,1,3,2,0,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0; ML:B:C,106,180,151,124,22,100,112,169,114,20,58,27,169,29,62,22,19,109,24,108,129,13,30,213,171,38,250,214,49,16,27,16,21,71,66,88,155,120,17,28,247,243,247,56,252,32,18,72,255,24,107,104,251,18,172,36,108,54,88,16,57,93,182,178,46,21,74,34,172,121,13,12,12,137,158,35,18,241,213,133,85,52,129,35,186,69,41,59,26,31,62,27,39,22,13,148,19,221,18,61,119,182,26,160,223,69,62,37,159,154,122,81,59,20,78,156,136,101,53,39,191,189,100,255,160,235,171,12,18,135,79,24,166,16,13,47,124,106,124,143,111,103,148,153,163,169,169,165,160,161 NM:i:36 ms:i:1272 AS:i:1902 nn:i:0 de:f:0.0164729 tp:A:P cm:i:63 s1:i:770 s2:i:793 MD:Z:61^C1T153^A66G30G224G210C13C37^C38^TC74^C85^CAG23 rl:i:0
The bam file is so large, tagging bam file isn't easy staff.
Firstly, the point of view of single molecule profiling, I would like to product your plot at https://github.com/nanoporetech/modkit/issues/367#issuecomment-2641095440 in my case. Secondly, binning single molecule profiling with some value to get the peaks more continuous (region bed), then dmr. My goal is the visual gene level. In other word, ignore heterogeneous profile.
Thanks
Hello @Chuang1118,
Current version modkit (v0.5.0) can handle this issue ?
It cannot. If there are two records reporting on the same genomic position and strand for a modification code, which one should it use for the BigWig record? What you could do for transcripts that overlap in genomic coordinates, is make multiple bedMethyl files for each original transcript and covert each into BigWig (you can skip the intermediate file and pipe into modkit bedmethyl tobigwig). I don't think this is an ideal workflow, so I'm working on better solutions.
Do you have expertise in the field of dRNA mapping on the reference genome with model DNA, or vice versa ?
You can map your dRNA reads to the genome using dorado aligner, documentation can be found here, specifically -x splice:hq is what we usually recommend.
Hello @Chuang1118,
In case you run into errors such as
> Error! Input bedGraph not sorted by chromosome. Sort with sort -k1,1 -k2,2n.
I've pushed some updates to master and attached a build here, if you want to try it. It will still fail on duplicated records since I don't think it is completely valid to have them.
Thanks.
Hello @ArtRand,
Thank you for your time and support.
Error! Input bedGraph not sorted by chromosome. Sort with sort -k1,1 -k2,2n.
I have sorted bed file mock_1_modified_base_count_filtered_genomic_reformatted_sorted.txt
This error does not make sense to me. One possible, when you handle duplicated records, you store them in "dictionary like structure" e.g. hashmap, as result it destroy the initial order.
For the release modkit_dev442e6ba_u16_x86_64.tar.gz I can't build it.
/opt/dist_modkit_v0.5.1_442e6ba` does not contain a Cargo.toml file. --path must point to a directory containing a Cargo.toml file.
As you previously mentioned
(1) you could sum the overlapping records together.
For a given modification site with the same genomic location, I re-defined the methylation percentage using the average value.
Input:
ouput:
then run modkit bedmethyl tobigwig
IGV
chr1 1014209 %m : (11.71 + 0 + 10.83) / 3 = 7.513
Could you tell me, what's it mean ? Value: 11.328976 in bigWig format. It's possible to plot average %m ?
Thanks.