HiCAssembler icon indicating copy to clipboard operation
HiCAssembler copied to clipboard

misassembly correction

Open yongyiyu opened this issue 6 years ago • 3 comments

Hi Fidel, I have created a corrected Hi-C matrix in h5 format by HiCExporer,now an assertion error killed the assembly while using the HiCAssembler. " Traceback (most recent call last): File "/annoroad/data1/bioinfo/PMO/yangweifei/Hicassemble/py2/bin/assemble", line 312, in main(args) File "/annoroad/data1/bioinfo/PMO/yangweifei/Hicassemble/py2/bin/assemble", line 308, in main chain_file=args.outFolder + "/liftover.chain") File "/annoroad/data1/bioinfo/PMO/yangweifei/Hicassemble/py2/bin/assemble", line 218, in save_fasta assert(next_contig['start'] - end >= 0) AssertionError " I'm a little confused while checking the code of assemble. The misassembly correction of my data mybe like the overlap of data. As the principle you described , "this means that a contig was split by the misassembly correction but was later joined together", the overlap of data isn't be considered , and I think this kind of misassembly correction shouldn't be joined.

Best regards, yongyi

yongyiyu avatar Jan 10 '19 08:01 yongyiyu

To be clear, did you run the misassembly correction before?

On Thu, Jan 10, 2019 at 9:42 AM yongyiyu [email protected] wrote:

Hi Fidel, I have created a corrected Hi-C matrix in h5 format by HiCExporer,now an assertion error killed the assembly while using the HiCAssembler. " Traceback (most recent call last): File "/annoroad/data1/bioinfo/PMO/yangweifei/Hicassemble/py2/bin/assemble", line 312, in main(args) File "/annoroad/data1/bioinfo/PMO/yangweifei/Hicassemble/py2/bin/assemble", line 308, in main chain_file=args.outFolder + "/liftover.chain") File "/annoroad/data1/bioinfo/PMO/yangweifei/Hicassemble/py2/bin/assemble", line 218, in save_fasta assert(next_contig['start'] - end >= 0) AssertionError " I'm a little confused while checking the code of assemble. The misassembly correction of my data mybe like the overlap of data. As the principle you described , "this means that a contig was split by the misassembly correction but was later joined together", the overlap of data isn't be considered , and I think this kind of misassembly correction shouldn't be joined.

Best regards, yongyi

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/maxplanck-ie/HiCAssembler/issues/7, or mute the thread https://github.com/notifications/unsubscribe-auth/AEu_1QdLKFcXF00UfRhKKNQ4bpceV_bXks5vBvzdgaJpZM4Z5Blu .

fidelram avatar Jan 11 '19 09:01 fidelram

Hi Fidel, I ran the misassembly correction before using the HiCAssembler. Now when testing the HiCAssembler with the Hi-C matrix that wasn't corrected,it worked successfully. So I think that the situation of the overlap may not be considered. The code is followed:

“hicBuildMatrix --samFiles L3-8_Lib1_Lane1_genome.reads1.bam \
L3-8_Lib1_Lane1_genome.reads2.bam --binSize 10000 --restrictionSequence GATC --threads 4 \ --inputBufferSize 100000 --outBam hic.bam -o hic_matrix.h5 --QCfolder ./hicQC”

“hicCorrectMatrix correct -m hic_matrix.h5 -t -1.2 5 -o hic_corrected_matrix.h5”

“assemble -f genome.fa -m hic_corrected_matrix.h5 -o ./assembly_output
--min_scaffold_length 100000 --bin_size 5000 --misassembly_zscore_threshold -1.0
--num_iterations 3 --num_processors 16”

yongyi

yongyiyu avatar Jan 15 '19 13:01 yongyiyu

Hi yongyi,

I got the same error, how did you solve it? @yongyiyu

Thanks a lot!

xuxiaoman0212 avatar Feb 16 '20 14:02 xuxiaoman0212