juicer icon indicating copy to clipboard operation
juicer copied to clipboard

statistics of hic contacts in inter.txt looks abnormal

Open JiaoLaboratory opened this issue 2 years ago • 1 comments

Are you sure this is an issue? Github Issues is reserved for situations that require changes to the codebase. Questions, discussions, etc. should be posted to the forum: aidenlab.org/forum.html Dear JUICER developer,

I am running Juicer to generate Hi-C contacts heatmaps for a diplod heterozygous genome (2x = 500M). the genome size file like this:

h1JH_chr01 37856113 h1JH_chr02 38277602 h1JH_chr03 31105642 h1JH_chr04 36469279 h1JH_chr05 27916969 h1JH_chr06 45024913 h1JH_chr07 34590854 h2JH_chr01 37686295 h2JH_chr02 39088487 h2JH_chr03 32593457 h2JH_chr04 37426946 h2JH_chr05 27942359 h2JH_chr06 44783632 h2JH_chr07 34270479

Then I running Juicer following below: Experiment description: Juicer version 1.5.7; BWA 0.7.12-r1039; 10 threads; openjdk version "1.8.0_312"; ./scripts/juicer.sh -s MboI -g hi_h2 -z references/h1_h2.fasta -y restriction_sites/h1_h2_MboI.txt -p h1_h2.fasta.size -D ./ -t 10

But,the constructed inter.hic(inter_30.hic) is only 24M,that looks not normal.

108G abnormal.sam 4.0K collisions.txt 8.8G dups.txt 4.0K header 0 inter_30_contact_domains 8.6M inter_30.hic 12K inter_30_hists.m 4.0K inter_30.txt 24M inter.hic 12K inter_hists.m 4.0K inter.txt 60G merged_nodups.txt 69G merged_sort.txt 386M opt_dups.txt 12K stats_dups_hists.m 4.0K stats_dups.txt 4.8G unmapped.sam

I checked the statistics in inter.txt, and found the Hi-C Contacts is only 3.38% / 6.20%,

Sequenced Read Pairs: 259,892,602 Normal Paired: 126,960,665 (48.85%) Chimeric Paired: 36,076,539 (13.88%) Chimeric Ambiguous: 90,558,733 (34.84%) Unmapped: 6,296,665 (2.42%) Ligation Motif Present: 172,352,080 (66.32%) Alignable (Normal+Chimeric Paired): 163,037,204 (62.73%) Unique Reads: 141,532,529 (54.46%) PCR Duplicates: 20,614,012 (7.93%) Optical Duplicates: 890,663 (0.34%) Library Complexity Estimate: 582,430,468 Intra-fragment Reads: 1,239,893 (0.48% / 0.88%) Below MAPQ Threshold: 131,514,187 (50.60% / 92.92%) Hi-C Contacts: 8,778,449 (3.38% / 6.20%) Ligation Motif Present: 4,967,849 (1.91% / 3.51%) 3' Bias (Long Range): 65% - 35% Pair Type %(L-I-O-R): 25% - 25% - 25% - 25% Inter-chromosomal: 1,320,146 (0.51% / 0.93%) Intra-chromosomal: 7,458,303 (2.87% / 5.27%) Short Range (<20Kb): 4,571,216 (1.76% / 3.23%) Long Range (>20Kb): 2,886,831 (1.11% / 2.04%)

Any idea on how to solve this issue??

Thanks in advance,

JiaoLaboratory avatar Mar 06 '22 06:03 JiaoLaboratory

PCR Duplicates: 20,614,012 (7.93%) Optical Duplicates: 890,663 (0.34%)

These looks fine.

Inter-chromosomal: 1,320,146 (0.51% / 0.93%) Intra-chromosomal: 7,458,303 (2.87% / 5.27%) Short Range (<20Kb): 4,571,216 (1.76% / 3.23%) Long Range (>20Kb): 2,886,831 (1.11% / 2.04%)

These looks too low, the library/analysis failed to capture long range interaction

yinshiyi avatar May 16 '22 19:05 yinshiyi