ALLHiC
ALLHiC copied to clipboard
Omni-C
Hello
We have done a phased assembly using HiFi data and Improved Phased Assembly (IPA) on a polyploid plant. Then, we did a Omni-C (Dovetail) for the same plant and we got the its short read libraries.
We tried to use AllHiC using Omni-C data but I got the following errors at PreprocessSAMs.pl srtage.
Could AllHiC be used for Omni-C data? Any parameter to change in the scripts?
(base) uq@fl002:.../uq/bwa/F-HiC> PreprocessSAMs.pl sample.bwa_aln.sam final.female.p_ctg.fasta MOBI Sun Feb 21 10:24:32 2021: PreprocessSAMs.pl: samtools view -bS sample.bwa_aln.sam -o sample.bwa_aln.bam /30days/uq/ALLHiC/scripts/PreprocessSAMs.pl sample.bwa_aln.sam final.female.p_ctg.fasta MOBI
Use of uninitialized value $RE_site in string eq at /30days/uq/ALLHiC/scripts/PreprocessSAMs.pl line 137. Use of uninitialized value $RE_site in concatenation (.) or string at /30days/uq/ALLHiC/scripts/PreprocessSAMs.pl line 155. Use of uninitialized value $RE_site in concatenation (.) or string at /30days/uq/ALLHiC/scripts/PreprocessSAMs.pl line 156. Sun Feb 21 10:28:47 2021: PreprocessSAMs.pl: make_bed_around_RE_site.pl final.female.p_ctg.fasta 500
make_bed_around_RE_site.pl
Find all occurrences of a motif in a genome. Make a 'POS' file listing these occurrences, and also a BED file representing the regions around these occurrences.
SYNTAX: make_bed_around_RE_site.pl
OUTPUT FILES:
Sun Feb 21 10:28:47 2021: PreprocessSAMs.pl: bedtools intersect -abam sample.bwa_aln.bam -b final.female.p_ctg.fasta.near_.500.bed > sample.bwa_aln.REduced.bam Error: Unable to open file final.female.p_ctg.fasta.near_.500.bed. Exiting. Sun Feb 21 10:28:48 2021: PreprocessSAMs.pl: samtools view -F12 sample.bwa_aln.REduced.bam -b -o sample.bwa_aln.REduced.paired_only.bam [main_samview] fail to read the header from "sample.bwa_aln.REduced.bam". Sun Feb 21 10:28:48 2021: PreprocessSAMs.pl: samtools flagstat sample.bwa_aln.REduced.paired_only.bam > sample.bwa_aln.REduced.paired_only.flagstat [E::hts_open_format] Failed to open file "sample.bwa_aln.REduced.paired_only.bam" : No such file or directory samtools flagstat: Cannot open input file "sample.bwa_aln.REduced.paired_only.bam": No such file or directory
Dovetail have a detailed instruction for Omni-C fastq preprocessing. From fastq to final valid pairs bam file, maybe feed the valid pairs bam into AllHiC (allhic prune
)is better?
For DNAase type data, juicer
choose skip the frag filtering step
## If DNAse-type experiment, no fragment maps
if [ "$site" == "none" ]
then
nofrag=1;
fi
SALSA2
have similar process,see https://github.com/marbl/SALSA/issues/55.
But for ALLHiC_partition
, can skip the -e enzyme_sites
option? @tangerzhang
Just for curiousus, Is the IPA
assembly have better quality than hifiasm
? hifiasm
have any problem in polyploid plant?
Hi @ardy20
The error was caused by the typo of MBOI name. Your input is MOBI. Alternatively, you can simply the restriction sites: GATC.
We have not tested Omni-C libraries. In theory, ALLHiC can be applied to various types of Hi-C libraries if there is known restriction sites. However, if Omni-C libraries do not record restriction sties, I am afraid that the current ALLHiC does not support this kind of libraries.
@baozg -e enzyme_sites
option is still requested at current stage.
Hi All
Thanks a lot for guides and apology for the typo. I will try the suggestions and get back to you. Regarding the assembly with HiFi, we did not test Hifiasm because we were happy with IPA. However, we tested HiCanu and the assembly quality was much better. Especially, IPA creates phased assembly and provides primary and associated contigs. I am not sure if HiFiasm has the same capability. We combined our IPA HiFi with Omni-C and we got very high quality chromosome level assembly of Jojoba plant.
hifiasm
usually have better quality on plant assembly and faster than hicanu and IPA. It can creates phased assembly in falcon-unzip style (primary + alternative) and trio.
Thanks for the suggestion. We will test it soon.
In which file the following code should be changed?
For DNAase type data, juicer choose skip the frag filtering step
If DNAse-type experiment, no fragment maps
if [ "$site" == "none" ] then nofrag=1; fi
Just as a quick update that we tested hifiasm and found that the IPA creates significantly better assemblies with higher N50.
Dear ALLHiC Team
For Omni-C, the -e option is still required. What should we put for that?
Dear Sirs, is there any planned support for OmniC data in allHiC? thanks
Dear Sirs, is there any planned support for OmniC data in allHiC? thanks
Hi @diriano I have not had a chance to test OmniC data. Could anyone share with me some sample data so that we can test ALLHiC on OmniC?
Hi @tangerzhang , I think I can share some data with you. I have a diploid plant that is highly het. Would it be OK if I share the two versions (haplotypes) of a set of contigs that make a chromosome and the corresponding OmniC reads?
Hi @diriano That would be great if you could share with me the contig assembly and OmniC reads. My gmail is tanger.zhang@gmail and google drive works for me. Thanks!
@tangerzhang did you have a chance to check the data that I sent? Cheers
Hi @tangerzhang, is there any update on supporting DNAse type HiC data (like OmniC)? I would also appreciate it very much. I think many people use it now for scafollding purposes, because of its higher/more even coverage.
OmniC data support would be much appreciated.
any news on this topic ?