ALLHiC
ALLHiC copied to clipboard
Intermediate files in size of Tb?
Hi @tangerzhang,
I am running allhic with a genome of size around 3 Gb, at a read coverage of 100x.
I observed super large sizes of intermediate files during pruning step, like below:
total 5.3T -rwxrwx--- 1 sun 4.1T Jun 20 03:48 log.txt* -rwxrwx--- 1 sun 7.7G Jun 20 03:48 removedb_Allele.txt* -rwxrwx--- 1 sun 1.3T Jun 20 03:48 removedb_nonBest.txt*
The log file itself is around 4.1 Tb, and the program has not finished yet. Is this common? And, is there a way to handle the large files?
thanks, Hequan
Another info: Allele.ctg.table is 27 Mb, and I have ~20,000 contigs.
Hi,
The final result of ALLHiC_prune
is prunning.bam
.
You can use the development version of ALLHiC_prune
(https://github.com/sc-zhang/ALLHiC_components/tree/main/Prune` ). This version does not generate intermediate files and has a speed increase.
Hi, The final result of
ALLHiC_prune
isprunning.bam
. You can use the development version ofALLHiC_prune
(https://github.com/sc-zhang/ALLHiC_components/tree/main/Prune` ). This version does not generate intermediate files and has a speed increase.
Thanks @wangyibin. I am running the version you mentioned.
Hi the links are broken. I have the same issue, it filled my file system and the system ran out of space due to the huge log file.
Hi the links are broken. I have the same issue, it filled my file system and the system ran out of space due to the huge log file.
This link works: https://github.com/sc-zhang/ALLHiC_components/tree/main/Prune
Hello! Do you know how long this step usually takes? This step has been running for about 3 days so far for me and I just want to make sure that's not unusual.