verkko
verkko copied to clipboard
Homogenize trio vs hic architecture
Currently, trio requires users to build homopolymer-compressed databases and we run one round of consensus after rukki resolution and coloring. In contrast, with Hi-C data, we run consensus, map Hi-C data, then run rukki and another round of consensus. This is similar to the architecture for GFAse as well.
This should be universal, perhaps always use uncompressed post-consensus sequence? This would allow users to build normal meryl databases rather than HPC ones, simplifying confusion. Additionally, output file locations differ between HiC and trio (e.g. trio = 6-layout*/*scfmap
vs hic = 8-*/6-layout*/*scfmap
). These should be promoted to top level to be the same in both cases.