transabyss
transabyss copied to clipboard
Assembly strategy for large datasets
Hi Trans-abyss team,
I have a large dataset consisting of 99 libraries of ~35m reads each for a species that does not yet have a reference genome available and am interested in building a _de novo assembly. I have access to an HPC cluster (24 core, 250 Gb RAM) and have setup trans-abyss with singularity.
I am unsure whether it would be better to assemble a single transcriptome using all libraries or assemble each library separately and then merge them.
Based on your experience with your assembler, can you make any strategy recommendations based on my available computing resources?