transabyss icon indicating copy to clipboard operation
transabyss copied to clipboard

Assembly strategy for large datasets

Open mtmcgowan opened this issue 3 years ago • 5 comments

Hi Trans-abyss team,

I have a large dataset consisting of 99 libraries of ~35m reads each for a species that does not yet have a reference genome available and am interested in building a _de novo assembly. I have access to an HPC cluster (24 core, 250 Gb RAM) and have setup trans-abyss with singularity.

I am unsure whether it would be better to assemble a single transcriptome using all libraries or assemble each library separately and then merge them.

Based on your experience with your assembler, can you make any strategy recommendations based on my available computing resources?

mtmcgowan avatar Feb 11 '21 19:02 mtmcgowan