treeval icon indicating copy to clipboard operation
treeval copied to clipboard

REQUEST: Large genome parallelisation

Open DLBPointon opened this issue 1 year ago • 2 comments

Description of feature

Currently testing TreeVal by running daSamNigr - the European elder.

The peptide subworkflow, is obviously slower however does not require any futher optimizations. GAP_FINDER also requires no further optimizations - output is exactly as expected.

Repeat Density is much slower.

Insilico Digest completed, as well as GENERATE_GENOME.

Nothing is moving through the nuc_alignments subworkflow, currently, it is in a constant state of fail and retry.

We will need to include a fix that will slip the genome into 1Gbp chunks, run the workflow in parallel and then merge. For the current pipeline, this could be simple. The more complex subworkflows however may require a different solution.

DLBPointon avatar Apr 05 '23 14:04 DLBPointon