long-read-pipelines
long-read-pipelines copied to clipboard
Long read production pipelines
There are many places where dockers offer overlapping functionalities, sub-optimal.
Clair3 to v0.1-r11: https://github.com/HKU-BAL/Clair3/issues/99 DV to 1.4.0: https://github.com/google/deepvariant/releases/tag/v1.4.0
It's taking long hours, so to increase throughput, we need to investigate which metrics are generated by it, and swap them with a faster algorithm. The ultimate goal is to...
While trying to collect some information on our throughput, I've found a low-hanging fruit that should yield significant benefit to our throughput. I'll take the following timing chart as an...
And when running large scale analysis, try to split into non-overlapping zones to avoid quote issue. See #324 and #326.
Some of the dockers have non-trivial code. They need to be documented, and potentially moved to their own repos.
This will also mean the tasks should have that script running in the background among it's first commands.
When #268 was implemented, there was some time pressure so this wasn't done. We should have the various small variant calls all use the same scatter-gather procedure.