irida
irida copied to clipboard
Pipelines of pipelines - automated chaining of pipelines
There are some pipelines that only make sense to run given some conditions which in some cases can be automatically detected through the results of other pipelines.
For example, if you wish to automatically subtype a genome that you're not really sure is Salmonella or not using one of the bio_hansel
Salmonella SNV subtyping schemes, you could find out if you have a Salmonella genome or not using the top match results from refseq_masher
then use SISTR to serotype your genome and if it's Heidelberg or Enteritidis then run the appropriate bio_hansel
subtyping scheme.
This could be extended to allow chained execution of pipelines that naturally follow one another:
e.g.
-
refseq_masher
/Kraken prediction:-
E. coli species prediction?
- assembly with parameters optimal for E. coli
- calculate assembly quality metrics and compare to generally acceptable metrics for E. coli (e.g. genome size within accepted range +/- 1 SD)
- annotation with custom E. coli protein annotation database for more consistent annotations
- run E. coli serotyping for in silico serotype prediction
- run MLST scheme for E. coli for MLST allele calls and MLST ST/CC
- genomic island prediction
- E. coli specific AMR detection
- run cg/wgMLST scheme for E. coli either from reads or from assembly
- E. coli-specific SNV subtyping
- assembly with parameters optimal for E. coli
-
Salmonella genus prediction?
- assembly with parameters optimal for Salmonella
- calculate assembly quality metrics and compare to generally acceptable metrics for Salmonella (e.g. genome size within accepted range +/- 1 SD)
- annotation with custom E. coli protein annotation database for more consistent annotations
- run SISTR for in silico serotype prediction and Salmonella cgMLST330 allelic profile/ST
- run MLST scheme for Salmonella for MLST allele calls and MLST ST/CC
- Salmonella specific AMR detection
- run cg/wgMLST scheme for Salmonella either from reads or from assembly
- Salmonella-specific SNV subtyping
- assembly with parameters optimal for Salmonella
- other organisms...
-
E. coli species prediction?
Note: pipelines would only be compatible for this mode of operation would be those that run on single samples and produce single results corresponding to a single sample.
Imported from GitLab issue #644. Originally posted on 2018/05/25 02:36PM\Posted by Peter Kruczkiewicz
Related to #711