xqtl-protocol
xqtl-protocol copied to clipboard
Current status and potential improvement of command generators
For the command generator to be organized into a protocol-module structure similar to other sections, this ticket will outline its current status and some potential caveats.
Current status: There are Four break-point (four commands required to run to complete an analysis) required for a complete analysis.
- The calling of molecular phenotype as a completed and self-contained notebook.
- The subcommand
plink_per_chrom
processes the genotype once. - The subcommand
factor
runs all the processing of bed and covariate analysis while leaving the genotype file intact (Assuming done) - The subcommand
TensorQTL
/APEX
/UniSuSiE
runs the final association analysis and other file preparation to generate unique input.
Potential Enhancement:
- Automatic detection in recipe file, and skip certain step if deemed finished 1.1. Can break-point be further consolidated, i.e. check all the input genotype, if they are all the same, automatically run 1 check point 2
- Get rid of GRM so that APEX run OLS analysis just like tensorQTL.
Caveats:
- The execution of each command requires all the input to be placed in the correct folder and named accordingly as if they are the output of the previous step.
- Accommodation of additional tissue is problematic, in terms of plink_per_gene. Would require manual manipulation of generated region list by
cat {} | sort | uniq
- The break-point is not clearly defined by the stage as shown below; can this be improved?
Potential non standard operation:
- genotype processing done elsewhere : BP 1, 3,4
- Phenotype done elsewhere, but genotype not done. Still require BP 2,3,4. For region list needs to be generated, and the phenotype needs to be put in the correct order.