xqtl-protocol icon indicating copy to clipboard operation
xqtl-protocol copied to clipboard

Current status and potential improvement of command generators

Open hsun3163 opened this issue 2 years ago • 1 comments

For the command generator to be organized into a protocol-module structure similar to other sections, this ticket will outline its current status and some potential caveats.

Current status: There are Four break-point (four commands required to run to complete an analysis) required for a complete analysis.

  1. The calling of molecular phenotype as a completed and self-contained notebook.
  2. The subcommand plink_per_chrom processes the genotype once.
  3. The subcommand factor runs all the processing of bed and covariate analysis while leaving the genotype file intact (Assuming done)
  4. The subcommand TensorQTL/APEX/UniSuSiE runs the final association analysis and other file preparation to generate unique input.

Potential Enhancement:

  1. Automatic detection in recipe file, and skip certain step if deemed finished 1.1. Can break-point be further consolidated, i.e. check all the input genotype, if they are all the same, automatically run 1 check point 2
  2. Get rid of GRM so that APEX run OLS analysis just like tensorQTL.

Caveats:

  1. The execution of each command requires all the input to be placed in the correct folder and named accordingly as if they are the output of the previous step.
  2. Accommodation of additional tissue is problematic, in terms of plink_per_gene. Would require manual manipulation of generated region list by cat {} | sort | uniq
  3. The break-point is not clearly defined by the stage as shown below; can this be improved?

Potential non standard operation:

  1. genotype processing done elsewhere : BP 1, 3,4
  2. Phenotype done elsewhere, but genotype not done. Still require BP 2,3,4. For region list needs to be generated, and the phenotype needs to be put in the correct order.

hsun3163 avatar Apr 20 '22 20:04 hsun3163