quantms icon indicating copy to clipboard operation
quantms copied to clipboard

Refactoring some of the global steps.

Open ypriverol opened this issue 7 months ago • 5 comments

Description of feature

In the current design of quantms workflow, we have some steps like FILE PREPARATION, SUMMARYPIPELINE that are at the root of the workflow, see example 👇: QUANTMS:FILE_PREPARATION, while the majority of the steps are QUANTMS:DIA:{}. This is confusing for the user don't know if tihs is part of the DIA workflow or something else.

[61/ebba76] Submitted process > BIGBIO_QUANTMS:QUANTMS:FILE_PREPARATION:THERMORAWFILEPARSER (RD139_Narrow_UPS1_0_1fmol_inj1)
[11/da3064] Submitted process > BIGBIO_QUANTMS:QUANTMS:FILE_PREPARATION:THERMORAWFILEPARSER (RD139_Narrow_UPS1_0_1fmol_inj2)
[92/c06fe7] Submitted process > BIGBIO_QUANTMS:QUANTMS:FILE_PREPARATION:THERMORAWFILEPARSER (RD139_Narrow_UPS1_0_25fmol_inj1)
[a8/cbce0a] Submitted process > BIGBIO_QUANTMS:QUANTMS:FILE_PREPARATION:THERMORAWFILEPARSER (RD139_Narrow_UPS1_0_25fmol_inj2)
[03/2951bf] Submitted process > BIGBIO_QUANTMS:QUANTMS:DIA:GENERATE_CFG (PXD026600.sdrf)
[de/bbcb6d] Submitted process > BIGBIO_QUANTMS:QUANTMS:FILE_PREPARATION:MZML_STATISTICS (RD139_Narrow_UPS1_0_1fmol_inj1)
[bb/dd9713] Submitted process > BIGBIO_QUANTMS:QUANTMS:FILE_PREPARATION:MZML_STATISTICS (RD139_Narrow_UPS1_0_1fmol_inj2)
[18/c95e9a] Submitted process > BIGBIO_QUANTMS:QUANTMS:FILE_PREPARATION:MZML_STATISTICS (RD139_Narrow_UPS1_0_25fmol_inj1)
[f8/0419f0] Submitted process > BIGBIO_QUANTMS:QUANTMS:FILE_PREPARATION:MZML_STATISTICS (RD139_Narrow_UPS1_0_25fmol_inj2)
[a6/bd956f] Submitted process > BIGBIO_QUANTMS:QUANTMS:DIA:INSILICO_LIBRARY_GENERATION (REF_EColi_K12_UPS1_combined.fasta)
[[24](https://github.com/bigbio/quantms/actions/runs/15183897490/job/42699738072?pr=551#step:7:25)/55e348] Submitted process > BIGBIO_QUANTMS:QUANTMS:DIA:PRELIMINARY_ANALYSIS (RD139_Narrow_UPS1_0_25fmol_inj1)
[bb/84b883] Submitted process > BIGBIO_QUANTMS:QUANTMS:DIA:PRELIMINARY_ANALYSIS (RD139_Narrow_UPS1_0_[25](https://github.com/bigbio/quantms/actions/runs/15183897490/job/42699738072?pr=551#step:7:26)fmol_inj2)
[05/1c9dee] Submitted process > BIGBIO_QUANTMS:QUANTMS:DIA:PRELIMINARY_ANALYSIS (RD139_Narrow_UPS1_0_1fmol_inj1)
[31/8025f2] Submitted process > BIGBIO_QUANTMS:QUANTMS:DIA:PRELIMINARY_ANALYSIS (RD139_Narrow_UPS1_0_1fmol_inj2)
[25/4a9fe1] Submitted process > BIGBIO_QUANTMS:QUANTMS:DIA:ASSEMBLE_EMPIRICAL_LIBRARY (PXD0[26](https://github.com/bigbio/quantms/actions/runs/15183897490/job/42699738072?pr=551#step:7:27)600.sdrf)
[a7/8c0f3b] Submitted process > BIGBIO_QUANTMS:QUANTMS:DIA:INDIVIDUAL_ANALYSIS (RD139_Narrow_UPS1_0_25fmol_inj2)
[3a/95b91a] Submitted process > BIGBIO_QUANTMS:QUANTMS:DIA:INDIVIDUAL_ANALYSIS (RD139_Narrow_UPS1_0_1fmol_inj1)
[65/0149f6] Submitted process > BIGBIO_QUANTMS:QUANTMS:DIA:INDIVIDUAL_ANALYSIS (RD139_Narrow_UPS1_0_25fmol_inj1)
[54/c62c57] Submitted process > BIGBIO_QUANTMS:QUANTMS:DIA:INDIVIDUAL_ANALYSIS (RD139_Narrow_UPS1_0_1fmol_inj2)
[bf/61dd1a] Submitted process > BIGBIO_QUANTMS:QUANTMS:DIA:FINAL_QUANTIFICATION (PXD026600.sdrf)
DIANNCONVERT is based on the output of DIA-NN 1.8.1, 2.0.* and 2.1.*, other versions of DIA-NN don't support mzTab conversion.
[a5/95c2c6] Submitted process > BIGBIO_QUANTMS:QUANTMS:DIA:CONVERT_RESULTS (PXD026600.sdrf)
[05/693a45] Submitted process > BIGBIO_QUANTMS:QUANTMS:DIA:MSSTATS_LFQ (PXD026600.sdrf_openms_design_msstats_in.csv)
[92/f17800] Submitted process > BIGBIO_QUANTMS:QUANTMS:SUMMARY_PIPELINE (1)

Im wondering if it would be interesting to have some refactoring and have a different approach. we have different options:

  • [ ] 1. Leave it like it is, generic step at the level of QUANTS: QUANTMS:{}
  • [ ] 2. Repeat some of these code steps on each DIA, TMT and LFQ: QUANTMS:DIA:{} or QUANTMS:LFQ:{}
  • [ ] 3. Have a generic prefix like: QUANTMS:GENERIC:{}
  • [ ] 4. Create a global subworkflow for both kind of: QUANTMS:FILE_PROCESSING:{STEPS} QUANTMS:SUMMARYPIPELIE:PMULTIQC

What do you think, guys?

ypriverol avatar May 22 '25 11:05 ypriverol

I have no strong opinion here.

timosachsenberg avatar May 22 '25 11:05 timosachsenberg

1- it's the only one that correctly represents the DAG

jpfeuffer avatar May 22 '25 11:05 jpfeuffer

Actually @jpfeuffer the only step that is not doing this is:

Submitted process > BIGBIO_QUANTMS:QUANTMS:SUMMARY_PIPELINE

We should probably move it into some kind of BIGBIO_QUANTMS:QUANTMS:SUMMARY_PIPELINE:PMULTIQC

ypriverol avatar May 22 '25 11:05 ypriverol

1-Consistent with steps. I have no strong opinion for pmultiqc. Because there is only one step for SUMMARY_PIPELINE. pmultiqc just been renamed to SUMMARY_PIPELINE.

daichengxin avatar May 22 '25 12:05 daichengxin

Ah, if pmultiqc is the only step in the summary pipeline sub workflow you can also just use pmultiqc directly in the parent workflow if you want. But you don't have to, it's fine with me.

jpfeuffer avatar May 22 '25 12:05 jpfeuffer

This issue has been solved in the following PR #551

ypriverol avatar May 30 '25 09:05 ypriverol