atacseq icon indicating copy to clipboard operation
atacseq copied to clipboard

CONTROL description needs clarifications

Open mdozmorov opened this issue 8 months ago • 1 comments

Description of feature

The "CONTROL" option description in the "samplesheet input" section needs to be improved.

  1. What exactly the control option does? From the description, it appears "CONTROL" samples may be used as input, but "input" is different from control. This is especially confusing in the provided examples, where "TREATMENT" samples are designated as "CONTROL".
  2. "Example sheets without controls and with controls" - links are broken.
  3. It appears the "control" and "control_replicate" columns are only recognized using "--with_control true" parameter, which is not clear from the documentation.
  4. The pipeline breaks if creating samplesheet following the example. I created the spreadsheet as:
sample fastq_1 fastq_2 replicate control control_replicate
HSATACtr 00_raw/HSATACtr1_S67_R1_001.fastq.gz 00_raw/HSATACtr1_S67_R2_001.fastq.gz 1 CONTROL 1
HSATACtr 00_raw/HSATACtr2_S68_R1_001.fastq.gz 00_raw/HSATACtr2_S68_R2_001.fastq.gz 2 CONTROL 2
HSATACun 00_raw/HSATACun1_S63_R1_001.fastq.gz 00_raw/HSATACun1_S63_R2_001.fastq.gz 1
HSATACun 00_raw/HSATACun2_S64_R1_001.fastq.gz 00_raw/HSATACun2_S64_R2_001.fastq.gz 2

The pipeline errors with "ERROR: Please check samplesheet -> Control identifier and replicate has to match a provided sample identifier and replicate!"

Correcting the spreadsheet in the "control" column as

sample fastq_1 fastq_2 replicate control control_replicate
HSATACtr 00_raw/HSATACtr1_S67_R1_001.fastq.gz 00_raw/HSATACtr1_S67_R2_001.fastq.gz 1 HSATACtr 1
HSATACtr 00_raw/HSATACtr2_S68_R1_001.fastq.gz 00_raw/HSATACtr2_S68_R2_001.fastq.gz 2 HSATACtr 2
HSATACun 00_raw/HSATACun1_S63_R1_001.fastq.gz 00_raw/HSATACun1_S63_R2_001.fastq.gz 1
HSATACun 00_raw/HSATACun2_S64_R1_001.fastq.gz 00_raw/HSATACun2_S64_R2_001.fastq.gz 2

works. But this contradicts the documentation.

As of now, it feels safer to run the pipeline without "controls" because it is unclear what are the consequences.

mdozmorov avatar Oct 06 '23 11:10 mdozmorov