MAGeCK is pulling conditions instead of unique sample names
Description of the bug
I have an experimental setup with biological triplicates of my conditions: treated_5hr, treated_7hr, and control. My sample names as specified in the samplesheet look like
treated_5hr_1
treated_5hr_2
treated_5hr_3
treated_7hr_1
treated_7hr_2
treated_7hr_3
control_1
control_2
control_3
However my MAGeCK-count log shows my sample labels as
--sample-label treated_5hr,treated_5hr,treated_5hr,treated_7hr,treated_7hr,treated_7hr,control,control,control
As a result, I can't tell my replicates apart in my counts tables (count_table.count.txt , count_table.count_normalized.txt)
sgRNA Gene treated_5hr treated_5hr treated_5hr treated_7hr treated_7hr treated_7hr control control control
sgRNA1976 CD28 303 315 134 207 374 438 399 281 329
sgRNA56069 POTEB2 350 652 587 555 501 784 558 785 509
sgRNA37077 ZC2HC1C 224 53 43 121 0 57 308 73 76
sgRNA7735 KRT5 571 458 393 396 533 811 339 278 352
sgRNA9783 OPRM1 164 39 67 167 386 120 177 145 107
[...]
Is there an additional metadata column I can pass in my samplesheet, or is there another parameter somewhere I've missed? Or is this somehow pulling names from the condition column instead?
Command used and terminal output
No response
Relevant files
No response
System information
No response
+1, any suggested workaround for this?
Hi @nick-phillips and @jeremymsimon,
I have recently been involved in the development of the pipeline, and I have experienced the same problem that you are describing. I think that there is no workaround, but I think that this can easily be fixed.
I will open a PR to try to change the sample labels in the mageck command line, and let you know!
closed by #252