goleft
goleft copied to clipboard
Suggestions : paths ; case/control
Hi Brent , here are two suggestions for indexcov
-
using a file containing the path to the bams (to avoid something like
xargs) -
if we could include the fact that some samples are 'cases' or 'controls', would it improve your algorithm ?
thanks
could you expand on the first point? you mean you want to avoid argument list too long error or something?
for the 2nd point, indexcov only does within sample normalization, not between sample. I did have a mode where you could specify that the first $N samples were of interest and the remaining were background--to give an idea of how a maybe small $N looks given a large background, but I removed this as it made the code and interface more complex. I'm hesitant to revisit, but I might be convinced.
? you mean you want to avoid argument list too long error or something?
yes. Something like what the broad is doing with the '.list' suffix: https://software.broadinstitute.org/gatk/documentation/tooldocs/3.8-0/org_broadinstitute_gatk_engine_CommandLineGATK.php#--input_file
An input file containing sequence data mapped to a reference, in BAM or CRAM format, or a text file containing a list of input files (with extension .list).
I'm hesitant to revisit, but I might be convinced
I wouldn't be able to convince you. I wondered if there was something to explore here.
? you mean you want to avoid argument list too long error or something?
yes. Something like what the broad is doing with the '.list' suffix: https://software.broadinstitute.org/gatk/documentation/tooldocs/3.8-0/org_broadinstitute_gatk_engine_CommandLineGATK.php#--input_file
that's doable, though I suspect if your command is that long then it'll be hard to do much with the output. Though I guess you have your own viewer that can overcome the limitations of the html one included. Let's keep this open and I'll try to get around to adding that.
An input file containing sequence data mapped to a reference, in BAM or CRAM format, or a text file containing a list of input files (with extension .list).
I'm hesitant to revisit, but I might be convinced
I wouldn't be able to convince you. I wondered if there was something to explore here.
There may be, but I don't have the bandwidth for now.