pandora
pandora copied to clipboard
Compare should have option to pass reads as a list
I understand the reason for compare taking a tsv of sample -> filepath mappings is in the use case where the user has a large number of samples.
It would also be great if you could pass the reads on the command line without the need to create a tsv. Something like
pandora compare -p prg.fa -r data/*.fastq -o outdir/
Maybe a better way would be having two options for input and one or the other are required.
pandora compare --reads data/*.fastq ...
# and tsv option
pandora compare --tsv mappings.tsv ...
How is data/*fastq to be interpreted? Which fastq is which sample, and what sample names go in the vcf? All fastq for one sample? Or each fastq is a different one?
How is data/*fastq to be interpreted? Which fastq is which sample, and what sample names go in the vcf?
Well the way I envision it being interpretted is you have 4 fastqs in a directory (like with our four-way) where each one is a sample.
$ ls data/
ST38.fastq CFT073.fastq 063.fastq H131800743.fastq
The idea would be that the sample is named after the filename prefix. So ST38.fastq would get the name ST38 and this would be the name that goes in the VCF.
All fastq for one sample? Or each fastq is a different one?
Each fastq would be a different sample.
I propose to add this feature when dealing with #53 and #140 : all related with improving argument parsing. As it is not hard to work around this issue and the others referent to argument parsing, I'd suggest to leave these issues to be solved after all the ones that are essential for the paper's analysis (i.e. during submission/reviewing process).