disq
disq copied to clipboard
Support for FASTQ
In the current Support Matrix there is no mention to the status (currently none) to FASTQ-files. Although HadoopBAM has it but it is not widely used, I would like to see support for it.
We can port the HadoopBAM version or we can also ask for other solutions like Fastdoop. I had previously contacted @umbfer about the availability of his library (before it got into GitHub) and also I mentioned him this project, so maybe he can also chime if he is interested to contribute to this project with a port of his code.
In any case, I think that the library might benefit form this support as FASTQ is still the most common file format for unmapped reads.
ADAM has support for FASTQ files. My main goal with the Disq library is to reduce duplication of effort among all projects using Apache Spark for genomics.
@heuermh - is it planned to port some functionality from ADAM?
Yes, I am in favor of bringing things from downstream projects (e.g. GATK4, ADAM) up into Disq as necessary.
Perfect! So in that case we keep open until it is ported (or decided not to)
Hi everybody, we are focused on the development of a new version of Fastdoop and, unfortunately, we miss the time required for porting our code into your framework. But we could provide some form of support, if some you is willing to undertake this task.