Ruben Vorderman
Ruben Vorderman
on second thought, this extra open_multiple API is going to be confusing and there is some code duplication. Let me simplify this.
Well, the gains are significant, but only because dnaio is already fast. It is going from 900ms for 5 million records to 700ms. So it is quite significant in percentage...
I have an idea about an architectural change that may improve speed, but it only makes sense if we do the above move to C. The current flow is that...
Yes that is what I meant. Generalizing dnaio.open seems indeed the best path. `MultipleReader` is not intended to be the final name. I am struggling to think of a better...
Wouldn't it be easer to use the threading module and do the alignment `with nogil:`? That way the SequenceRecords themselves can be distributed at low cost (threading allows for shared...
Sure, you know that project best. So the SequenceRecordArray should be implemented. I propose the following design: We create a new struct: ``` cdef struct SequenceRecordOffsets: Py_ssize_t name_offset Py_ssize_t name_length...
I have been thinking a bit about this. FastqWriter could also simply use the boolean flag that is part of fastq_bytes. That would make it a lot simpler. As for...
I tried factoring the two header system out of FastqIter altogether, but it is impossible to determine the two_header status outside that loop if the file cannot be seeked and...
Hi, in my latest pr #12 I noticed that adding methods is not without cost. It makes the parsing slightly slower. Also I noticed a weird pattern. First it turned...
Thank you for your answer. I understand this is hard and I also understand that this feature may never be implemented because of this. The docker registry was never designed...