pymzML icon indicating copy to clipboard operation
pymzML copied to clipboard

Iterator reinitializes itself and the end of iteration leading to pointless re indexing.

Open arabidopsis opened this issue 2 years ago • 1 comments

In run.py:Reader.__next__ method the 'END' event triggers a re-opening of the file leading to another reindexing of a possibly very large file (if it contains no index).

Since we have already run through the file this is quite likely pointless.

If I want to run through the file again I'll re-open it myself.

If you want to be smarter about this, then just a seek(0) on the underlying file pointer would be better. Possibly a reset method on the underlying interface(s).

Plus the original underlying file pointer in not explicitly closed before the new one is created. (It will be -- eventulally -- with garbage collection but ... bad form).

arabidopsis avatar Jun 24 '22 21:06 arabidopsis

Hi @arabidopsis,

Thanks for reporting this. The reinitialization was implemented due to a user request after some internal discussion. But you are right, reindexing the whole thing after iterating does not make sense, setting the pointer to the beginning is indeed smarter. I'll change this as soon as I have time and also fix the problem with the open pointer!

Best, Manuel

MKoesters avatar Jun 27 '22 21:06 MKoesters

Should be solved with #307 We still reset the iterator, but there is no reindexing performed after reset. If you think your issue was not addressed as it should be, feel free to open it again.

Best, Manuel

MKoesters avatar Nov 10 '22 12:11 MKoesters