Niek de Jonge

Results 68 comments of Niek de Jonge

Interesting plans! If I understand it correctly the main goals are to make creating new pipelines easier and cleaner and to speed up the score calculation? One thing I wonder...

Thanks, good points. Good news the following HDF5 based storage method was recently published: mzMLb: A Future-Proof Raw Mass Spectrometry Data Format Based on Standards-Compliant mzML and Optimized for Speed...

I agree on your point. We are indeed not looking for creating a new mzML format and actually I would always suggest sharing and storing the data in the publicly...

I think #584 should fix the issue @ADablanc Sorry for the long time it took to fixing this issue...

Yes that is what I want to do. I could just set it to 100 spectra or 100 unique inchikeys, but, since I am not fully sure what it is...

In MS2Query I solved it by just using large test sets of 2000 spectra, to never have that issue, but as a result the tests run very long (minutes) which...

I think I figured it out now. It seems like the number of unique inchikeys has to be larger than the batch size. So for the tests we can use...

Yes, I will add a check in the data generators that checks if there are more unique inchikeys than the batch size. This should indeed return a clear error message.

Yes, this restarting is always a bit difficult. Brainstorming a bit: Just checking the number of already processed spectra would not work, since spectra can be removed during filtering. We...

The logging that you mention could also be a good option. I like the idea of storing it in the header of the temporary spectrum file. This would directly be...