Manuel Nuno Melo
Manuel Nuno Melo
Sadly, this speedup is still well behind `gmx trjconv` speeds :( For a synthetic test with an `adk.xtc` consisting of 1000 frames of the same structure (`adk_oplsaa.gro`) I get the...
Made a final optimization, and I think this is ready to merge. An initial check for bonds was being made that was costing about 40% of the entire unwrap call...
... and the results are in: @richardjgowers' tweak to `make_whole` brings the time of that synthetic trajectory test way down! We're now only 5x slower than `gmx trjconv`! ``` real...
Hello and as usual apologies for real life keeping me away from the cool stuff I'd like to make happen. If you're up to the task, @PicoCentauri, I'll try to...
@richardjgowers I agree with leaving the index-splitting caching out. Too complex, little gain. The fragment cache stays, right? There was quite some gain there.
Perfect, I'll split it up, then. Just to be clear, I only optimized the wrapping. Was actually right now looking at the unwrapping and think we can look at some...
Thanks for the positive outlook, @orbeckst. To make this work I propose some refactoring to the compound and cache code in #3000 and #3005, and further unwrap-specific optimizations in #2376....
I don't know if we'd be able to work around the GIL, but it'd probably be a task for multithreading, where one of the threads carries on with CPU inensive...
I think this might work, and certainly more so if the load can be spread between threads because then you also gain whatever performance is lost to I/O. Just as...
@orbeckst My idea was to find the maximum possible gain if: -frame reading occurs separately from per-frame calculations; -frame iteration is limited by CPU, not frame reading. This would mean...