openff-evaluator icon indicating copy to clipboard operation
openff-evaluator copied to clipboard

"extract_" protocols take a long time relative to simulation

Open lilyminium opened this issue 3 months ago • 1 comments

During equilibration, by default we simulate 200 ps at a time. This seems to take on the order of a minute.

To check equilibration conditions, I usually set up conditions for Density and PotentialEnergy. The protocol of extracting the data often takes up to 7 minutes, and this happens in sequence. That means up to 90% the compute time is actually taken up by processing data. I believe the work happens in BaseAverageObservable._execute, which runs the following steps:

  • analyses the timeseries to find the equilibration point and statistical inefficiency using pymbar
  • gets uncorrelated points of data
  • bootstraps the observables

Can any time be saved in these steps? For equilibration purposes, we don't need the boostrapping -- potentially we could subclass BaseAverageObservable (or superclass it) and skip that last step.

Example logging with timestamps:

2025-08-05 12:49:00.748 INFO     dhmix_1064373607860850902|equilibration_simulation_mixture finished executing after 53063.88087599771 ms
2025-08-05 12:49:00.866 INFO     Executing dhmix_1064373607860850902|extract_1_Density_mixture
2025-08-05 12:56:02.345 INFO     dhmix_1064373607860850902|extract_1_Density_mixture finished executing after 421478.9291450288 ms
2025-08-05 12:56:02.348 INFO     Executing dhmix_1064373607860850902|multiply_1_Density_mixture
2025-08-05 12:56:02.348 INFO     dhmix_1064373607860850902|multiply_1_Density_mixture finished executing after 0.2952350187115371 ms
2025-08-05 12:56:02.350 INFO     Executing dhmix_1064373607860850902|absolute_1_Density_mixture
2025-08-05 12:56:02.350 INFO     dhmix_1064373607860850902|absolute_1_Density_mixture finished executing after 0.19303098088130355 ms
2025-08-05 12:56:02.745 INFO     Executing dhmix_1064373607860850902|extract_0_PotentialEnergy_mixture
2025-08-05 13:03:46.404 INFO     dhmix_1064373607860850902|extract_0_PotentialEnergy_mixture finished executing after 463659.129380947 ms
2025-08-05 13:03:46.408 INFO     Executing dhmix_1064373607860850902|multiply_0_PotentialEnergy_mixture
2025-08-05 13:03:46.408 INFO     dhmix_1064373607860850902|multiply_0_PotentialEnergy_mixture finished executing after 0.29414903838187456 ms
2025-08-05 13:03:46.410 INFO     Executing dhmix_1064373607860850902|absolute_0_PotentialEnergy_mixture
2025-08-05 13:03:46.410 INFO     dhmix_1064373607860850902|absolute_0_PotentialEnergy_mixture finished executing after 0.200728012714535 ms
2025-08-05 13:03:46.414 INFO     Executing dhmix_1064373607860850902|get_maximum_statistical_inefficiency_mixture

lilyminium avatar Aug 05 '25 20:08 lilyminium