foyer
foyer copied to clipboard
Parallelizing option for foyer
Describe the behavior you would like added to Foyer Atomtyping and parametrizing large system can be slow, so I propose that we parallelize some process in foyer that can hopefully speed up the process. The two places I think parallelization can be apply is the atomtyping step (in atomtyper.py) and the parametrization step (in forcefield.py, the parametrize method of the Forcefield class).
Describe the solution you'd like Add option for user to parallelize the processes mentioned above.
Within the foyer API, this gets hard because methods like parametrize_system all operate on the same object so shared memory options might be difficult to parallelize.
For something like run_atomtyping, you could imagine trying to split up your entire chemical system (pmd.Structure) into individual molecules (pmd.Structure) and distributing those structures across threads, processes, or workers. With the residue map, you could try to send the same molecules to the same worker. Or something like dask to handle distributing workloads
Regardless it could end up looking hairy
Apologies if this information is readily available, but have we done any profiling? What are the limiting steps?
In general it's parmed that causes big systems to type slowly. Finding the atom types is pretty quick, but actually populating the parmed system with the parameters can be slow (even sometimes memory limited). There's a figure in the 2019 paper (fig.2, I think) that demonstrates linear scaling over a decent range of sizes for a chemically simple system. I have some old notebooks that tried to make this systematic for different types of systems (highly bonded, not at all bonded, large XMLs, small XMLs, etc.), but the results were unsurprising.
On Wed, Jun 17, 2020, 7:16 PM Ryan S. DeFever [email protected] wrote:
Apologies if this information is readily available, but have we done any profiling? What are the limiting steps?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mosdef-hub/foyer/issues/331#issuecomment-645692499, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4RLFVKIHYEKJEX6TFMSC3RXFMENANCNFSM4LSAHUMQ .
@mattwthompson thanks for the insight. Do you think thats something we can improve on when we replace parmed with our own backend? Also good to know we should keep an eye out for performance there as we go about figuring that out.
Yes, and splitting out the two logical steps of Forcefield.apply should probably be the first step to being able to refactor for performance.