mace-mp
mace-mp copied to clipboard
Question about memory scaling during training
Hi,
Thanks for building these models. I noticed that the training scripts for the MP pre-trained models use small batch sizes of 16. What was the reasoning for this choice?
My application requires training on graphs with hundreds to a few thousand nodes, and I was hoping that MACE's lack of explicit triplet angle computation (as in DimeNet or GemNet) would offer more favorable memory scaling. Any insights would be greatly appreciated.
Thanks, Rees