torchmd-net icon indicating copy to clipboard operation
torchmd-net copied to clipboard

Consistent naming between dy and forces

Open PhilippThoelke opened this issue 1 year ago • 9 comments

This PR renames y to energy and dy to forces in places where we mean forces and not gradients. PyTorch-Geometric Data objects are now constructed as Data(energy=<energy>, ...) or Data(energy=<energy>, forces=<forces>, ...). This should not change any behavior but just concerns naming conventions. Linked to #116

PhilippThoelke avatar Aug 30 '22 00:08 PhilippThoelke

Do you also want to rename y to energy? It seems like it would be more consistent to either name both as abstract variables (y and dy, in which case the latter would be the gradient of the former), or else name both as physical quantities (energy and forces, in which case it's the negative gradient). Mixing them can be confusing.

peastman avatar Aug 30 '22 00:08 peastman

Initially I thought torch geometric was requiring setting y when constructing a Data object but now I'm not sure anymore. I agree that energy and forces makes more sense than y and forces. I'll rename y as well if torch geometric doesn't prohibit that.

PhilippThoelke avatar Aug 30 '22 01:08 PhilippThoelke

I'm a bit divided here. We use torchmd-net also for pKa prediction and in the future maybe also for charge prediction. This will make it more confusing, will it not?

stefdoerr avatar Aug 30 '22 07:08 stefdoerr

Do you mean renaming both y and dy are confusing or would you want dy to be forces and y to stay as it is?

PhilippThoelke avatar Aug 30 '22 17:08 PhilippThoelke

I think renaming them both is confusing. I think they should be named energy/forces only internally at the datasets level (and they should return y/dy) and not at the train/script level where you could replace the datasets with other type of data. The problem here being that we don't train on dy but on minus_dy which makes it ugly.

stefdoerr avatar Aug 31 '22 07:08 stefdoerr

If the models are intended to be used for predicting things other than energy, I agree that y and dy is a better choice, and dy should be the positive gradient.

peastman avatar Aug 31 '22 17:08 peastman

Changing the model to return the positive gradient now would silently break a lot of applications, I don't think that's a good idea. I'd be okay with calling it y and neg_dy or something like that to make it clear.

PhilippThoelke avatar Aug 31 '22 18:08 PhilippThoelke

Regarding the variable names, my preference is d/dy, but energy/forces is fine. Could they be made consistent in models.model.TorchMD_Net too.

raimis avatar Sep 01 '22 08:09 raimis

Another options: models.model.TorchMD_Net.forward does not only the computation of y (the forward pass), but also the computation of dy (the backward pass). We can factor out the gradient computation into a separate function/method. As a side effect, the sign of dy could be changed without breaking the others code silently.

raimis avatar Sep 01 '22 08:09 raimis

I updated (hopefully) all places to consistently use y as the label/prediction and neg_dy as the negative derivative of the label/prediction.

PhilippThoelke avatar Oct 10 '22 18:10 PhilippThoelke

neg_dy is the best compromise in my view

On Wed, Aug 31, 2022, 20:52 Philipp Thölke @.***> wrote:

Changing the model to return the positive gradient now would silently break a lot of applications, I don't think that's a good idea. I'd be okay with calling it y and neg_dy or something like that to make it clear.

— Reply to this email directly, view it on GitHub https://github.com/torchmd/torchmd-net/pull/121#issuecomment-1233296689, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3KUORMOPSLA5VL6FBCYMLV36SWPANCNFSM577VBPCQ . You are receiving this because your review was requested.Message ID: @.***>

giadefa avatar Oct 11 '22 07:10 giadefa