espaloma Issues trying to reproduce atom typing recovery experiment

I'm trying to reproduce the atom typing recovery experiment from the docs and ran into some issues. I'm including the steps I've tried below but I had a couple general questions:

Is the code from the atom typing recovery experiment docs compatible with the 0.3.2 version of the espaloma package or should I be using a different release?
I see that there's a few repos out there for reproducing some parts of the espaloma-related preprints, is there code available for reproducing the analysis from Fig 2 in this preprint?

Steps I've tried so far

First, in order to set up the environment I used mamba create -n espaloma-032 -c conda-forge espaloma=0.3.2 as suggested in https://github.com/choderalab/espaloma/issues/195#issuecomment-1776752844

The URL for the zinc dataset was not working so I replaced that chunk of code with the suggestion in https://github.com/choderalab/espaloma/issues/120

Following along with the code after that, I ran into the following warnings and error:

/opt/mambaforge/envs/espaloma-032/lib/python3.11/site-packages/h5py/__init__.py:36: UserWarning: h5py is running against HDF5 1.14.3 when it was built against 1.14.2, this may cause problems
  _warn(("h5py is running against HDF5 {0} when it was built against {1}, "

/opt/mambaforge/envs/espaloma-032/lib/python3.11/site-packages/dgl/heterograph.py:92: DGLWarning: Recommend creating graphs by `dgl.graph(data)` instead of `dgl.DGLGraph(data)`.
  dgl_warning(
[02:28:17] Explicit valence for atom # 9 N, 5, is greater than permitted
[02:28:17] ERROR: Could not sanitize molecule ending on line 142174
[02:28:17] ERROR: Explicit valence for atom # 9 N, 5, is greater than permitted
<few more warnings like above not included here>

AttributeError                            Traceback (most recent call last)
Cell In[8], line 8
      6 for g in ds_tr:
      7     optimizer.zero_grad()
----> 8     net(g.heterograph)
      9     loss = loss_fn(g.heterograph)
     10     loss.backward()

AttributeError: 'DGLGraph' object has no attribute 'heterograph'

At this point I tried referring to the docs for some of the other experiments and modified the following chunks of code:

if torch.cuda.is_available():
    net = net.cuda()
----------------------------
for idx_epoch in range(3000):
    train_iterator = tqdm(ds_tr, desc=f'Epoch {idx_epoch+1}/{3000}', unit='batch')
    for g in train_iterator:
        optimizer.zero_grad()
        if torch.cuda.is_available():
            g = g.to("cuda:0")
        g=net(g)
        loss = loss_fn(g)
        loss.requires_grad = True
        loss.backward()
        optimizer.step()

        train_iterator.set_postfix(loss=loss.item())
    loss_tr.append(loss.item())

With this I was able to get the model to train but the training loss looks off so I'm probably doing something wrong. Anyone have any ideas/suggestions? training_loss_plot

Dec 11 '23 06:12 rohithmohan

Figured out the issue, should've realized that I wouldn't need to specify loss.requires_grad = True

In case others might run into the same issue, the problem was resolved by using loss_fn = esp.metrics.TypingCrossEntropy() instead of loss_fn = esp.metrics.TypingAccuracy() as the docs suggested. Got much better training/val loss curves after that. There were some other deviations from the docs but that was the main one throwing me off.

Dec 13 '23 08:12 rohithmohan

Glad you figured this out! Is there anything we can do to make this more clear in our documentation?

Jan 23 '24 14:01 mikemhenry

Thanks for following up! It might be helpful to update the atom typing recovery docs with some of the changes I mentioned.

Specifically, changing loss_fn = esp.metrics.TypingAccuracy() to loss_fn = esp.metrics.TypingCrossEntropy()

And modifying the last code block on the page to something like:

# define optimizer
optimizer = torch.optim.Adam(net.parameters(), 1e-5)

# Uncomment below to use the GPU for training
# if torch.cuda.is_available():
#     net = net.cuda()

# train the model
for _ in range(3000):
    for g in ds_tr:
        optimizer.zero_grad()
        # Uncomment below to use the GPU for training
        # if torch.cuda.is_available():
        #     g = g.to("cuda:0")
        g=net(g)
        loss = loss_fn(g)
        loss.requires_grad = True
        loss.backward()
        optimizer.step()

Happy to submit a Pull Request if it's appropriate!

Jan 24 '24 06:01 rohithmohan

espaloma espaloma copied to clipboard

Issues trying to reproduce atom typing recovery experiment

Steps I've tried so far

espaloma
espaloma copied to clipboard