Brian DeCost
Brian DeCost
right, ok. How many dataloader workers are you using? Each worker process [apparently makes a full copy](https://discuss.pytorch.org/t/is-a-dataset-copied-as-part-of-dataloader-with-multiple-workers/112924/2) of the dataset, so if you're using multiple you can try to reduce...
I'm having a bit of trouble reproducing your result -- mainly because I am not able to parse the CIF data you shared. I have tried the jarvis, ase, and...
Hi @kdmsit -- would you mind giving some more details about the loss function you're using and what is not working as expected? One thing that I noticed is that...
since we are using [ignite checkpointing here](https://github.com/usnistgov/alignn/blob/b70232f86e46be8dd0235c6feb287f47f34176bb/alignn/train.py#L398), resuming should be straightforward enough. I think `train_dgl` should take an optional checkpoint to resume from, and then it can load model and...
There is a config option to wrap the model in DistributedDataParallel, but I think that may be a dead code path at this point. https://github.com/usnistgov/alignn/blob/736ec739dfd697d64b1c2a01dc84678a24bcfacd/alignn/train.py#L651 I think the most straightforward...
> My model does not fit into single GPU because of big molecule structures. These use cases are pretty common if you are studying big structures. I think adding a...
Ok, cool. I wasn't sure if you had such large molecules that fitting a single instance in GPU memory was a problem. What you want is more straightforward to do...
it looks like newer versions of scipy [require python 3.9 or newer](https://github.com/scipy/scipy/blob/37b77ad355d5c408007bcd7c0ccf3b8be76df986/setup.py#L32) you might be able to downgrade scipy to a version that will work with 3.8, but I would...
I saw that our README specifies python 3.8, so that will need to be updated... if you're using anaconda, you should change the environment setup to `conda create --name alignn...
One more thing that I recently encountered - in some of my code I'm hitting a possible CUDA bug in the latest version of dgl, so you may wish to...