main.py issues/improvements for mass inference with an ase-db with a checkpoint
It isn't obvious how to do mass inference on an ase-db with main.py.
Something like:
python main.py --mode predict --checkpoint gnoc_oc22_oc20_all_s2ef.pt --task.dataset=ase_db --test_dataset.src=data.db
should be sufficient I think, but this fails because --config-yml is a required argument. Given the checkpoint, I think it should not be required, at best it duplicates the model information, and at worst could be inconsistent with what is in the checkpoint. For predictions, it is hard to see why you should change the model.
Even when I make a config.yml file though, it appears you have to populate both train and test datasets
'dataset.train.a2g_args.r_energy': False,
'dataset.train.a2g_args.r_forces': False,
# Test data - prediction only so no regression
'dataset.test.src': 'data.db',
'dataset.test.a2g_args.r_energy': False,
'dataset.test.a2g_args.r_forces': False,
})
or you get an error
File "/home/jovyan/shared-scratch/jkitchin/tutorial/ocp-tutorial/fine-tuning/ocp/ocpmodels/trainers/base_trainer.py", line 344, in load_datasets
if self.normalizer.get("normalize_labels", False):
AttributeError: 'NoneType' object has no attribute 'get'
This also doesn't make sense to me, I think you should only need to specify the source you want to make predictions from.
I guess this isn't very specific to ase-db, and also applies to other data sources like lmdb.
Adding to OCP 2.0 planned changes #520
This issue has been marked as stale because it has been open for 30 days with no activity.
@lbluque Can we close this? Do we have an equivalent option for the new cli now?