SaProt Error about mutation

When I run python command like this : python mutation_zeroshot.py -c SaProt/config/ClinVar/saprot.yaml , the error happens. The information is listed below: data/Users/zengbing1/software/anaconda3/envs/esm/lib/python3.11/site-packages/torchmetrics/utilities/prints.py:43: UserWarning: Metric SpearmanCorrcoef will save all targets and predictions in the buffer. For large datasets, this may lead to large memory footprint. warnings.warn(*args, **kwargs) # noqa: B028 GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs 0%| | 0/2525 [00:00<?, ?it/s]NP_891847.1.csv 0%| | 0/2525 [00:00<?, ?it/s] Traceback (most recent call last): File "/data3/soft/User/zengbing/method_data3/PLM/SaProt/scripts/mutation_zeroshot.py", line 71, in main() File "/data3/soft/User/zengbing/method_data3/PLM/SaProt/scripts/mutation_zeroshot.py", line 66, in main run(config) File "/data3/soft/User/zengbing/method_data3/PLM/SaProt/scripts/mutation_zeroshot.py", line 41, in run result = trainer.test(model=model, datamodule=data_module) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/Users/zengbing1/software/anaconda3/envs/esm/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 755, in test return call._call_and_handle_interrupt( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/Users/zengbing1/software/anaconda3/envs/esm/lib/python3.11/site-packages/pytorch_lightning/trainer/call.py", line 43, in _call_and_handle_interrupt return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/Users/zengbing1/software/anaconda3/envs/esm/lib/python3.11/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 102, in launch return function(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/Users/zengbing1/software/anaconda3/envs/esm/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 795, in _test_impl results = self._run(model, ckpt_path=ckpt_path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/Users/zengbing1/software/anaconda3/envs/esm/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 938, in _run _verify_loop_configurations(self) File "/data/Users/zengbing1/software/anaconda3/envs/esm/lib/python3.11/site-packages/pytorch_lightning/trainer/configuration_validator.py", line 42, in _verify_loop_configurations __verify_eval_loop_configuration(model, "test") File "/data/Users/zengbing1/software/anaconda3/envs/esm/lib/python3.11/site-packages/pytorch_lightning/trainer/configuration_validator.py", line 115, in __verify_eval_loop_configuration raise NotImplementedError( NotImplementedError: Support for test_epoch_end has been removed in v2.0.0. SaprotFoldseekMutationModel implements this method. You can use the on_test_epoch_end hook instead. To access outputs, save them in-memory as instance attributes. You can find migration examples in https://github.com/Lightning-AI/lightning/pull/16520. Can you help me? Many thanks.

Sep 02 '25 13:09 rescuz

Hi！

This error was probably caused due to the incompatibility of the version of pytorch-lightning. You could solve the problem through this command:

pip install pytorch-lightning==1.8.3

Sep 03 '25 10:09 LTEnjoy

Thank you. Could you share the ClinVar data format used for the code "python mutation_zeroshot.py -c SaProt/config/ClinVar/saprot.yaml". I have downloaded the clinvar data from the proteingym website, but error happened :return int(self._get("length")) [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'. The format of clinvar data I used is like this: ,protein,protein_sequence,mutant,mutated_sequence,DMS_bin_score 85343,NP_998885.1,MPRGSRSAASRPASRPAAPSAHPPAHPPPSAAAPAPAPSGQPGLMAQMATTAAGVAVGSAVGHVMGSALTGAFSGGSSEPSQPAVQQAPTPAAPQPLQMGPCAYEIRQFLDCSTTQSDLSLCEGFSEALKQCKYYHGLSSLP,Y135H,MPRGSRSAASRPASRPAAPSAHPPAHPPPSAAAPAPAPSGQPGLMAQMATTAAGVAVGSAVGHVMGSALTGAFSGGSSEPSQPAVQQAPTPAAPQPLQMGPCAYEIRQFLDCSTTQSDLSLCEGFSEALKQCKYHHGLSSLP,Benign

Sep 11 '25 08:09 rescuz

Hi,

You could download all datasets used in our paper from here :)

Sep 11 '25 09:09 LTEnjoy

OK, thanks. I noticed the data provided is in LMDB format. Could you please share the details of the original source files and their formats that were used to generate it?

Sep 11 '25 10:09 rescuz

The original files are .pdb files. We just extracted such information and corresponding labels to generate a unified lmdb for data loading.

Sep 11 '25 11:09 LTEnjoy

Error about mutation_zeroshot.py