SaProt icon indicating copy to clipboard operation
SaProt copied to clipboard

Error about mutation_zeroshot.py

Open rescuz opened this issue 4 months ago • 5 comments

When I run python command like this : python mutation_zeroshot.py -c SaProt/config/ClinVar/saprot.yaml , the error happens. The information is listed below: data/Users/zengbing1/software/anaconda3/envs/esm/lib/python3.11/site-packages/torchmetrics/utilities/prints.py:43: UserWarning: Metric SpearmanCorrcoef will save all targets and predictions in the buffer. For large datasets, this may lead to large memory footprint. warnings.warn(*args, **kwargs) # noqa: B028 GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs 0%| | 0/2525 [00:00<?, ?it/s]NP_891847.1.csv 0%| | 0/2525 [00:00<?, ?it/s] Traceback (most recent call last): File "/data3/soft/User/zengbing/method_data3/PLM/SaProt/scripts/mutation_zeroshot.py", line 71, in main() File "/data3/soft/User/zengbing/method_data3/PLM/SaProt/scripts/mutation_zeroshot.py", line 66, in main run(config) File "/data3/soft/User/zengbing/method_data3/PLM/SaProt/scripts/mutation_zeroshot.py", line 41, in run result = trainer.test(model=model, datamodule=data_module) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/Users/zengbing1/software/anaconda3/envs/esm/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 755, in test return call._call_and_handle_interrupt( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/Users/zengbing1/software/anaconda3/envs/esm/lib/python3.11/site-packages/pytorch_lightning/trainer/call.py", line 43, in _call_and_handle_interrupt return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/Users/zengbing1/software/anaconda3/envs/esm/lib/python3.11/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 102, in launch return function(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/Users/zengbing1/software/anaconda3/envs/esm/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 795, in _test_impl results = self._run(model, ckpt_path=ckpt_path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/Users/zengbing1/software/anaconda3/envs/esm/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 938, in _run _verify_loop_configurations(self) File "/data/Users/zengbing1/software/anaconda3/envs/esm/lib/python3.11/site-packages/pytorch_lightning/trainer/configuration_validator.py", line 42, in _verify_loop_configurations __verify_eval_loop_configuration(model, "test") File "/data/Users/zengbing1/software/anaconda3/envs/esm/lib/python3.11/site-packages/pytorch_lightning/trainer/configuration_validator.py", line 115, in __verify_eval_loop_configuration raise NotImplementedError( NotImplementedError: Support for test_epoch_end has been removed in v2.0.0. SaprotFoldseekMutationModel implements this method. You can use the on_test_epoch_end hook instead. To access outputs, save them in-memory as instance attributes. You can find migration examples in https://github.com/Lightning-AI/lightning/pull/16520. Can you help me? Many thanks.

rescuz avatar Sep 02 '25 13:09 rescuz

Hi!

This error was probably caused due to the incompatibility of the version of pytorch-lightning. You could solve the problem through this command:

pip install pytorch-lightning==1.8.3 

LTEnjoy avatar Sep 03 '25 10:09 LTEnjoy

Thank you. ​Could you share the ClinVar data format used for the code "python mutation_zeroshot.py -c SaProt/config/ClinVar/saprot.yaml". I have downloaded the clinvar data from the proteingym website, but error happened :return int(self._get("length")) [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'. The format of clinvar data I used is like this: ,protein,protein_sequence,mutant,mutated_sequence,DMS_bin_score 85343,NP_998885.1,MPRGSRSAASRPASRPAAPSAHPPAHPPPSAAAPAPAPSGQPGLMAQMATTAAGVAVGSAVGHVMGSALTGAFSGGSSEPSQPAVQQAPTPAAPQPLQMGPCAYEIRQFLDCSTTQSDLSLCEGFSEALKQCKYYHGLSSLP,Y135H,MPRGSRSAASRPASRPAAPSAHPPAHPPPSAAAPAPAPSGQPGLMAQMATTAAGVAVGSAVGHVMGSALTGAFSGGSSEPSQPAVQQAPTPAAPQPLQMGPCAYEIRQFLDCSTTQSDLSLCEGFSEALKQCKYHHGLSSLP,Benign

rescuz avatar Sep 11 '25 08:09 rescuz

Hi,

You could download all datasets used in our paper from here :)

LTEnjoy avatar Sep 11 '25 09:09 LTEnjoy

OK, thanks. I noticed the data ​provided​ is in LMDB format. Could you ​please share​ the details of the ​original source files and their formats​ that were used to ​generate​ it?

rescuz avatar Sep 11 '25 10:09 rescuz

The original files are .pdb files. We just extracted such information and corresponding labels to generate a unified lmdb for data loading.

LTEnjoy avatar Sep 11 '25 11:09 LTEnjoy