esm icon indicating copy to clipboard operation
esm copied to clipboard

Command for the MSA Transformer in the Variant Prediction example results in a runtime error

Open aybarsnazlica opened this issue 2 years ago • 2 comments

Bug description Running the example command for the MSA Transformer in the Variant Prediction example in https://github.com/facebookresearch/esm/tree/main/examples/variant-prediction results in a runtime error "Received unaligned sequences for input to MSA, all sequence lengths must be equal".

Reproduction steps python predict.py \
--model-location esm_msa1b_t12_100M_UR50S
--sequence HPETLVKVKDAEDQLGARVGYIELDLNSGKILESFRPEERFPMMSTFKVLLCGAVLSRVDAGQEQLGRRIHYSQNDLVEYSPVTEKHLTDGMTVRELCSAAITMSDNTAANLLLTTIGGPKELTAFLHNMGDHVTRLDRWEPELNEAIPNDERDTTMPAAMATTLRKLLTGELLTLASRQQLIDWMEADKVAGPLLRSALPAGWFIADKSGAGERGSRGIIAALGPDGKPSRIVVIYTTGSQATMDERNRQIAEIGASLIKHW
--dms-input ./data/BLAT_ECOLX_Ranganathan2015.csv
--mutation-col mutant
--dms-output ./data/BLAT_ECOLX_Ranganathan2015_labeled.csv
--offset-idx 24
--scoring-strategy masked-marginals
--msa-path ./data/BLAT_ECOLX_1_b0.5.a3m

Logs Traceback (most recent call last): File "predict.py", line 241, in main(args) File "predict.py", line 167, in main batch_labels, batch_strs, batch_tokens = batch_converter(data) File "/home/ubuntu/miniconda3/envs/msa_trans/lib/python3.7/site-packages/esm/data.py", line 328, in call "Received unaligned sequences for input to MSA, all sequence " RuntimeError: Received unaligned sequences for input to MSA, all sequence lengths must be equal.

aybarsnazlica avatar Jan 23 '23 07:01 aybarsnazlica