Jin Su comments

Results 198 comments of


                                            Jin Su

Error about mutation_zeroshot.py

The original files are ``.pdb`` files. We just extracted such information and corresponding labels to generate a unified lmdb for data loading.

How to extract whole-protein structural representations with multi-chain input?

Hi, I think the way you mentioned that concatenating the sequences of different chains into one and feed it into SaProt is feasible. Besides, you may also obtain the embeddings...

How to extract whole-protein structural representations with multi-chain input?

We split a multi-chain protein into single-chain proteins and then adopted the same training strategy :)

More evaluation metrics

Hi, It's good to include more metrics to comprehensively evaluate the performance of different models! Since the results were recorded around 2 years ago, we didn't save all model checkpoints...

How to use SaProt as a binary classification embedding in a proper way？

Hi, In my opinion, you could formulate your senario as a classification task where label 1 indicates two proteins can bind and label 0 indicates they cannot bind. Then you...

How are `sequence`, `structure`, and `text` files matched in FAISS index?

Hi, thank you for your interest in our work! These three parts are not aligned directly. They are different databases for retrieval purpose. For example, If you have a function...

code for pretraining

Hi, We have already released the pre-training code. You could refer to https://github.com/westlake-repl/SaProt/issues/24.

generate_embedding error

Hi, It looks like the node you used was busy at that time. Maybe it was executing an earlier search?

This is because the system cannot find the config file of protein encoder. Did you download the model weights correctly following https://github.com/westlake-repl/ProTrek?tab=readme-ov-file#Download-model-weights ?

Confusing Documentation + Missing JSON file

Hi, Thank you for your feedback to make our document more clear! To avoid confusion, we have removed ``load_protein_pretrained`` and ``load_text_pretrained`` in the example. We keep these arguments for debugging...