How to use SaProt as a binary classification embedding in a proper way?
Hello, I am doing a research about whether a T cell receptor could be bounded with a specific antigen. It may not be too complicated, but I am wondering How to use SaProt as a useful embedding. I take the instruction on the Readme of Github, but it doesn‘t reach a high level. I have noticed that a downstram task of Metal Ion Binding was conducted, so I want to know how it works,and it may give a great inspiration for my study. Thank you!
Hi,
In my opinion, you could formulate your senario as a classification task where label 1 indicates two proteins can bind and label 0 indicates they cannot bind. Then you need to construct a training set to fine-tune SaProt on this task. For instance, you could collect many positive examples that a T cell receptor binds to different antigens and negative examples that a T cell receptor cannot bind to unrelated antigens. By fine-tuning SaProt on your training set, you could expect the model to gain the capability to identify which antigen could be bound to T cell receptor.
Thank you for your useful advice! After some attempts, I think fine-tuning might be a necessary step. I plan to follow the approach used in the metal ion binding task and give it a try.