progen
progen copied to clipboard
Predicting model for CM and MDH dataset
Hi, thank you for the beautiful work.
Porgen has been applied to generate proteins for CM and MDH families. In the Method section, the details are described as:
We computed the AUC in receiver operating characteristic (ROC) curves for predicting binary function labels from model scores. We computed model scores for each sequence in both CM and MDH by using the per-token model log-likelihood in Eq. 2.
Does this mean: (1) for each sequence the log-likelihood is calculated for each token and (2) then a classifier model is employed to predict whether the whole sequence is reactive or not (the label is from experimental data). The features are the calculated log-likelihood score for each token. Could you please also release data/codes/models for this part?
Best regards
Could you please release the corresponding data for this generation of CM/MDH?
Best, Liguo
I can not understand what is GB1 (top100avg) ? how to calculate?