GDPO
GDPO copied to clipboard
Clarification on the DS Score Normalization
Hi.
I hope this message finds you well
In paper 5.2, it is explained (following prior research and as shown in the figure) that the DS score is normalized by dividing it by 20. However, in the code scorer/evaluate.py—specifically in the gen_score_list function (lines 65 to 74)—it appears that the DS score is normalized by dividing it by 10. Could you clarify which DS Score normalization approach is intended?
Best, Sujin
def gen_score_list(protein,smiles,train_fps=None,weight_list=None):
...
df = df[~df[protein].isin([-1])]
dsscore = np.clip(df[protein],0,20)/10
novelscore=1-df["sim"]
df['qed'] = get_scores('qed', df['mol'])
qedscore = np.array(df["qed"])
df['sa'] = get_scores('sa', df['mol'])
sascore = np.array(df["sa"])
if weight_list is None:
score_list = 0.1*qedscore+0.1*sascore+0.4*novelscore+0.4*dsscore
else:
score_list = weight_list[0]*qedscore+weight_list[1]*sascore+weight_list[2]*novelscore+weight_list[3]*dsscore
valid_score_list = score_list.tolist()