More evaluation metrics

Open kalifadan opened this issue 6 months ago • 1 comments

Hey, Could you provide more rigorous metrics, such as AUROC and AUPRC, rather than ACC for the classification tasks (such as HumanPPI or DeepLoc)? I think it can help to compare SaProt to other baselines better. Thank you!

Jun 22 '25 14:06 kalifadan

Hi,

It's good to include more metrics to comprehensively evaluate the performance of different models! Since the results were recorded around 2 years ago, we didn't save all model checkpoints for that long time, and rerun the experiments requires a lot of computational resources. Nevertheless, following existing benchmarks, we think ACC is a good metric to reflect models' performance :)

Jun 23 '25 01:06 LTEnjoy