ConvE icon indicating copy to clipboard operation
ConvE copied to clipboard

About the activation function

Open HammerWang98 opened this issue 2 years ago • 2 comments

Hello , Tim. Have u experimented that replacing the sigmoid with softmax in the logits layer? I tried to run your code, but I found that I got a lower MRR score than the result your paper with sigmoid. When I changed it to softmax, I got a higher MRR score than u. I want to cite your paper in our experiments, could u tell me how to address this problem and use your result as our base. Thank u, looking forward to your reply.

HammerWang98 avatar May 19 '22 08:05 HammerWang98

Hi This is a multi-label classification, therefore, it should be sigmoid. Sometimes you have to predict 30 classes in logits. If you are interested, use focal loss or its new version (ICLR 2022) version to enhance the result. @TimDettmers might agree with me.

saeedizade avatar May 25 '22 17:05 saeedizade

Hi! Thanks for raising this issue. While mathematically the logistic sigmoid should be the right thing to do, I have heard before that using a softmax actually performs better in practice. Some authors use softmax in practice. The focal loss or the enhanced version suggested by @saeedizade might be even better. What I would suggest in the experiments for your paper is to use a better framework with more baselines across different models. I would recommend PyKEEN which is actively developed and has a ConvE baseline. It should give you very robust baselines that makes it easy to compare to.

TimDettmers avatar Jun 06 '22 18:06 TimDettmers