mulrel-nel icon indicating copy to clipboard operation
mulrel-nel copied to clipboard

Ganea model parameters

Open lej-la opened this issue 5 years ago • 18 comments

I was trying to run your model with "rel-norm" type and 1 relation. As you suggest in the paper it should be equivalent to the Ganea & Hofmann (2017) model, but the result I got on AIDA-B dataset was not the same as they reported.

Their reported number was 92.22 micro F1, while I got only 83.71 micro F1 on average (highest was 86.24).

Did you actually manage to replicate their results? Am I missing some parameter settings?

Thanks.

lej-la avatar Jul 03 '19 14:07 lej-la

I wasn't able to replicate their paper in the beginning. But then reading their source code https://github.com/dalab/deep-ed was super helpful.

lephong avatar Jul 03 '19 15:07 lephong

Thank you, I've read their code and also tried to reimplement it (https://github.com/lej-la/deep-ed-pytorch), but I eventually gave up.

I thought your model is a generalized version of their model and that it is able to produce the same results using only 1 relation (the number of relations = 1).

So were you, eventually, able to replicate their results with this code, please?

lej-la avatar Jul 03 '19 16:07 lej-la

You are right the if the number of relation is set to 1, we will have their model.

Yes, I successfully replicated their results (even got a bit higher scores, but not significant).

lephong avatar Jul 03 '19 16:07 lephong

Please, just for clarification. My assumption is that if I run your code like this with the following parameters:

python -u -m nel.main --mode train --n_rels 1 --mulrel_type rel-norm --model_path model

I should be able to get a Ganea-like model with the performance ~92 micro F1 on AIDA-B. Is that correct?

Because I've tried that several times, but the results were only 83.71 micro F1 on average.

Thanks

lej-la avatar Jul 04 '19 07:07 lej-la

Unfortunately, I don't have a facility to run the code.

When you ran the command line in the README

python -u -m nel.main --mode train --n_rels 3 --mulrel_type ment-norm --model_path model

did you get the reported number?

For Ganea-like, could you try:

python -u -m nel.main --mode train --n_rels 1 --mulrel_type rel-norm --model_path model

lephong avatar Jul 04 '19 07:07 lephong

The reason for trying "rel-norm" instead of "ment-norm" is that for ment-norm the model uses mention padding, which Genea model doesn't have.

lephong avatar Jul 04 '19 07:07 lephong

The results of your best model that I was able to get (by running the first command) were on average 91.62 micro F1.

lej-la avatar Jul 04 '19 07:07 lej-la

Hmm, could you please send me the log files (or what you get when running the cmd)?

lephong avatar Jul 04 '19 07:07 lephong

I don't have them, but I'll re-run the training and send it to you. Can I use your email from your last paper (https://arxiv.org/pdf/1906.01250.pdf)?

lej-la avatar Jul 04 '19 07:07 lej-la

Yes, please send to my gmail address (I no longer use UoEdin email address). Thanks!

lephong avatar Jul 04 '19 07:07 lephong

Thank you :)

lej-la avatar Jul 04 '19 07:07 lej-la

I have the same issue of not being able to achieve the claimed 93.07 score. Did you manage to find the issue?

martinj96 avatar Jul 24 '19 14:07 martinj96

Yes, I found the issue, but we had a private discussion which is not shown here. Long story short, lej-la commented out an important line.

Could you show your log file?

lephong avatar Jul 24 '19 14:07 lephong

Thanks for the fast response. I ran it once, got her result and assumed there's some issue. Let me run it at least five times and get back to you with the log file or a message that I've reproduced the result.

martinj96 avatar Jul 24 '19 14:07 martinj96

Hey, I got a bit confused by the part in section 3.2 about rel-norm. The true parameters to replicate ganea global model are actually using ment-norm with K=1. In that way, the normalization factor becomes the same as in equation 3. Using rel-norm, the normalization becomes just 1, instead of 1/(n-1). I got as close as 91.6 micro F1 on AIDA-B.

lej-la avatar Jul 24 '19 14:07 lej-la

okay, so they differ at the normalization factor. Thanks for pointing out

lephong avatar Jul 24 '19 14:07 lephong

But then, actually the ment-norm model seems to have the same performance with K=1 and K=3.

lej-la avatar Jul 24 '19 14:07 lej-la

All is good, I managed to reproduce the results. Very simple steps, with no issues; good job @lephong. Maybe you could also close this thread as it seems to be resolved. Bests.

martinj96 avatar Jul 24 '19 15:07 martinj96