coref
coref copied to clipboard
Is there a bug here?
https://github.com/mandarjoshi90/coref/blob/bdd15253d174a6a9e155b578ea8e53a46a9aff4c/independent.py#L231
Hi, I guess an offset
can only be the first term, not the subtraction.
Apologies for the late response. IIRC, I don't think I changed that part of the code from e2e-coref. You could be right, though. At a quick glance, it would seem that it's computing distances between indices of the span pairs. That should still be fine for the mask in the following line but less so for the distances.
Hi,
Thanks for the reply.
I was worried about the negative values in antecedent_offsets
.
Maybe we only need to consider the positive ones (they are real distances), and the negative ones will be masked out later on?
Right. The bucket_distance
function will mask out the negative values.
Hi,
I'm still confused after taking another close look. My understanding is that top_antecedent_offsets
will contains negative values, since antecedent_offsets
has negative values.
As a result, there will be some unexpected behavior within bucket_distance function
, as it calculates the log value of top_antecedent_offsets
.
Also, I think maybe the top_fast_antecedent_scores
in
https://github.com/mandarjoshi90/coref/blob/master/independent.py#L324 needs to be updated along with the loop, since top_span_emb
gets updated after every iteration.
Maybe? I suspect it doesn't matter since the slow scores are doing the heavy lifting. Happy to accept a PR though if you're seeing an improvement with that change :)