gen-arg icon indicating copy to clipboard operation
gen-arg copied to clipboard

Tuning the model to handle imbalanced data

Open jeremytanjianle opened this issue 3 years ago • 1 comments

Love the paper.

I've tried it on my own closed domain dataset and achieved poor recall.

Role identification: P: 49.30, R: 28.43, F: 36.06
Role: P: 44.41, R: 25.60, F: 32.48
Coref Role identification: P: 69.93, R: 40.32, F: 51.15
Coref Role: P: 48.60, R: 28.02, F: 35.55

I believe the low recall is due to imbalanced labels, but I value recall over precision. Is there some way to tune the model to increase recall at the cost of precision?

jeremytanjianle avatar May 12 '21 04:05 jeremytanjianle

Unfortunately, I can't think of any straightforward way to increase recall since the model is trained for generation, using token-level cross entropy loss. Perhaps you can try lowering the probability of producing the <arg> token?

raspberryice avatar May 15 '21 20:05 raspberryice