metal
metal copied to clipboard
Customizable loss reduction breaks assumptions
PR #75 introduced a customizable reduction for the loss function, but there are two places later in the code where we make the assumption that its the total loss (reduction='sum') when we go to calculate the average loss per example:
- https://github.com/HazyResearch/metal/blob/31587d42205e0e1ee55135d8e65c341cc2dd0029/metal/classifier.py#L222
- https://github.com/HazyResearch/metal/blob/31587d42205e0e1ee55135d8e65c341cc2dd0029/metal/classifier.py#L228
I think we either need to set it to sum and allow post-processing if they want to combine in some other way, or add some special handling so that we're not dividing by train set size twice. I think I'd probably be in favor of always calculating some and then dividing later. What was the motivation for allowing it to be changed?
Good call, will fix!
Active development is being moved to our main repo, https://github.com/snorkel-team/snorkel. The model there does not have this issue.