code-intelligence
code-intelligence copied to clipboard
Add one more step to select labels which can be predicted
For now, we choose labels which satisfy both of the precision and recall thresholds (e.g., 0.7 and 0.5 as default respectively) to be able to be predicted. It may cause a small label coverage depending on different repositories.
An option is that we can choose labels by two steps
- Choose labels which satisfy both of the precision and recall thresholds. (the current method)
- For remaining labels, also pick up those which can meet the precision threshold. (a new step)
The reason for why we do the second step is that maintainers may care more about the false positive. Therefore, it is possible to include all labels which meet the precision threshold even though they may be predicted seldom.
However, there is the trade-off between precision and recall. For the second step, if we maximize the precision, it is likely to minimize the recall. In my opinion, we may need to choose probability thresholds for labels by letting their precision to be higher than but close to the precision threshold because the threshold is seen as the minimum acceptable value. And, only those labels satisfying the precision threshold can be included to be predicted.
Issue-Label Bot is automatically applying the label kind/feature
to this issue, with a confidence of 0.85. Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback!
Links: app homepage, dashboard and code for this bot.