abess icon indicating copy to clipboard operation
abess copied to clipboard

Risk Score Card Develope

Open cfkstat opened this issue 2 years ago • 3 comments

Maximize the AUC Score of the model training set and validation set, while ensuring that the difference between the two AUCs is less than 0.02, or the difference between KS indicators is less than 3%. It should be noted that the training set and validation set are split across time, such as the loan month.

cfkstat avatar Jun 17 '23 16:06 cfkstat

Thanks for this comment. But I do not fully understand this, does you mean splitting samples across time like sklearn.model_selection.TimeSeriesSplit?

Mamba413 avatar Jun 19 '23 08:06 Mamba413

It's similar, but not exactly the same. For example, to develop loan application score, I use loan credit 202204 to 202208 as the training set, and 202209 to 202210 as the valid set. It is necessary to optimize the AUC of the training set and the valid set, and it cannot be overfitted. The gap between the training and valid AUC is less than or equal to 2%, and the gap between KS is less than or equal to 3%.

cfkstat avatar Jun 19 '23 15:06 cfkstat

Based on my understanding, the difference from sklearn.model_selection.TimeSeriesSplit is that you want to control the AUC of the validation set and training set within a certain range (e.g., 2%), is that correct?

Mamba413 avatar Jun 20 '23 16:06 Mamba413