cfkstat issues

Results 12 issues of


                                            cfkstat

Model variable screening

How to develop a scorecard that uses lasso or ridge for variable screening to get a model that is more generalizable than a model with a full subset of variables？

enhancement

如何写自定义评价函数，需要保证训练集和验证集的AUC差的最小，且验证AUC最好的。

示例中的reward_metric的函数，好像只传了训练集的预测值和实际值？

How to set test data

how to set a fixed test data, to eval model?

documentation

Risk Score Card Develope

Maximize the AUC Score of the model training set and validation set, while ensuring that the difference between the two AUCs is less than 0.02, or the difference between KS...

enhancement

glum with ray

Using glum and joblib with ray, I ran multiple models and found that threads could use 1 core, and if I set n_jobs=1, I could only use 50% of all...

the package predict result

The predicted result of PMML is different from the structure given by the package. The value accuracy of node nodes of each tree is more different than that of PMML....

风险评分卡的开发，我们通常需要找到一个Logistic回归模型满足如下条件认为是最优的（给定变量入模数量，限制条件最优）： 1. 给定一个训练集（train）和验证集（test），训练和验证是不同时点的贷款数据最终的风险表现（客户是否逾期）。 2. Score1 = AUC_train - if(abs(AUC_train-AUC_test) >= 0.015, abs(AUC_train-AUC_test), 0.5*abs(AUC_train-AUC_test)) 3. Score2 = KS_train- if(abs(KS_train-KS_test) >= 0.03, abs(KS_train-KS_test), 0.5*abs(KS_train-KS_test)) 评分1和评分2都可以作为一个评价函数，这里要test测试集上不参与模型训练的，所以交叉验证是不能用的，test只能用来评价模型函数。

cfkstat

特殊值分箱

Model variable screening

如何写自定义评价函数，需要保证训练集和验证集的AUC差的最小，且验证AUC最好的。

Jupyter不显示可视化

How to set test data

Risk Score Card Develope

how about question_words.txt?

glum with ray

the package predict result

风险评分优化