woe
woe copied to clipboard
check_point 函数在合并分段的时候有一个错误
check_point 函数在合并分段的时候有一个错误,
feature_process.py 112行这里:
pdf = df[(df[var] > split[i]) & (df[var] <= split[i+1])]
没有考虑到相邻的split中的sample数量都少于min_sample,但是合起来又大于min_sample的情况。
可以改为:
pre_left_position = float('-inf') # a value to remember the left point of the segmentation
for i in range(-1,split.__len__()-1):
pdf = df[(df[var] > pre_left_position) & (df[var] <= split[i+1])]
if (pdf.shape[0] < min_sample) or (np.unique(pdf['target']).__len__()<=1):
#print(var, pre_left_position, i, "continue")
continue
else:
new_split.append(split[i+1])
pre_left_position = split[i+1]