woe icon indicating copy to clipboard operation
woe copied to clipboard

check_point 函数在合并分段的时候有一个错误

Open paleylouie opened this issue 6 years ago • 0 comments

check_point 函数在合并分段的时候有一个错误, feature_process.py 112行这里: pdf = df[(df[var] > split[i]) & (df[var] <= split[i+1])] 没有考虑到相邻的split中的sample数量都少于min_sample,但是合起来又大于min_sample的情况。 可以改为:

pre_left_position = float('-inf') # a value to remember the left point of the segmentation
for i in range(-1,split.__len__()-1):
    pdf = df[(df[var] > pre_left_position) & (df[var] <= split[i+1])]
    if (pdf.shape[0] < min_sample) or (np.unique(pdf['target']).__len__()<=1):
        #print(var, pre_left_position, i, "continue")
        continue
    else:
        new_split.append(split[i+1])
        pre_left_position = split[i+1]

paleylouie avatar Jun 03 '18 10:06 paleylouie