kaggle-HomeDepot icon indicating copy to clipboard operation
kaggle-HomeDepot copied to clipboard

split_param in class HomedepotSplitter

Open great-thoughts opened this issue 8 years ago • 0 comments

The class HomedepotSplitter has a split_param=[0.5, 0.25, 0.5]. What do these number represents? Is it: split_param[0]: split unique search terms appear in dTrain into 50%train (df0) and 50%validation (df1) split_param[1]: split the terms appearing in df0 into 25% in train and 75% as common between train and validation split_param[2]: 50% of the 75% will be appended to the validation dataset?

How do those values relate to the ven diagram?

great-thoughts avatar Oct 15 '16 04:10 great-thoughts