kaggle-HomeDepot
kaggle-HomeDepot copied to clipboard
split_param in class HomedepotSplitter
The class HomedepotSplitter has a split_param=[0.5, 0.25, 0.5]. What do these number represents? Is it: split_param[0]: split unique search terms appear in dTrain into 50%train (df0) and 50%validation (df1) split_param[1]: split the terms appearing in df0 into 25% in train and 75% as common between train and validation split_param[2]: 50% of the 75% will be appended to the validation dataset?
How do those values relate to the ven diagram?