handson-ml
handson-ml copied to clipboard
Chapter 2: Stratify
trafficstars
Why use StrafiedShuffleSplit, instead of train_test_split with attribute stratify?
X_train, X_test = train_test_split(housing.values, test_size=0.2, stratify=housing['income_cat'], random_state=42)
I think its more clear and pythonic that make folds and use a for statement for 1 iteration
Hi @Jeffresh , Thanks for this great suggestion. I wanted to show an example of how to use the splitter classes, but I think I should point out that there's an alternative (and simpler) way to stratify. I'll add a note in the book and the notebooks. 👍