handson-ml icon indicating copy to clipboard operation
handson-ml copied to clipboard

Chapter 2: Stratify

Open Jeffresh opened this issue 4 years ago • 1 comments
trafficstars

Why use StrafiedShuffleSplit, instead of train_test_split with attribute stratify?

X_train, X_test = train_test_split(housing.values, test_size=0.2, stratify=housing['income_cat'], random_state=42)

I think its more clear and pythonic that make folds and use a for statement for 1 iteration

Jeffresh avatar Aug 03 '21 09:08 Jeffresh

Hi @Jeffresh , Thanks for this great suggestion. I wanted to show an example of how to use the splitter classes, but I think I should point out that there's an alternative (and simpler) way to stratify. I'll add a note in the book and the notebooks. 👍

ageron avatar Aug 17 '21 01:08 ageron