handson-ml3 icon indicating copy to clipboard operation
handson-ml3 copied to clipboard

[QUESTION] Why all estimators should be fitted to training data only?

Open vasili111 opened this issue 1 year ago • 1 comments

Page 45 says:

As will all estimators, it is important to fit scalers to the training data only: never use fit() or fit_transform() for anything else than training set.

Could you please explain why it is important and what happens if this recommendation is not followed?

vasili111 avatar Mar 29 '23 04:03 vasili111

Hi @vasili111, that's a common and really useful question to ask. When I get asked this, I usually point folks to this Stack Overflow answer, to get an insightful explanation:

https://stackoverflow.com/questions/48692500/fit-transform-on-training-data-and-transform-on-test-data

UPstartDeveloper avatar Apr 20 '23 10:04 UPstartDeveloper