Chapter 15: rail_test is not used; models in the notebook not evaluated on the test set
In the code for Chapter 15 of handson-ml3 at https://github.com/ageron/handson-ml3/blob/main/15_processing_sequences_using_rnns_and_cnns.ipynb
I see code that creates that train, validation, and test sets. rail_train = df["rail"]["2016-01":"2018-12"] / 1e6 rail_valid = df["rail"]["2019-01":"2019-05"] / 1e6 rail_test = df["rail"]["2019-06":] / 1e6
I observe that: • rail_test is never used in the notebook. • The models created in the notebook are not evaluated on the test set.
Q. Why are the models in the notebook not evaluated on the test set?
Q. Should the models created in the notebook be evaluated on the test set (rail_test)?
There are plenty of times the test_set is not used at all across the book. Why he makes them? I believe that he wants to create a habit in the reader while splitting the dataset. As Aurelien repeats a bunch of times in the book: you only test the model on the test set once you are comfortable with the model, and you want to send it to production. It has to be the very last step.
Many of the models we create on the book are "toy" models, and you tend to improve them many times. There's no need to evaluate them on the test set, as you are not going to do anything with them. Test every model on the test set would be reiterative, as well as doing many other common practices while building a model. It is necessary to take for granted some of this stuff, as this saves a lot of repetitive code when teaching something new. It's something more like "Hey, remember this is a thing, even tho we are not doing it right now". Feel free to evaluate the final model on the test set tho, you have the data!
Thanks for your question @tc-git-1 , and thanks to @cf-chmod for the best possible answer! That's exactly right.