cmc-csci145-math166 icon indicating copy to clipboard operation
cmc-csci145-math166 copied to clipboard

Regularization and Out of Sample Error

Open mvalsania opened this issue 1 year ago • 2 comments

Hey there!

I was just curious about the question mark under Eout. Are we saying it is undetermined because it will depend on the extent of the change in Ein and |Eout - Ein|, or can we confidently say that if the model was already overfitting the data then Eout will go down and if it was already underfitting the data then Eout will go up?

In other words, can we make any claims on the extent of change of Ein and |Ein - Eout| based on whether the model was already overfitting or underfitting?

image

Thanks!

mvalsania avatar Dec 09 '24 04:12 mvalsania

Are we saying it is undetermined because it will depend on the extent of the change in Ein and |Eout - Ein|

This is correct.

or can we confidently say that if the model was already overfitting the data then Eout will go down and if it was already underfitting the data then Eout will go up?

This is not necessarily correct. For example, you could be currently overfitting and so choose to add regularization. But you might add too much, overshoot, and then be underfitting "worse" than you were previously overfitting, resulting in a higher out of sample error even though you moved $\lambda$ in the right direction.

mikeizbicki avatar Dec 09 '24 07:12 mikeizbicki

That makes sense, thank you!

mvalsania avatar Dec 09 '24 08:12 mvalsania