spin
spin copied to clipboard
Question of eval_mask of airquality dataset
As mentioned in the article, only the data that is valid and invalid at the same time in the next month is used as evaluation data. I do not understand this setting. Why can't all the data except the missing value be used as evaluation data? Doesn't this limit the model from learning a broader representation? In addition, a whiten_prob is mentioned in the code, I would like to know if this whiten operation is only carried out on eval_mask
In addition, I am confused about the training process of the model. Previously, I thought the model was trained by masking data with whiten operation, but when I tried to change airquality's eval_mask to a mask composed of full valid values, I found that the model could not be trained. Could it be that the model is trained on eval_mask? (I know there is a traning_mask but that looks like the inverse of eval_mask in the code)