ludwig
ludwig copied to clipboard
Setting validation split probability to 0 causes size errors
When using the following config:
preprocessing:
split:
probabilities: [0.7, 0, 0.3]
Ludwig throws the following error: ValueError: Dataset is empty following preprocessing.
This error actually surfaces when writing/reading from the cache because Ludwig instantiates a RayDataset object that has an explicit check for size being > 0.
When you do 0.7, 0.3, and 0, the validation set reports N/A for metrics and train/test set had metrics. When you do 0.7, 0, 0.3, then you hit the error messages above.