ludwig icon indicating copy to clipboard operation
ludwig copied to clipboard

Setting validation split probability to 0 causes size errors

Open arnavgarg1 opened this issue 3 years ago • 0 comments

When using the following config:

preprocessing:
    split:
        probabilities: [0.7, 0, 0.3]

Ludwig throws the following error: ValueError: Dataset is empty following preprocessing.

This error actually surfaces when writing/reading from the cache because Ludwig instantiates a RayDataset object that has an explicit check for size being > 0.

When you do 0.7, 0.3, and 0, the validation set reports N/A for metrics and train/test set had metrics. When you do 0.7, 0, 0.3, then you hit the error messages above.

arnavgarg1 avatar Aug 09 '22 18:08 arnavgarg1