ludwig icon indicating copy to clipboard operation
ludwig copied to clipboard

Informative error message when split col is missing train data

Open jppgks opened this issue 2 years ago • 0 comments

A dataset with a split column with missing 0 (train) values, yields the following error during Ludwig preprocessing:


  File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/api.py", line 434, in train
    preprocessed_data = self.preprocess(
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/api.py", line 1276, in preprocess
    preprocessed_data = preprocess_for_training(
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/data/preprocessing.py", line 1599, in preprocess_for_training
    processed = cache.put(*processed)
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/data/cache/manager.py", line 41, in put
    training_set = self.dataset_manager.save(
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/data/dataset/ray.py", line 139, in save
    self.backend.df_engine.to_parquet(dataset, cache_path)
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/data/dataframe/dask.py", line 93, in to_parquet
    df.to_parquet(
AttributeError: 'NoneType' object has no attribute 'to_parquet'

jppgks avatar Jun 21 '22 16:06 jppgks