FLAML icon indicating copy to clipboard operation
FLAML copied to clipboard

Error while fitting

Open GDGauravDutta opened this issue 3 years ago • 3 comments

AssertionError Traceback (most recent call last) /tmp/ipykernel_33/4176855811.py in ----> 1 automl.fit(X_train, y_train, task="regression",metric='rmse',time_budget=3600)

/opt/conda/lib/python3.7/site-packages/flaml/automl.py in fit(self, X_train, y_train, dataframe, label, metric, task, n_jobs, log_file_name, estimator_list, time_budget, max_iter, sample, ensemble, eval_method, log_type, model_history, split_ratio, n_splits, log_training_metric, mem_thres, pred_time_limit, train_time_limit, X_val, y_val, sample_weight_val, groups_val, groups, verbose, retrain_full, split_type, learner_selector, hpo_method, starting_points, seed, n_concurrent_trials, keep_search_state, early_stop, append_log, auto_augment, min_sample_size, use_ray, metric_constraints, **fit_kwargs) 2089 2090 self._validate_data( -> 2091 X_train, y_train, dataframe, label, X_val, y_val, groups_val, groups 2092 ) 2093 self._search_states = {} # key: estimator name; value: SearchState

/opt/conda/lib/python3.7/site-packages/flaml/automl.py in _validate_data(self, X_train_all, y_train_all, dataframe, label, X_val, y_val, groups_val, groups) 887 assert isinstance(y_train_all, np.ndarray) or isinstance( 888 y_train_all, pd.Series --> 889 ), "y_train_all must be a numpy array or a pandas series." 890 assert ( 891 X_train_all.size != 0 and y_train_all.size != 0

AssertionError: y_train_all must be a numpy array or a pandas series.

Info about data

<class 'pandas.core.frame.DataFrame'> RangeIndex: 15129 entries, 0 to 15128 Data columns (total 28 columns):

Column Non-Null Count Dtype


0 bedrooms 15129 non-null uint8
1 bathrooms 15129 non-null float32 2 sqft_living 15129 non-null uint16 3 sqft_lot 15129 non-null uint32 4 floors 15129 non-null float32 5 waterfront 15129 non-null uint8
6 view 15129 non-null uint8
7 condition 15129 non-null uint8
8 grade 15129 non-null uint8
9 sqft_above 15129 non-null uint16 10 sqft_basement 15129 non-null uint16 11 yr_built 15129 non-null uint16 12 yr_renovated 15129 non-null uint16 13 lat 15129 non-null float32 14 long 15129 non-null float32 15 dateyear 15129 non-null uint16 16 datequarter 15129 non-null uint8
17 datemonth 15129 non-null uint8
18 dateday 15129 non-null uint8
19 dateday_of_week 15129 non-null uint8
20 dateday_of_year 15129 non-null uint16 21 dateweekofyear 15129 non-null uint8
22 dateis_month_end 15129 non-null uint8
23 dateis_month_start 15129 non-null uint8
24 dateis_quarter_end 15129 non-null uint8
25 dateis_quarter_start 15129 non-null uint8
26 dateis_year_end 15129 non-null uint8
27 dateis_weekend 15129 non-null uint8
dtypes: float32(4), uint16(7), uint32(1), uint8(16)

GDGauravDutta avatar Apr 09 '22 20:04 GDGauravDutta

Hi @GDGauravDutta, is your y_train_all a pandas series? Make sure it is in pandas series format. Let me know if it does not resolve the issue. Thank you!

qingyun-wu avatar Apr 11 '22 00:04 qingyun-wu

Hi @GDGauravDutta, is your y_train_all a pandas series? Make sure it is in pandas series format. Let me know if it does not resolve the issue. Thank you!

All are series

GDGauravDutta avatar Apr 12 '22 15:04 GDGauravDutta

Hi @GDGauravDutta, Can you share with me a copy or a sub-sample of your data such that I can reproduce the error you had? Thank you!

qingyun-wu avatar Apr 14 '22 04:04 qingyun-wu