Error while fitting
AssertionError Traceback (most recent call last)
/tmp/ipykernel_33/4176855811.py in
/opt/conda/lib/python3.7/site-packages/flaml/automl.py in fit(self, X_train, y_train, dataframe, label, metric, task, n_jobs, log_file_name, estimator_list, time_budget, max_iter, sample, ensemble, eval_method, log_type, model_history, split_ratio, n_splits, log_training_metric, mem_thres, pred_time_limit, train_time_limit, X_val, y_val, sample_weight_val, groups_val, groups, verbose, retrain_full, split_type, learner_selector, hpo_method, starting_points, seed, n_concurrent_trials, keep_search_state, early_stop, append_log, auto_augment, min_sample_size, use_ray, metric_constraints, **fit_kwargs) 2089 2090 self._validate_data( -> 2091 X_train, y_train, dataframe, label, X_val, y_val, groups_val, groups 2092 ) 2093 self._search_states = {} # key: estimator name; value: SearchState
/opt/conda/lib/python3.7/site-packages/flaml/automl.py in _validate_data(self, X_train_all, y_train_all, dataframe, label, X_val, y_val, groups_val, groups) 887 assert isinstance(y_train_all, np.ndarray) or isinstance( 888 y_train_all, pd.Series --> 889 ), "y_train_all must be a numpy array or a pandas series." 890 assert ( 891 X_train_all.size != 0 and y_train_all.size != 0
AssertionError: y_train_all must be a numpy array or a pandas series.
Info about data
<class 'pandas.core.frame.DataFrame'> RangeIndex: 15129 entries, 0 to 15128 Data columns (total 28 columns):
Column Non-Null Count Dtype
0 bedrooms 15129 non-null uint8
1 bathrooms 15129 non-null float32
2 sqft_living 15129 non-null uint16
3 sqft_lot 15129 non-null uint32
4 floors 15129 non-null float32
5 waterfront 15129 non-null uint8
6 view 15129 non-null uint8
7 condition 15129 non-null uint8
8 grade 15129 non-null uint8
9 sqft_above 15129 non-null uint16
10 sqft_basement 15129 non-null uint16
11 yr_built 15129 non-null uint16
12 yr_renovated 15129 non-null uint16
13 lat 15129 non-null float32
14 long 15129 non-null float32
15 dateyear 15129 non-null uint16
16 datequarter 15129 non-null uint8
17 datemonth 15129 non-null uint8
18 dateday 15129 non-null uint8
19 dateday_of_week 15129 non-null uint8
20 dateday_of_year 15129 non-null uint16
21 dateweekofyear 15129 non-null uint8
22 dateis_month_end 15129 non-null uint8
23 dateis_month_start 15129 non-null uint8
24 dateis_quarter_end 15129 non-null uint8
25 dateis_quarter_start 15129 non-null uint8
26 dateis_year_end 15129 non-null uint8
27 dateis_weekend 15129 non-null uint8
dtypes: float32(4), uint16(7), uint32(1), uint8(16)
Hi @GDGauravDutta, is your y_train_all a pandas series? Make sure it is in pandas series format. Let me know if it does not resolve the issue. Thank you!
Hi @GDGauravDutta, is your y_train_all a pandas series? Make sure it is in pandas series format. Let me know if it does not resolve the issue. Thank you!
All are series
Hi @GDGauravDutta, Can you share with me a copy or a sub-sample of your data such that I can reproduce the error you had? Thank you!