qlib icon indicating copy to clipboard operation
qlib copied to clipboard

BUG when judging ndim of a series in dataframe

Open huyp182 opened this issue 4 months ago • 2 comments

File: qlib/contrib/model/gbdt.py Line 43: x, y = df["feature"], df["label"] if y.values.ndim == 2 and y.values.shape[1] == 1: y = np.squeeze(y.values) else: raise ValueError("LightGBM doesn't support multi-label training")

y is a Series type from a Dataframe, so the ndim of y.value can only be 1, and is impossible to be 2, then error raises.

huyp182 avatar Aug 15 '25 07:08 huyp182

Hi, @huyp182 Can you tell me how to reproduce this?

SunsetWolf avatar Sep 01 '25 08:09 SunsetWolf

Hi, @huyp182

Thank you for your attention to qlib. The issue you mentioned is not a bug, because:

  • The DataFrame has multi-level indices (datetime, instrument);
  • The DataFrame has multi-level columns (feature and label);
  • There is only one subcolumn LABEL0 under the label column.

Your observation is based on the fact that “Series values must be 1D,” which is correct. However, in this DataFrame, df[“label”] returns a DataFrame, not a Series. Therefore, y.values.ndim == 2 is reasonable.

Reference code: LGBModel._prepare_data -> DatasetH.prepare -> DatasetH._prepare_seg -> DataHandler.fetch -> DataHandler._fetch_data

SunsetWolf avatar Sep 16 '25 06:09 SunsetWolf