RD-Agent icon indicating copy to clipboard operation
RD-Agent copied to clipboard

Fail to concat factors with different MultiIndex

Open Hanson13 opened this issue 9 months ago • 0 comments

🐛 Bug Description

Hi, when run the demo of rdagent fin_factor, raise AssertionError: Length of new_levels (3) must be <= self.nlevels (2). After debugging, I find that there is a factor with 3-dim multiIndex, different from other factors with 2-dim multiIndex. It causes an error in rdagent/scenarios/qlib/developer/factor_runner.py: 159, where concats the factor_dfs. Is there any suggestion to handle this exception? Thank you!

To Reproduce

Steps to reproduce the behavior:

  1. run fin_factor
  2. wait for the program running

Expected Behavior

Screenshot

ipdb> pd.concat(factor_dfs, axis=1) AssertionError: Length of new_levels (3) must be <= self.nlevels (2) ipdb> p factor_dfs[0].info() <class 'pandas.core.frame.DataFrame'> MultiIndex: 13011155 entries, (Timestamp('2008-12-29 00:00:00'), 'SH000300') to (Timestamp('2025-02-17 00:00:00'), 'SZ399300') Data columns (total 1 columns): Column Dtype
0 MediumTermMomentum float32 dtypes: float32(1) memory usage: 615.6+ MB None ipdb> p factor_dfs[1].info() <class 'pandas.core.frame.DataFrame'> MultiIndex: 13011155 entries, (Timestamp('2008-12-29 00:00:00'), 'SH000300') to (Timestamp('2025-02-17 00:00:00'), 'SZ399300') Data columns (total 1 columns): Column Dtype
0 VolumeWeightedMomentum float64 dtypes: float64(1) memory usage: 665.2+ MB None ipdb> p factor_dfs[2].info() <class 'pandas.core.frame.DataFrame'> MultiIndex: 13011155 entries, (Timestamp('2008-12-29 00:00:00'), 'SH000300') to (Timestamp('2025-02-17 00:00:00'), 'SZ399300') Data columns (total 1 columns): Column Dtype
0 RelativeStrengthIndex float64 dtypes: float64(1) memory usage: 665.2+ MB None ipdb> p factor_dfs[3].info() <class 'pandas.core.frame.DataFrame'> MultiIndex: 13011155 entries, ('BJ430017', Timestamp('2023-05-31 00:00:00'), 'BJ430017') to ('SZ399300', Timestamp('2025-02-17 00:00:00'), 'SZ399300') Data columns (total 1 columns): Column Dtype
0 PVT float32 dtypes: float32(1) memory usage: 640.6+ MB None

Environment

Note: Users can run rdagent collect_info to get system information and paste it directly here.

  • Name of current operating system:
  • Processor architecture:
  • System, version, and hardware information:
  • Version number of the system:
  • Python version:
  • Container ID:
  • Container Name:
  • Container Status:
  • Image ID used by the container:
  • Image tag used by the container:
  • Container port mapping:
  • Container Label:
  • Startup Commands:
  • RD-Agent version:
  • Package version:

Additional Notes

Hanson13 avatar Mar 13 '25 10:03 Hanson13