FinRL icon indicating copy to clipboard operation
FinRL copied to clipboard

Is possible to use Numba/cupy/cython to speed up the preprocessing step of FinRL?

Open dev590t opened this issue 2 years ago • 10 comments

The preprocessing is very slow in FinRL. The Nmba allow GPU parallelization with C speed for process array data. Some high speed trading framework like vectorbt use it.

Is possible to take experience from vectorbt and use Numba to speed up FinRL?

dev590t avatar May 14 '22 06:05 dev590t

https://github.com/AI4Finance-Foundation/ElegantRL If you check ElegantRL, you will find more advanced ways of using GPU, DGX-2 server and even NVIDIA's SuperPod. Enjoy!

YangletLiu avatar May 14 '22 18:05 YangletLiu

@XiaoYangLiu-FinRL If I understood the code correctly,ElegantRL is used in DRL step. I mean use Numba in preprocessing data step, like here:

data = DP.clean_data(data)
data = DP.add_technical_indicator(data, INDICATORS)
data = DP.add_vix(data)

Did this part of code use also GPU acceleration?

dev590t avatar May 14 '22 21:05 dev590t

Yes, ElegantRL is for DRL algorithms. Your suggestions for accelerating data preprocessing are quite interesting. Looking forward to workable codes, would that be possible from your end?

YangletLiu avatar May 14 '22 22:05 YangletLiu

@XiaoYangLiu-FinRL If I understood the code correctly,ElegantRL is used in DRL step. I mean use Numba in preprocessing data step, like here:

data = DP.clean_data(data)
data = DP.add_technical_indicator(data, INDICATORS)
data = DP.add_vix(data)

Did this part of code use also GPU acceleration?

@dev590t Not yet. The current does not support numba. We may test the acceleration and add numba if it is good.

zhumingpassional avatar May 15 '22 03:05 zhumingpassional

Unfortunaly, I'm not a python data scientist programmer, and have never used Numba. I have discover this lib from the trading lib vectorbt.

vectorbt can also preprocess data. I think it is maybe possible to use the dataframe produced by vectorbt and pass it directly to DRL agent, instead of use dataframe produced by the built-in code of FinRL.

This have few advandage:

  • GPU acceleration in preprocessing
  • it maybe support more indicator than FinRL. According its doc, it support for 99% indicators in Technical Analysis Library, Pandas TA, and TA-Lib. So it is possible to pass more indicators to DRL agent.
  • less code to maintain in FinRL

Do you think it is a good direction to explore? Externalize the preprocessing to vectorbt.

dev590t avatar May 15 '22 07:05 dev590t

Yes. it is a good direction. Thanks for your suggestions.

zhumingpassional avatar May 15 '22 09:05 zhumingpassional

Some engineers said that they tried numba, however there exist errors if the scenario becomes complex. They recommend cupy and cython.

zhumingpassional avatar Jun 15 '22 03:06 zhumingpassional

I think it is better to test in first existing GPU trading indicator librairy before to decide to reimplement or no the preprocessing step with Numba or Cupy. And if existing GPU trading indicator librairy is not suitable, Cupy also could be a interested solution.

dev590t avatar Jun 15 '22 07:06 dev590t

With respect to existing GPU trading indicator librairy, could you give any examples?

zhumingpassional avatar Jun 15 '22 08:06 zhumingpassional

If I understand correctly the interface of Finrl, Finrl can directly use the data from other libraries. During preprocessing step: For classic Talib indicator, Finrl can use the subset of feature of the GPU trading librairy vectorbt: https://vectorbt.dev/ For portfolio management indicator, Finrl maybe can use some indicators of pyportfolioopt: https://pyportfolioopt.readthedocs.io/en/latest/RiskModels.html.

dev590t avatar Jun 15 '22 14:06 dev590t