qlib
qlib copied to clipboard
[Proposal] Systematic RL support in qlib
2022/2/25
Package name:
- qlib.neutrader?
- Sound, brand
- Sounds like limited to "trading" scenario
- qlib.rl?
- Shorter, easier to remember
- Not exactly an RL framework
- No ML as opposed to RL.
- qlib.sdm?
- What is sdm?
K: Keep in an internal repo (possibly another repo)
D: Delete
TBD: To be discussed (possibly need merge effort)
Major efforts
- Unify config
- Currently neutrader is based on
utilsd.config. - qlib has another configuration system.
- Currently neutrader is based on
- Unify entry
- Currently neutrader has several command line tools
- qlib is mostly Pythonic launching.
- Merge with workflow, qrun.
- Documentation and tests
- Neutrader is almost zero-documented.
- Neutrader has its own pytest.
- … and coveragerc.
| K | .azure/docker.yml |
| D | .azure/pipeline.yml |
| TBD | .coveragerc |
| K | .gitignore |
| K | README.md |
| K | docker/base.Dockerfile |
| K | docker/neutrader.Dockerfile |
| qlib/examples | examples/backtest/qlib.yml |
| qlib.contrib.data.utils | examples/data/ordergen.py |
| neutrader/__init__.py | |
| <neutrader>.action | neutrader/action.py |
| neutrader/data/__init__.py | |
| neutrader.data | neutrader/data/base.py |
| neutrader.utils | neutrader/data/data_queue.py |
| qlib.contrib.data | neutrader/data/highfreq_handler.py |
| qlib.contrib.data | neutrader/data/highfreq_handler_order.py |
| qlib.contrib.data | neutrader/data/highfreq_handler_order_other_price.py |
| qlib.contrib.data | neutrader/data/highfreq_label_handler.py |
| qlib.contrib.data | neutrader/data/highfreq_label_handler_other_price.py |
| qlib.contrib.ops | neutrader/data/highfreq_ops.py |
| qlib.contrib.data | neutrader/data/highfreq_processor.py |
| qlib.contrib.data | neutrader/data/highfreq_provider.py |
| neutrader.data | neutrader/data/intraday.py |
| neutrader/env/__init__.py | |
| neutrader.utils | neutrader/env/finite_env.py |
| neutrader.env (deprecated) | neutrader/env/intraday_sa.py |
| neutrader.utils | neutrader/env/logging.py |
| neutrader/forecast/__init__.py | |
| K | neutrader/forecast/__main__.py |
| K | neutrader/forecast/common/__init__.py |
| K | neutrader/forecast/common/function.py |
| K | neutrader/forecast/common/util.py |
| K | neutrader/forecast/config.py |
| K | neutrader/forecast/dataset/__init__.py |
| K | neutrader/forecast/dataset/forecast.py |
| K | neutrader/forecast/dataset/minlevel.py |
| K | neutrader/forecast/model/__init__.py |
| K | neutrader/forecast/model/base.py |
| K | neutrader/forecast/model/darnn.py |
| neutrader/network/__init__.py | |
| neutrader.network | neutrader/network/base.py |
| neutrader.network | neutrader/network/darnn.py |
| K | neutrader/network/darnn4pred.py |
| neutrader.network | neutrader/network/recurrent.py |
| neutrader.observation | neutrader/observation.py |
| neutrader/policy/__init__.py | |
| neutrader.policy | neutrader/policy/base.py |
| K | neutrader/policy/baseline.py |
| neutrader.policy | neutrader/policy/twap/vwap/ac.py |
| K | neutrader/policy/mappo.py |
| neutrader.policy | neutrader/policy/ppo.py |
| neutrader.policy | neutrader/policy/utils.py |
| neutrader/qlib_integration/__init__.py | |
| neutrader.integration | neutrader/qlib_integration/feature.py |
| neutrader.integration | neutrader/qlib_integration/infrastructure.py |
| K | neutrader/qlib_integration/predictor.py |
| neutrader.integration | neutrader/qlib_integration/simulator.py |
| neutrader.integration | neutrader/qlib_integration/strategy.py |
| neutrader.reward | neutrader/reward.py |
| D | neutrader/search/__init__.py |
| D | neutrader/search/config_gen.py |
| D | neutrader/search/rerun_exp.py |
| D | neutrader/search/search.py |
| D | neutrader/search/util.py |
| neutrader.state | neutrader/state.py |
| neutrader.cli | neutrader/tools/__init__.py |
| neutrader.cli | neutrader/tools/backtest.py |
| neutrader.cli | neutrader/tools/backtest_qlib.py |
| neutrader.cli | neutrader/tools/config.py |
| neutrader.cli | neutrader/tools/ctl.py |
| neutrader.cli | neutrader/tools/openpai.py |
| neutrader.cli | neutrader/tools/train_onpolicy.py |
| TBD | setup.py |
| qlib/tests/rl | tests/assets/opds_15_225_backtest_qlib.csv |
| qlib/tests | tests/assets/opds_15_225_inner_twap_backtest_qlib.csv |
| qlib/tests | tests/assets/opds_15_225_single_day_backtest_qlib.csv |
| qlib/tests | tests/assets/peppo_15_225_backtest_qlib.csv |
| qlib/tests | tests/assets/twap_backtest_qlib.csv |
| qlib/tests | tests/assets/twap_nested_backtest_qlib.csv |
| qlib/tests | tests/assets/twap_single_day_backtest_qlib.csv |
| qlib/tests | tests/configs/hamburger.yml |
| qlib/tests | tests/configs/opds_15_225_backtest_qlib.py |
| qlib/tests | tests/configs/peppo_15_225_backtest_qlib.py |
| qlib/tests | tests/configs/ppo_30min_test.yml |
| qlib/tests | tests/configs/ppo_30min_test_qlib.yml |
| qlib/tests | tests/configs/ppo_30min_train.yml |
| qlib/tests | tests/configs/twap_30min.yml |
| qlib/tests | tests/configs/twap_backtest_qlib.yml |
| qlib/tests | tests/configs/twap_nested_backtest_qlib.yml |
| qlib/tests | tests/test_dataloader.py |
| qlib/tests | tests/test_e2e.py |
| qlib/tests | tests/test_finite_env.py |
| qlib/tests | tests/test_qlib_integration.py |
| qlib/tests | tests/test_state.py |
Status update (5/27)

Immediate work items are those I believe important and marked italic.
RL framework - self-contained, agnostic to tasks
- [ ] Trainer @ultmaster #1125 waiting for review
- [x] Policy - interpreter - simulator
- [ ] Logging system (only basics, many TODOs - more loggers including tensorboard, mlflow, memory buffer) - 2 weeks
- [x] Auxiliary info collector
- [x] Reward
- [x] Seed (aka initial state)
- [ ] Other utilities - detailed breakdowns from #1076
- [x] Data queue, finite env
- [x] Env wrapper (env = interpreter + simulator, policy = policy)
- [ ] Non-linux compatibility fix
- [ ] Performance optimization
- [ ] Rechargeable queue (needed by PM)
Qlib integration - Make RL framework part of qlib
- [x] Use
qlib.backtest.Orderthroughout everywhere where "order" is needed. - [ ] Strategy wrapper (strategy = interpreter + policy, simulator =
qlib.backtest+ something else). @lihuoran - neutrader simulator migrated.- [ ] RL can use simulator provided by qlib backtest (can run)
- [ ] Qlib inference can use trained policy (including simple policies like TWAP) in RL (only has internal drafts).
- [ ] Experiment/workflow management (closely related to "trainer" above).
- [ ] Programming with config only - launching backtest / training via config.
Tasks and algorithms - somewhat independent
- [ ] SAOE
- [x] The first SAOE simulator built upon "OPD-styled" data, along with several interpreters and basic policies.
- [ ] Second SAOE simulator based on
qlib.backtest. Depends on "Strategy wrapper" above. - [ ] New (and old) algorithms listed by Kan. @rk2900
- [ ] OPD
- [ ] Depends on: log actions of agents
- [ ] DDQN, PPO, AC, VWAP: decision needed - whether to load data and predict online, or cache the prediction offline.
- [ ] PM
Others
- [ ] Tutorials for first-time users.
- [ ] Continual improvements on tests.