trading-momentum-transformer differences between paper and validation

It is a gread work for the methods sharing. Validate steps are done following the orders listed from the README and the results caculated are listed bellow: It is very different from results from the paper:

I am wondering the reason. Thank you for your help.

Apr 13 '22 13:04 nkchem09

I believe I heard/read some where that they ran the model 5 times and averaged the results.

Oct 30 '22 00:10 replacementAI

How long did you take to build all the features @nkchem09 ? In my computer it is very slow. Have you tried with 5% target volatility instead of 15% ?

Dec 29 '22 20:12 q-learning-trader

How long did you take to build all the features @nkchem09 ? In my computer it is very slow. Have you tried with 5% target volatility instead of 15% ?

It is been done just as the code provided with default parameters. The computer is consisted with i9 9900k, 64GB DDR4, 2080ti and it takes long time to get the result too.

Dec 29 '22 23:12 nkchem09

I ran the model using Kiernan's data (5x per @replacementAI and as specified in the paper) and can confirm the performance. I've also run it on day, day part, hour and five minute equity prices. It finds signal w/ this relatively noisy data, though (a) the signal declines as the time series frequency increases, (b) the importance of CPD breaks down. Regarding CPD, it takes a while to run w/ limited upside from using a GPU. That's either due to a configuration issue (getting the gpflow libs right is challenging) or the sequential nature of the problem. At 3.7 Ghz it will do a ticker in 60-ish minutes per CPU on the default data set. As you move to higher frequency data, the time required blows up as the value breaks down.

Feb 26 '23 18:02 MickyDowns

Thats good to hear @MickyDowns, could you share the results on the performances of different time candles?

Feb 26 '23 18:02 replacementAI

Hey @replacementAI, let me see about providing my version of @kieranjwood's framework after I'm done testing it. It's table driven so that you can run data at different frequencies relatively easily. I think that will be more useful than providing numbers which are dependent on my separate feature engineering. However, as you'd expect, the performance of these predictors degrades on the higher-frequency data. Fortunately, new types of predictors come available as you move from bar, to bid/ask, to LOB data.

Feb 26 '23 20:02 MickyDowns

Hi, I also get difference results.

I have tried running the TFT model without change points module with the command "python.exe -m examples.run_dmn_experiment TFT"

It then runs experiments by default on:

2016-2017 - Here position is always -1
2017-2018 - Position varies but most of the time higher than 0.9
2019-2020 - Position always over 0.9
2020-2021 - Position varies but most of the time higher than 0.9

For each experiments Long is always better than the TFT model. Here the comparision between cumsum of captured returns and returns for 2017-2018 and 2020-2021 periods. Returns should be the same as go always full long.

2017-2018

2020-2021

Mar 03 '23 19:03 makovez

Is this normal? Does anyone have any idea about what is the issue?

Mar 12 '23 20:03 makovez

No, it is not. When I ran the experiments with the futures contracts as described in the paper, the position values varied day by day a lot. Sharpe Ratio was around 2 to 3 for the out of sample years if I remember correctly.

On Sun, 12 Mar 2023, 20:15 Makovez, @.***> wrote:

Is this normal? Does anyone have any idea about what is the issue?

— Reply to this email directly, view it on GitHub https://github.com/kieranjwood/trading-momentum-transformer/issues/2#issuecomment-1465289564, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHAE3RDYKLHS6IRA7OOGRDW3YVEXANCNFSM5TKWRG5Q . You are receiving this because you are subscribed to this thread.Message ID: @.*** com>

Mar 12 '23 20:03 aicheung

I haven't done any change in the code tho, i just ran the commands showed in the readme. So I also used the same dataset. Then i created the features and run this command.

python.exe -m examples.run_dmn_experiment TFT

I haven't changed anything inside the code.

Mar 12 '23 20:03 makovez

Did you calculate the 21 and 126 day CPD and include them in features file before running the experiment? They affect the performance of the model quite a bit, compared to no CPD.

On Sun, 12 Mar 2023, 20:57 Makovez, @.***> wrote:

I haven't done any change in the code tho, i just ran the commands showed in the readme. So I also used the same dataset. Then i created the features and run this command.

python.exe -m examples.run_dmn_experiment TFT

I haven't changed anything inside the code.

— Reply to this email directly, view it on GitHub https://github.com/kieranjwood/trading-momentum-transformer/issues/2#issuecomment-1465297638, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHAE3WZTIKPZJAFQINC2XTW3Y2DVANCNFSM5TKWRG5Q . You are receiving this because you commented.Message ID: @.***>

Mar 12 '23 21:03 aicheung

No i haven't, because it takes a lot of time. But even without cpd, the paper shows some results without the module and they are not comparable to what i am getting. I get most of the times positions over 0.9 or -0.9

Mar 12 '23 21:03 makovez

Hmm, I did a quick rerun of the experiment in my local codebase without the CPD. It still shows Sharpe Ratio around 1 - 2.

Maybe an issue local to your codebase? Here is my config:

Python version: Python 3.10.8 (using pyenv) OS: WSL Ubuntu 22.04.1 (with CUDA support) Pip packages:

~/trading-momentum-transformer$ pip freeze absl-py==1.3.0 aiohttp==3.8.3 aiosignal==1.3.1 appdirs==1.4.4 astroid==2.9.0 astunparse==1.6.3 async-timeout==4.0.2 attrs==22.1.0 backcall==0.2.0 black==22.1.0 cachetools==4.2.2 certifi==2021.10.8 charset-normalizer==2.0.4 click==8.0.4 cloudpickle==1.6.0 coloredlogs==15.0.1 contourpy==1.0.6 cycler==0.10.0 dataclasses==0.6 debugpy==1.5.1 decorator==5.0.9 Deprecated==1.2.12 dm-tree==0.1.6 empyrical==0.5.5 flatbuffers==2.0.7 fonttools==4.38.0 frozenlist==1.3.3 fsspec==2022.11.0 gast==0.3.3 gcsfs==2022.11.0 google-api-core==2.8.2 google-auth==1.34.0 google-auth-oauthlib==0.4.5 google-cloud-appengine-logging==1.1.1 google-cloud-audit-log==0.2.4 google-cloud-core==2.3.2 google-cloud-logging==3.3.1 google-cloud-storage==2.7.0 google-crc32c==1.5.0 google-pasta==0.2.0 google-resumable-media==2.4.0 googleapis-common-protos==1.56.4 gpflow==2.6.3 grpc-google-iam-v1==0.12.4 grpcio==1.51.1 grpcio-status==1.48.2 h5py==3.7.0 humanfriendly==10.0 ibapi @ file:///home/aicheung/trading-momentum-transformer/ibapi-10.20.1-py3-none-any.whl idna==3.2 inflection==0.5.1 ipykernel==6.9.2 ipython==7.26.0 ipython-genutils==0.2.0 isort==5.9.3 jedi==0.18.0 joblib==1.2.0 jupyter-client==6.1.12 jupyter-core==4.7.1 keras==2.11.0 Keras-Preprocessing==1.1.2 keras-tuner==1.0.3 kiwisolver==1.3.1 kt-legacy==1.0.4 lark==1.1.5 lazy-object-proxy==1.6.0 libclang==14.0.6 lxml==4.9.1 Markdown==3.3.4 matplotlib==3.6.2 matplotlib-inline==0.1.6 mccabe==0.6.1 more-itertools==8.12.0 mpmath==1.2.1 multidict==6.0.3 multipledispatch==0.6.0 multitasking==0.0.9 mypy-extensions==0.4.3 nest-asyncio==1.5.4 numpy==1.23.5 oauthlib==3.1.1 onnx==1.12.0 onnxruntime-gpu==1.13.1 opt-einsum==3.3.0 packaging==21.0 pandas==1.3.5 pandas-datareader==0.10.0 parso==0.8.2 pathspec==0.9.0 pexpect==4.8.0 pickleshare==0.7.5 Pillow==9.3.0 platformdirs==2.5.1 prompt-toolkit==3.0.19 proto-plus==1.22.1 protobuf==3.19.6 psutil==5.9.0 ptyprocess==0.7.0 pyasn1==0.4.8 pyasn1-modules==0.2.8 Pygments==2.9.0 pylint==2.12.2 pyparsing==2.4.7 python-dateutil==2.8.2 pytz==2021.1 pyzmq==22.2.1 Quandl==3.7.0 regex==2022.10.31 requests==2.26.0 requests-oauthlib==1.3.0 rsa==4.7.2 scikit-learn==1.2.0 scipy==1.9.3 six==1.16.0 sympy==1.11.1 tabulate==0.8.9 tensorboard==2.11.0 tensorboard-data-server==0.6.1 tensorboard-plugin-wit==1.8.1 tensorflow==2.11.0 tensorflow-estimator==2.11.0 tensorflow-io-gcs-filesystem==0.28.0 tensorflow-probability==0.18.0 termcolor==1.1.0 tf2onnx==1.13.0 threadpoolctl==2.2.0 toml==0.10.2 tomli==2.0.1 tornado==6.1 traitlets==5.1.1 typing_extensions==4.1.1 urllib3==1.26.6 wcwidth==0.2.5 Werkzeug==2.0.1 wrapt==1.13.3 yarl==1.8.2 yfinance==0.1.63

Quandl data is the same as I originally ran it a few months ago. I did not refresh it.

On Sun, 12 Mar 2023 at 21:21, Makovez @.***> wrote:

No i haven't, because it takes a lot of time. But even without cpd, the paper shows some results without the module and they are not comparable to what i am getting. Why I get most of the times over 0.9 or -0.9

— Reply to this email directly, view it on GitHub https://github.com/kieranjwood/trading-momentum-transformer/issues/2#issuecomment-1465302682, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHAE3VIPYC5H2FRQG3MAWTW3Y435ANCNFSM5TKWRG5Q . You are receiving this because you commented.Message ID: @.***>

Mar 12 '23 22:03 aicheung

Sharpe Ratio 1-2 on what period ?

Mar 12 '23 22:03 makovez

For example for 2020-2021 I also get a sharpe ratio of 1.77 looking at result.json but it is still worst than the baseline meaning it is doing worst than the always full long position. Looking back again at captured_returns file for 2020-2021 with TFT without CPD module, the positions size varies, and most of them are not over 0.9/less than -0.9. But for the other periods what i previously said remains in place.

For example this is for 2019-2020, as you can see most of the times they are either over 0.9 or less than -0.9, i don't think this is a good sign

Mar 12 '23 22:03 makovez

If you are getting good Sharpe Ratios, then the system is working properly. Momentum trading is different from buy and hold, and 2017-2018 and 2020-2021 are both extremely good years for buy and holders. On the other hand, you get good returns from bear market years from momentum trading (my system using this codebase had Sharpe Ratio over 2 in 2022 while buy and holders lost money). And since Sharpe Ratio is high with this system, you can apply leverage (or volatility scaling to the whole portfolio to a target volatility, similar to what the paper has done) to get higher absolute returns.

On Sun, 12 Mar 2023 at 22:30, Makovez @.***> wrote:

For example for 2020-2021 I also get a sharpe ratio of 1.77 looking at result.json but it is still worst than the baseline meaning it is doing worst than the always full long position. Looking back again at captured_returns file for 2020-2021 with TFT without CPD module, the positions size varies, and most of them are not over 0.9/less than -0.9. But for the other periods what i previously said remains in place.

— Reply to this email directly, view it on GitHub https://github.com/kieranjwood/trading-momentum-transformer/issues/2#issuecomment-1465317187, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHAE3UJ3ZJKJXHRV6ZEDNTW3ZE6ZANCNFSM5TKWRG5Q . You are receiving this because you commented.Message ID: @.***>

Mar 12 '23 22:03 aicheung

I understand what you are saying but I still think that if the strategy does this, it is lossing 1/4 of potential gain. And this is not good. The strategy must at least be as good as buy and hold returns in period of bull market while performing best in period of bear market.

If he looses 1/4 potential gain that could have been made by just buy and hold and then when there is a bear market the strategy is avoiding losses, overall would still be same as buy and hold. Do you agree with me ?

Mar 13 '23 20:03 makovez

The potential gain you said as well as the account balance of long only portfolio in the chart are due to absolute returns. This system is not optimised for getting the maximum absolute returns. It is optimised for getting the maximum return per unit of risk (Sharpe Ratio). And it works, by noticing that there are very few drawdowns in the Experiment line on your chart.

Here is one thing you can check for yourself: multiply the Experiment line on your chart such that the final value of both lines meet, and check the Sharpe Ratio, portfolio volatility and max drawdown for both lines. The results from the experiment is better in all three metrics.

As for "The strategy must at least be as good as buy and hold returns in period of bull market while performing best in period of bear market." No, you can't get it by simply applying the singals from this system to get that. You have to use leverage or volatility scaling in addition to that i.e. in bull markets as shown in your chart, the volatility is low, thus you should increase leverage to maximise absolute returns.

On Mon, 13 Mar 2023, 20:17 Makovez, @.***> wrote:

[image: image] https://user-images.githubusercontent.com/21694707/224821587-113c3a2a-4344-4538-9613-4a1e9bdc3b51.png

I understand what you are saying but I still think that if the strategy does this, it is lossing 1/4 of potential gain. And this is not good. The strategy must at least be as good as buy and hold returns in period of bull market while performing best in period of bear market.

If he looses 1/4 potential gain that could have been made by just buy and hold and then when there is a bear market the strategy is avoid losses, overall would still be same as buy and hold. Do you get me ?

— Reply to this email directly, view it on GitHub https://github.com/kieranjwood/trading-momentum-transformer/issues/2#issuecomment-1466898385, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHAE3WRVJAJ2KGZZAW6P5LW356G7ANCNFSM5TKWRG5Q . You are receiving this because you commented.Message ID: @.***>

Mar 13 '23 21:03 aicheung

I ran the model using Kiernan's data (5x per @replacementAI and as specified in the paper) and can confirm the performance. I've also run it on day, day part, hour and five minute equity prices. It finds signal w/ this relatively noisy data, though (a) the signal declines as the time series frequency increases, (b) the importance of CPD breaks down. Regarding CPD, it takes a while to run w/ limited upside from using a GPU. That's either due to a configuration issue (getting the gpflow libs right is challenging) or the sequential nature of the problem. At 3.7 Ghz it will do a ticker in 60-ish minutes per CPU on the default data set. As you move to higher frequency data, the time required blows up as the value breaks down.

Hi @MickyDowns, it's so nice to see that you can confirm the performance. I have a question about how the 'captured_returns' is calculated in the code, please see https://github.com/kieranjwood/trading-momentum-transformer/blob/master/mom_trans/deep_momentum_network.py#L484, here the 'returns ' is a volatility scaled returns. Should we use the real daily return? I understand the training objective is to optimize the sharpe ratio which is calculated using volatility scaled returns. But when we want to get the actual return, I think we need to use the actual daily return.

Hi @aicheung do you any thoughts on my comments above? Thanks!

Mar 05 '24 09:03 danbo6

It is a gread work for the methods sharing. Validate steps are done following the orders listed from the README and the results caculated are listed bellow: It is very different from results from the paper:

I am wondering the reason. Thank you for your help.

Same here. Did you find out the reason? You can add me @danzb0 on telegram to have more discussions :)

Mar 09 '24 14:03 danbo6

trading-momentum-transformer trading-momentum-transformer copied to clipboard

differences between paper and validation

trading-momentum-transformer
trading-momentum-transformer copied to clipboard