qlib
qlib copied to clipboard
Bug fix for Rank and WMA operators
Description
- change the scaling in the inner function of Rank operator from len(x1) to 100
- add 1 on the non-normalized weights of WMA operators
Motivation and Context
For 1): I would guess the goal is to scale the ranking result between 0 and 1. According to the scipy document, percentileofscore
Compute the percentile rank of a score relative to a list of scores. A percentileofscore of, for example, 80% means that 80% of the scores in a are below the given score.
it doesn't make sense to divide by length to array size, instead, should divide by 100.
For 2): The common convention of linear weighted MA with window size d should be weighted by d, d-1, ..., 1, instead of d-1, d-2, ..., 0, see e.g. https://www.fidelity.com/learning-center/trading-investing/technical-analysis/technical-indicator-guide/wma and the decay_linear
function in famous 101 Formulaic Alphas by Zura Kakushadze from WorldQuant.
Though convention problem is arguable I think it's better to be more intuitive.
How Has This Been Tested?
- [x] Pass the test by running:
pytest qlib/tests/test_all_pipeline.py
under upper directory ofqlib
. - [ ] If you are adding a new feature, test on your own test scripts.
Screenshots of Test Results (if appropriate):
-
Pipeline test:
-
Your own tests:
Types of changes
- [x] Fix bugs
- [ ] Add new feature
- [ ] Update documentation
@you-n-g Would you take a look please? Let me know if anything needs improvement. Thanks!
Nice shot! Thanks for fixing the bug! I have left one comment. LGTM after it is fixed.
Thanks for reply. Updated as you suggested. Also corrected some wrong param list comments.
@you-n-g any feedback please?
@qianyun210603 It looks that the CI still fails. Please check the error. Thanks
@you-n-g It's because the base 'microsoft:main' on Aug 13, which was the base of my pull request, had failed CI . Fixed after merging the latest 'microsoft:main'.
@you-n-g took a further look on the failure it complains
self = Rolling [window=10,min_periods=1,center=False,axis=0,method=single]
attr = 'rank'
def __getattr__(self, attr: str):
if attr in self._internal_names_set:
return object.__getattribute__(self, attr)
if attr in self.obj:
return self[attr]
raise AttributeError(
> f"'{type(self).__name__}' object has no attribute '{attr}'"
)
E AttributeError: 'Rolling' object has no attribute 'rank'
I would guess it is caused by too low pandas version which rank has not been implemented in Rolling. Could you let me know where to check the pandas version in the Action run? Thanks!
@you-n-g
Dig in some further, it looks the py3.7 tests fails because Rolling.rank
was added from pandas 1.4.0+ but py3.7 can only run pandas up to 1.3.5. So the failures.
Therefore we need to fall back to the original scipy solution for rank for py3.7.
I modified the pull request to use scipy version for pandas 1.3.5 and below and native pandas Rolling.rank
for pandas 1.4.0 and above. The code is a bit ugly though.
Alternative would be drop the py37 support and use native pandas Rolling.rank
, or
fall back to the scipy solution for all versions (I would recommend against as scipy version is 20 times slower than pandas native from a simple test I did)
Let me know your thought. thx.
@you-n-g any thoughts please?
@you-n-g any progress please?
@you-n-g any comments on this?
Sorry for the late response. Thanks for your great efforts!