pandas-ta
pandas-ta copied to clipboard
Potential bug with large chained custom strategies
Which version are you running? The lastest version is on Github. Pip is for major releases.
import pandas_ta as ta
print(ta.version)
0.3.14b0
Do you have TA Lib also installed in your environment?
$ pip list
Yes TA-Lib 0.4.24
Most of my custom indicators rely on TA-Lib.
Did you upgrade? Did the upgrade resolve the issue?
$ pip install -U git+https://github.com/twopirllc/pandas-ta
Yes I did install the latest. No it didn't help.
Describe the bug A clear and concise description of what the bug is. I'm chaining together custom indicators. If I have more than 5 indicators it fails to recognize the custom indicator columns. If there are 5 or fewer it runs just fine.
To Reproduce
# THIS FAILS .....
class DSStrategizer(Strategizer):
def init_indicators(self, base_df, **kwargs):
self.indicators = base_df # Start with history <<-- Pulled from kucoin exchange
self.indicators.set_index(
pd.DatetimeIndex(self.indicators["date_time"]), inplace=True)
TheStrategy = ta.Strategy(
name="DSIND",
description="Just Testing",
ta=[
{"kind": 'lrsi'}, # custom
{"kind": 'mama_ppo'}, # custom
{"kind": 'ohlc4'},
{"kind": 'ht_dcperiod'}, # Direct use of talib col name HT_DCPERIOD
{"kind": 'minvp'}, # Uses HT_DCPERIOD
{"kind": 'maxvp'}, # Uses HT_DCPERIOD
{"kind": "mavp", "matype": MA_Type.TEMA,
"maxperiod": 40, "minperiod": 10}, # Uses HT_DCPERIOD
]
)
self.indicators.ta.strategy(TheStrategy, **kwargs)
return self.indicators
Expected behavior The strategy method should populate indicators. The three indicators that depend on on HT_DCPERIOD should not generate errors. If there are five or fewer indicators in the ta array, it's fine. But as soon as I add the 6th indicator it fails (see below).
I expect to be able to load an arbitrarily large number of indicators, including chaining custom indicators.
Screenshots When it fails....
[X] Ooops!!! It's True, the series 'HT_DCPERIOD' was not found in start_timestamp, date_time, open, close, high, low, amount, volume
[X] Ooops!!! It's True, the series 'HT_DCPERIOD' was not found in start_timestamp, date_time, open, close, high, low, amount, volume
[X] Ooops!!! It's True, the series 'HT_DCPERIOD' was not found in start_timestamp, date_time, open, close, high, low, amount, volume
When it works ...
Additional context It looks like after 5 indicators the strategy method loses the new indicators and only has my base history (the columns above are what I've populated from the kucoin exchange) date_time is the time index.
I've changed the order. I've always made sure HT_DCPERIOD is called before the variable period methods. As long as there are 5 or fewer methods everything fails As soon as I add the 6th method, all three methods that depend on HT_DCPERIOD FAIL (minvp, maxvp, mavp)
Also, it doesn't seem to matter what the indicators are. I can add sma, ema etc. it's the number of indicators in the list that triggers the error.
This DOES NOT FAIL....
ta=[
# {"kind": 'lrsi'}, # custom
# {"kind": 'mama_ppo'}, # custom
{"kind": 'ohlc4'},
{"kind": 'ht_dcperiod'}, # Direct use of talib col name HT_DCPERIOD
{"kind": 'minvp'}, # Uses HT_DCPERIOD
{"kind": 'maxvp'}, # Uses HT_DCPERIOD
{"kind": "mavp", "matype": MA_Type.TEMA, "maxperiod": 40, "minperiod": 10}, # Uses HT_DCPERIOD
]
I have a workaround
Seems to be related to the number of cores and I expect it's multiprocessing .
Not sure what a long-term fix would be.
class DSStrategizer(Strategizer):
def init_indicators(self, base_df, **kwargs):
self.indicators = base_df # Start with history
self.indicators.set_index(
pd.DatetimeIndex(self.indicators["date_time"]), inplace=True)
Basestrategy = ta.Strategy(
name="DSBase",
description="Just Testing",
ta=[
{"kind": 'ht_dcperiod'}, # Direct use of talib col name HT_DCPERIOD
{"kind": 'lrsi'}, # custom
{"kind": 'mama_ppo'}, # custom
]
)
self.indicators.ta.strategy(Basestrategy, verbose=True, **kwargs)
TheStrategy = ta.Strategy(
name="DSIND",
description="Just Testing",
ta=[
# {"kind": 'ohlc4'},
{"kind": 'minvp'},
{"kind": 'maxvp'}, # Uses HT_DCPERIOD
{"kind": "mavp", "matype": MA_Type.TEMA,
"maxperiod": 40, "minperiod": 10}, # Uses HT_DCPERIOD
]
)
self.indicators.ta.strategy(TheStrategy, verbose=True, **kwargs)
return self.indicators
# self.indicators.concat(dcp)
Hello @DavidEBrumbaugh,
Hmmm... quite a few things to unpack here. Especially since you have ht_dcperiod
, minvp
, maxvp
, and mav
which are not part of this library yet, so I am a little confused how Pandas TA (PTA) is even processing them. I plan to include the rest of the ht_*
indicators before updating the main branch. So are you running ht_dcperiod
somewhere before DSStrategizer.init_indicators(base_df)
is called? Or is it already included with base_df
? In fact, which columns exist in base_df
? Also, I think 🤔 , if ht_dcperiod
exists in base_df
, then you do not need to include it in the ta.Study()
(or ta.Strategy()
).
Before moving forward, I recommend updating to the development version.
$ pip install -U git+https://github.com/twopirllc/pandas-ta.git@development
Note: df.ta.strategy()
is depreciated in lieu of df.ta.study()
as well as ta.Strategy()
-> ta.Study()
.
Development README
Short Answer
Since multiprocessing is the default mode of df.ta.study()
and you have few indicators and chaining, you should dodf.ta.study(TheStudy, cores=0, **kwargs)
. Multiprocessing serves no benefit for less than a certain number of indicators. It can take more time to spin up the multiprocessing pool, process, then tear it down than to process them sequentially. Furthermore, sequential processing makes it easier for chaining to work.
# With the Development Branch
class DSStrategizer(Strategizer):
def init_indicators(self, base_df, **kwargs):
self.indicators = base_df # Start with history <<-- Pulled from kucoin exchange
self.indicators.set_index(pd.DatetimeIndex(self.indicators["date_time"]), inplace=True)
TheStudy = ta.Study(
name="DSIND",
description="Just Testing",
cores=0, # <<--- Add this
ta=[
{"kind": 'lrsi'}, # custom
{"kind": 'mama_ppo'}, # custom
{"kind": 'ohlc4'},
{"kind": 'ht_dcperiod'}, # Direct use of talib col name HT_DCPERIOD
{"kind": 'minvp'}, # Uses HT_DCPERIOD
{"kind": 'maxvp'}, # Uses HT_DCPERIOD
{"kind": "mavp", "matype": MA_Type.TEMA, "maxperiod": 40, "minperiod": 10}, # Uses HT_DCPERIOD
]
)
self.indicators.ta.study(TheStudy, **kwargs)
return self.indicators
Alternative
If you absolutely must use multiprocessing, then you may need to play around with chunksize
argument tailored to your environment and use case. The default value of chunksize
is the number of cores of the machines... 6 in your case. Some details about chunksize. Note: there is no simple solution.
Long Term
It appears I need to loosen control of the multiprocessing pool arguments. 🤔 🤷🏼♂️
Hope this helps!
Thanks for the support. 😎 KJ
Thanks KJ - yes it does help a lot.
As to your question I used ht_dcperiod as a custom indicator ... All of my custom indicators use HT so I'm just building a ton of custom indicators....
Originally when base_df comes in it's just open, close, high, low, amount, volume - I downloaded it from the kucoin exchange API.
# pandas_ta custom implemenation of ta ht_dcperiod
from pandas import Series
from pandas_ta.utils import get_offset, verify_series
import talib as tal # NOTE: talib is just plain required
def ht_dcperiod(close, offset=None, **kwargs):
close = verify_series(close)
offset = get_offset(offset)
if close is None:
return
result = tal.HT_DCPERIOD(close)
dcp = Series(result, index=close.index)
# Offset
if offset != 0:
dcp = dcp.shift(offset)
# Handle fills
if "fillna" in kwargs:
dcp.fillna(kwargs["fillna"], inplace=True)
if "fill_method" in kwargs:
dcp.fillna(method=kwargs["fill_method"], inplace=True)
# Name and Categorize it
dcp.name = "HT_DCPERIOD"
dcp.category = "cycles"
return dcp
ht_dcperiod.__doc__ = \
'''
Yada yada
'''
def ht_dcperiod_method(self, offset=None, **kwargs):
close = self._get_column(kwargs.pop("close", "close"))
result = ht_dcperiod(close=close, offset=offset, **kwargs)
return self._post_process(result, **kwargs)
@DavidEBrumbaugh
yes it does help a lot.
Excellent! Thanks for your patience.
As to your question I used ht_dcperiod as a custom indicator ... All of my custom indicators use HT so I'm just building a ton of custom indicators....
Gotcha!
Yeah my goal is to finish the rest of the ht_*
indicators before updating the main branch and pip. Of course I will be wrapping the TA Lib ones as well like you did in the previous comment.
Originally when base_df comes in it's just open, close, high, low, amount, volume - I downloaded it from the kucoin exchange API.
That's what I thought. Just wanted to make sure. 😉
KJ