vectorbt
vectorbt copied to clipboard
bug: business date freq lead to error
When having a time series with a business day index frequency, the code crashes once I call the stats() method of an indicator.
import numpy as np
import pandas as pd
import vectorbt as vbt
df = pd.DataFrame(
data=np.random.randint(low=1, high=100, size=20),
index=pd.bdate_range(start="2021-01-01", periods=20),
)
ma = vbt.MA.run(close=df, window=10)
print(ma.stats())
Traceback (most recent call last):
File "/home/andi/coding/test_project/test_project/vbt_bus_day_issue.py", line 12, in <module>
ma.stats()
File "/home/andi/coding/test_project/.venv/lib/python3.9/site-packages/vectorbt/generic/stats_builder.py", line 250, in stats
silence_warnings = self.stats_defaults.get('silence_warnings', False)
File "/home/andi/coding/test_project/.venv/lib/python3.9/site-packages/vectorbt/generic/stats_builder.py", line 58, in stats_defaults
dict(settings=dict(freq=self.wrapper.freq))
File "/home/andi/coding/test_project/.venv/lib/python3.9/site-packages/vectorbt/base/array_wrapper.py", line 432, in freq
return freq_to_timedelta(self.index.freq)
File "/home/andi/coding/test_project/.venv/lib/python3.9/site-packages/vectorbt/utils/datetime_.py", line 24, in freq_to_timedelta
return pd.Timedelta(arg)
File "pandas/_libs/tslibs/timedeltas.pyx", line 1315, in pandas._libs.tslibs.timedeltas.Timedelta.__new__
ValueError: Value must be Timedelta, string, integer, float, timedelta or convertible, not BusinessDay
When defining df without business days, the code runs smooth.
import numpy as np
import pandas as pd
import vectorbt as vbt
df = pd.DataFrame(
data=np.random.randint(low=1, high=100, size=20),
index=pd.date_range(start="2021-01-01", periods=20), # change from 'bdate' to 'date'
)
ma = vbt.MA.run(close=df, window=10)
print(ma.stats())
Start 2021-01-01 00:00:00
End 2021-01-20 00:00:00
Period 20 days 00:00:00
Name: agg_func_mean, dtype: object
@andreas-vester business days cannot be represented as timedelta. Use regular days as frequency and reduce number of days in year_freq accordingly (see https://github.com/polakowo/vectorbt/issues/294#issuecomment-989641568).
@andreas-vester business days cannot be represented as timedelta.
OK, that's clear.
Use regular days as frequency and reduce number of days in year_freq accordingly (see #294 (comment)).
That's not clear to me. You're probably not talking about reindexing the time series, aren't you?
df = pd.DataFrame(
data=np.random.randint(low=1, high=100, size=260),
index=pd.bdate_range(start="2021-01-01", periods=260),
)
idx = pd.date_range(start=df.index[0], end=df.index[-1])
df = df.reindex(idx)
vbt.settings.returns["year_freq"] = "260 days"
ma = vbt.MA.run(close=df, window=10)
print(ma.stats())
print(ma.ma) # doesn't ignore additional ``np.nan``
Start 2021-01-01 00:00:00
End 2021-12-30 00:00:00
Period 364 days 00:00:00
Name: agg_func_mean, dtype: object
ma_window 10
2021-01-01 NaN
2021-01-02 NaN
2021-01-03 NaN
2021-01-04 NaN
2021-01-05 NaN
... ..
2021-12-26 NaN
2021-12-27 NaN
2021-12-28 NaN
2021-12-29 NaN
2021-12-30 NaN
[364 rows x 1 columns]
No need to reindex, just pass freq='d' and year_freq=... to portfolio so it can internally do to_timedelta(year_freq) / to_timedelta(freq) to get the annualization factor.
Sorry, but I didn't mentioned any Portfolio class in my example, yet. I am just fighting with the indicator. So what settings am I exactly supposed to set?
vbt.settings.returns["year_freq"] = "260 days"
vbt.settings.portfolio["freq"] = "d"
This is not working as intended.
This will set frequency for all classes: vbt.settings.array_wrapper['freq'] = 'd'
This will set frequency for all classes:
vbt.settings.array_wrapper['freq'] = 'd'
OK, this is perfectly working. Thanks.
Two more questions:
- Is this setting mentioned in the docs?
- Would it make sense to let the code automatically set this setting if it infers the date freq to be "business days"? When would it considered not to be useful?
Maybe somewhere, but you can definitely find it in the example notebooks whenever frequency cannot be inferred. Yes, I can make it to be set to days automatically.
@polakowo I found this issue after having the same error using 1M and 1w interval data from Binance api and then running a strategy and trying to print stats and plots from the portfolio object.
ValueError: Value must be Timedelta, string, integer, float, timedelta or convertible, not Week
or
ValueError: Value must be Timedelta, string, integer, float, timedelta or convertible, not Month
I don't fully understand what these settings are doing, and what they should be set to for weekly/monthly series data. Are these settings (vbt.settings.array_wrapper['freq'] = 'd' and vbt.settings.returns["year_freq"] = "260 days") only relevant for calculating the Sharpe ratio and other annualized statistics?
If I set:
vbt.settings.array_wrapper['freq'] = 'w' my strategy completes successfully without error (weekly series from Binance). I don't appear to need to set the ["year_freq"]. What am I actually changing here and what should the combination of these two settings be?
ArrayWrapper documentation does not tell me what the valid values of freq are: https://vectorbt.dev/api/base/array_wrapper/#vectorbt.base.array_wrapper.ArrayWrapper, so I don't know if for monthly I should be setting it to m or M etc.
Any clarity on what these settings are doing is appreciated.
For one month you can simply set 30d, you cannot use 1M because it's an irregular date offset that cannot be converted into a timedelta, which is required for annualization.
Thanks, so for weekly series:
vbt.settings.array_wrapper['freq'] = '7d'
Monthly series:
vbt.settings.array_wrapper['freq'] = '30d'
Is that correct @polakowo ?
Do I need to set the vbt.settings.returns["year_freq"] = "260 days" at all? Or is this only if you don't have a contiguous series set (like traditional equities Mon-Fri)?