vectorbt icon indicating copy to clipboard operation
vectorbt copied to clipboard

bug: business date freq lead to error

Open andreas-vester opened this issue 3 years ago • 10 comments

When having a time series with a business day index frequency, the code crashes once I call the stats() method of an indicator.

import numpy as np
import pandas as pd
import vectorbt as vbt

df = pd.DataFrame(
    data=np.random.randint(low=1, high=100, size=20),
    index=pd.bdate_range(start="2021-01-01", periods=20),
)

ma = vbt.MA.run(close=df, window=10)

print(ma.stats())


Traceback (most recent call last):
  File "/home/andi/coding/test_project/test_project/vbt_bus_day_issue.py", line 12, in <module>
    ma.stats()
  File "/home/andi/coding/test_project/.venv/lib/python3.9/site-packages/vectorbt/generic/stats_builder.py", line 250, in stats
    silence_warnings = self.stats_defaults.get('silence_warnings', False)
  File "/home/andi/coding/test_project/.venv/lib/python3.9/site-packages/vectorbt/generic/stats_builder.py", line 58, in stats_defaults
    dict(settings=dict(freq=self.wrapper.freq))
  File "/home/andi/coding/test_project/.venv/lib/python3.9/site-packages/vectorbt/base/array_wrapper.py", line 432, in freq
    return freq_to_timedelta(self.index.freq)
  File "/home/andi/coding/test_project/.venv/lib/python3.9/site-packages/vectorbt/utils/datetime_.py", line 24, in freq_to_timedelta
    return pd.Timedelta(arg)
  File "pandas/_libs/tslibs/timedeltas.pyx", line 1315, in pandas._libs.tslibs.timedeltas.Timedelta.__new__
ValueError: Value must be Timedelta, string, integer, float, timedelta or convertible, not BusinessDay

When defining df without business days, the code runs smooth.

import numpy as np
import pandas as pd
import vectorbt as vbt

df = pd.DataFrame(
    data=np.random.randint(low=1, high=100, size=20),
    index=pd.date_range(start="2021-01-01", periods=20),  # change from 'bdate' to 'date'
)

ma = vbt.MA.run(close=df, window=10)

print(ma.stats())


Start     2021-01-01 00:00:00
End       2021-01-20 00:00:00
Period       20 days 00:00:00
Name: agg_func_mean, dtype: object

andreas-vester avatar Dec 13 '21 14:12 andreas-vester

@andreas-vester business days cannot be represented as timedelta. Use regular days as frequency and reduce number of days in year_freq accordingly (see https://github.com/polakowo/vectorbt/issues/294#issuecomment-989641568).

polakowo avatar Dec 13 '21 15:12 polakowo

@andreas-vester business days cannot be represented as timedelta.

OK, that's clear.

Use regular days as frequency and reduce number of days in year_freq accordingly (see #294 (comment)).

That's not clear to me. You're probably not talking about reindexing the time series, aren't you?

df = pd.DataFrame(
    data=np.random.randint(low=1, high=100, size=260),
    index=pd.bdate_range(start="2021-01-01", periods=260),
)

idx = pd.date_range(start=df.index[0], end=df.index[-1])
df = df.reindex(idx)

vbt.settings.returns["year_freq"] = "260 days"

ma = vbt.MA.run(close=df, window=10)

print(ma.stats())

print(ma.ma)  # doesn't ignore additional ``np.nan``


Start     2021-01-01 00:00:00
End       2021-12-30 00:00:00
Period      364 days 00:00:00
Name: agg_func_mean, dtype: object
ma_window   10
2021-01-01 NaN
2021-01-02 NaN
2021-01-03 NaN
2021-01-04 NaN
2021-01-05 NaN
...         ..
2021-12-26 NaN
2021-12-27 NaN
2021-12-28 NaN
2021-12-29 NaN
2021-12-30 NaN

[364 rows x 1 columns]

andreas-vester avatar Dec 13 '21 19:12 andreas-vester

No need to reindex, just pass freq='d' and year_freq=... to portfolio so it can internally do to_timedelta(year_freq) / to_timedelta(freq) to get the annualization factor.

polakowo avatar Dec 13 '21 20:12 polakowo

Sorry, but I didn't mentioned any Portfolio class in my example, yet. I am just fighting with the indicator. So what settings am I exactly supposed to set?

vbt.settings.returns["year_freq"] = "260 days"
vbt.settings.portfolio["freq"] = "d"

This is not working as intended.

andreas-vester avatar Dec 13 '21 20:12 andreas-vester

This will set frequency for all classes: vbt.settings.array_wrapper['freq'] = 'd'

polakowo avatar Dec 13 '21 20:12 polakowo

This will set frequency for all classes: vbt.settings.array_wrapper['freq'] = 'd'

OK, this is perfectly working. Thanks.

Two more questions:

  1. Is this setting mentioned in the docs?
  2. Would it make sense to let the code automatically set this setting if it infers the date freq to be "business days"? When would it considered not to be useful?

andreas-vester avatar Dec 14 '21 08:12 andreas-vester

Maybe somewhere, but you can definitely find it in the example notebooks whenever frequency cannot be inferred. Yes, I can make it to be set to days automatically.

polakowo avatar Dec 14 '21 11:12 polakowo

@polakowo I found this issue after having the same error using 1M and 1w interval data from Binance api and then running a strategy and trying to print stats and plots from the portfolio object.

ValueError: Value must be Timedelta, string, integer, float, timedelta or convertible, not Week or ValueError: Value must be Timedelta, string, integer, float, timedelta or convertible, not Month

I don't fully understand what these settings are doing, and what they should be set to for weekly/monthly series data. Are these settings (vbt.settings.array_wrapper['freq'] = 'd' and vbt.settings.returns["year_freq"] = "260 days") only relevant for calculating the Sharpe ratio and other annualized statistics?

If I set: vbt.settings.array_wrapper['freq'] = 'w' my strategy completes successfully without error (weekly series from Binance). I don't appear to need to set the ["year_freq"]. What am I actually changing here and what should the combination of these two settings be?

ArrayWrapper documentation does not tell me what the valid values of freq are: https://vectorbt.dev/api/base/array_wrapper/#vectorbt.base.array_wrapper.ArrayWrapper, so I don't know if for monthly I should be setting it to m or M etc.

Any clarity on what these settings are doing is appreciated.

aaronmboyd avatar Nov 28 '22 11:11 aaronmboyd

For one month you can simply set 30d, you cannot use 1M because it's an irregular date offset that cannot be converted into a timedelta, which is required for annualization.

polakowo avatar Nov 28 '22 11:11 polakowo

Thanks, so for weekly series: vbt.settings.array_wrapper['freq'] = '7d'

Monthly series: vbt.settings.array_wrapper['freq'] = '30d'

Is that correct @polakowo ?

Do I need to set the vbt.settings.returns["year_freq"] = "260 days" at all? Or is this only if you don't have a contiguous series set (like traditional equities Mon-Fri)?

aaronmboyd avatar Nov 28 '22 12:11 aaronmboyd