backtesting.py icon indicating copy to clipboard operation
backtesting.py copied to clipboard

Invalid daily aggregation of OHLC data with timezone/datetime offset

Open h0wXD opened this issue 2 years ago • 6 comments

Expected Behavior

resample('D') to take in account the right trading day when using timezoneoffset dates (issue with date parsing?)

Actual Behavior

Resample('D') of hourly candle puts equity sample on weekend instead of friday, when position entry was clearly on friday, equity balance should also be on friday instead of saturday.

Steps to Reproduce

Added log lines (see below) and ran sample strategy on AAPL1H timeframe exported from tradingview (only way to get correct candles plotted and entries plotted in my timezone +8, is to also add the timezoneoffset to the export) for both plotting and 100% same entry/exits (can see in trades table - except last position entry is using final candle.close instead of final candle.open in backtesting.py) Comparing with my C# code, where starting equity is at day end of Friday 2022-08-26 - 10077.1, where on backtesting.py it's moved to saturday, leading to incorrect results on lower timeframes. I have compared daily backtest of 'D' in both my program and backtesting.py, results are equal, so I think backtesting.py is not taking datetimeoffset into account for candles with lower interval

    day_returns = np.array(np.nan)
    annual_trading_days = np.nan
    if isinstance(index, pd.DatetimeIndex):
        day_returns = equity_df['Equity'].resample('D').last().dropna().pct_change()
        equity_df['Equity'].to_csv("Equity.csv")
        equity_df['Equity'].resample('D').last().dropna().to_csv("EquityD.csv")
class SmaCross(Strategy):
    n1 = 50
    n2 = 100

    def init(self):
        close = self.data.Close
        self.sma1 = self.I(SMA, close, self.n1)
        self.sma2 = self.I(SMA, close, self.n2)

    def next(self):
        if crossover(self.sma1, self.sma2):
            self.buy()
        elif crossover(self.sma2, self.sma1):
            self.sell()

bt = Backtest(AAPL1H, SmaCross,
              cash=10000, commission=.00,
              exclusive_orders=True,)

Additional info

AAPL1H.csv Equity.csv EquityD.csv image Some C# logic I wrote shows first change in portfolio balance on friday 2022-08-26 image backtesting.py logic shows first change in portfolio balance on saturday 2022-08-27 image

  • Backtesting version: 0.3.3
  • bokeh.__version__: 3.1.1
  • OS: Win 10

h0wXD avatar Jun 13 '23 20:06 h0wXD

In Equity.csv, the first change occurs:

2022-08-27 01:30:00+08:00,10000.0
2022-08-27 02:30:00+08:00,10028.392  <--
2022-08-27 03:30:00+08:00,10077.1
2022-08-29 21:30:00+08:00,10173.56

In EquityD.csv, this shows as:

2022-08-27 00:00:00+08:00,10077.1

which I guess is reasonable since the two dates match.

Can you use:

df.index = df.index.tz_convert(None)

before passing df to Backtest()?

kernc avatar Jun 13 '23 22:06 kernc

@kernc that works perfectly, thanks for the quick response

after doing the following before passing it to backtest

AAPL1H.index = AAPL1H.index.tz_convert(None)

now the Equity results are correct comparing to my previously shared C# sample

2022-08-23,10000.0
2022-08-24,10000.0
2022-08-25,10000.0
2022-08-26,10077.1
2022-08-29,10214.5
2022-08-30,10362.1
2022-08-31,10466.5
2022-09-01,10418.5
2022-09-02,10545.7
2022-09-06,10624.3
2022-09-07,10541.5
2022-09-08,10628.5

EquityD.csv Equity.csv

Do you reckon this should be built-in to backtesting.py?

h0wXD avatar Jun 14 '23 06:06 h0wXD

Do you reckon this should be built-in to backtesting.py?

I'm not too certain. If the user prefers timestamps in TZ-aware UTC time, I'm thinking why override it? In all respects, the user (should) knows what they are doing. And it's a simple-enough workaround.

kernc avatar Jun 14 '23 12:06 kernc

I still think this should be handled by the library when library users do use unintended datetime formats, as using date time dataset with offset causes invalid backtest results, this is a date handling issue. The datasets used above are default tradingview exports with only the csv headers updated to ,Open,High,Low,Close,Volume,VolMa. When changing the tradingview chart to UTC and exporting dates are in format "2022-08-03T16:30:00Z", when exporting from your local timezone it's in "2022-08-04 00:30:00+08:00". If this is not supported / leads to invalid backtest results, it would be nice to at least show a warning message. Thanks for the quick response and time spent building this amazing library!

h0wXD avatar Jun 14 '23 12:06 h0wXD

using date time dataset with offset causes invalid backtest results

Those results are not invalid! In Greenwich, it was simply already Saturday when the trade closed!

I feel this change would force a behavior which then couldn't be reverted. Maybe we can indeed issue a warning if timezone offset is present somewhere around here: https://github.com/kernc/backtesting.py/blob/0ce24d80b1bcb8120d95d31dc3bb351b1052a27d/backtesting/backtesting.py#L1123-L1126

kernc avatar Jun 14 '23 13:06 kernc