backtesting.py
backtesting.py copied to clipboard
plot html from 15m, warning Length of values (2) does not match length of index (1)
Expected Behavior
expect draw a plot html
(resample param is True by default, same code is works fine when using 1day data, seems like it is because there is too many data?)
Actual Behavior
D:\Users\MECHREVO\PycharmProjects\backtesting.py\backtesting\_plotting.py:122: UserWarning: Data contains too many candlesticks to plot; downsampling to '8H'. See `Backtest.plot(resample=...)`
warnings.warn(f"Data contains too many candlesticks to plot; downsampling to {freq!r}. "
Traceback (most recent call last):
File "D:\Users\MECHREVO\AppData\Local\Programs\Python\Python37\lib\code.py", line 90, in runcode
exec(code, self.locals)
File "<input>", line 1, in <module>
File "C:\Program Files\JetBrains\PyCharm 2021.3.1\plugins\python\helpers\pydev\_pydev_bundle\pydev_umd.py", line 198, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "C:\Program Files\JetBrains\PyCharm 2021.3.1\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "D:/Users/MECHREVO/PycharmProjects/backtesting.py/main.py", line 30, in <module>
bt.plot()
File "D:\Users\MECHREVO\PycharmProjects\backtesting.py\backtesting\backtesting.py", line 1609, in plot
open_browser=open_browser)
File "D:\Users\MECHREVO\PycharmProjects\backtesting.py\backtesting\_plotting.py", line 204, in plot
resample, df, indicators, equity_data, trades)
File "D:\Users\MECHREVO\PycharmProjects\backtesting.py\backtesting\_plotting.py", line 158, in _maybe_resample_data
ExitBar=_group_trades('ExitTime'),
File "D:\Users\MECHREVO\PycharmProjects\backtesting.py\venv\lib\site-packages\pandas\core\resample.py", line 335, in aggregate
result = ResamplerWindowApply(self, func, args=args, kwargs=kwargs).agg()
File "D:\Users\MECHREVO\PycharmProjects\backtesting.py\venv\lib\site-packages\pandas\core\apply.py", line 161, in agg
return self.agg_dict_like()
File "D:\Users\MECHREVO\PycharmProjects\backtesting.py\venv\lib\site-packages\pandas\core\apply.py", line 436, in agg_dict_like
key: obj._gotitem(key, ndim=1).agg(how) for key, how in arg.items()
File "D:\Users\MECHREVO\PycharmProjects\backtesting.py\venv\lib\site-packages\pandas\core\apply.py", line 436, in <dictcomp>
key: obj._gotitem(key, ndim=1).agg(how) for key, how in arg.items()
File "D:\Users\MECHREVO\PycharmProjects\backtesting.py\venv\lib\site-packages\pandas\core\groupby\generic.py", line 265, in aggregate
return self._python_agg_general(func, *args, **kwargs)
File "D:\Users\MECHREVO\PycharmProjects\backtesting.py\venv\lib\site-packages\pandas\core\groupby\groupby.py", line 1332, in _python_agg_general
result = self.grouper.agg_series(obj, f)
File "D:\Users\MECHREVO\PycharmProjects\backtesting.py\venv\lib\site-packages\pandas\core\groupby\ops.py", line 1060, in agg_series
result = self._aggregate_series_fast(obj, func)
File "D:\Users\MECHREVO\PycharmProjects\backtesting.py\venv\lib\site-packages\pandas\core\groupby\ops.py", line 1283, in _aggregate_series_fast
result, _ = sbg.get_result()
File "pandas\_libs\reduction.pyx", line 184, in pandas._libs.reduction.SeriesBinGrouper.get_result
File "pandas\_libs\reduction.pyx", line 88, in pandas._libs.reduction._BaseGrouper._apply_to_group
File "D:\Users\MECHREVO\PycharmProjects\backtesting.py\venv\lib\site-packages\pandas\core\groupby\groupby.py", line 1318, in <lambda>
f = lambda x: func(x, *args, **kwargs)
File "D:\Users\MECHREVO\PycharmProjects\backtesting.py\backtesting\_plotting.py", line 147, in f
mean_time = int(bars.loc[s.index].view(int).mean())
File "D:\Users\MECHREVO\PycharmProjects\backtesting.py\venv\lib\site-packages\pandas\core\series.py", line 801, in view
self._values.view(dtype), index=self.index
File "D:\Users\MECHREVO\PycharmProjects\backtesting.py\venv\lib\site-packages\pandas\core\series.py", line 428, in __init__
com.require_length_match(data, index)
File "D:\Users\MECHREVO\PycharmProjects\backtesting.py\venv\lib\site-packages\pandas\core\common.py", line 532, in require_length_match
"Length of values "
ValueError: Length of values (2) does not match length of index (1)
Steps to Reproduce
ETHUSDT15m = _read_file('ETHUSDT-15m.csv')
main.py
from backtesting import Backtest, Strategy
from backtesting.lib import crossover
from backtesting.test import SMA,ETHUSDT15m
class SmaCross(Strategy):
def init(self):
price = self.data.Close
self.ma1 = self.I(SMA, price, 10)
self.ma2 = self.I(SMA, price, 20)
def next(self):
reverse = not self.data.High.max(initial=0) > 65000
if crossover(self.ma1, self.ma2):
if reverse:
self.buy()
else:
self.sell()
elif crossover(self.ma2, self.ma1):
if reverse:
self.sell()
else:
self.buy()
bt = Backtest(ETHUSDT15m, SmaCross, cash=5000, commission=0.02, margin=1 / 125, exclusive_orders=True)
stats = bt.run()
bt.plot()
print(stats)
ETHUSDT-15m.csv ETHUSDT-1d.csv
Additional info
- Backtesting version: 0.3.4.dev1+g94d20da
You can install backtesting=0.3.2, and it will plot after some warning. It has something to do with the way backtesting 0.3.3 resamples when there are to many data points.
Experiencing the same issue with 0.33. It seems that this bug is located somewhere in the resampling functions, as it is only triggered only when the resample=True flag is taking effect (i.m., # entries > 10000 by default). When forcing resampling with a string, it will always be triggered no matter the number of entries.
Experiencing the same issue with 0.32 and 0.33
I thought I was the only one having this problem. Just set resample=False and it is fixed, but then you cannot use resampling for your plots.
Any update on this issue? Resampling doesn't seem to be working on the current build, and I wasn't able to diagnose the issue. I'm not even sure what the root cause is... 🤷♂️
As my debug, _group_trades
inside _maybe_resample_data
didn't work correctly because error happened below aggregation.
https://github.com/kernc/backtesting.py/blob/65f54f6819cac5f36fd94ebf0377644c62b4ee3d/backtesting/_plotting.py#L143-L159
By the way, why do we need another aggregation for EntryBar/ExitBar? In my impression TRADES_AGG
already has it and we can simply use it. so can we remove these two lines? or am I missing something?
My version was 0.3.3
TRADES_AGG = OrderedDict((
('Size', 'sum'),
('EntryBar', 'first'),
('ExitBar', 'last'),
('EntryPrice', 'mean'),
('ExitPrice', 'mean'),
('PnL', 'sum'),
('ReturnPct', 'mean'),
('EntryTime', 'first'),
('ExitTime', 'last'),
('Duration', 'sum'),
))
Hi! I'm experiencing the same problem. Any update on this issue? :)
I have removed the extra aggregation for EntryBar and ExitBar. That appears to solve to problem, but you loose the plot of the Entry/Exit points
if len(trades): # Avoid pandas "resampling on Int64 index" error
trades = trades.assign(count=1).resample(freq, on='ExitTime', label='right').agg(dict(
TRADES_AGG,
ReturnPct=_weighted_returns,
count='sum',
#EntryBar=_group_trades('EntryTime'),
#ExitBar=_group_trades('ExitTime'),
)).dropna()
Bump. Having the same issue.
UserWarning:
Data contains too many candlesticks to plot; downsampling to '8H'. See `Backtest.plot(resample=...)`
ValueError: Length of values (2) does not match length of index (1)
Interestingly enough, I tried running this in WSL and it worked fine with Bokeh 3.1.1 and backtesting.py 0.3.3. Im using more than 50K rows.
Wrestling a whole lot with this one! Downgrading bokeh, checking length of DF in all possible ways, setting resample to 2H, but the advice from casper with setting it to False fixed it. But it is still sad, that i have to plot hundreds of thousands of 5 min candles, that i cannot even see on the screen, in regards to speed. will there be a fix, or is it something we could fix ourself?
Same issue, any fix?
Same issue, any fix?
my solution was to plot it manually with plotly graph objects. resampling to 1H is decently enough performance wise. as i was doing that, i also made a neatly color formatted table with the stats with pandas to_html
In my resampling from hourly timeseries to weekly, once I changed view(int)
to view('int64')
like below, it worked.
def _group_trades(column):
def f(s, new_index=pd.Index(df.index.view('int64')), bars=trades[column]):
if s.size:
# Via int64 because on pandas recently broken datetime
mean_time = int(bars.loc[s.index].view('int64').mean())
new_bar_idx = new_index.get_loc(mean_time, method='nearest')
return new_bar_idx
return f
The following is the original one. https://github.com/kernc/backtesting.py/blob/65f54f6819cac5f36fd94ebf0377644c62b4ee3d/backtesting/_plotting.py#L143-L150
From my observation, view(int)
actually returned int32 instead of int64 and also L147 was crushed in some reason with pandas 2.0.1 and backtesting 0.3.3. I think this issue happens when we use more frequent data than daily as original post of this topic said.
This is what I saw in dtype
.
> df.index.view(int).dtype
dtype('int32')
alignment: 4
base: dtype('int32')
byteorder: '='
char: 'l'
descr: [('', '<i4')]
fields: None
flags: 0
hasobject: False
isalignedstruct: False
isbuiltin: 1
isnative: True
itemsize: 4
kind: 'i'
metadata: None
name: 'int32'
names: None
ndim: 0
num: 7
shape: ()
str: '<i4'
subdtype: None
> df.index.view('int64').dtype
dtype('int64')
alignment: 8
base: dtype('int64')
byteorder: '='
char: 'q'
descr: [('', '<i8')]
fields: None
flags: 0
hasobject: False
isalignedstruct: False
isbuiltin: 1
isnative: True
itemsize: 8
kind: 'i'
metadata: None
name: 'int64'
names: None
ndim: 0
num: 9
shape: ()
str: '<i8'
subdtype: None
Same issue, any fix?
my solution was to plot it manually with plotly graph objects. resampling to 1H is decently enough performance wise. as i was doing that, i also made a neatly color formatted table with the stats with pandas to_html
Any chance you can share your solution? Also hitting this issue with 200k historical data points.
this obviously hasn't been fixed.. but @tani3010 , is this a for sure working solution you just recently for _group_trades()?
this obviously hasn't been fixed.. but @tani3010 , is this a for sure working solution you just recently for _group_trades()?
I had to change an additional line because get_loc
does not have a method
parameter anymore:
def _group_trades(column):
def f(s, new_index=pd.Index(df.index.view('int64')), bars=trades[column]):
if s.size:
# Via int64 because on pandas recently broken datetime
mean_time = int(bars.loc[s.index].view('int64').mean())
new_bar_idx = new_index.get_indexer([mean_time], method='nearest')[0]
return new_bar_idx
return f
This solution currently works as expected for me.
Same issue, any fix?
my solution was to plot it manually with plotly graph objects. resampling to 1H is decently enough performance wise. as i was doing that, i also made a neatly color formatted table with the stats with pandas to_html
Any chance you can share your solution? Also hitting this issue with 200k historical data points.
Hi there! Sure! I've stopped using Backtesting because it is too slow, but i've digged down in the chest to find some hopefully usable code for you. You will need to take out the _trades that are inside the backtest results(the series object with the stats) :
import plotly.graph_objects as go
from plotly.subplots import make_subplots
ohlcdata = df.resample('1H').agg({'Open':'first','High':'max','Low':'min','Close':'last','Volume':'sum'})
charts = make_subplots(rows=1, cols=1)
charts.add_trace(go.Candlestick(showlegend=False,name='OHLC',x=ohlcdata.index,open=ohlcdata['Open'], high=ohlcdata['High'], low=ohlcdata['Low'],close=ohlcdata['Close'], row=1, col=1)
hover_entry = [f" <br>{entrytime}<br>Qty: {size}<br>Price: {round(price,4)}" for entrytime, size, price in zip(optistats_trades['EntryTime'], optistats_trades['Size'], optistats_trades['EntryPrice']*0.99)]
charts.add_trace(go.Scatter(hovertemplate=hover_entry,showlegend=False,x=optistats_trades['EntryTime'],y=optistats_trades['EntryPrice'], name=' ', mode='markers', marker=dict(size=10, symbol="arrow",color=light,showscale=False)), row=1, col=1)
hover_exit = [f" <br>{time}<br>PnL: {round(pnl, 2)}<br>Return %: {round(return_pct*100,2)}<br>Price: {round(price,2)}" for pnl, return_pct, time, price in zip(optistats_trades['PnL'], optistats_trades['ReturnPct'], optistats_trades['ExitTime'], optistats_trades['ExitPrice'])]
charts.add_trace(go.Scatter(hovertemplate=hover_exit,showlegend=False,x=optistats_trades['ExitTime'],y=optistats_trades['ExitPrice'],name=" ",mode='markers', marker=dict(size=15, symbol="triangle-down",), row=1, col=1)
start_date = '2017-06-01'
end_date = '2023-06-01'
charts.update_xaxes(matches='x1',griddash='dot',range=[start_date, end_date],showdividers=True,showline=False)
charts_equity_html = charts.to_html(div_id='charts')
with open(filename, 'w', encoding='utf-8') as the_file:
the_file.write(charts_equity_html)
something like this, i retrieved it from a messy file and tried to clean it a bit, but it should be a good head start for you
to create a html table you can use: results_html = pandas.DataFrame(metrics).to_html()