backtesting.py icon indicating copy to clipboard operation
backtesting.py copied to clipboard

bt.optimize() results in "buffer is too small for requested array"

Open Zirafnik opened this issue 9 months ago • 7 comments

Expected behavior

When I try to optimize the parameters of the simple SMA cross example provided in the tutorials, but instead use custom 1min OHLC data for a period of one month (44640 units), the following code results in an error: TypeError: buffer is too small for requested array.

stats = bt.optimize(
    n1=[10], n2=[20], maximize="Equity Final [$]", constraint=lambda param: param.n1 < param.n2
)
bt.plot(plot_equity=False, plot_return=True)
print(stats)
TypeError                                 Traceback (most recent call last)
Cell In[19], line 1
----> 1 stats = bt.optimize(
      2     n1=[10], n2=[20], maximize="Equity Final [$]", constraint=lambda param: param.n1 < param.n2
      3 )
      4 bt.plot(plot_equity=False, plot_return=True)
      5 print(stats)

File ~/Coding/algotrading/.venv/lib/python3.12/site-packages/backtesting/backtesting.py:1630, in Backtest.optimize(self, maximize, method, max_tries, constraint, return_heatmap, return_optimization, random_state, **kwargs)
   1627     return stats if len(output) == 1 else tuple(output)
   1629 if method == 'grid':
-> 1630     output = _optimize_grid()
   1631 elif method in ('sambo', 'skopt'):
   1632     output = _optimize_sambo()

File ~/Coding/algotrading/.venv/lib/python3.12/site-packages/backtesting/backtesting.py:1527, in Backtest.optimize.<locals>._optimize_grid()
   1524     shm_refs.append(shm)
   1525     return shm.name, vals.shape, vals.dtype
-> 1527 data_shm = tuple((
   1528     (column, *arr2shm(values))
   1529     for column, values in chain([(Backtest._mp_task_INDEX_COL, self._data.index)],
   1530                                 self._data.items())
   1531 ))
   1532 with patch(self, '_data', None):
...
-> 1521 buf = np.ndarray(vals.shape, dtype=vals.dtype, buffer=shm.buf)
   1522 buf[:] = vals[:]  # Copy into shared memory
   1523 assert vals.ndim == 1, (vals.ndim, vals.shape, vals)

TypeError: buffer is too small for requested array
****

As you can see I have removed the optimization ranges and just gave one value per parameter, but it still fails. The original backtest itself, bt.run(), runs fine, and completes in 0.5sec.

I don't know if bt.optimize() runs some kind of vectorized calculations, where the data array can somehow end up being too big for it to handle? Can I instead run the optimization sequentially?

Code sample


Actual behavior

.

Additional info, steps to reproduce, full crash traceback, screenshots

No response

Software versions

backtesting==0.6.2

Zirafnik avatar Mar 10 '25 19:03 Zirafnik

This seems to break at: https://github.com/kernc/backtesting.py/blob/cf596b4feea48f0b5c28857f953cef1cbae0b6f4/backtesting/backtesting.py#L1518-L1523

Our master has since diverted: https://github.com/kernc/backtesting.py/blob/5503b9d76e3798651eb30be8b2dce4373db41d81/backtesting/_util.py#L306-L309

but this doesn't change the fact that:

buffer is too small for requested array

which it shouldn't be.

What OS/platform is this? How large ohlc_df.values.nbytes and RAM availability?

Can I instead run the optimization sequentially?

On Python 3.13, you can set PYTHON_CPU_COUNT= environment variable, but this won't prevent copying into shared memory. Grid optimization (randomized or not) copies data into shared memory for the workers. An alternative, sequential method is to call bt.optimize(..., method='sambo').

kernc avatar Mar 10 '25 20:03 kernc

OS/Platform: Linux

I tried running the code in both a Jupyter Notebook (.ipynb) in VSCode, as well as a regular python file (.py) directly in terminal.

Python version: 3.12.3

RAM: 16GB (although, I am new to python, so I am not sure if there are some kind of environment limits?)

ohlc_df.values.nbytes: 2142720 (which is still very, very small)

I have tried running it with both method='grid' and method='sambo', but the issue remains with both.

Zirafnik avatar Mar 18 '25 07:03 Zirafnik

TypeError: buffer is too small for requested array

I have tried running it with both method='grid' and method='sambo', but the issue remains with both.

method='sambo' results in the exact same issue/error? I should hope not! Please confirm.

RAM: 16GB (although, I am new to python, so I am not sure if there are some kind of environment limits?)

ohlc_df.values.nbytes: 2142720 (which is still very, very small)

Would you care to paste the output of the following commands?

cat /etc/os-release

df -h | grep shm

mount | grep shm

grep -R . /etc/tmpfiles.d/

kernc avatar Mar 20 '25 14:03 kernc

This issue also occurs when you don't set the Date correctly for indexing, check if the start and end are the correct dates in your backtest results:

Data index missing:

Start                                     0.0
End                                    2752.0
Duration                               2752.0

Data index correct:

Start                     2017-08-17 00:00:00
End                       2025-02-28 00:00:00
Duration                   2752 days 00:00:00

TimonPeng avatar Mar 21 '25 15:03 TimonPeng

@TimonPeng Can you provide some example code that reproduces the issue for you? Are you saying there's something wrong with df2shm (or the inverse) procedure? Might be, it's wholly new. With the Windows folk experiencing issues and stalling as well, I'm almost leaning to revert. 🙄 https://github.com/kernc/backtesting.py/blob/b1a869c67feb531f97bef8769aee09d26a5e0288/backtesting/_util.py#L304-L324

kernc avatar Mar 21 '25 16:03 kernc

This issue also occurs when you don't set the Date correctly for indexing, check if the start and end are the correct dates in your backtest results:

Data index missing:

Start                                     0.0
End                                    2752.0
Duration                               2752.0

Data index correct:

Start                     2017-08-17 00:00:00
End                       2025-02-28 00:00:00
Duration                   2752 days 00:00:00

Thank you, I had the same issue, this solution solved them.

KleversonGer avatar Mar 28 '25 15:03 KleversonGer

Is there any update on this issue? 😢 i have also encounter this issue when my df (DataFrame) is too large. Meaning that if i take only like below 10000 rows then it would run but my original data size are 123841.

bt = Backtest(
     53     df,
     54     MyStrat,
   (...)     57     commission=0.000,
     58 )
     60 # Optimize the ATR multiplier and TP/SL ratio parameters to maximize returns.
---> [61](vscode-notebook-cell:?execution_count=62&line=61) stats, heatmap = bt.optimize(
     62     atr_multiplier=np.arange(0.5, 5.1, 0.5).tolist(),
     63     tp_sl_ratio=np.arange(0.5, 5.1, 0.5).tolist(),
     64     maximize='Return [%]',
     65     return_heatmap=True,
     66     method='grid',
     67 )
python3.13/site-packages/backtesting/backtesting.py:1624, in Backtest.optimize(self, maximize, method, max_tries, constraint, return_heatmap, return_optimization, random_state, **kwargs)
   1621     return stats if len(output) == 1 else tuple(output)
   1623 if method == 'grid':
-> [1624](https://file+.vscode-resource.vscode-cdn.net/.venv/lib/python3.13/site-packages/backtesting/backtesting.py:1624)     output = _optimize_grid()
   1625 elif method in ('sambo', 'skopt'):
   1626     output = _optimize_sambo()
...
--> [310](https://file+.vscode-resource.vscode-cdn.net/~/.venv/lib/python3.13/site-packages/backtesting/_util.py:310) buf = np.ndarray(vals.shape, dtype=vals.dtype.base, buffer=shm.buf)
    311 has_tz = getattr(vals.dtype, 'tz', None)
    312 buf[:] = vals.tz_localize(None) if has_tz else vals  # Copy into shared memory

TypeError: buffer is too small for requested array

Ohh and btw, the machine im running this on is Mac M4. I'm not sure if this is relevant, but i just want to be as specific as possible. Been stuck with this for days now.

RithyP avatar Nov 12 '25 15:11 RithyP