backtesting.py icon indicating copy to clipboard operation
backtesting.py copied to clipboard

Backtest/Plot are not thread safe ?

Open wesleywilian opened this issue 5 years ago • 5 comments

Expected Behavior

Run backtests and plots with multiple threads without concurrency problems.

Actual Behavior

Multithreaded execution results in data inconsistency. When plotting. (and backtesting maybe?)

Steps to Reproduce

  1. Run this code.
import os
import threading
from uuid import uuid4

from backtesting import Backtest, Strategy
from backtesting.lib import crossover
from backtesting.test import GOOG, SMA
from flask import Flask
from waitress import serve

app = Flask(__name__)


class SmaCross(Strategy):
    n1 = 15
    n2 = 30

    def init(self):
        setattr(self, "sma1", self.I(SMA, self.data.Close, self.n1))
        setattr(self, "sma2", self.I(SMA, self.data.Close, self.n2))

    def next(self):
        if crossover(getattr(self, "sma1"), getattr(self, "sma2")):
            self.buy()
        elif crossover(getattr(self, "sma2"), getattr(self, "sma1")):
            self.sell()


class Backtesting:
    def compute(self):
        bt = Backtest(GOOG, SmaCross, cash=10000, commission=.002, exclusive_orders=True)
        bt.run()
        file_uuid = str(uuid4())
        filename = "/tmp/backtest_plot_" + file_uuid + ".html"
        print("thread:", threading.get_ident(), "start creating file:", filename)
        bt.plot(open_browser=False, filename=filename)
        try:
            f = open(filename)
            some_raw_data = f.read()
            f.close()
            os.remove(filename)
        except FileNotFoundError:
            some_raw_data = ""
            print("thread:", threading.get_ident(), "file:", filename, "not found!")
        print("thread:", threading.get_ident(), "end creating file:", filename)
        return ""


@app.route('/', methods=['GET'])
def index():
    return Backtesting().compute()


if __name__ == '__main__':
    serve(app, host='0.0.0.0', port=9090, threads=10)
  1. Open two terminals and run on both (or use jmeter or something like that): while true ; do curl localhost:9090 ; done

  2. Check the logs

thread: 140514310731520 start creating file: /tmp/backtest_plot_bd84f04f-6c98-4180-9d4a-e353d917b4c3.html
thread: 140514302338816 start creating file: /tmp/backtest_plot_46abb441-d70a-472b-b67e-de446dd4d8c1.html
thread: 140514310731520 file: /tmp/backtest_plot_bd84f04f-6c98-4180-9d4a-e353d917b4c3.html not found!
thread: 140514310731520 end creating file: /tmp/backtest_plot_bd84f04f-6c98-4180-9d4a-e353d917b4c3.html
thread: 140514302338816 end creating file: /tmp/backtest_plot_46abb441-d70a-472b-b67e-de446dd4d8c1.html

simplifying...

thread: 1 start creating file:  A
thread: 2 start creating file:  B
thread: 1 file:                 A not found!
thread: 1 end creating file:    A
thread: 2 end creating file:    B

The "thread 1" starts creating the "A" file, then the "thread 2" starts creating the "B" file. As we can see, there is a inconsistency due the threads usage (sorry for the flask example).

I suspect, the problem is the "SmaCross" and "Strategy" classes.

@kernc would you have suggestions, how we can fix this ?

Additional info

  • Backtesting version: 0.2.1

wesleywilian avatar Aug 04 '20 03:08 wesleywilian

If likely, the first thing I'd look at is the way we use Bokeh's global state: https://github.com/kernc/backtesting.py/blob/c8f9cc1e2e71aad5fbcb2fa2895ed4226d01c2d8/backtesting/_plotting.py#L145-L164 https://github.com/kernc/backtesting.py/blob/c8f9cc1e2e71aad5fbcb2fa2895ed4226d01c2d8/backtesting/_plotting.py#L68-L75 PR welcome!

kernc avatar Aug 04 '20 14:08 kernc

I see

So... basically we need to reset bokeh's global state ?

wesleywilian avatar Aug 05 '20 04:08 wesleywilian

I was thinking.

Isn't the problem because we're not inheriting a class "SmaCross"?

Even if we reset the global state, it will still not be thread safe.

(Correct me if I'm wrong)

wesleywilian avatar Aug 05 '20 04:08 wesleywilian

The state is already reset each time. I think we need to replace the use of Bokeh's global state (curstate()) with a new State object for each plot() invocation.

I can't say for sure that that's the only critical section, but it's the obvious one and the plot-file-not-found error points to it as well.

We are using SmaCross only to instantiate further new objects; we never refer to the class directly, at least not in a writing manner: https://github.com/kernc/backtesting.py/blob/c8f9cc1e2e71aad5fbcb2fa2895ed4226d01c2d8/backtesting/backtesting.py#L1068 https://github.com/kernc/backtesting.py/blob/c8f9cc1e2e71aad5fbcb2fa2895ed4226d01c2d8/backtesting/backtesting.py#L1110-L1112

kernc avatar Aug 05 '20 12:08 kernc

Hi @kernc

I did a test to validate that the backtest is not being affected

import os
import threading
from uuid import uuid4

from backtesting import Backtest, Strategy
from backtesting.lib import crossover
from backtesting.test import GOOG, SMA
from flask import Flask
from waitress import serve

app = Flask(__name__)


class SmaCross(Strategy):
    n1 = 15
    n2 = 30

    def init(self):
        setattr(self, "sma1", self.I(SMA, self.data.Close, self.n1))
        setattr(self, "sma2", self.I(SMA, self.data.Close, self.n2))

    def next(self):
        if crossover(getattr(self, "sma1"), getattr(self, "sma2")):
            self.buy()
        elif crossover(getattr(self, "sma2"), getattr(self, "sma1")):
            self.sell()


class SmaCrossBankrupt(Strategy):
    # n1 and n2 inverted on bankrupt class...
    n1 = 30
    n2 = 15

    def init(self):
        setattr(self, "sma1", self.I(SMA, self.data.Close, self.n1))
        setattr(self, "sma2", self.I(SMA, self.data.Close, self.n2))

    def next(self):
        if crossover(getattr(self, "sma1"), getattr(self, "sma2")):
            self.buy()
        elif crossover(getattr(self, "sma2"), getattr(self, "sma1")):
            self.sell()


class Backtesting:

    def __init__(self, strategy_id):
        self.strategy_id = strategy_id

    def compute(self):
        if self.strategy_id == 1:
            strategy_class = SmaCross
        else:
            strategy_class = SmaCrossBankrupt

        bt = Backtest(GOOG, strategy_class, cash=10000, commission=.002, exclusive_orders=True)
        x = bt.run()
        file_uuid = str(uuid4())
        filename = "/tmp/backtest_plot_" + file_uuid + ".html"
        print("thread:", threading.get_ident(), "start creating file:", filename)
        bt.plot(open_browser=False, filename=filename)
        try:
            f = open(filename)
            some_raw_data = f.read()
            f.close()
            os.remove(filename)
        except FileNotFoundError:
            some_raw_data = ""
            print("thread:", threading.get_ident(), "file:", filename, "not found!")
        print("thread:", threading.get_ident(), "end creating file:", filename)

        ret = "Return [%] {}\n".format(x.get("Return [%]"))

        return ret


@app.route('/<int:strategy_id>', methods=['GET'])
def index(strategy_id):
    return Backtesting(strategy_id).compute()


if __name__ == '__main__':
    serve(app, host='0.0.0.0', port=9090, threads=10)

Open multiple terminals with while true ; do curl localhost:9090/1 ; done expected results on all requests: Return [%] 194.62947480000028

and at same time, multiples with another context

while true ; do curl localhost:9090/2 ; done expected results on all requests: Return [%] -90.43158260000001

So, this confirms, the backtest is not affected.

I'll check your suggestions and internally analyze how the plotting feature works...

Thanks @kernc

wesleywilian avatar Aug 10 '20 01:08 wesleywilian