backtesting.py
backtesting.py copied to clipboard
Issues with updating class variables while walk-forward optimising
I have issues with updating MLTrainOnceStrategy
class variables while walk-forward optimizing like done with self.clf
in MLWalkForwardStrategy
in the tutorial notebook Trading with Machine Learning Models:
class MLWalkForwardStrategy(MLTrainOnceStrategy):
def next(self):
# Skip the cold start period with too few values available
if len(self.data) < N_TRAIN:
return
# Re-train the model only every 20 iterations.
# Since 20 << N_TRAIN, we don't lose much in terms of
# "recent training examples", but the speed-up is significant!
if len(self.data) % 20:
return super().next()
# Retrain on last N_TRAIN values
df = self.data.df[-N_TRAIN:]
X, y = get_clean_Xy(df)
self.clf.fit(X, y)
# Now that the model is fitted,
# proceed the same as in MLTrainOnceStrategy
super().next()
bt = Backtest(data, MLWalkForwardStrategy, commission=.0002, margin=.05)
bt.run()
When I print in the above, I see that the model is training, but the self.clf
is still the same as defined in def init(self)
in the main strategy class MLTrainOnceStrategy
.
This issue suddenly appeared last week. This approach worked before. Is anyone experiencing the same?
A way to see that it doesn't work is to look at the equity curves for both MLTrainOnceStrategy
and MLWalkForwardStrategy
. They will look exactly the same.
MLTrainOnceStrategy
in the notebook:
from backtesting import Backtest, Strategy
N_TRAIN = 400
class MLTrainOnceStrategy(Strategy):
price_delta = .004 # 0.4%
def init(self):
# Init our model, a kNN classifier
self.clf = KNeighborsClassifier(7)
# Train the classifier in advance on the first N_TRAIN examples
df = self.data.df.iloc[:N_TRAIN]
X, y = get_clean_Xy(df)
self.clf.fit(X, y)
# Plot y for inspection
self.I(get_y, self.data.df, name='y_true')
# Prepare empty, all-NaN forecast indicator
self.forecasts = self.I(lambda: np.repeat(np.nan, len(self.data)), name='forecast')
def next(self):
# Skip the training, in-sample data
if len(self.data) < N_TRAIN:
return
# Proceed only with out-of-sample data. Prepare some variables
high, low, close = self.data.High, self.data.Low, self.data.Close
current_time = self.data.index[-1]
# Forecast the next movement
X = get_X(self.data.df.iloc[-1:])
forecast = self.clf.predict(X)[0]
# Update the plotted "forecast" indicator
self.forecasts[-1] = forecast
# If our forecast is upwards and we don't already hold a long position
# place a long order for 20% of available account equity. Vice versa for short.
# Also set target take-profit and stop-loss prices to be one price_delta
# away from the current closing price.
upper, lower = close[-1] * (1 + np.r_[1, -1]*self.price_delta)
if forecast == 1 and not self.position.is_long:
self.buy(size=.2, tp=upper, sl=lower)
elif forecast == -1 and not self.position.is_short:
self.sell(size=.2, tp=lower, sl=upper)
# Additionally, set aggressive stop-loss on trades that have been open
# for more than two days
for trade in self.trades:
if current_time - trade.entry_time > pd.Timedelta('2 days'):
if trade.is_long:
trade.sl = max(trade.sl, low)
else:
trade.sl = min(trade.sl, high)
Simplified example:
Simple SMA strategy. Test by increasing the moving averages' lengths with 10 per 500 bars (just to check if indicators change).
Initial model:
from backtesting import Backtest, Strategy
from backtesting.lib import crossover
from backtesting.test import SMA, GOOG
class SmaCross(Strategy):
n1 = 10
n2 = 20
size = 1
def init(self):
close = self.data.Close
self.sma1 = self.I(SMA, close, self.n1)
self.sma2 = self.I(SMA, close, self.n2)
def next(self):
if crossover(self.sma1, self.sma2):
self.buy(size=self.size)
elif crossover(self.sma2, self.sma1):
self.sell(size=self.size)
bt = Backtest(GOOG, SmaCross,
cash=10000, commission=.002,
exclusive_orders=True)
output = bt.run()
Walk-forward model:
class WalkForwardSmaCross(SmaCross):
def next(self):
if len(self.data) % 500:
return super().next()
# Increase moving avg length with 10
self.n1 += 10
self.n2 += 10
super().next()
bt_wf = Backtest(GOOG, WalkForwardSmaCross,
cash=10000, commission=.002,
exclusive_orders=True)
output_wf = bt_wf.run()
Then compare equity curves:
ax = output["_equity_curve"].Equity.plot(label="SmaCross", figsize=(10,5), alpha=0.5, linestyle=":", lw=2)
output_wf["_equity_curve"].Equity.plot(label="WalkForwardSmaCross", alpha=0.5, lw=2)
ax.set_title("Equity curves")
ax.legend()
plt.show()
Output:
As you can see, the equity curves are identical. Seems to be an issue with this being Indicators, because if I do the same to the self.size
instead of self.n1
or self.n2
, it works:
class WalkForwardSmaCross(SmaCross):
def next(self):
if len(self.data) % 500:
return super().next()
# Increase position size with 1
self.size += 1
super().next()
bt_wf = Backtest(GOOG, WalkForwardSmaCross,
cash=10000, commission=.002,
exclusive_orders=True)
output_wf = bt_wf.run()
Plot output:
In your simplified example, changing n1
and n2
in next()
does not (and is not supposed to) affect your SMA indicators precomputed in init()
...
Ok thank you, is there a way to update indicators thought-out the backtest? Or do you have any suggestions on how to mimic this?
is there a way to update indicators thought-out the backtest?
Working with the simplified example, SMAs simply need to be recomputed after n1
change, i.e.:
self.n1 += 10
self.sma1 = self.I(SMA, close, self.n1)
...
but the
self.clf
is still the same as defined indef init(self)
Fitting doesn't change the model object (after fitting, it still holds: self.clf == self.clf and self.clf is self.clf
), but it should change model's internal parameters. Can you confirm?
A way to see that it doesn't work is to look at the equity curves for both MLTrainOnceStrategy and MLWalkForwardStrategy. They will look exactly the same.
This is certainly not evidence enough that the reiterative fitting doesn't work. Your code looks ok at a glance.