drop usage of sklearn ?
By reading your code i realized that you were using the sklearn lib. But according to what I found in the develop branch, you are not using the intersect parameter of the LinearRegression method.
https://github.com/diku-dk/bfast/blob/476eb8b09e94906f80057d832104e1ed6dc2293e/bfast/monitor/python/base.py#L230
It means that using LInearRegression all-together is useless as the optimisation problem have a closed solution given by
theta = (H^T*H)^-1*H^T*Y
I am proposing a change to the lib using only numpy to compute the coefficients. According to the tests I've performed on my local computer, I get the same results twice as fast (44s with sklearn, 23 without).
It will also reduce the number of dependencies which is alway important.
The Python backend was created for validation of the GPU backend(s), hence performance was not a concern. However, I would like to get rid of sklearn dependency. I am thinking about substituting it with statsmodels since it is going to be used other places as well in the future, and I have also heard that it is a thinner wrapper around the numpy code.
As mentioned in the PR I think I can still improve the numpy backend so I'll continue working on it