stl-java
stl-java copied to clipboard
Issues comparing with R / Python / Fortran version
Hi @brandtg,
I've been looking for a Java STL package and was psyched to find that you were working on one!
My first comparison with the Java code was for some artificial data from the NAB dataset (e.g. https://github.com/numenta/NAB/blob/master/data/artificialWithAnomaly/art_daily_flatmiddle.csv). When I tried this with the Java implementation, the results didn't look quite right. In particular, there were issues with the trend at the ends, IIRC. I was trying to sort out possible config differences (I should repeat with the "periodic" option enabled, but generally I won't want this enabled) when I ran across the performance discussion and the hourly.txt
data set that you've been using for testing. So I grabbed this data and did my own set of comparisons. Once I got the configs right, the Python and R implementations basically agree to close to numerical noise. However, the stl-java
version, while qualitatively similar, is still fairly different from the Python/R results, with the trend differing by on the order of 10%. For example:
I'm definitely interested in using this package, but I want to figure out whether the configs just aren't right yet (given the funky inputs to the LoessInterpolator) or there's an issue either in the StlDecomposition class or the underlying LoessInterpolator. I'm happy to help dig in if you have any suggestions.
Oh, I'll try to repeat my test on the NAB example with "periodic" set to try and see if I still see the odd behavior that I saw earlier. Will let you know.
Cheers - Jim
@jcrotinger @brandtg and I have been looking at the funky output for sometime now trying to figure out why it gets weird in certain places. I don't know if he has some more insight, but I'm still slightly confused why it does that at the end, but I have reason to believe it's in the loess smoother. I think honestly the R impl has a different way than the java which is what yields a strange result at the tail end of it, but Greg might have more insight than I do.
The R implementation is just a very thin wrapper around the Fortran, as is the Python implementation. I was confused by this since the original paper talks about the S implementation, which is NOT the same as the Fortran (it uses K-D trees for performance in the Loess smoothing instead of skips, and it has several additional features, like post-STL seasonality smoothing, which I assumed was just part of the algorithm after reading the paper, but they are not part of the Fortran version). Here's the S doc. Let me rerun the NAB artificial anomaly example - it is much more clear cut.
I've re-run my earlier test on this NAB dataset. Here is the result from my Python test:
I'm using the default for the length of the trend smoother and it seems perhaps a little short as the trend is picking up a bit of the seasonality, and there is a very small non-periodic blip at the beginning, but mostly its pretty flat until the "anomaly". This run isn't using "robust" smoothing - I have
no=0
, ni=2
to match how I ran the Java. The Java was run with
java -Dinner.loop=2 -cp /Users/jim.crotinger/.m2/repository/org/apache/commons/commons-math3/3.5/commons-math3-3.5.jar:/Users/jim.crotinger/git/github/stl-java/target/stl-java-0.1.2-SNAPSHOT.jar com.github.brandtg.stl.StlDecomposition 288 <art_daily_flatmiddle.csv >art_daily_flatmiddle_java.csv
Here are the Java results (plotted in the same Python notebook):
As you can see, the trend has a excursion that is on the order of the size of the seasonal component at the very beginning. Its excursion at the anomaly is qualitatively similar to the Python version but somewhat smoother, and the Python one looks like it is recovering back to near-flat after the anomaly more quickly than the Java version.
Another issue that I just noticed is that the Java version is not capturing as much of the seasonality as the original version. Here is a comparison:
As a result, the residual is retaining seasonal structure that it shouldn't. The structure is very close - in fact if you scale the Java seasonal component up by 8-10% then it is very close to the Python result.
Stepping through the code, I see that the "periodic" option is not implemented in the same fashion as R does it (and that I'm doing it in Python). The R wrapper sets ns = 10 * n + 1
where n
is the total number of observations, and isdeg = 0
, where isdeg
is the degree of the polynomial fit used for the cycle subseries, and then calls the underlying Fortran; i.e. it computes the "moving average" of the cycle sub-series over a window that is larger than the actual data, resulting in a "fit" of the cycle sub-series by its average value. The R code follows up by ensuring exact periodicity via an explicit post-STL averaging of the cycle sub-series and replacing the individual values with this average. (My version in Python doesn't do this and gives the same results, which seems right given that the Loess with degree 0 and a very-large window should already be doing that average.)
It appears that StlDecomposition
is computing the seasonality using the default parameters (bandwidth
= 0.4, linear fit [the only option]) and then doing the explicit average after-the-fact. While the residual is adjusted so the results are consistent, it isn't clear (or even likely) that the trend will be unaffected. In the "hourly.txt" data, for instance, the seasonality would clearly have strong trends that would affect the trend component. The last-minute averaging of the seasonality comes too late for it to have the same impact on the trend/seasonality iteration that it has when it is imposed as in the R version.
(Replicating the R version is, unfortunately, going to require some work since the LoessInterpolator
only supports linear interpolation.)
All this said, for the data above I would expect very little evolution in the seasonal component, particularly if the outer iteration is run. So I don't think that the post-STL averaging is the cause of the inconsistencies above.
@brandtg @hntd187
I've re-run with minimal bells and whistles and with the bandwidths chosen so that the bandwidthInPoints
computed in the LoessInterpolator
matched the values that are set in the calls to the underlying Fortran. Here is the Python result - I've made nl
and nt
explicit although these would be the default values:
For the Java run, I had to modify
main
so that I could set more of the parameters. I think they're obvious. (outer.loop
sets numberOfRobustnessIterations
; I don't think the number of Loess robustness iterations needs to be exposed - pretty sure we always want to do the weight adjustments at the STL level based on the STL's residual, not at the Loess level based on whatever residual it is seeing in the current fit, but I haven't played with that option - certainly that is something the Fortran doesn't expose.) Here is the Java command line:
$ java -Dtrend.bandwidth=0.1075 -Dseasonal.bandwidth=0.85 \
-Douter.loop=1 -Dinner.loop=1 -Dstl.periodic=false \
-cp $BLAH com.github.brandtg.stl.StlDecomposition 288 \
art_daily_flatmiddle.csv art_daily_flatmiddle_java_noperiodic_norobust.csv
There are 14 periods in the data, but during the smoothing of the cyclic sub-series two extra periods are added, giving 16 in the smoothing, so I had to set the seasonal bandwidth to 0.85 in order to get bandwidthInPoints
up to the 13 that I used in Python. (Note that the default is 7 in the R/Python and I don't think there is any attempt to ensure that the default bandwidth results in a minimum of 7 points, or that the number of points will be odd. I've added some local TODOs to fix these things.)
Here are the results:
Note that the seasonality has much more variation than that found by the Python version. As a result, the residuals are really only similar in periods 5-7.
With one robustness iteration (-Douter.loop=2
in the Java case) the trend picks up much less of the anomaly and the seasonality is much more periodic looking:
Java is heading in a similar direction:
But note that the trend is still off at the origin. This is even more noticeable if we take 5 robustness iterations as the trend should start to lose all of the transient effects and be close to constant:
Note that both the trend and seasonal components appear to have incorrect drift in the first period. After that they look pretty good.
@brandtg @hntd187 I think this is about as close to a minimal criminal as I can come up with. I created a simple two-day pattern:
simple = pd.DataFrame({ 'data' : 20 }, index=pd.date_range("2016-04-16 00:00:00", periods=2*12*24+1, freq="5min"))
simple[(simple.index.hour >= 9) & (simple.index.hour < 18)] = 80
For the Python STL this gives:
For the Java, using
$ java -Dtrend.bandwidth=0.751 -Dseasonal.bandwidth=0.85 \
-Douter.loop=1 -Dinner.loop=1 -Dstl.periodic=false \
-cp $BLAH com.github.brandtg.stl.StlDecomposition 288 \
art_daily_flatmiddle.csv art_daily_flatmiddle_java_noperiodic_robust.csv
This gives:
@brandtg @hntd187
The nice thing about minimal criminals is that it is easy to see what should be happening. In this case, the cyclic sub-series should always be exactly constant. However, the way the procedure is running, the "paddedSeries" that is passed to the LoessInterpolator is (for the first point) [0, 20, 20, 20, 0]. It then does a Loess smoothing of this data, using all five points. That is clearly not what should be happening. What should be happening is a Loess smoothing of [20, 20, 20] that is then extrapolated to the padding points. I think that instead of calling smooth
on the padded data-set, this can be done by calling interpolate
on the un-padded dataset and then using the PolynomialSplineFunction to extrapolate to the padding points.
Digging into the Fortran, they only call the full Loess smoother with the actual sub-cycle points and then call a single-point Loess to clean up the endpoints. I don't think LoessInterpolator could be used this way but it might not be that hard to write the single-point "interpolator" for these extrapolations.
@jcrotinger Sorry I was out sick for a couple days, so didn't have a chance to get to this. But thank you for the awesome analysis! We have been wondering about this funky behavior, but haven't had too much time to dig into the exact cause.
Your reasoning makes sense. The zero padding at either end of the series is likely causing the weird behavior. Do you want to submit a PR with the fix that you describe? Do you think we can get away with interpolate
, or should we write our own single-point interpolator for the endpoint extrapolations?
@brandtg No worries. I'm taking a look at porting the single-point routine from the original RATFOR. :) About have the first cut done. Need to write some tests. Then I'll see about trying to integrate it into the stl-java
code. If I can get this single-point version working, then ultimately it would be my preference to bag the dependence on LoessInterpolator and implement something closer to the Fortran. As I mentioned over on the performance thread, without doing the strided smoothing, this is always going to be slow compared to what the Fortran does. (I also want to think about how hard it would be to extend to quadratic, though that can't be recast as a linear operation on the data so it would never be as simple.)
@jcrotinger Sounds good to me. Removing the dependency on LoessInterpolator would be great.
And if we need a native impl wanna jump on that rust hype train? @brandtg choo choo
@hntd187 Haha I suppose we could try writing something in rust, but if we can get pure Java that would be ideal
Hey @hntd187 / @jcrotinger I put in a hack to do some kind of extrapolation when padding the cycle subseries, and this is what we get for that input (but with 10 inner loop iterations vs. 1):
With 1 it looks a little funky still, but it seems like there's some progress if we can achieve reasonable output with the default config:
What do you think?
Hmm... Actually this seems to make the tests fail and really throws off that output... going to dig into that one a bit more, but open to ideas on how we can do the extrapolation part
Switched back to LoessInterpolator and just tried to handle the edge cases better. Given the limit of LoessInterpolator (can't extrapolate), this is effectively a hack - it just interpolates at the edge and uses that value, assuming it's similar enough.
Here is a run with the following configuration:
StlDecomposition stl = new StlDecomposition(288);
stl.getConfig().setTrendComponentBandwidth(0.751);
stl.getConfig().setSeasonalComponentBandwidth(0.85);
stl.getConfig().setNumberOfInnerLoopPasses(10);
stl.getConfig().setPeriodic(false);
StlResult res = stl.decompose(times, measures);
There still seems to be some pattern in the residuals, but they're more balanced and closer to zero than the previous version. Also the tests pass 😅
Hi @brandtg. I'm afraid I gave up on this and "ported" the original Fortran to Java (I cleaned up the interface in the process, though would like to make more improvements - just not enough spare time). I want to give that back but I will have to go through a fair bit of red tape at work in order to be able to put it into the public domain and right now I don't have time. (I also extended it to support second order LOESS, a change I'd also like to put back into the original Fortran.) The experience was enlightening - the version of LOESS in the original Fortran is completely specialized for the use case of implementing the Fortran STL. It assumes evenly spaced data and is written to compute no more than necessary in order to interpolate the point that it is smoothing. Trying to use a generic LOESS package will never perform well, as a result, IMO. In the mean time, if you are not constrained against using unmanaged code, it might be better to do a JNI interface to the original Fortran.
Yea, but a JNI interface to that would be a portability nightmare no doubt
Agree - that's why I did a Java version (well, that and other reasons). I've pinged our legal department about steps to release my version. Will let you guys know. Might be a project I could do over the holidays.
@brandtg and @hntd187, at long last my Java port is available on github. Sorry it took so long - a back burner project that had to go through corporate channels since it is based on work I did during my day job. The now-public repo is https://github.com/ServiceNow/stl-decomp-4j. Please take a look, give it a whirl, etc. Performance is about half the speed of the native Fortran, which I think isn't too bad for Java, though I'm not thrilled that it isn't better. Currently it does create some garbage and I'd like to change it to reuse the work arrays that it allocates during the low-pass phase, but the current version will have to do for now.
@jcrotinger Thanks for the update! Will dig into it - probably will deprecate this project and point towards yours
FYI, stl-decomp-4j is officially release, with version 1.0.0 at a central maven repo near you! I'd like to make a "stl server" variant that would cache its working arrays so that it doesn't generate garbage, which it currently does for the low-pass smoothing, but that will have to wait for a future pause in the action here at work. :)