Imp/tsmixer basic
Checklist before merging this PR:
- [x] Mentioned all issues that this PR fixes or addresses.
- [x] Summarized the updates of this PR under Summary.
- [x] Added an entry under Unreleased in the Changelog.
Implements #2510
Summary
Adds the option to project to the output temporal space at the end of TS-Mixer, rather than the beginning. This was how most of the results in the original google-research paper were achieved (ie, the architecture in Fig #1 of the paper). This may allow higher performance in cases where past covariates are important by allowing a more direct series of residual connections along the input time dimension.
I allowed support for future covariates by instead projecting them into the lookback temporal space, but this probably won't perform well in cases where they are more important than the historical targets and past covariates.
Other Information
The original paper and source code do not clarify whether the final temporal projection should go before or after the final feature projection as they hardcoded hidden_size to output_dim and therefore did not have need a final feature projection. I erred on the side of putting the temporal projection first, as otherwise the common output_dim==1 could lead to unexpected, catastrophic compression before the temporal projection step.
Check out this pull request on ![]()
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
Hi @eschibli,
First of all, thanks for opening this PR!
For the linting, it will make your life much easier if you follow these instruction, or you can also run it manually
Codecov Report
:white_check_mark: All modified and coverable lines are covered by tests.
:white_check_mark: Project coverage is 95.14%. Comparing base (cbf1c1a) to head (7ca964b).
Additional details and impacted files
@@ Coverage Diff @@
## master #2555 +/- ##
==========================================
- Coverage 95.21% 95.14% -0.08%
==========================================
Files 146 146
Lines 15554 15568 +14
==========================================
+ Hits 14810 14812 +2
- Misses 744 756 +12
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
Hi @eschibli, ...
Thanks @madtoinou. I was not able to get Gradle running on my machine and didn't realize ruff was that easy to set up so sorry for spamming your test pipeline.
I don't believe the failing mac build is a result of my changes so it should be good for review now.
Hi @eschibli, thanks for the PR. Yes, the failing mac tests are unrelated to your PR, we're working on it :). Also, give us some time to review, our capacity is currently a bit limited 🙏
Understood Dennis
@eschibli, do you think that you would have the time to apply the modifications mentioned above so that we can get closer to merge this in main?
Hi @madtoinou: Sorry I was very busy in December and January. I should have some time this week or next week to finish this.
Also, in my own work I have had more success with an intermediate variant, which I mentioned in the original feature request thread. If we would consider adding that additional functionality I will add it to this pull request.
Sure! We will then try to compare each one of them on small use-cases, to make sure that they bring something but based on your experience, it seems to already be the case :)
@madtoinou are you happy with these changes?
@madtoinou @dennisbader sorry to nag but could you please clarify if you are unhappy with this addition or if more changes are needed to approve this?
@madtoinou could you give an update on this one? :)
Hi @eschibli,
Sorry for the delay, I started reviewing but it took me longer than expected.
These changes are welcomed, I have the impression that it will bring considerable performance gain but I want to make sure that the current behavior is still accessible and reproducible.