Add Model Support for xLSTM
Model description
Inspired by recent rumors about xLSTM - a hidden successor to LSTM - by Sepp Hochreiter, this issue tracks the open source implementation about adding xLSTM to Transformers library.
Open source status
- [ ] The model implementation is available
- [ ] The model weights are available
Provide useful links for the implementation
- [x] Paper is available here
At the moment no implementation does exist.
Only rumors that xLSTM surpasses GPT-2 on various (small) downstream datasets.
Good overview is the xLSTM Resources repository from @AI-Guru.
Sounds like a money grab. If it is something useful, he should have chosen the academic path or at least filing patent.
This way of boldly claiming success via non-serious media channels is highly unprofessional. It smells like publicity is more relevant than results which further supports motivations like funding/personal gains/politics.
If I understood it correctly, a patent is on its way, and at least a paper about xLSTM will be published in less than 6 month.
I have some doubts if this is planned as an open source model.
Paper is published now: https://arxiv.org/abs/2405.04517
Need code and checkpoint or it didn't happen.
Official implementation is out now:
https://github.com/NX-AI/xlstm
Note that the official source code is AGPL-licensed.