mamba icon indicating copy to clipboard operation
mamba copied to clipboard

[Summary] Ability to pass in initial state

Open CompRhys opened this issue 1 year ago • 3 comments

Several issues have requested the ability to input initial state but several of these have often been closed by those posting without the issue being resolved. This issue simply collates those prior issues to make comments by the maintainers in response to those more findable when searching open issues.

https://github.com/state-spaces/mamba/issues/155 https://github.com/state-spaces/mamba/issues/146 https://github.com/state-spaces/mamba/issues/141 https://github.com/state-spaces/mamba/issues/127 https://github.com/state-spaces/mamba/issues/101

tldr; this functionality is work in progress

CompRhys avatar Feb 14 '24 01:02 CompRhys

https://github.com/state-spaces/mamba/issues/258

CompRhys avatar Mar 20 '24 22:03 CompRhys

I'd also benefit a lot from this feature, as I have multiple training sequences with long common prefixes, and I'd like to be able to run the model over each prefix once, then fork the state for each continuation. This would be for use during training, so contrary to what @albertfgu said in #101 I would need gradient flow through the pause/resume process.

SamPruden avatar Mar 24 '24 14:03 SamPruden

Actually, if you don't mind 10x slower and 2x gpu memory usage, there is a workaround for now: https://github.com/state-spaces/mamba/issues/51

But I guess the true mamba with initial hidden states will require CUDA master to improve it.

radarFudan avatar Mar 24 '24 14:03 radarFudan