brainstorm icon indicating copy to clipboard operation
brainstorm copied to clipboard

Truncated BPTT

Open jramapuram opened this issue 9 years ago • 4 comments

Is it possible to do truncated BPTT currently? I have a really long time series: 1411889 samples This overflows when trying to train on any backend.

jramapuram avatar Oct 28 '15 10:10 jramapuram

We don't have specific support for truncated BPTT currently. What you can do is to chunk up your sequence and just treat them as separate sequences. That will loose the internal state between chunks but at least allow you to train. But carrying the internal state precisely for usecase is on our agenda (see #57).

Qwlouse avatar Oct 28 '15 12:10 Qwlouse

Yea, I had considered chunking, however as you mentioned the cross-sequence context is lost. I.e. in turn we prevent learning truly 'long-term' dependencies. Looking forward to your solution to #57

jramapuram avatar Oct 28 '15 12:10 jramapuram

Any news with this?

jramapuram avatar Mar 26 '16 17:03 jramapuram

We realized that this was not needed for our current experiments, and so we wouldn't be able to properly test it etc. We didn't finalize how this should be cleanly integrated with everything else, but the lower-level stuff necessary to get and restore context is in place so it should be possible to write a custom SgdStepper which restores context across forward passes with the help of network.get_context()

flukeskywalker avatar Mar 27 '16 17:03 flukeskywalker