clstm
clstm copied to clipboard
An OCR example for 2D LSTM
it would be very nice to provide a code or a high level description of using 2DLSTMs in CLSTM for OCR tasks. going though the image filter example in test-2d.cc with 1x1 patch (if I understand right) it is not that obvious how to use 2DLSTMs in e.g. clstmocrtrain.cc direct replacing of BLSTM with 2DLSTM would compile and work but won't converge and would be using 48x1 (i.e. 1D) patches. What one would like to try is using instead e.g. 1x2 or 2x4 as in here or here
Also, I would like to ask what are the limitations of CLSTM implementation compared to RNNLIBs' when handling 2DLSTMs
Thanks,
The 2D LSTM hasn't been used yet for OCR, so that's still to be worked out.
CLSTM's 2D LSTM uses a "separable 2D LSTM"; that is, it doesn't zig-zag across the image, but instead processes rows and columns in parallel. In existing benchmarks, that's turned out to be at least as good as traditional 2D LSTM, but it's a lot faster and more parallelizable.
is this the architecture used here ?