sightseq
sightseq copied to clipboard
Computer vision tools for fairseq, containing PyTorch implementation of text recognition and object detection
The shape of the picture is (Batch, Channel, H, W) The data shape that the sequence_generate can receive is (batch, seq_len,...) I did not find a solution in your code,...
I checked the network roughly, and I found it seems no recurrent layers like Bi-LSTM? Is this repo another implementation for CRNN? I just see several CNN backbone and fully...
The convolutional part of the architecture act as a encoder part, it capture image's contexture information, the architecture should ensemble a decoder part (deconvolution layer or RNN layer) to recover...