handwritten-chinese-ocr-samples
handwritten-chinese-ocr-samples copied to clipboard
The map relationship of the feature sequence to the input image
I am interested in Fully-CNN architectures to do handwriting recognition. I read your article "Searching from the Prediction of Visual and Language Model for Handwritten Chinese Text Recognition" which was presented at ICDAR conference this year. I am very interested in your method.
In particular I am confused about the picture below, i.e. how to map the feature sequence to the input image.

In my opinion, if it is directly interpreted according to the downsampling ratio, the original image is downsampled by 32 times in the width direction, so a feature sequence should correspond to an area with a width of 32 on the original image, and there is no overlap between each other. .
Hope to get your advice
Thank you very much for your time.
@yusirhhh If you heard the concept of receptive field of convolution neural networks, you will know the receptive fields of adjacent features in deeper layers are obviously overlapped.
But for a deep CNN, its receptive field is very large, and its theoretical receptive field can even reach the whole picture. But from the schematic, I think the width of the receptive field is about 96.