handwritten-chinese-ocr-samples icon indicating copy to clipboard operation
handwritten-chinese-ocr-samples copied to clipboard

The map relationship of the feature sequence to the input image

Open BigTree765 opened this issue 3 years ago • 2 comments

I am interested in Fully-CNN architectures to do handwriting recognition. I read your article "Searching from the Prediction of Visual and Language Model for Handwritten Chinese Text Recognition" which was presented at ICDAR conference this year. I am very interested in your method.

In particular I am confused about the picture below, i.e. how to map the feature sequence to the input image. mapRelationship

In my opinion, if it is directly interpreted according to the downsampling ratio, the original image is downsampled by 32 times in the width direction, so a feature sequence should correspond to an area with a width of 32 on the original image, and there is no overlap between each other. .

Hope to get your advice

Thank you very much for your time.

BigTree765 avatar Jan 11 '22 05:01 BigTree765

@yusirhhh If you heard the concept of receptive field of convolution neural networks, you will know the receptive fields of adjacent features in deeper layers are obviously overlapped.

bliu3650 avatar Jan 18 '22 12:01 bliu3650

But for a deep CNN, its receptive field is very large, and its theoretical receptive field can even reach the whole picture. But from the schematic, I think the width of the receptive field is about 96.

BigTree765 avatar Jan 18 '22 13:01 BigTree765