CRNN-Keras
CRNN-Keras copied to clipboard
CNN - other architectures and transfer learning
Has anyone tried to replace default simple CNN structure with pretrained architectures like VGG, InceptionV3, etc.? On one hand the typical text image (like plates) is different in nature than typical photo (of a dog for example), but on the other hand maybe it would still be beneficial to use (first few) layers?
I have same question,
The original paper seemed not to use the pre-trained model, cause the VGG16 model's input shape is fixed at 224x224, it's not appropriate to resize the shapes of text images to that size