mmocr icon indicating copy to clipboard operation
mmocr copied to clipboard

DBnet 使用ploy方式,检测弯曲文本,有没有预训练模型呀

Open Tomhouxin opened this issue 2 years ago • 10 comments

由于样本数据比较少,只有400张,单纯的用resnet18模型做预训练模型太难收敛了,如果用quad没有问题,但是样本大部分是弯曲样本,所以来请教一下,谢谢!

Tomhouxin avatar Jun 09 '22 07:06 Tomhouxin

We recommend using English or English & Chinese for issues so that we could have broader discussion.

mm-assistant[bot] avatar Jun 09 '22 07:06 mm-assistant[bot]

We have added it to our backlog and will release it when it's ready. But for now, we don't have the checkpoint for this setting.

gaotongxiao avatar Jun 09 '22 07:06 gaotongxiao

Ok, Thanks!

How much data is required to use the ploy mode?

Tomhouxin avatar Jun 09 '22 09:06 Tomhouxin

Generally, text detection models can achieve satisfactory performance with not quite a large amount of data. For example, ic15 only contains 1000 training samples.

gaotongxiao avatar Jun 09 '22 10:06 gaotongxiao

这些是训练日志,目测貌似不再收敛了,hmean-iou:hmean停留在了0.8,loss也不在下降了,但是检测结果来看是乱起八糟的。是否要调整学习率继续训练呢?

{"mode": "val", "epoch": 69, "iter": 80, "lr": 0.00482, "hmean-iou:recall": 0.76316, "hmean-iou:precision": 0.84466, "hmean-iou:hmean": 0.80184} {"mode": "train", "epoch": 70, "iter": 5, "lr": 0.00478, "memory": 6697, "data_time": 1.78142, "loss_prob": 0.8945, "loss_db": 0.14975, "loss_thr": 0.41529, "loss": 1.45953, "time": 2.87308} {"mode": "train", "epoch": 70, "iter": 10, "lr": 0.00478, "memory": 6697, "data_time": 0.36489, "loss_prob": 0.73329, "loss_db": 0.13053, "loss_thr": 0.38303, "loss": 1.24684, "time": 1.38242} {"mode": "train", "epoch": 70, "iter": 15, "lr": 0.00478, "memory": 6697, "data_time": 0.27795, "loss_prob": 0.79045, "loss_db": 0.14272, "loss_thr": 0.39014, "loss": 1.32331, "time": 1.27847} {"mode": "train", "epoch": 70, "iter": 20, "lr": 0.00478, "memory": 6697, "data_time": 0.21064, "loss_prob": 0.83678, "loss_db": 0.14476, "loss_thr": 0.42282, "loss": 1.40436, "time": 1.1707} {"mode": "val", "epoch": 70, "iter": 80, "lr": 0.00478, "hmean-iou:recall": 0.7807, "hmean-iou:precision": 0.84762, "hmean-iou:hmean": 0.81279} {"mode": "train", "epoch": 71, "iter": 5, "lr": 0.00475, "memory": 6697, "data_time": 1.92355, "loss_prob": 0.81302, "loss_db": 0.13625, "loss_thr": 0.42131, "loss": 1.37059, "time": 3.07605} {"mode": "train", "epoch": 71, "iter": 10, "lr": 0.00475, "memory": 6697, "data_time": 0.35925, "loss_prob": 0.90666, "loss_db": 0.15217, "loss_thr": 0.44795, "loss": 1.50678, "time": 1.28993} {"mode": "train", "epoch": 71, "iter": 15, "lr": 0.00475, "memory": 6697, "data_time": 0.26736, "loss_prob": 0.94919, "loss_db": 0.16886, "loss_thr": 0.41646, "loss": 1.53451, "time": 1.21419} {"mode": "train", "epoch": 71, "iter": 20, "lr": 0.00475, "memory": 6697, "data_time": 0.2183, "loss_prob": 0.75486, "loss_db": 0.13175, "loss_thr": 0.4056, "loss": 1.2922, "time": 1.27543} {"mode": "val", "epoch": 71, "iter": 80, "lr": 0.00475, "hmean-iou:recall": 0.81579, "hmean-iou:precision": 0.83784, "hmean-iou:hmean": 0.82667} {"mode": "train", "epoch": 72, "iter": 5, "lr": 0.00472, "memory": 6697, "data_time": 1.85821, "loss_prob": 0.82848, "loss_db": 0.14779, "loss_thr": 0.39736, "loss": 1.37362, "time": 3.30308} {"mode": "train", "epoch": 72, "iter": 10, "lr": 0.00472, "memory": 6697, "data_time": 0.54173, "loss_prob": 0.95, "loss_db": 0.16089, "loss_thr": 0.43677, "loss": 1.54766, "time": 1.33972} {"mode": "train", "epoch": 72, "iter": 15, "lr": 0.00472, "memory": 6697, "data_time": 0.22638, "loss_prob": 0.75431, "loss_db": 0.13126, "loss_thr": 0.3615, "loss": 1.24707, "time": 1.19844} {"mode": "train", "epoch": 72, "iter": 20, "lr": 0.00472, "memory": 6697, "data_time": 0.24143, "loss_prob": 0.97624, "loss_db": 0.16567, "loss_thr": 0.43851, "loss": 1.58042, "time": 1.26011} {"mode": "val", "epoch": 72, "iter": 80, "lr": 0.00472, "hmean-iou:recall": 0.7807, "hmean-iou:precision": 0.83962, "hmean-iou:hmean": 0.80909} {"mode": "train", "epoch": 73, "iter": 5, "lr": 0.00468, "memory": 6697, "data_time": 1.7954, "loss_prob": 0.85423, "loss_db": 0.15098, "loss_thr": 0.45753, "loss": 1.46274, "time": 2.90421} {"mode": "train", "epoch": 73, "iter": 10, "lr": 0.00468, "memory": 6697, "data_time": 0.29776, "loss_prob": 0.85731, "loss_db": 0.14573, "loss_thr": 0.39932, "loss": 1.40237, "time": 1.23691} {"mode": "train", "epoch": 73, "iter": 15, "lr": 0.00468, "memory": 6697, "data_time": 0.27035, "loss_prob": 0.85157, "loss_db": 0.14417, "loss_thr": 0.39404, "loss": 1.38979, "time": 1.33959} {"mode": "train", "epoch": 73, "iter": 20, "lr": 0.00468, "memory": 6697, "data_time": 0.24757, "loss_prob": 0.97287, "loss_db": 0.17615, "loss_thr": 0.4258, "loss": 1.57482, "time": 1.25033} {"mode": "val", "epoch": 73, "iter": 80, "lr": 0.00468, "hmean-iou:recall": 0.79825, "hmean-iou:precision": 0.80531, "hmean-iou:hmean": 0.80176} {"mode": "train", "epoch": 74, "iter": 5, "lr": 0.00465, "memory": 6697, "data_time": 1.77191, "loss_prob": 0.79204, "loss_db": 0.13724, "loss_thr": 0.40282, "loss": 1.33211, "time": 2.81048} {"mode": "train", "epoch": 74, "iter": 10, "lr": 0.00465, "memory": 6697, "data_time": 0.35505, "loss_prob": 0.90555, "loss_db": 0.16267, "loss_thr": 0.43164, "loss": 1.49986, "time": 1.32076} {"mode": "train", "epoch": 74, "iter": 15, "lr": 0.00465, "memory": 6697, "data_time": 0.28285, "loss_prob": 0.84884, "loss_db": 0.14985, "loss_thr": 0.40945, "loss": 1.40814, "time": 1.26473} {"mode": "train", "epoch": 74, "iter": 20, "lr": 0.00465, "memory": 6697, "data_time": 0.22423, "loss_prob": 0.83922, "loss_db": 0.13588, "loss_thr": 0.37352, "loss": 1.34863, "time": 1.17557} {"mode": "val", "epoch": 74, "iter": 80, "lr": 0.00465, "hmean-iou:recall": 0.80702, "hmean-iou:precision": 0.81416, "hmean-iou:hmean": 0.81057} {"mode": "train", "epoch": 75, "iter": 5, "lr": 0.00462, "memory": 6697, "data_time": 1.89862, "loss_prob": 0.85342, "loss_db": 0.15324, "loss_thr": 0.38652, "loss": 1.39318, "time": 3.14093} {"mode": "train", "epoch": 75, "iter": 10, "lr": 0.00462, "memory": 6697, "data_time": 0.40921, "loss_prob": 0.91439, "loss_db": 0.1647, "loss_thr": 0.41639, "loss": 1.49547, "time": 1.22508} {"mode": "train", "epoch": 75, "iter": 15, "lr": 0.00462, "memory": 6697, "data_time": 0.24701, "loss_prob": 0.86097, "loss_db": 0.16405, "loss_thr": 0.4155, "loss": 1.44053, "time": 1.17545} {"mode": "train", "epoch": 75, "iter": 20, "lr": 0.00462, "memory": 6697, "data_time": 0.21653, "loss_prob": 0.83336, "loss_db": 0.14021, "loss_thr": 0.41576, "loss": 1.38932, "time": 1.21775} {"mode": "val", "epoch": 75, "iter": 80, "lr": 0.00462, "hmean-iou:recall": 0.77193, "hmean-iou:precision": 0.85437, "hmean-iou:hmean": 0.81106} {"mode": "train", "epoch": 76, "iter": 5, "lr": 0.00459, "memory": 6697, "data_time": 1.8262, "loss_prob": 0.96722, "loss_db": 0.16866, "loss_thr": 0.42837, "loss": 1.56425, "time": 3.11078} {"mode": "train", "epoch": 76, "iter": 10, "lr": 0.00459, "memory": 6697, "data_time": 0.34228, "loss_prob": 0.83945, "loss_db": 0.13983, "loss_thr": 0.41511, "loss": 1.39439, "time": 1.24785} {"mode": "train", "epoch": 76, "iter": 15, "lr": 0.00459, "memory": 6697, "data_time": 0.24269, "loss_prob": 0.77103, "loss_db": 0.13846, "loss_thr": 0.39751, "loss": 1.30701, "time": 1.16602} {"mode": "train", "epoch": 76, "iter": 20, "lr": 0.00459, "memory": 6697, "data_time": 0.23284, "loss_prob": 0.82548, "loss_db": 0.14137, "loss_thr": 0.4183, "loss": 1.38514, "time": 1.17493} {"mode": "val", "epoch": 76, "iter": 80, "lr": 0.00459, "hmean-iou:recall": 0.80702, "hmean-iou:precision": 0.83636, "hmean-iou:hmean": 0.82143} {"mode": "train", "epoch": 77, "iter": 5, "lr": 0.00455, "memory": 6697, "data_time": 1.89037, "loss_prob": 0.86756, "loss_db": 0.14867, "loss_thr": 0.43502, "loss": 1.45125, "time": 3.01924} {"mode": "train", "epoch": 77, "iter": 10, "lr": 0.00455, "memory": 6697, "data_time": 0.28837, "loss_prob": 0.80347, "loss_db": 0.13739, "loss_thr": 0.42769, "loss": 1.36856, "time": 1.19132} {"mode": "train", "epoch": 77, "iter": 15, "lr": 0.00455, "memory": 6697, "data_time": 0.25167, "loss_prob": 0.77439, "loss_db": 0.13453, "loss_thr": 0.38776, "loss": 1.29668, "time": 1.23971} {"mode": "train", "epoch": 77, "iter": 20, "lr": 0.00455, "memory": 6697, "data_time": 0.23134, "loss_prob": 0.9148, "loss_db": 0.16201, "loss_thr": 0.43418, "loss": 1.51099, "time": 1.17831} {"mode": "val", "epoch": 77, "iter": 80, "lr": 0.00455, "hmean-iou:recall": 0.80702, "hmean-iou:precision": 0.85981, "hmean-iou:hmean": 0.83258} {"mode": "train", "epoch": 78, "iter": 5, "lr": 0.00452, "memory": 6697, "data_time": 1.92259, "loss_prob": 0.81544, "loss_db": 0.14349, "loss_thr": 0.43224, "loss": 1.39118, "time": 3.1056} {"mode": "train", "epoch": 78, "iter": 10, "lr": 0.00452, "memory": 6697, "data_time": 0.36711, "loss_prob": 0.79501, "loss_db": 0.13893, "loss_thr": 0.40264, "loss": 1.33658, "time": 1.271} {"mode": "train", "epoch": 78, "iter": 15, "lr": 0.00452, "memory": 6697, "data_time": 0.24121, "loss_prob": 0.80556, "loss_db": 0.14648, "loss_thr": 0.40289, "loss": 1.35494, "time": 1.27677} {"mode": "train", "epoch": 78, "iter": 20, "lr": 0.00452, "memory": 6697, "data_time": 0.2458, "loss_prob": 0.83969, "loss_db": 0.14757, "loss_thr": 0.41819, "loss": 1.40545, "time": 1.21414} {"mode": "val", "epoch": 78, "iter": 80, "lr": 0.00452, "hmean-iou:recall": 0.80702, "hmean-iou:precision": 0.80702, "hmean-iou:hmean": 0.80702} {"mode": "train", "epoch": 79, "iter": 5, "lr": 0.00449, "memory": 6697, "data_time": 2.18414, "loss_prob": 0.77072, "loss_db": 0.13662, "loss_thr": 0.40215, "loss": 1.3095, "time": 3.49511} {"mode": "train", "epoch": 79, "iter": 10, "lr": 0.00449, "memory": 6697, "data_time": 0.55483, "loss_prob": 0.80347, "loss_db": 0.13312, "loss_thr": 0.42606, "loss": 1.36265, "time": 1.4829} {"mode": "train", "epoch": 79, "iter": 15, "lr": 0.00449, "memory": 6697, "data_time": 0.14666, "loss_prob": 0.85638, "loss_db": 0.14485, "loss_thr": 0.42296, "loss": 1.42419, "time": 1.19862} {"mode": "train", "epoch": 79, "iter": 20, "lr": 0.00449, "memory": 6697, "data_time": 0.22317, "loss_prob": 0.83489, "loss_db": 0.13855, "loss_thr": 0.40241, "loss": 1.37585, "time": 1.24663} {"mode": "val", "epoch": 79, "iter": 80, "lr": 0.00449, "hmean-iou:recall": 0.72807, "hmean-iou:precision": 0.93258, "hmean-iou:hmean": 0.81773} {"mode": "train", "epoch": 80, "iter": 5, "lr": 0.00445, "memory": 6697, "data_time": 1.90082, "loss_prob": 0.81433, "loss_db": 0.14039, "loss_thr": 0.41021, "loss": 1.36494, "time": 3.28465} {"mode": "train", "epoch": 80, "iter": 10, "lr": 0.00445, "memory": 6697, "data_time": 0.26379, "loss_prob": 0.77511, "loss_db": 0.13023, "loss_thr": 0.4288, "loss": 1.33414, "time": 1.25493} {"mode": "train", "epoch": 80, "iter": 15, "lr": 0.00445, "memory": 6697, "data_time": 0.29805, "loss_prob": 1.04423, "loss_db": 0.1674, "loss_thr": 0.44752, "loss": 1.65915, "time": 1.29785} {"mode": "train", "epoch": 80, "iter": 20, "lr": 0.00445, "memory": 6697, "data_time": 0.2231, "loss_prob": 0.79777, "loss_db": 0.13595, "loss_thr": 0.41598, "loss": 1.3497, "time": 1.23199} {"mode": "val", "epoch": 80, "iter": 80, "lr": 0.00445, "hmean-iou:recall": 0.81579, "hmean-iou:precision": 0.80172, "hmean-iou:hmean": 0.8087} {"mode": "train", "epoch": 81, "iter": 5, "lr": 0.00442, "memory": 6697, "data_time": 1.99578, "loss_prob": 0.98923, "loss_db": 0.16665, "loss_thr": 0.42496, "loss": 1.58084, "time": 3.14785} {"mode": "train", "epoch": 81, "iter": 10, "lr": 0.00442, "memory": 6697, "data_time": 0.22379, "loss_prob": 0.95486, "loss_db": 0.16162, "loss_thr": 0.43368, "loss": 1.55015, "time": 1.23538} {"mode": "train", "epoch": 81, "iter": 15, "lr": 0.00442, "memory": 6697, "data_time": 0.37464, "loss_prob": 0.79042, "loss_db": 0.14499, "loss_thr": 0.40032, "loss": 1.33574, "time": 1.41556} {"mode": "train", "epoch": 81, "iter": 20, "lr": 0.00442, "memory": 6697, "data_time": 0.22999, "loss_prob": 0.80191, "loss_db": 0.13728, "loss_thr": 0.39115, "loss": 1.33035, "time": 1.23056} {"mode": "val", "epoch": 81, "iter": 80, "lr": 0.00442, "hmean-iou:recall": 0.76316, "hmean-iou:precision": 0.85294, "hmean-iou:hmean": 0.80556} {"mode": "train", "epoch": 82, "iter": 5, "lr": 0.00439, "memory": 6697, "data_time": 1.88433, "loss_prob": 0.8251, "loss_db": 0.14148, "loss_thr": 0.41003, "loss": 1.37661, "time": 3.11013} {"mode": "train", "epoch": 82, "iter": 10, "lr": 0.00439, "memory": 6697, "data_time": 0.52942, "loss_prob": 0.76283, "loss_db": 0.12862, "loss_thr": 0.42403, "loss": 1.31549, "time": 1.52728} {"mode": "train", "epoch": 82, "iter": 15, "lr": 0.00439, "memory": 6697, "data_time": 0.24408, "loss_prob": 0.77767, "loss_db": 0.13427, "loss_thr": 0.3993, "loss": 1.31124, "time": 1.20119} {"mode": "train", "epoch": 82, "iter": 20, "lr": 0.00439, "memory": 6697, "data_time": 0.23964, "loss_prob": 0.89766, "loss_db": 0.16041, "loss_thr": 0.41673, "loss": 1.4748, "time": 1.22605} {"mode": "val", "epoch": 82, "iter": 80, "lr": 0.00439, "hmean-iou:recall": 0.81579, "hmean-iou:precision": 0.80172, "hmean-iou:hmean": 0.8087}

Tomhouxin avatar Jun 10 '22 03:06 Tomhouxin

Empirically, ~80 epochs might not be sufficient. You can keep training it for another few hundred epochs. Check out the log here

gaotongxiao avatar Jun 10 '22 09:06 gaotongxiao

image 但是学习率已经下降到很低了

Tomhouxin avatar Jun 10 '22 09:06 Tomhouxin

image image

And detect result like this

Tomhouxin avatar Jun 10 '22 10:06 Tomhouxin

The result looks reasonable to me. If you are striking for higher performance, consider trying out different data augmentation techniques, adding more data (e.g. using CTW1500), and different learning strategies. We cannot give concrete suggestions as the actual situation varies case by case.

gaotongxiao avatar Jun 10 '22 10:06 gaotongxiao

Ok, Thank you!!!

Tomhouxin avatar Jun 10 '22 10:06 Tomhouxin