DAVAR-Lab-OCR
DAVAR-Lab-OCR copied to clipboard
Need tips on how to visualize GPMA masks
Hi there! I need some help in visualizing the GPMA masks. When I visualized the LPMA masks, it look perfectly normal as shown below.
LPMA Horizontal Mask
LPMA Vertical Mask
But the visualization of GPMA masks are incomprehensible.
GPMA Horizontal Mask
GPMA Vertical Mask
The way I visualize GPMA mask is by first retrieving the global mask from result[3] of simple_test.py (as result[3][1] and result[3][2] are horizontal and vertical masks respectively), then I multiply the masks by 255. I tried multiplying the masks with the cell box (result[3][0]), but the result is still messy.
Hope to get some tips on how to improve the visualization of the GPMA masks. Thanks!
It looks like the model is not well learned, can you check if the prediction target of the model is correct (L163 in pipelines/gpma_data.py)?
The model is pretrained on pubtabnet provided by the authors, and the picture visualized is also a test image from pubtabnet. I tried visualize the training image as well but get the same result.
Hi! Did you manage to solve this problem? I am having the same issue. The global segmentation does't seem to work properly. I understand that the task mainly exists in order to improve the local tasks however if the global segmentation doesn't work I expect a decrease in performance even in the local segmentations and pyramids