DAVAR-Lab-OCR Does CTUNet 's masked self-attention mechanism in the graph attention network USE the neighbours relation in Structural Graph Construction part?

Does CTUNet 's masked self-attention mechanism in the graph attention network USE the neighbours relation in Structural Graph Construction part?

Open Arlen-yuzu opened this issue 7 months ago • 1 comments

Nov 29 '23 07:11 Arlen-yuzu

From the codes opensource in CTUNet, I only see this: multimodal_context, batched_img_label, batched_img_bieo_label, bert_token_embeddings = \ self.infor_context_module(info_feat_list, pos_feat=gt_bboxes[0], img_metas=img_metas, info_labels=None, bieo_labels=None, gt_texts=gt_texts[0], char_nums=char_nums) I can't see any param about graph neighborhood information, which I can find is that the codes use bert encoder to excute the masked self-attention mechanism.

Nov 29 '23 07:11 Arlen-yuzu

DAVAR-Lab-OCR DAVAR-Lab-OCR copied to clipboard

Does CTUNet 's masked self-attention mechanism in the graph attention network USE the neighbours relation in Structural Graph Construction part?

DAVAR-Lab-OCR
DAVAR-Lab-OCR copied to clipboard