DAVAR-Lab-OCR icon indicating copy to clipboard operation
DAVAR-Lab-OCR copied to clipboard

Does CTUNet 's masked self-attention mechanism in the graph attention network USE the neighbours relation in Structural Graph Construction part?

Open Arlen-yuzu opened this issue 7 months ago • 1 comments

Arlen-yuzu avatar Nov 29 '23 07:11 Arlen-yuzu

From the codes opensource in CTUNet, I only see this: multimodal_context, batched_img_label, batched_img_bieo_label, bert_token_embeddings = \ self.infor_context_module(info_feat_list, pos_feat=gt_bboxes[0], img_metas=img_metas, info_labels=None, bieo_labels=None, gt_texts=gt_texts[0], char_nums=char_nums) I can't see any param about graph neighborhood information, which I can find is that the codes use bert encoder to excute the masked self-attention mechanism.

Arlen-yuzu avatar Nov 29 '23 07:11 Arlen-yuzu