DAVAR-Lab-OCR
DAVAR-Lab-OCR copied to clipboard
Does CTUNet 's masked self-attention mechanism in the graph attention network USE the neighbours relation in Structural Graph Construction part?
From the codes opensource in CTUNet, I only see this:
multimodal_context, batched_img_label, batched_img_bieo_label, bert_token_embeddings = \ self.infor_context_module(info_feat_list, pos_feat=gt_bboxes[0], img_metas=img_metas, info_labels=None, bieo_labels=None, gt_texts=gt_texts[0], char_nums=char_nums)
I can't see any param about graph neighborhood information, which I can find is that the codes use bert encoder to excute the masked self-attention mechanism.