VQA_ReGAT The semantic_embedding and spatic

Hi, it is a great work for VQA. I did't download the datasets. So I want to konw the types of semantic_embedding and spatic_embedding, are they one-hot embedding or word embedding or extract features from model? I'm looking forward you reply, thanks!

Jan 15 '21 14:01 haoopan

Hi, thanks for your interests in this work and sorry for the late reply. They are one-hot embeddings.

Jan 21 '21 05:01 linjieli222

Dear scholar, Did your code pos_embedding.py show the same type id num like the below picture?

Apr 15 '21 02:04 alice-cool

Dear scholar, This is your code in pos_emb.py

                    y_diff = center_y[i] - center_y[j]
                    x_diff = center_x[i] - center_x[j]
                    diag = math.sqrt((y_diff)**2 + (x_diff)**2)
                    if diag < 0.5 * image_diag:
                        sin_ij = y_diff/diag
                        cos_ij = x_diff/diag
                        if sin_ij >= 0 and cos_ij >= 0:
                            label_i = np.arcsin(sin_ij)
                            label_j = 2*math.pi - label_i
                        elif sin_ij < 0 and cos_ij >= 0:
                            label_i = np.arcsin(sin_ij)+2*math.pi
                            label_j = label_i - math.pi
                        elif sin_ij >= 0 and cos_ij < 0:
                            label_i = np.arccos(cos_ij)
                            label_j = 2*math.pi - label_i
                        else:
                            label_i = -np.arccos(sin_ij)+2*math.pi
                            label_j = label_i - math.pi
                        adj_matrix[i, j] = int(np.ceil(label_i/(math.pi/4)))+3
                        adj_matrix[j, i] = int(np.ceil(label_j/(math.pi/4)))+3

But I think if obey the below picture type id num , maybe the following

                        if sin_ij >= 0 and cos_ij >= 0:# j is in the second Quadrant, i is the reference center
                            label_i = math.pi - np.arcsin(sin_ij)
                            label_j = 2*math.pi - np.arcsin(sin_ij)
                            print(math.degrees(label_i))
                            print(math.degrees(label_j))
                        elif sin_ij < 0 and cos_ij >= 0:#j is in  the third Quadrant, i is the reference center
                            label_i = -np.arcsin(sin_ij)+math.pi
                            label_j = np.arccos(cos_ij)
                            print(math.degrees(label_i))
                            print(math.degrees(label_j))
                        elif sin_ij >= 0 and cos_ij < 0: #j is in the first Quadrant, i is the reference center
                            label_i = np.arcsin(sin_ij)
                            label_j = math.pi + np.arcsin(sin_ij)
                            print(math.degrees(label_i))
                            print(math.degrees(label_j))
                        else:# j is in the fourth Quadrant, i is the reference center
                            label_i = np.arcsin(sin_ij)+2*math.pi
                            label_j = math.pi + np.arcsin(sin_ij)
                            print(math.degrees(label_i))
                            print(math.degrees(label_j))
                        adj_matrix[i, j] = int(np.ceil(label_i/(math.pi/4)))+3
                        adj_matrix[j, i] = int(np.ceil(label_j/(math.pi/4)))+3

For spatial relations, as we do not use their semantic meaning during graph attention. The order of the labels do not matter. But you are right, the labels are not exactly the same as the ones in the pictures.

Apr 15 '21 17:04 linjieli222

0: wearing, 1: holding, 2: sitting on, 3: standing on, 4: riding, 5:eating, 6:hanging from, 7:carrying, 8:attached to, 9: walking on, 10: playing, 11:covering, 12: lying on, 13:watching, 14:looking at the relation is 4: riding, 10: playing I think it must be my error but I don't know where the error is .

Remember that our semantic relation labels are predictions from a neural network, so the labels are not ground truth labels, which means there are very likely mistakes made in predictions. Also, can you remind me where did you get the label to relation mapping? It has been a while since I worked on this project, just want to make sure that we are on the same page.

Apr 15 '21 18:04 linjieli222

VQA_ReGAT
VQA_ReGAT copied to clipboard

The semantic_embedding and spatic_embedding types.

VQA_ReGAT VQA_ReGAT copied to clipboard

The semantic_embedding and spatic_embedding types.

VQA_ReGAT
VQA_ReGAT copied to clipboard