relbench icon indicating copy to clipboard operation
relbench copied to clipboard

Why is `input_id` not contained by `n_id`?

Open smiles724 opened this issue 11 months ago • 4 comments

Hi, thanks for sharing the benchmark.

In rel-attendance data, what is the relationship between input_id and n_id in the heterogenous graph sampled by NeighborLoader? From my understanding, 'input_id' corresponds to the global index of the input_nodes, while n_id corresponds to the global node index for every sampled node. Therefore, n_id should contain 'input_id'. However, when I printed them out, this is not the case.

Can you please give me some hints to understand this? image

smiles724 avatar Dec 20 '24 22:12 smiles724

@rusty1s can you please take a look?

rishabh-ranjan avatar Dec 21 '24 04:12 rishabh-ranjan

input_id refers to the ID of the training table, i.e., the row of the training table the example subgraph was sampled from.

rusty1s avatar Dec 21 '24 10:12 rusty1s

input_id refers to the ID of the training table, i.e., the row of the training table the example subgraph was sampled from.

Matthias, thanks for your immediate reply! I have two more questions.

(1) As input_id refers to the training table's ID, can we access this input_id for other tables instead of only entity_table? It seems in rel-attendance, input_id only exists for users.

(2) I am confused about the GNN model forward function. As you directly slice x_dict[entity_table][: seed_time.size(0)] for final output prediction, does it mean that the first seed_time.size(0) entities of the entity_table are the target entities that corresponds to the seed time? image

Look forward to your response!

smiles724 avatar Dec 21 '24 21:12 smiles724

  1. it only exists for the entity table since this is the seed table to start sampling from.
  2. Yes

rusty1s avatar Dec 22 '24 07:12 rusty1s