OpenHGNN icon indicating copy to clipboard operation
OpenHGNN copied to clipboard

trainner defined without scripts and readme in output

Open buaalyx opened this issue 2 years ago • 7 comments

For example, DeepWalk and HeGAN trainner was defined but it seems no cmd in scripts/run_experiments.py and no readme doc in outputs/. I don't know the datasets supported and how to run the model.

buaalyx avatar Apr 19 '22 13:04 buaalyx

Hello, First, the model DeepWalk is a homogeneous GNN algorithm, which will not be included in OpenHGNN. The reason why DeepWalk appears in trainer is that metapath2vec use sthe trainer way used in DeepWalk. If you are interested in DeepWalk, refer to implementation in DGL. Second, the HeGAN is uploaded now. However, our developer can't reproduce the performance even if we use the source code of the author. The performance in our implementation can only be up to that in the author's source code.

Theheavens avatar Apr 22 '22 14:04 Theheavens

In the sample_graph_for_dis in HeGAN_trainner.py, the comment states the function returns 3 graphs(pos_hg, neg_hg1 and neg_hg2), but the exact result are pos_hg, pos_hg1, pos_hg2, and pos_hg2 looks the same with pos_hg

buaalyx avatar Apr 27 '22 14:04 buaalyx

Besides, I have a question about args.meta_path_key, for herec and mp2vec, does these two models only use one metapath? Many datasets contain multiple metapaths in their meta_paths_dict(eg. dblp4MAGNN has 'APVPA' & 'APA'), but in config.ini or config.py, the args for these two models only has one value namely 'APVPA'. So, will other metapaths use during trainning? How to use multiple metapaths?

buaalyx avatar Apr 27 '22 17:04 buaalyx

Good question!

  1. It seems that the experiments of mp2vec only use one meta-path.
  2. In section 4.2.2 Setting the Fusion Function of herec, it offers three functions used in fusing different meta-paths, which will make it become end-to-end training. For generality, we only offer the embedding training, not including fusion training.
  3. For now, we recommend that set one meta-path during training and other meta-paths will not be used.
  4. How to use multiple meta-paths? The direct way is that concat the embeddings of different meta-paths. There are some advanced algorithms, like the fusion function in Herec and the HEAD.

Theheavens avatar Apr 28 '22 03:04 Theheavens

In the sample_graph_for_dis in HeGAN_trainner.py, the comment states the function returns 3 graphs(pos_hg, neg_hg1 and neg_hg2), but the exact result are pos_hg, pos_hg1, pos_hg2, and pos_hg2 looks the same with pos_hg

neg_hg2 is negative sampled graph with wrong nodes embedding generated by Generator, but its adjacency matrix is real. So the adjacency matrix can be the same with pos_hg, and nodes embedding are assigned in HERE.

clearhanhui avatar Apr 28 '22 11:04 clearhanhui

Thanks for reply but when I want to run hegan on my own datasets, at this part in sample_graph_for_dis

for nt in self.hg_dict.keys():
            for src in self.hg_dict[nt].keys():
                for i in range(self.k):

I found another question

  File "/root/Downloads/lyx/heter/hgt/base_methods/OpenHGNN-main/openhgnn/trainerflow/HeGAN_trainer.py", line 57, in sample_graph_for_dis
    dst = random.choice(self.hg_dict[nt][src][et])
  File "/usr/local/anaconda3/envs/d80/lib/python3.7/random.py", line 261, in choice
    raise IndexError('Cannot choose from an empty sequence') from None
IndexError: Cannot choose from an empty sequence

I found at last, self.hg_dict[nt][src][et] is an empty tensor([]). Since nt and src are the keys of hg_dict , I guess maybe the parameter k is not compatible with my own datasets? I don't know the meaning of k, how should I set that value for my own dataset?

buaalyx avatar May 01 '22 18:05 buaalyx

Q1: The meaning of k A1: The parameter self.k here means the number of samples, and similar implementation can be found in the author's codes.

Q2: IndexError A2: I guess that your dataset may exist some unconnected edges. I suggest you can just skip it when it comes to empty tensor by writing a line of if statement: if len(self.hg_dict[nt][src][et]) == 0: continue

clearhanhui avatar May 04 '22 09:05 clearhanhui