OpenHGNN icon indicating copy to clipboard operation
OpenHGNN copied to clipboard

Something weird with the order of setting arguments when using best config

Open LspongebobJH opened this issue 3 years ago • 3 comments

https://github.com/BUPT-GAMMA/OpenHGNN/blob/143aa56fafa5d5858c0a0528126c679cb4f518e0/openhgnn/config.py#L232

Hi,

I found that Config would firstly read config arguments from config.ini then utilize best_config.py to replace them when setting --use_best_config. However, stuff like self.hidden_dim in config.py was set when Config reads arg from config.ini and would not change even when best_config.py modified self.h_dim and self.num_heads because best_config.py works in OpenHGNN(). Probably it's a bug I think? Because change of self.h_dim and self.num_heads might change self.hidden_dim.

LspongebobJH avatar Mar 02 '22 08:03 LspongebobJH

In a way, it's a bug. self.hidden_dim = self.h_dim * self.num_heads. Maybe we should remove one of the parameters to fix the bug. Or we move the code L232 after the best configuration setting. In the beginning, I just want to solve the integer division problem (hidden_dim / num_heads). Or you get other solutions?

Theheavens avatar Mar 02 '22 08:03 Theheavens

I think we should not remove one of the parameters. I'm not sure how these two work for other models. But self.h_dim and self.hidden_dim actually mean different but important things in MAGNN. The former represents the hidden dimension in each head while the latter represents the sum of hidden dimension across all heads. And I see these two parameters in other models' config as well (probably those with attention mechanism). Removing them might affect a lot to some models.

Probably it'd better if we could try to differentiate the sources from which we read config arguments instead of using best config to replace original one. For example, we should read args from best_config.py when setting --use_best_config and from config.ini otherwise.

I'm not sure if it's important to give hidden_dim a clear definition, the hidden dimension in each head or the sum of hidden dimension across all heads, in multi-head attention mechanism. If we want to remove one of them, we should figure this definition out and revise model codes that confused this concept or utilized both h_dim and hidden_dim. But the method mentioned above might be more convenient at the moment.

LspongebobJH avatar Mar 02 '22 12:03 LspongebobJH

I think we do not need to specify three parameters, instead, we need only two of them, as self.hidden_dim = self.h_dim * self.num_heads. Given two of them and the third one is determined. Considering this system, I think we only need self.hidden_dim and self.num_heads, as in module preprocess, we need args.hidden_dim, so this parameter is necessary. Then in the model, we just need to replace self.h_dim with self.hidden_dim // self.num_heads. This method we need to check if self.hidden_dim can self.num_heads

We may rename self.h_dim to self.head_size like the definition in dgl.gatconv. The head_size means the hidden dimension in each head. The hidden_dim means the output dimension of each hidden layer.

We can keep the two parameters in config.ini and best_config.py. When we use best_config and change it, it will not affect other parameters.

dddg617 avatar Mar 09 '22 08:03 dddg617