luzai
luzai
I find that it is related to [here](https://github.com/Zhongdao/gcn_clustering/blob/8f1e5e37f07b53c1350f8784f7c3ac3e5ca31405/feeder/feeder.py#L82). On test dataset, there may exists some nodes not connecting to any other nodes, causing A/D contains some entry of 1/0. May...
Thank you very much for your advice! It is a good idea to ignore linkage of nan for large k_at_hop and indeed k_at_hop=[20,5] is enough for high performance clustering.
Hi @cfifty , May I ask why not replacing with `loss=tf.random.shuffle(loss)`?
Thank you very much for your detailed explanation! 1. The loss list is not shuffled if using `tf.random.shuffle(loss)`. For the reason, I think, tf.random.shuffle is not an inplace operation, and...
I am sorry that I made a mistake, the gradient operation is still not defined for `loss=tf.random.shuffle(loss)` in `tf 1.15.3`.  We should consider use `loss = tf.gather(loss, tf.random.shuffle(tf.range(tf.shape(loss)[0])))` instead.
I implement a naive version of PCGrad by PyTorch: Click to expand ``` python import torch from torch.optim.optimizer import Optimizer import numpy as np from torch.utils.cpp_extension import load @torch.no_grad() def...
I think it may related to the task this method applied to, for example, on CIFAR-100 classification task, momentum=0.9, weight_decay=1e-4, mopt_weit=1, maybe suitable. This naive version is relatively slow, because...
May I ask the optimal hyperparameter? I try with `xi=1e-6, eps=8, batch_size=128`, performance is better, the test acc is 60.74%.
Hi @mk-minchul , could you share the solution of DistributedDataParallel? It would be quite helpful to increase the batch size and speedup search process by multi-gpu. Thank you very much!
Hi youkaichao, Thank you so much for your reply! Do you mean "if you evaluate LogME scores on the pretrained models before finetuning, they would have similar LogME scores"? I...