luzai comments

Results 12 comments of


                                            luzai

pred will contain nan if k_at_hop is large

I find that it is related to [here](https://github.com/Zhongdao/gcn_clustering/blob/8f1e5e37f07b53c1350f8784f7c3ac3e5ca31405/feeder/feeder.py#L82). On test dataset, there may exists some nodes not connecting to any other nodes, causing A/D contains some entry of 1/0. May...

pred will contain nan if k_at_hop is large

Thank you very much for your advice! It is a good idea to ignore linkage of nan for large k_at_hop and indeed k_at_hop=[20,5] is enough for high performance clustering.

shuffle stacked loss

Hi @cfifty , May I ask why not replacing with `loss=tf.random.shuffle(loss)`?

shuffle stacked loss

Thank you very much for your detailed explanation! 1. The loss list is not shuffled if using `tf.random.shuffle(loss)`. For the reason, I think, tf.random.shuffle is not an inplace operation, and...

shuffle stacked loss

I am sorry that I made a mistake, the gradient operation is still not defined for `loss=tf.random.shuffle(loss)` in `tf 1.15.3`. ![image](https://user-images.githubusercontent.com/14788650/90609719-0862e900-e237-11ea-8863-4c2d4e49f28c.png) We should consider use `loss = tf.gather(loss, tf.random.shuffle(tf.range(tf.shape(loss)[0])))` instead.

PyTorch implementation

I implement a naive version of PCGrad by PyTorch: Click to expand ``` python import torch from torch.optim.optimizer import Optimizer import numpy as np from torch.utils.cpp_extension import load @torch.no_grad() def...

PyTorch implementation

I think it may related to the task this method applied to, for example, on CIFAR-100 classification task, momentum=0.9, weight_decay=1e-4, mopt_weit=1, maybe suitable. This naive version is relatively slow, because...

Test Accuracy

May I ask the optimal hyperparameter? I try with `xi=1e-6, eps=8, batch_size=128`, performance is better, the test acc is 60.74%.

very slow in multi-gpu

Hi @mk-minchul , could you share the solution of DistributedDataParallel? It would be quite helpful to increase the batch size and speedup search process by multi-gpu. Thank you very much!

May I ask which transfer accuracy used in LogME paper?

Hi youkaichao, Thank you so much for your reply! Do you mean "if you evaluate LogME scores on the pretrained models before finetuning, they would have similar LogME scores"? I...