VHL why the accuracy is so low?

Excuse me, I copy your code to my environment of experiment. Firstly, the virtual data is generated in upsample way. Then, they are loaded and normalized by mean=(0.5, 0.5, 0.5), std=(0.25, 0.25, 0.25) to a Dataset. Lastly, the virtual dataloader is created by the virtual dataset. The batch_size and batch_num_per_class are 64, 20, respectively. And the real dataset cifar10 is normalized by mean = [0.5071, 0.4865, 0.4409], std = [0.2673, 0.2564, 0.2762] like your code. And batch_size = 128 I set the weight of virtual data loss is 1, and the weight of proxy align loss is 0.5. Why the accuracy is only 0.2 in the 100 round? If I ignored something? Could you help me to debug. 截屏2023-06-04 10 09 54

Jun 04 '23 02:06 XueBaolu

Did you run the original project, or re-implement the algorithm with your own codes? If you re-implement it, can you provide the accuracy of FedAvg and other baselines, check if the low ACC comes from the implementation?

Jun 04 '23 13:06 wizard1203

Thank you for responding. I do re-implement the algorithm with my own codes. And I set the lr=0.01, didn't use a scheduler, So the increase of accuracy is slow. But for FedAvg, it still got a 0.3 accuracy at 100 round, higher than which for my re-implement of VHL. So I wander whether I ignore some steps of VHL in my codes. Could help me to check?

Jun 05 '23 02:06 XueBaolu

Got it. So the FedAvg has the similar acc with VHL in your case. What is the neural network you use? A simple CNN may not have enough capacity to fit real data and noise data.

Jun 05 '23 08:06 wizard1203

The VHL is strangely worse than FedAvg in my case. And I use the resnet18 to run.

Jun 05 '23 12:06 XueBaolu

Could you try lr=0.1, or lr=0.3? And check the number of clients in total, number of clients each round, local epochs, the Non-IID degree of datasets.

Jun 06 '23 04:06 wizard1203

Thank you for responding. I will try it. And I set the number of client to 10, all sampled in each round. Set local epochs to 1. The alpha of lda is 0.05.

Jun 06 '23 12:06 XueBaolu

Excuse me. I run my codes with lr=0.1. But the VHL is still worse. The result of VHL is upper and which of FedAvg is below. 截屏2023-06-07 09 02 29 截屏2023-06-07 09 02 46

Jun 07 '23 01:06 XueBaolu

I see. The accuracy of FedAvg can be improved further. You can try lr=0.3, or lr=0.1 with Momentum. And the first conv layer of resnet18 should be 3x3 instead of 7x7 used for ImageNet. For comparing VHL and FedAvg,

Could you try loading more noise data, and ensuring each class can be sampled per iteration?
Maybe you can try sampling 5 clients each round instead of 10. I'm thinking that sampling all clients may have low variance.

Jun 07 '23 02:06 wizard1203

ok, I will try later, thanks!

Jun 07 '23 07:06 XueBaolu