AlphaGAN
AlphaGAN copied to clipboard
For some training details, thank you
尊敬的研究者 您好,冒昧的打扰您了。在拜读了该论文后,我觉得这篇论文对我的研究有重大的意义。由于我是一个初学者,我想咨询您一下训练时候的问题。希望您能帮帮忙。 遇到的困难:我看文章中用的是p40显卡来计算,我的服务器是16GB显存的P100显卡,似乎AlphaGAN的代码对显存的要求是24G左右我把batchsize调小到了32, 但是还是不行。那么我能否对代码进行一个简单的修改,改成多卡并行计算,是否会影响生成的精度呢? 期待您的回复,万分感谢。
Distinguished Researcher Hello, after reading this paper, I think it is of great significance to my research. Since I am a beginner, I would like to ask you some questions about training. I was hoping you could help. Difficulties encountered: I read the article using P40 graphics card to calculate, my server is a P100 graphics card with 16GB video memory, it seems that AlphaGAN's code requires about 24G video memory. I have reduced batchsize to 32, but it still does not work. So can I make a simple change to the code to multi-card parallel computation, will it affect the accuracy of the generation? Looking forward to your reply. Thank you very much.
尊敬的研究者
您好,search的过程,我将三组batchsize缩减为32,将val的batchsize缩减为50,就可以在P100 16G的显卡上进行运行。但是我还是有一个疑问,这个代码的并行计算似乎有点问题,我的服务器有8张P100,但是会出现很明显的报错,导致无法并行(矩阵无法对齐)。
在retrain CIFAR10的模型过程中,我出现了以下的报错,请问这个对精度有什么影响吗?
Distinguished Researcher
Hello, in the process of search, I have reduced the batchsize of three groups to 32 and the batchsize of Val to 50, which can be run on the P100 16G graphics card. However, I still have a question. There seems to be something wrong with the parallel calculation of this code. My server has 8 P100 sheets, but there will be an obvious error, leading to the failure of parallelism.
In the process of retrain CIFAR10 model, I have reported the following error, may I ask whether this has any impact on the accuracy?
尊敬的研究者 您好,search的过程,我将三组batchsize缩减为32,将val的batchsize缩减为50,就可以在P100 16G的显卡上进行运行。但是我还是有一个疑问,这个代码的并行计算似乎有点问题,我的服务器有8张P100,但是会出现很明显的报错,导致无法并行(矩阵无法对齐)。 在retrain CIFAR10的模型过程中,我出现了以下的报错,请问这个对精度有什么影响吗? Distinguished Researcher Hello, in the process of search, I have reduced the batchsize of three groups to 32 and the batchsize of Val to 50, which can be run on the P100 16G graphics card. However, I still have a question. There seems to be something wrong with the parallel calculation of this code. My server has 8 P100 sheets, but there will be an obvious error, leading to the failure of parallelism. In the process of retrain CIFAR10 model, I have reported the following error, may I ask whether this has any impact on the accuracy?
Hi, the warning is due to the calculation of Params and FlOPs. Do not care about that. It does not affect the search and re-training process.
尊敬的研究者 您好,search的过程,我将三组batchsize缩减为32,将val的batchsize缩减为50,就可以在P100 16G的显卡上进行运行。但是我还是有一个疑问,这个代码的并行计算似乎有点问题,我的服务器有8张P100,但是会出现很明显的报错,导致无法并行(矩阵无法对齐)。 在retrain CIFAR10的模型过程中,我出现了以下的报错,请问这个对精度有什么影响吗? Distinguished Researcher Hello, in the process of search, I have reduced the batchsize of three groups to 32 and the batchsize of Val to 50, which can be run on the P100 16G graphics card. However, I still have a question. There seems to be something wrong with the parallel calculation of this code. My server has 8 P100 sheets, but there will be an obvious error, leading to the failure of parallelism. In the process of retrain CIFAR10 model, I have reported the following error, may I ask whether this has any impact on the accuracy?
About the parallel search of alphaGAN, I will release the modified code of parallel searching after the deadline of NIPS 2022. However, I still recomment single card search. Even searching StyleGAN2, I use a single V100.
Distinguished researcher
hello, I follow your advice. I replaced the V100 graphics card, but my BATchsize can only be set to 48, but still cannot reach 64. Batchsize, including val, can only be set to 70, not 100. If you want to use the parameters in your paper. Graphics card memory may exceed 42G. Is there any way to fix it? Thank you very much!