Wentao Jiang

Results 21 comments of Wentao Jiang

> Replace all `64` with `32` of `Generator.__init__` ( in net.py ), the max gpu memory usage is around 7500MiB, it goes well with my GTX 1070 That's a good...

> BTW, how to choose the best G model from snapshots? Through qualitative evaluation.

Change the dtype in inference/sample.py to fp16 could work `dtype = "fp16" # use fp16 instead of bf16`

> @wtjiang98 genius Still don't know how to solve the problem. Should we modify the inference code?

> > For fine-tuning, we offer the following suggestions: > > > > 1. Reduce the learning rate. We recommend a learning rate of 1e-5 to 1e-6 for fine-tuning. >...

> For fine-tuning, we offer the following suggestions: > > 1. Reduce the learning rate. We recommend a learning rate of 1e-5 to 1e-6 for fine-tuning. > 2. If there...

> > > 对于微调,我们提出以下建议: > > > > > > 1. 降低学习率。我们建议使用 1e-5 到 1e-6 的学习率进行微调。 > > > 2. 如果添加了其他模块,请加载预先训练的权重并使用零初始化进行推理。这将验证初始化或代码是否正确。 > > > 3. 时刻关注loss曲线,如果出现loss的尖峰,那么很有可能模型崩溃了,从最近的checkpoint恢复训练。如果训练过程频繁崩溃,那么可以考虑增加batch size或者继续降低学习率。 > > >...

> > > > For training from scratch, this is normal. The point of confusion for me is that if you guys are zero-init from pre-training weights, then the results...

> > > > > > For training from scratch, this is normal. The point of confusion for me is that if you guys are zero-init from pre-training weights, then...