hi,thank you for the great codes. But I have met some problems. I just download the code, and run using distribute training mode,
hi,thank you for the great codes. But I have met some problems. I just download the code, and run 35 epochs using distribute training mode, but using the eval.py to evaluate the perfermance of trained model, just got 0.0 accuracy using 30th epoch model, so i want to know what should I do? thank you very much
@yangsuhui run more epochs until the model is convergent; try 80 or 90 epochs
@yizt thank you, I will try train more epochs,; when you train model with generate horizonal datas, and get the best model with 61 epochs as you put in the code; May I ask you, from which epoch the model accuracy is not zero when you train model?
I did not calculate accuracy for every epoch ,i guess when the loss reduced greatly ,accuracy will not be zero
| | 易作天 | | 邮箱:[email protected] |
签名由 网易邮箱大师 定制
On 08/26/2020 17:40, yangsuhui wrote:
@yizt thank you, I will try train more epochs,; when you train model with generate horizonal datas, and get the best model with 61 epochs as you put in the code; May I ask you, from which epoch the model accuracy is not zero when you train model?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
@yangsuhui hi, loss value is less than 0.01 after 30 epochs;
@yizt ok, understood, thank you very much!!! one more, May I ask you how much time you trained from begining to model convergent with 61 epochs in horizonal mode? I found the convergent speed using distribute training is slower than trained with single gpu,
@yangsuhui i train the model using 8 RTX 2080 Ti GPUs with the image batch size of 128; the trainning process takes about 36 hours