crnn.pytorch icon indicating copy to clipboard operation
crnn.pytorch copied to clipboard

hi,thank you for the great codes. But I have met some problems. I just download the code, and run using distribute training mode,

Open yangsuhui opened this issue 5 years ago • 7 comments

yangsuhui avatar Aug 26 '20 09:08 yangsuhui

hi,thank you for the great codes. But I have met some problems. I just download the code, and run 35 epochs using distribute training mode, but using the eval.py to evaluate the perfermance of trained model, just got 0.0 accuracy using 30th epoch model, so i want to know what should I do? thank you very much

yangsuhui avatar Aug 26 '20 09:08 yangsuhui

@yangsuhui run more epochs until the model is convergent; try 80 or 90 epochs

yizt avatar Aug 26 '20 09:08 yizt

@yizt thank you, I will try train more epochs,; when you train model with generate horizonal datas, and get the best model with 61 epochs as you put in the code; May I ask you, from which epoch the model accuracy is not zero when you train model?

yangsuhui avatar Aug 26 '20 09:08 yangsuhui

I did not calculate accuracy for every epoch ,i guess when the loss reduced greatly ,accuracy will not be zero

| | 易作天 | | 邮箱:[email protected] |

签名由 网易邮箱大师 定制

On 08/26/2020 17:40, yangsuhui wrote:

@yizt thank you, I will try train more epochs,; when you train model with generate horizonal datas, and get the best model with 61 epochs as you put in the code; May I ask you, from which epoch the model accuracy is not zero when you train model?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

yizt avatar Aug 26 '20 09:08 yizt

@yangsuhui hi, loss value is less than 0.01 after 30 epochs;

yizt avatar Aug 28 '20 06:08 yizt

@yizt ok, understood, thank you very much!!! one more, May I ask you how much time you trained from begining to model convergent with 61 epochs in horizonal mode? I found the convergent speed using distribute training is slower than trained with single gpu,

yangsuhui avatar Aug 28 '20 09:08 yangsuhui

@yangsuhui i train the model using 8 RTX 2080 Ti GPUs with the image batch size of 128; the trainning process takes about 36 hours

yizt avatar Aug 31 '20 05:08 yizt