fasterrcnn-pytorch-training-pipeline icon indicating copy to clipboard operation
fasterrcnn-pytorch-training-pipeline copied to clipboard

Training is very slow: prefetching and cachin data?

Open tolsicsse opened this issue 2 years ago • 14 comments

I have been trying to implement my own custom dataset and it works but it seems like the data is red from files every epoch, is there no way of caching the data? I have mostly worked in Tensorflow and it can get a tremendous improvement in speed by prefetching and caching the data.

tolsicsse avatar Nov 24 '22 08:11 tolsicsse

@tolsicsse I am not sure why you are facing slow down issues. The datasets are created once before the training. Although there is no memory caching but the dataset pipeline has not been giving me any slowdown issues. If you don't mind answering this, are you using your own dataset and data loader script? If so, can you try the one given in the repository and see whether it speeds up.

sovit-123 avatar Nov 24 '22 09:11 sovit-123

I am callinge train.py with my own yaml-file with data. I will see if I have time to test with the example.

tolsicsse avatar Nov 24 '22 09:11 tolsicsse

In that case, you should face any slowness in data loading. It's pretty well-optimized. But surely, let me know in case if you have ideas to optimize it further. I am open to suggestions.

sovit-123 avatar Nov 24 '22 11:11 sovit-123

I have now a version running with the following changes:

  • It trains with images without labels since I have only one label that I am interested in, and it seems to work with --use-train-aug
  • I have very large images but only about 500 so reading from file a lot takes time, so I have added customizable caching to speed up, and adding persistent workers and pre-fetching, which on my laptop speeds up training from 17 sek/iteration to 4.5-5.5 sek/iteration

I have not time test if this work with other setups. If you are interested in trying it out I can submit my version. However, I am not that good at git so I am not sure how to push it.

tolsicsse avatar Nov 28 '22 08:11 tolsicsse

@tolsicsse Thanks a lot for the update. It means a lot to see such work from your side. In case you are not comfortable with Git yet, and if you have made changes in the existing files only, then maybe you can email them to me. I can surely take a look at them. And if I find that there are no merge conflicts they will be included in the main codebase. Your contributions will surely be acknowledged. Please let me know what you think.

sovit-123 avatar Nov 28 '22 08:11 sovit-123

I have cloned the project, can I just push it? Am I allowed? I did not create a fork or anything similar. Could be good to learn how this works.´:-)

tolsicsse avatar Nov 28 '22 08:11 tolsicsse

I have forked and pushed to my own repo https://github.com/tolsicsse/fasterrcnn-pytorch-training-pipeline/tree/modified I think you can somehow merge from there?

tolsicsse avatar Nov 28 '22 11:11 tolsicsse

I think you forked, created a new branch, and then created a PR. That is the right process. Had you pulled the code recently? I have made some major changes. Nothing that would break your code. But your PR may overwrite the new code if you have not pulled recently. But in case you can pull the recent code and create a PR on top of that, it will be really great.
Please let me know if you have any concerns.

sovit-123 avatar Nov 28 '22 11:11 sovit-123

OK, I have synched my fork, should I do anythin more? Do I have to create a new PR?

tolsicsse avatar Nov 28 '22 12:11 tolsicsse

@tolsicsse Just to be safe, I am discarding the current PR, maybe you can commit and create a new PR. I am not discarding the PR yet, if you reply to this message to do so, then I will do it. I think that's the best approach. By the way, I was taking a look at the changed files, and it was looking really good to have to functionalities in place.

sovit-123 avatar Nov 28 '22 12:11 sovit-123

Thanks, it's Ok to through it away

tolsicsse avatar Nov 28 '22 12:11 tolsicsse

@tolsicsse I think there is some confusion. I am not saying that I will throw away the contribution. I want to merge it. I just need to be sure that it is on the latest code that I had committed earlier. I absolutely like the contribution that you have made. If you will not be creating a new PR with the latest pulled code, I will surely go through the current PR carefully and merge it. I just hope that there is no miscommunication. Thanks.

sovit-123 avatar Nov 28 '22 17:11 sovit-123

OK, yes it up to date now. I synchronized with your code.

tolsicsse avatar Nov 28 '22 18:11 tolsicsse

Thanks. I will try to merge it by end of the day today.

sovit-123 avatar Nov 29 '22 03:11 sovit-123