comda2k19 dataset, the sequence length as 800 costs too much memory
Dear authors: In the comma2k19datset, the length of image sequences is set as 800. It costs too much memory, and the supercombo model inputs are only the last 2 frames of the image sequences. Why don't we set the image sequences length shorter to save memory?
Yes you are correct. Unfortunately, while it's the best solution at least theoretically, properly reading the frames from the videos and batchfying them on-the-fly requires much development. Therefore, we decide to extract all the frames of the videos in the batch first before the training batch begins. That's why it consumes much memory. If you are interested in implementing the "on-the-fly" mechanism, discussions and PRs are always welcomed.
I remove the imges sequnce in the dataset, and only return the last two images, in this way I save a lot memory