Ensure efficient IO when training on large sets of 2D images
As discussed with @atbenmurray and previously with @wyli and @luiscarlosgph, this is a follow up of cmiclab issue #205. We now have support for 2D images but it's rather crude and would benefit from being optimised for example by storing the images in a dedicated high-performance database (LMDB?).
The first task would be to look into the current state of the art in other TF-based projects for that.
Probably best to stick to the recommended TFRecord if possible. Some relevant links: https://www.tensorflow.org/performance/datasets_performance https://github.com/tensorflow/tensorflow/issues/21129 https://stackoverflow.com/questions/48309631/tensorflow-tf-data-dataset-reading-large-hdf5-files
Ok, I'll look into this today