DeepPoseKit
DeepPoseKit copied to clipboard
Feature request: support training and annotating datasets with image data outside hdf5
This seems like a great package!
From what I can tell, it seems that the TrainingGenerator and DataGenerator classes only support loading data from an hdf5 file. However, I expect this would not work with datasets which do not fit into RAM.
Would it be possible to have some way to load the image files from paths stored in the hdf5 file, rather than from the hdf5 file directly?
Thank you!
Hi, Thanks for your interest! However, I'm a bit confused by your feature request, so I'd appreciate if you could provide more details.
During model training and when annotating images, the image data are dynamically loaded into memory as single images or in small training batches from the HDF5 file—which is stored on the disk, not in memory—so there should be no problems with the data fitting into memory. See https://github.com/jgraving/deepposekit/blob/master/deepposekit/io/DataGenerator.py#L80 for more details.
If you'd like to use an existing dataset, we have preliminary code for converting to the deepposekit format, but we are still working to make this easier. You can find examples here: https://github.com/jgraving/deepposekit-contrib/tree/master/notebooks
That being said, the DataGenerator class could certainly be modified to support other data formats, but it's unclear to me how useful this would be vs. converting data to the existing HDF5 format.
Ah I see, thank you for clarifying!
Yes, that makes sense. I didn't realize that it was opening the HDF5 file each time to read a couple images and not loading into memory.
That said, I still think this would be a useful feature. Many existing datasets (e.g. MPII Human Pose, COCO keypoints) consist of folders with images so it would be more convenient to generate a small HDF5 file which has annotations with paths to all the images, rather than generate a huge HDF5 file with all the image data. It also makes it easier to inspect the images.
We've now added a (experimental) data generator for loading DeepLabCut data (see here for example: https://github.com/jgraving/DeepPoseKit/blob/master/examples/deeplacut_data_example.ipynb). The BaseGenerator class is now abstracted so writing a custom generator is possible with arbitrary data formats. (see here for example: https://github.com/jgraving/DeepPoseKit/blob/master/examples/custom_data_generator.ipynb)
The deepposekit.io API has changed quite a bit, so check out the updated examples for more details: https://github.com/jgraving/DeepPoseKit/tree/master/examples
We're still working to update and abstract the GUI and Annotator classes to work with the abstract BaseGenerator, so I'll leave this open until we get that update pushed.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.