DeepPoseKit icon indicating copy to clipboard operation
DeepPoseKit copied to clipboard

Feature request: support training and annotating datasets with image data outside hdf5

Open lambdaloop opened this issue 6 years ago • 4 comments

This seems like a great package!

From what I can tell, it seems that the TrainingGenerator and DataGenerator classes only support loading data from an hdf5 file. However, I expect this would not work with datasets which do not fit into RAM.

Would it be possible to have some way to load the image files from paths stored in the hdf5 file, rather than from the hdf5 file directly?

Thank you!

lambdaloop avatar Aug 04 '19 22:08 lambdaloop

Hi, Thanks for your interest! However, I'm a bit confused by your feature request, so I'd appreciate if you could provide more details.

During model training and when annotating images, the image data are dynamically loaded into memory as single images or in small training batches from the HDF5 file—which is stored on the disk, not in memory—so there should be no problems with the data fitting into memory. See https://github.com/jgraving/deepposekit/blob/master/deepposekit/io/DataGenerator.py#L80 for more details.

If you'd like to use an existing dataset, we have preliminary code for converting to the deepposekit format, but we are still working to make this easier. You can find examples here: https://github.com/jgraving/deepposekit-contrib/tree/master/notebooks

That being said, the DataGenerator class could certainly be modified to support other data formats, but it's unclear to me how useful this would be vs. converting data to the existing HDF5 format.

jgraving avatar Aug 05 '19 10:08 jgraving

Ah I see, thank you for clarifying!

Yes, that makes sense. I didn't realize that it was opening the HDF5 file each time to read a couple images and not loading into memory.

That said, I still think this would be a useful feature. Many existing datasets (e.g. MPII Human Pose, COCO keypoints) consist of folders with images so it would be more convenient to generate a small HDF5 file which has annotations with paths to all the images, rather than generate a huge HDF5 file with all the image data. It also makes it easier to inspect the images.

lambdaloop avatar Aug 05 '19 17:08 lambdaloop

We've now added a (experimental) data generator for loading DeepLabCut data (see here for example: https://github.com/jgraving/DeepPoseKit/blob/master/examples/deeplacut_data_example.ipynb). The BaseGenerator class is now abstracted so writing a custom generator is possible with arbitrary data formats. (see here for example: https://github.com/jgraving/DeepPoseKit/blob/master/examples/custom_data_generator.ipynb)

The deepposekit.io API has changed quite a bit, so check out the updated examples for more details: https://github.com/jgraving/DeepPoseKit/tree/master/examples

We're still working to update and abstract the GUI and Annotator classes to work with the abstract BaseGenerator, so I'll leave this open until we get that update pushed.

jgraving avatar Sep 30 '19 11:09 jgraving

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

stale[bot] avatar Oct 25 '19 02:10 stale[bot]