fuel
fuel copied to clipboard
Caltech101
This is a converter and a loader for caltech101, which could be useful for some other people.
It might lack a bit of the polish that other datasets have, but it could be useful, at least as a starting point.
There is no downloader and I couldn't figure out how to get scipy to read from an open file.
If you have ideas on how to improve that don't necessitate a long time investment, I can probably do it. Otherwise, this is mostly to avoid duplication of effort in case someone else is working on this.
I think having a downloader would be very helpful for the dataset.
@abergeron I'm doing a cleanup of the open PRs. Would you still like to go ahead with this one?
Is there much to do?
Aside from my inline comments, we would need a downloader for the dataset, and we would also need to make this a variable-length dataset.
I think you could reuse a lot of code from the Imagenet converter if you encode images as a variable-length vector of raw bytes.
You might find this helpful: https://github.com/MartinThoma/algorithms/blob/master/ML/datasets/caltech101.py#L22-L37