plantcv
plantcv copied to clipboard
Create test datasets
Description
We have large datasets available here but it would be useful to have smaller datasets that users could easily download and use for testing, learning, etc. This idea is based on issue #159.
Details
Smaller datasets are easier to download quickly and onto a personal computer and are easier to visualize. We could have a variety of small datasets.
- It could be worth having a sample dataset associated with each public dataset, available through the same mechanisms the full datasets are available through.
- In addition to or alternatively, some sample datasets for each data type or analysis type.
- A dataset for images used in the documentation. This could be best stored in a separate GitHub repository so that we could easily add to it over time, as long as it stays under 1GB total size.
Completion Criteria
- [ ] Create dataset(s)
- [ ] Make dataset(s) available
- [ ] Update the documentation with instructions on how to get the dataset(s)
Hi Noah, I need to train a deep neural network with images of vegetables (those normally suitable for green house farming). Can you please help with this data or provide me with a suitable link. Thanks.
@abiodungit I'm not aware of any datasets for vegetables per se, or even many datasets with labeled training data (at the moment). I can point you to our publicly available datasets: http://plantcv.danforthcenter.org/pages/data.html and the additional datasets that are described here: http://www.plant-image-analysis.org/dataset.
Dear Noah, Thank you for those links. They are much useful. @abiodungit
Onile A. E.
On Friday, September 29, 2017 6:23 PM, Noah Fahlgren <[email protected]> wrote:
@abiodungit I'm not aware of any datasets for vegetables per se, or even many datasets with labeled training data (at the moment). I can point you to our publicly available datasets: http://plantcv.danforthcenter.org/pages/data.html and the additional datasets that are described here: http://www.plant-image-analysis.org/dataset.— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Could we use "original" images from tutorials and other examples in the documentation as one of the small datasets? Maybe another dataset could be a subset of images from currently available datasets.
Yeah, I think it would be ideal if the documentation images were available for people to work on since we demonstrate functions with that data. Since the static documentation on Read the Docs is reduced quality/downsized, maybe it's easier to worry about the test datasets for now in the context of the interactive documentation. We could potentially replace our existing documentation images with this new dataset later if we think it's important for the data to match between the static and interactive docs.