jupyter-earth
jupyter-earth copied to clipboard
"To database or not to database"
@fperez mentioned this at today's project meeting -- this thought came from the Cryosphere working group meeting several weeks ago, and I feel that it might worth sharpening the question itself and would like to get some ideas from you.
Suppose a research group is making a data set. If they want to open their data for other people, they have to choose a certain data structure first. People usually choose whatever they think the best to structure and share their data, but this is not necessarily the best way for using/analyzing the data from a user's perspective. As a researcher, how do I know the best way to structure the data so that other people can explore them with maximum efficiency?
I know many research agencies (NSIDC, NASA, USGS, ...) do a lot of end-user surveys for such information, and you often have multiple ways to get their data. However, doing this might be hard for a single research lab or a small working group. I also feel that many geoscience people, including me, are unfamiliar with various data structures and database management systems. When we have to make such a decision, we choose whatever we know the best. For example, I worked with Landsat 8 GeoTIFF data, and for most of the derived data sets I generated, I used GeoTIFF and did not care whether it is the most convenient way to share them.
Any thoughts are welcome!