Add function to create CF-compliant arrays/metadata for HDF5 stacks
Description of proposed changes
Start of the implementation described here: https://github.com/insarlab/MintPy/discussions/1016 to make mintpy HDF5 stacks readable by gdal/xarray/qgis.
I'll need to think of a way we can smoothly incorporate this into the prep_ scripts without interruption.
As a side note:
Maybe we could use "datetime" instead? Neither "date" nor "time" feels accurate, given that we handle both spaceborne and airborne data
For now I still have time as the possible attribute for the date/datetime stacks just because that's what the CF-conventions suggested. They also generally have dates/datetime, but use time as the standard variable name (even if you have daily granularity).
Also, it currently has units=f"days since {str(date_arr[0])}" as the time units, but this can be seconds for intra-day stacks (e.g. for @taliboliver 's deltaX data). But i don't know what modifications he made; i've only seen that there's a date dataset usually.
Reminders
- [ ] Fix #xxxx
- [ ] Pass Pre-commit check (green)
- [ ] Pass Codacy code review (green)
- [ ] Pass Circle CI test (green)
- [ ] Make sure that your code follows our style. Use the other functions/files as a basis.
- [ ] If modifying functionality, describe changes to function behavior and arguments in a comment below the function declaration.
- [ ] If adding new functionality, add a detailed description to the documentation and/or an example.
It's currently failing on one of the integration tests in the readfile.read_hdf5_file.
It would be good to add a unit test for this, since the error is saying that data is never assigned.
There's only checks for 2D and higher dimensions..
# 2D dataset
if ds.ndim == 2:
...
data = ...
so it's likely one of the 1D or 0D datasets is tripping it up (but that should be caught separately in read_hdf5_file i think, since there shouldn't be a code path to an undefined var)
Thank you @scottstanie for this exciting PR!!!
I am trying to review this PR, but it does not seem easy to see what data and metadata have been added in the new format. Here are two questions for you:
- I assume the San Francisco Bay dataset from ARIA can be used to test this new capability, right?
- To document this new format, could you provide a minimal example for creating and examining the old/new format of the HDF5 file? I was relying on
info.py, but it does not seem to work with the new format yet.
Also, it currently has units=f"days since {str(date_arr[0])}" as the time units, but this can be seconds for intra-day stacks (e.g. for @taliboliver 's deltaX data). But i don't know what modifications he made; i've only seen that there's a date dataset usually.
For the intra-day stacks, @taliboliver uses the YYYYMMDDTHHMM format in the date dataset, instead of the usual YYYYMMDD format, and modified the code throughout mintpy to automatically identify this difference while reading it.
I am not familiar enough with the new changes in this PR yet to help the choice here. Having the above two questions answered would help.