rioxarray
rioxarray copied to clipboard
Envi header information is stripped on write
Code Sample
import rioxarray as rx
input = rx.open_rasterio("./test_file.envi")
input.rio.to_raster("./output.envi", driver="ENVI")
Problem description
In the above code, any tag under the ENVI namespace is stripped when the file is written back out. This data reads in just fine, but since the tag structure is flattened on read, that context is then lost on write. So headers like wavelengths, wavelength units, acquisition time, etc are lost.
Expected Output
I would expect a subsequent to_raster call to look exactly the same as the data passed in.
Environment Information
rioxarray (0.13.3) deps:
rasterio: 1.3.5.post1
xarray: 2023.1.0
GDAL: 3.5.3
GEOS: 3.11.1
PROJ: 9.0.1
PROJ DATA: /home/andrew/.cache/pypoetry/virtualenvs/rioxarray-test-GzltKOLB-py3.8/lib/python3.8/site-packages/rasterio/proj_data
GDAL DATA: /home/andrew/.cache/pypoetry/virtualenvs/rioxarray-test-GzltKOLB-py3.8/lib/python3.8/site-packages/rasterio/gdal_data
Other python deps:
scipy: None
pyproj: 3.4.1
System:
python: 3.8.10 (default, Nov 14 2022, 12:59:47) [GCC 9.4.0]
executable: /home/andrew/.cache/pypoetry/virtualenvs/rioxarray-test-GzltKOLB-py3.8/bin/python
machine: Linux-5.15.0-58-generic-x86_64-with-glibc2.29
Installation method
poetry
A simple solution would be to store an attribute indicating what fields were read from the ENVI namespace and then automatically include those tags on write. The gotcha however is if someone were to add additional attributes or band information they wanted included in the ENVI header they'd need a way to indicate that.
Tracking of the read headers would probably go here: https://github.com/corteva/rioxarray/blob/master/rioxarray/_io.py#L713-L725
One potential solution would be to not flatten the tag structure for general metadata at all. So if a user wanted to include data in the ENVI namespace they could do so this way:
data.attrs["ENVI"]["wavelength units"]
Per band information is a bit more difficult...
Tag writing is here
When there are multiple bands, the band_tags key can be added as a list of dicts.