rioxarray icon indicating copy to clipboard operation
rioxarray copied to clipboard

Envi header information is stripped on write

Open AndrewGuenther opened this issue 2 years ago • 3 comments

Code Sample

import rioxarray as rx

input = rx.open_rasterio("./test_file.envi")
input.rio.to_raster("./output.envi", driver="ENVI")

Problem description

In the above code, any tag under the ENVI namespace is stripped when the file is written back out. This data reads in just fine, but since the tag structure is flattened on read, that context is then lost on write. So headers like wavelengths, wavelength units, acquisition time, etc are lost.

Expected Output

I would expect a subsequent to_raster call to look exactly the same as the data passed in.

Environment Information

rioxarray (0.13.3) deps:
  rasterio: 1.3.5.post1
    xarray: 2023.1.0
      GDAL: 3.5.3
      GEOS: 3.11.1
      PROJ: 9.0.1
 PROJ DATA: /home/andrew/.cache/pypoetry/virtualenvs/rioxarray-test-GzltKOLB-py3.8/lib/python3.8/site-packages/rasterio/proj_data
 GDAL DATA: /home/andrew/.cache/pypoetry/virtualenvs/rioxarray-test-GzltKOLB-py3.8/lib/python3.8/site-packages/rasterio/gdal_data

Other python deps:
     scipy: None
    pyproj: 3.4.1

System:
    python: 3.8.10 (default, Nov 14 2022, 12:59:47)  [GCC 9.4.0]
executable: /home/andrew/.cache/pypoetry/virtualenvs/rioxarray-test-GzltKOLB-py3.8/bin/python
   machine: Linux-5.15.0-58-generic-x86_64-with-glibc2.29

Installation method

poetry

AndrewGuenther avatar Feb 09 '23 22:02 AndrewGuenther

A simple solution would be to store an attribute indicating what fields were read from the ENVI namespace and then automatically include those tags on write. The gotcha however is if someone were to add additional attributes or band information they wanted included in the ENVI header they'd need a way to indicate that.

Tracking of the read headers would probably go here: https://github.com/corteva/rioxarray/blob/master/rioxarray/_io.py#L713-L725

One potential solution would be to not flatten the tag structure for general metadata at all. So if a user wanted to include data in the ENVI namespace they could do so this way:

data.attrs["ENVI"]["wavelength units"]

Per band information is a bit more difficult...

AndrewGuenther avatar Feb 09 '23 22:02 AndrewGuenther

Tag writing is here

snowman2 avatar Feb 10 '23 21:02 snowman2

When there are multiple bands, the band_tags key can be added as a list of dicts.

snowman2 avatar Feb 10 '23 21:02 snowman2