ascat icon indicating copy to clipboard operation
ascat copied to clipboard

Rewrite cell/swath xarray readers as MultiFileHandlers

Open claytharrison opened this issue 2 months ago • 1 comments

This pull request aims to reimplement the reading/merging logic for swath and cell files in the structure established by MultiFileHandler/ChronFiles/etc in the file_handling module.

On this commit, readers for cell files are implemented (RaggedArray and OrthoMulti). The most basic method of operation goes something like:

from ascat.read_native.cell_collection import RaggedArrayFiles, OrthoMultiArrayFiles
contiguous_ra_source = "/path/to/contiguous/sig0_12.5/metop_a"
indexed_ra_source = "/path/to/indexed/sig0_12.5/metop_a"
multisat_ra_source = "/path/to/indexed/sig0_12.5/"
orthomulti_source = "/path/to/era5_land_2023/"
orthomulti_grid = "/path/to/era5_land_2023/grid.nc"

# amazon chunk
# you can also query by list of location_id, cell number, or lon/lat coords
bbox = (-7, -4, -69, -65)

contiguous_ra_files = RaggedArrayFiles(contiguous_ra_source, product_id="sig0_12.5") 
indexed_ra_files = RaggedArrayFiles(indexed_ra_source, product_id="sig0_12.5")

# right now we just use the "all_sats" parameter to indicate if the files are nested within metop_a/metop_b/metop_c directories underneath
# the root dir. This is of course not general or ideal.
multisat_ra_files = RaggedArrayFiles(multisat_ra_source, product_id="sig0_12.5", all_sats=True)

# for orthomulti right now you just pass the grid file path as an argument and it will generate a pygeogrids object from that.
# the product_id doesn't do anything in this case.
orthomulti_files = OrthoMultiArrayFiles(orthomulti_source, product_id="this_doesnt_matter_in_this_case", grid=orthomulti_grid)

# extract the data

contiguous_ra_ds = contiguous_ra_files.extract(bbox=bbox)
indexed_ra_ds = indexed_ra_files.extract(bbox=bbox)
# ^ these two should be the same, since contiguous RAs are converted to indexed before merging

multisat_ra_ds = multisat_ra_files.extract(bbox=bbox)

orthomulti_ds = orthomulti_files.extract(bbox=bbox)

To do:

  • ~Add swath file reader~ Finish swath file reader
  • Find a robust method of handling product-specific information like grids, etc., including a way for users to provide that themselves. For the cell reader we only really need to pass the grid, but for the swath reader this will get more complicated
  • Add ability to write out according to different cell scheme (any cell scheme)
  • Try integration with regrid applications, make sure that still works nicely.
  • Rename things better
  • whatever else is missing compared to the old version

claytharrison avatar Apr 24 '24 12:04 claytharrison