its_live_production icon indicating copy to clipboard operation
its_live_production copied to clipboard

Granules, datacubes, composites and mosaics fixes

Open mliukis opened this issue 2 years ago • 0 comments

The following changes are required for:

Elevation:

  • [x] Fix mapping attributes for ANT_G1920V01_GroundedIceHeight.nc according to NSIDC standard
  • [x] Create premet and spacial files for ingest
  • [x] Fill out data submission form

V02 Granules:

  • [ ] Correction for projection distortion in optical and radar data

Datacubes:

  • [x] Add radar variables to the data cubes (M11, M12). Keep them empty for optical image pairs
    • va and vr are already stored in datacubes
  • [x] Add chunking for 1-d data variables
  • [x] Add landice mask to each of the datacubes
  • [x] Add floatingice mask to each of the datacubes
  • [x] Set mid_date to (date_center + microseconds(int('YYMMDD'))) where YYMMDD is date of acquisition_date_img1 of the granule
  • [x] If there is no stable_shift then stable_shift_flag = 0 and vy/x_stable_shift should equal zero.
  • [x] Map all possible L89 sensor values ( '8.', '9.', '8.0', '9.0', 8.0, 9.0) to the same string: "8" or "9".

Composites:

  • [x] Fix v_error: re-compute v_error based on vx and vy components instead of vx0 and vy0 components as it was done originally.
  • [x] Change v_error computation to autoRIFT computation: V_error = np.sqrt((vx_error * VX / V)**2 + (vy_error * VY / V)**2)
  • [x] Use analytical solution for v_phase and v_amplitude
  • [x] Fix datecube typo in some composite attributes: datecube_created datecube_s3 datecube_updated datecube_url
  • [x] Add EPSG code back to composites filenames to avoid multiple composites for different EPSG codes to have the same filename under the same subdirectory. Example of datacubes that result in the same composite filename s3://its-live-data/composites/annual/v02/N40E070/ITS_LIVE_velocity_120m_X650000_Y4750000.zarr:
s3://its-live-data/datacubes/v02/N40E070/ITS_LIVE_vel_EPSG32642_G0120_X650000_Y4750000.zarr
s3://its-live-data/datacubes/v02/N40E070/ITS_LIVE_vel_EPSG32643_G0120_X650000_Y4750000.zarr
  • [x] Add landice to composites if it's not in datacube, propagate the mask to mosaics
  • [x] Add floating ice mask to composites if it's not in datacube, propagate the mask to mosaics
  • [x] Add stable_shift filter
  • [x] Change dtype of count0 data variable to uint32 to avoid overflow
  • [x] Revise dtype of all data variables (see Chad's and Alex's summary spreadsheet on Slack from 8/24/2022)
  • [x] Convert outlier_fraction to percent and set _FillValue=255 when writing to the NetCDF or Zarr store
  • [x] Determine minimum observation threshold to avoid huge v0 values for some composites (examine count0 for such points for existing GRE composites): Set model threshold to 30, invalidate all of the return variables from LSQ fit for all values exceeding that threshold
  • [x] New filter to handle "bad" composite within GRE mosaics:
    • [x] For the composite exclude S2 cube layers that contain 23WPN in their file name
    • [x] Look at as possibly adding a mission specific seasonal amplitude check: compute seasonal amplitudes for S1+L8 and S2 separately... if S2_amp > (S1+L8_amp)*2 then exclude S2 from seasonal fit
    • [x] add a minimum difference in amplitude before removing S2 data: compute seasonal amplitudes for S1+L8 and S2 separately... if {S2_amp > [2 x (S1_L8)_amp]} & {(S2_amp - [(S1_L8)_amp] ) > 2 m/yr} then exclude S2 from seasonal fit
    • [x] Use 2km in-buffer land ice masking:
    • SensorExcludeFilter should only be applied if landice_2km_inbuff == 0
    • The 2nd LSQ S2 filter should only be applied where landice_2km_inbuff == 1
  • [x] Fix outlier_percent when second LSQ fit is applied (don't exclude land ice cells from the count; reset total count only for the cells which use 2nd LSQ fit results)
Existing composites already include some of the mentioned above fixes (ran itslive/src/tools/fix_composites_*.py scripts):
  • [x] v_error
  • [x] analytical solution for v_phase and v_amplitude

Annual mosaics:

  • [x] Strip zero from data variables names and their metadata in static mosaics

  • [x] Parallelize re-projection code (matrix creation, apply transformation) * Using Dask to parallelize processing is slower than not using Dask * Try to use https://taichi-lang.org/

  • [x] Increase chunking size when storing to NetCDF (to speed up data access)

  • [x] Add missing sensor_flag data variable to static mosaics

  • [x] Store cumulative attributes from composites to standalone file per each of the mosaics - not to overcrowd mosaics with metadata :

    composites_software_version datacube_software_version composites_created composites_updated datecube_created datecube_s3 datecube_updated datecube_url geo_polygon proj_polygon composites_url

  • [x] Re-generate composites for HMA mosaics, build annual HMA mosaics based on "good" composites to verify the code works as expected

Production Runs

Note: these are possible runs as they might not be relevant if we need re-generate all datacubes and composites *** If we need to fix granules due to map distortion issue, then we need to re-generate all datacubes and composites as well.

  • [ ] Re-name "good" composites (not affected by described above issues) to include EPSG code into filename
  • [ ] Change dtype of count0 to uint32 for the composites where values don't overflow, re-generate composites for which current count0 overflows

mliukis avatar Aug 17 '22 18:08 mliukis