its_live_production
its_live_production copied to clipboard
Granules, datacubes, composites and mosaics fixes
The following changes are required for:
Elevation:
- [x] Fix mapping attributes for
ANT_G1920V01_GroundedIceHeight.nc
according to NSIDC standard - [x] Create premet and spacial files for ingest
- [x] Fill out data submission form
V02 Granules:
- [ ] Correction for projection distortion in optical and radar data
Datacubes:
- [x] Add radar variables to the data cubes (M11, M12). Keep them empty for optical image pairs
- va and vr are already stored in datacubes
- [x] Add chunking for 1-d data variables
- [x] Add landice mask to each of the datacubes
- [x] Add floatingice mask to each of the datacubes
- [x] Set
mid_date
to (date_center + microseconds(int('YYMMDD'))) where YYMMDD is date ofacquisition_date_img1
of the granule - [x] If there is no stable_shift then stable_shift_flag = 0 and vy/x_stable_shift should equal zero.
- [x] Map all possible L89 sensor values ( '8.', '9.', '8.0', '9.0', 8.0, 9.0) to the same string: "8" or "9".
Composites:
- [x] Fix
v_error
: re-compute v_error based onvx
andvy
components instead ofvx0
andvy0
components as it was done originally. - [x] Change
v_error
computation to autoRIFT computation:V_error = np.sqrt((vx_error * VX / V)**2 + (vy_error * VY / V)**2)
- [x] Use analytical solution for
v_phase
andv_amplitude
- [x] Fix
datecube
typo in some composite attributes: datecube_created datecube_s3 datecube_updated datecube_url - [x] Add EPSG code back to composites filenames to avoid multiple composites for different EPSG codes to have the same filename under the same subdirectory. Example of datacubes that result in the same composite filename
s3://its-live-data/composites/annual/v02/N40E070/ITS_LIVE_velocity_120m_X650000_Y4750000.zarr
:
s3://its-live-data/datacubes/v02/N40E070/ITS_LIVE_vel_EPSG32642_G0120_X650000_Y4750000.zarr
s3://its-live-data/datacubes/v02/N40E070/ITS_LIVE_vel_EPSG32643_G0120_X650000_Y4750000.zarr
- [x] Add landice to composites if it's not in datacube, propagate the mask to mosaics
- [x] Add floating ice mask to composites if it's not in datacube, propagate the mask to mosaics
- [x] Add stable_shift filter
- [x] Change dtype of
count0
data variable to uint32 to avoid overflow - [x] Revise
dtype
of all data variables (see Chad's and Alex's summary spreadsheet on Slack from 8/24/2022) - [x] Convert
outlier_fraction
to percent and set_FillValue=255
when writing to the NetCDF or Zarr store - [x] Determine minimum observation threshold to avoid huge
v0
values for some composites (examinecount0
for such points for existing GRE composites):Set model threshold to 30, invalidate all of the return variables from LSQ fit for all values exceeding that threshold
- [x] New filter to handle "bad" composite within GRE mosaics:
- [x] For the composite exclude S2 cube layers that contain
23WPN
in their file name - [x] Look at as possibly adding a mission specific seasonal amplitude check: compute seasonal amplitudes for S1+L8 and S2 separately... if S2_amp > (S1+L8_amp)*2 then exclude S2 from seasonal fit
- [x] add a minimum difference in amplitude before removing S2 data: compute seasonal amplitudes for S1+L8 and S2 separately... if {S2_amp > [2 x (S1_L8)_amp]} & {(S2_amp - [(S1_L8)_amp] ) > 2 m/yr} then exclude S2 from seasonal fit
- [x] Use 2km in-buffer land ice masking:
- SensorExcludeFilter should only be applied if landice_2km_inbuff == 0
- The 2nd LSQ S2 filter should only be applied where landice_2km_inbuff == 1
- [x] For the composite exclude S2 cube layers that contain
- [x] Fix outlier_percent when second LSQ fit is applied (don't exclude land ice cells from the count; reset total count only for the cells which use 2nd LSQ fit results)
Existing composites already include some of the mentioned above fixes (ran itslive/src/tools/fix_composites_*.py
scripts):
- [x]
v_error
- [x] analytical solution for
v_phase
andv_amplitude
Annual mosaics:
-
[x] Strip
zero
from data variables names and their metadata in static mosaics -
[x] Parallelize re-projection code (matrix creation, apply transformation) * Using Dask to parallelize processing is slower than not using Dask * Try to use https://taichi-lang.org/
-
[x] Increase chunking size when storing to NetCDF (to speed up data access)
-
[x] Add missing
sensor_flag
data variable to static mosaics -
[x] Store cumulative attributes from composites to standalone file per each of the mosaics - not to overcrowd mosaics with metadata :
composites_software_version datacube_software_version composites_created composites_updated datecube_created datecube_s3 datecube_updated datecube_url geo_polygon proj_polygon composites_url
-
[x] Re-generate composites for HMA mosaics, build annual HMA mosaics based on "good" composites to verify the code works as expected
Production Runs
Note: these are possible runs as they might not be relevant if we need re-generate all datacubes and composites *** If we need to fix granules due to map distortion issue, then we need to re-generate all datacubes and composites as well.
- [ ] Re-name "good" composites (not affected by described above issues) to include EPSG code into filename
- [ ] Change dtype of
count0
to uint32 for the composites where values don't overflow, re-generate composites for which currentcount0
overflows