ml_drought era5POS preprocessor

the era5POS data when converted to monthly doesn't necessarily make sense to just upsample from hourly to monthly

    @staticmethod
    def resample_time(ds: xr.Dataset,
                      resample_length: str = 'M',
                      upsampling: bool = False) -> xr.Dataset:

        # TODO: would be nice to programmatically get upsampling / not
        ds = ds.sortby('time')

        resampler = ds.resample(time=resample_length)

        if not upsampling:
            return resampler.mean()
        else:
            return resampler.nearest()

Because it's then Mean HOURLY precipitation over a month - when really we should be summing over the month.

...
        if not upsampling:
            return resampler.mean()
        elif data == 'era5POS':
             return resampler.sum()
        else:
            return resampler.nearest()

Otherwise we get extremely different spatial patterns:

CHIRPS (mm/month)

Screenshot 2019-06-27 at 12 16 37

ERA5POS (mm/hour mean for each month)

Screenshot 2019-06-27 at 12 16 42

Jun 27 '19 11:06 tommylees112

Isn't this okay if we normalize (which we do)?

Jun 28 '19 16:06 gabrieltseng

hmmm yeah you would think so actually. but the spatial patterns would remain the same and they're so vastly different i'm just a little concerned that we haven't done the preprocessing correctly!

Jun 28 '19 18:06 tommylees112

Are they so different? They have extremely different resolutions (this is only resolved at the engineering step), but the peaks in precipitation look like they are in roughly the same spot, as well as the troughs.

Jun 28 '19 18:06 gabrieltseng

ml_drought ml_drought copied to clipboard

era5POS preprocessor

ml_drought
ml_drought copied to clipboard