ml_drought icon indicating copy to clipboard operation
ml_drought copied to clipboard

era5POS preprocessor

Open tommylees112 opened this issue 5 years ago • 3 comments

the era5POS data when converted to monthly doesn't necessarily make sense to just upsample from hourly to monthly

    @staticmethod
    def resample_time(ds: xr.Dataset,
                      resample_length: str = 'M',
                      upsampling: bool = False) -> xr.Dataset:

        # TODO: would be nice to programmatically get upsampling / not
        ds = ds.sortby('time')

        resampler = ds.resample(time=resample_length)

        if not upsampling:
            return resampler.mean()
        else:
            return resampler.nearest()

Because it's then Mean HOURLY precipitation over a month - when really we should be summing over the month.

...
        if not upsampling:
            return resampler.mean()
        elif data == 'era5POS':
             return resampler.sum()
        else:
            return resampler.nearest()

Otherwise we get extremely different spatial patterns:

CHIRPS (mm/month)

Screenshot 2019-06-27 at 12 16 37

ERA5POS (mm/hour mean for each month)

Screenshot 2019-06-27 at 12 16 42

tommylees112 avatar Jun 27 '19 11:06 tommylees112

Isn't this okay if we normalize (which we do)?

gabrieltseng avatar Jun 28 '19 16:06 gabrieltseng

hmmm yeah you would think so actually. but the spatial patterns would remain the same and they're so vastly different i'm just a little concerned that we haven't done the preprocessing correctly!

tommylees112 avatar Jun 28 '19 18:06 tommylees112

Are they so different? They have extremely different resolutions (this is only resolved at the engineering step), but the peaks in precipitation look like they are in roughly the same spot, as well as the troughs.

gabrieltseng avatar Jun 28 '19 18:06 gabrieltseng