atlite icon indicating copy to clipboard operation
atlite copied to clipboard

Time misalignment between ERA5 and SARAH?

Open matzech opened this issue 1 year ago • 6 comments

Version Checks (indicate both or one)

  • [ ] I have confirmed this bug exists on the lastest release of Atlite.

  • [X] I have confirmed this bug exists on the current master branch of Atlite.

Issue Description

Hi,

I think there may be a time misalignment in the current implementation when working with instantaneous (satellite) data. As correctly written and considered (e.g. here: https://github.com/PyPSA/atlite/blob/master/atlite/datasets/era5.py#L173-L175), ERA5 takes as reference time the accumulated values of the last hour meaning 11:00 refers to 10:00-11:00.

Now, in the Sarah implementation you take the mean of the arrays at 11:00 and 11:30 and assign the time index of the first array (11:00): https://github.com/PyPSA/atlite/blob/master/atlite/datasets/sarah.py#L153-L156 which leads to the 1-hour time misalignment. See e.g. the spatially averaged values (GHI) for a June day : image

So if I did not overlook anything and this bug is true, the only change required would be: ds = ds.assign_coords(time=ds.indexes["time"] + pd.Timedelta(60, "m")) after merging the data with the solar position (https://github.com/PyPSA/atlite/blob/master/atlite/datasets/sarah.py#L237)

I could fix this in the sarah3 compatibility pull request (https://github.com/PyPSA/atlite/pull/352) if required.

Reproducible Example

No response

Expected Behavior

No response

Installed Versions

Replace this line.

matzech avatar Jul 10 '24 16:07 matzech

Mmmh, I have the concern that you are right. The point is that the hourly mean function in the sarah module it "wrong". As far as I see, it should be + intead of - in https://github.com/PyPSA/atlite/blob/1b3a3c0908538a178997a4991f9c4c062f8612fe/atlite/datasets/sarah.py#L155, right?

FabianHofmann avatar Jul 11 '24 08:07 FabianHofmann

Yes, indeed. Although I think that changing this line to '+' would mess up the calculation of the solar position, right? https://github.com/PyPSA/atlite/blob/1b3a3c0908538a178997a4991f9c4c062f8612fe/atlite/datasets/sarah.py#L233

So either

  • change reference time calculating hourly mean & change the timeshift attribute in the solarposition calculation
  • change the code to the ERA5 convention in the end

matzech avatar Jul 11 '24 08:07 matzech

I am not so sure about that. So the convention should be that an indexing hour (assuming hourly resolution) represents the completed hour. So, a value at 11:00 am represents the mean from 10:00 am to 11:00 am. This is how it is handled by era5 and how it was intended by the sarah module (however there is this bug).

Could you explain to what extent the solar position is misaligned? perhaps, the cleanest way is to also take the average between 10:30 and 11:00 for the solar position in this example

FabianHofmann avatar Jul 11 '24 08:07 FabianHofmann

Hi both, Just wanted to comment on this since me and @martavp looked into this issue for an analysis I'm currently doing that involves modeling east-facing and west-facing solar panels. To start, here are two links from the PVGIS that I found helpful for explaining the issue and the problems it could cause: PVGIS documentation note 9.3 PVGIS 5.2 release notes I tried shifting the original cutout after reading it (time shift of '-1 days +23:30:00') to create a new cutout. I tested this new cutout and the azimuth and altitude fit better with the ERA5 cutout. image

So for me, solar position was misaligned by 30 minutes and what SARAH showed at 8:30 is what ERA5 showed at 8:00.

Parisra avatar Jul 11 '24 09:07 Parisra

Thanks for the comments.

Oh, yes, @FabianHofmann . You are right with the solar position.

Speaking of the scan weighting, I think the weighting can also be improved. Taking only 2 values assumes that these two scans approximate the hour reasonably. This means that the average of 10:30 and 11:00 is considered a good estimate of 11:00, but you could argue it only describes the evolution of the half-hour from 10:30 to 11:00.

I think the more accurate way would be to reconsider the weighting of the instantaneous values and adapt them, as for instance described in the dissertation from Annette Hammer (sorry, in German: https://oops.uni-oldenburg.de/317/1/347.pdf, p. 82). Note here also satellite scan times are considered that could be ignored for simplicity.

This (and the time alignment error) could then be solved by

 ds1 = ds.isel(time=slice(None, None, 2))
 ds2 = ds.isel(time=slice(1, None, 2))
 ds3 = ds.isel(time=slice(2, None, 2))

 ds2 = ds2.assign_coords(time=ds2.indexes["time"] + pd.Timedelta(30, "m"))
 ds3 = ds3.assign_coords(time=ds3.indexes["time"] + pd.Timedelta(60, "m"))
 ds = (.25*ds1 + .5*ds2 + .25*ds3)

So this means that for hour 10, we consider 09:00 (1/4), 09:30 (1/2) and 10:00 (1/4).

I can run an evaluation for different in-situ measurements if desired as I am not aware of a publication but it is more like "unpublíshed knowledge" maybe.

For the solar position, this approach could be done analogously.

matzech avatar Jul 11 '24 12:07 matzech

Thanks all for looking into this!

Some notes from the side lines:

  • We have the following comment on time convention in the documentation, which should represent our initial intention behind the implementation: https://atlite.readthedocs.io/en/latest/conventions.html#time-points

  • Regarding solar position and investigating this potential bug: Please keep in mind that we also support cutouts with SARAH and ERA5 data combined, so any changes or fixes made should aim to keep the datasets compatible, e.g. temperature from ERA5 and irradiation from SARAH. I haven't looked at it closely on whether or not the proposed changes would affect this, just noting it here to raise awareness.

  • And related: I thought we had fixed that issue years ago, no? cf. https://github.com/PyPSA/atlite/issues/158

euronion avatar Jul 11 '24 22:07 euronion