atlite
atlite copied to clipboard
Time misalignment between ERA5 and SARAH?
Version Checks (indicate both or one)
-
[ ] I have confirmed this bug exists on the lastest release of Atlite.
-
[X] I have confirmed this bug exists on the current
masterbranch of Atlite.
Issue Description
Hi,
I think there may be a time misalignment in the current implementation when working with instantaneous (satellite) data. As correctly written and considered (e.g. here: https://github.com/PyPSA/atlite/blob/master/atlite/datasets/era5.py#L173-L175), ERA5 takes as reference time the accumulated values of the last hour meaning 11:00 refers to 10:00-11:00.
Now, in the Sarah implementation you take the mean of the arrays at 11:00 and 11:30 and assign the time index of the first array (11:00): https://github.com/PyPSA/atlite/blob/master/atlite/datasets/sarah.py#L153-L156 which leads to the 1-hour time misalignment. See e.g. the spatially averaged values (GHI) for a June day :
So if I did not overlook anything and this bug is true, the only change required would be:
ds = ds.assign_coords(time=ds.indexes["time"] + pd.Timedelta(60, "m"))
after merging the data with the solar position (https://github.com/PyPSA/atlite/blob/master/atlite/datasets/sarah.py#L237)
I could fix this in the sarah3 compatibility pull request (https://github.com/PyPSA/atlite/pull/352) if required.
Reproducible Example
No response
Expected Behavior
No response
Installed Versions
Replace this line.
Mmmh, I have the concern that you are right. The point is that the hourly mean function in the sarah module it "wrong". As far as I see, it should be + intead of - in https://github.com/PyPSA/atlite/blob/1b3a3c0908538a178997a4991f9c4c062f8612fe/atlite/datasets/sarah.py#L155, right?
Yes, indeed. Although I think that changing this line to '+' would mess up the calculation of the solar position, right? https://github.com/PyPSA/atlite/blob/1b3a3c0908538a178997a4991f9c4c062f8612fe/atlite/datasets/sarah.py#L233
So either
- change reference time calculating hourly mean & change the timeshift attribute in the solarposition calculation
- change the code to the ERA5 convention in the end
I am not so sure about that. So the convention should be that an indexing hour (assuming hourly resolution) represents the completed hour. So, a value at 11:00 am represents the mean from 10:00 am to 11:00 am. This is how it is handled by era5 and how it was intended by the sarah module (however there is this bug).
Could you explain to what extent the solar position is misaligned? perhaps, the cleanest way is to also take the average between 10:30 and 11:00 for the solar position in this example
Hi both,
Just wanted to comment on this since me and @martavp looked into this issue for an analysis I'm currently doing that involves modeling east-facing and west-facing solar panels. To start, here are two links from the PVGIS that I found helpful for explaining the issue and the problems it could cause:
PVGIS documentation note 9.3
PVGIS 5.2 release notes
I tried shifting the original cutout after reading it (time shift of '-1 days +23:30:00') to create a new cutout. I tested this new cutout and the azimuth and altitude fit better with the ERA5 cutout.
So for me, solar position was misaligned by 30 minutes and what SARAH showed at 8:30 is what ERA5 showed at 8:00.
Thanks for the comments.
Oh, yes, @FabianHofmann . You are right with the solar position.
Speaking of the scan weighting, I think the weighting can also be improved. Taking only 2 values assumes that these two scans approximate the hour reasonably. This means that the average of 10:30 and 11:00 is considered a good estimate of 11:00, but you could argue it only describes the evolution of the half-hour from 10:30 to 11:00.
I think the more accurate way would be to reconsider the weighting of the instantaneous values and adapt them, as for instance described in the dissertation from Annette Hammer (sorry, in German: https://oops.uni-oldenburg.de/317/1/347.pdf, p. 82). Note here also satellite scan times are considered that could be ignored for simplicity.
This (and the time alignment error) could then be solved by
ds1 = ds.isel(time=slice(None, None, 2))
ds2 = ds.isel(time=slice(1, None, 2))
ds3 = ds.isel(time=slice(2, None, 2))
ds2 = ds2.assign_coords(time=ds2.indexes["time"] + pd.Timedelta(30, "m"))
ds3 = ds3.assign_coords(time=ds3.indexes["time"] + pd.Timedelta(60, "m"))
ds = (.25*ds1 + .5*ds2 + .25*ds3)
So this means that for hour 10, we consider 09:00 (1/4), 09:30 (1/2) and 10:00 (1/4).
I can run an evaluation for different in-situ measurements if desired as I am not aware of a publication but it is more like "unpublíshed knowledge" maybe.
For the solar position, this approach could be done analogously.
Thanks all for looking into this!
Some notes from the side lines:
-
We have the following comment on time convention in the documentation, which should represent our initial intention behind the implementation: https://atlite.readthedocs.io/en/latest/conventions.html#time-points
-
Regarding solar position and investigating this potential bug: Please keep in mind that we also support cutouts with SARAH and ERA5 data combined, so any changes or fixes made should aim to keep the datasets compatible, e.g. temperature from ERA5 and irradiation from SARAH. I haven't looked at it closely on whether or not the proposed changes would affect this, just noting it here to raise awareness.
-
And related: I thought we had fixed that issue years ago, no? cf. https://github.com/PyPSA/atlite/issues/158