YAXArrays.jl
YAXArrays.jl copied to clipboard
Speed comparison of mapslices for YAXArray and Raster
This is a speed comparison between Rasters and YAXArrays. This uses the COSMO REA dataset. It seems that YAXArrays has some overhead , but then it doesn't matter much whether it is the european or the restriction to the germany data. I also tried to do this with a lazy Raster object, but then the computation didn't finish in many minutes. And when I do a subset of a lazy Raster the data got read at some point. It seems that Raster is better suited for smaller arrays and then for larger Arrays the overhead of YAXArrays doesn't matter that much and it is becoming faster. I would like to explore exactly where this tradeoff lays.
url = "http://esgf1.dkrz.de/thredds/fileServer/cosmo-rea/reanalysis/EUR-6km/DWD/ECMWF-ERAINT/REA6/r1i1p1f1/COSMO/v1/mon/atmos/tas/v20230314/tas_EUR-6km_ECMWF-ERAINT_REA6_r1i1p1f1_COSMO_v1_mon_199501-199512.nc"
filename = split(url, "/")[end]
mkpath("data/")
path = download(url, "data/$filename")
ds = open_dataset(path)
olonp = 180 + ds.rotated_latitude_longitude.properties["grid_north_pole_longitude"]
olatp = ds.rotated_latitude_longitude.properties["grid_north_pole_latitude"]
proj = "+proj=ob_tran +o_proj=latlon +o_lon_p=0 +o_lat_p=$olatp +lon_0=$olonp"
using GADM
using ArchGDAL:ArchGDAL as AG
deu = GADM.get("DEU")
projdeu = AG.reproject(only(deu.geom), ProjString(AG.toPROJ4(AG.getspatialref(deu.geom[1]))), ProjString(proj))
bbox = GI.extent(projdeu)
rger = r[bbox]
r = Raster(path, key="tas")
r = set(r, :rlat=>Y, :rlon=>X)
yax = ds.tas
yaxmem = readcubedata(yax)
yaxger = set(yax, :rlat=>Y, :rlon=> X)[bbox]
julia> @time mapslices(mean, yax, dims="Time");
1.987760 seconds (5.60 M allocations: 215.388 MiB, 71.15% gc time)
julia> @time mapslices(mean, yaxmem, dims="Time");
1.679276 seconds (5.59 M allocations: 182.658 MiB, 79.56% gc time)
julia> @time mapslices(mean, r, dims=Ti);
0.509211 seconds (4.04 M allocations: 64.265 MiB)
julia> @time mapslices(mean, yaxger, dims="Time");
1.231583 seconds (120.19 k allocations: 3.956 MiB, 99.25% gc time)
julia> @time mapslices(mean, rger, dims=Ti);
0.010213 seconds (73.89 k allocations: 1.185 MiB)