Rasters.jl cellsize returns erroneous results for grids with fine spacing [no reprojeciton]

This makes sense as gridsize is increasing linearly with latitude

using Rasters
using Rasters.Lookups
import GeoFormatTypes as GFT
using Plots

dX = 0.1
dY = -0.1
lon = X(Projected(166.:dX:168.; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dX), crs=EPSG(4326)))
lat = Y(Projected(-78.0:dY:-80.; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dY ), crs=EPSG(4326)))
ras = Raster(rand(lon, lat))
cell_area = Rasters.cellsize(ras)
plot(cell_area[1, :])

if we decrease the step size in X we start to see some jutter

dX = 0.0006
dY = -0.1
lon = X(Projected(166.0:dX:168.0; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dX), crs=EPSG(4326)))
lat = Y(Projected(-78.0:dY:-80.0; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dY), crs=EPSG(4326)))
ras = Raster(rand(lon, lat))
cell_area = Rasters.cellsize(ras)
plot(cell_area[1, :])

and things really fall apart when we decrease step size in X and Y

dX = 0.0006
dY = -0.0003
lon = X(Projected(166.0:dX:167.0; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dX), crs=EPSG(4326)))
lat = Y(Projected(-78.0:dY:-79.0; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dY), crs=EPSG(4326)))
ras = Raster(rand(lon, lat))
cell_area = Rasters.cellsize(ras)
plot(cell_area[1, :])

Sep 25 '24 20:09 alex-s-gardner

@alex-s-gardner can you confirm this is on v0.11.8 of Rasters.jl?

Sep 26 '24 21:09 asinghvi17

Yes, it is v0.11.8

Sep 26 '24 21:09 alex-s-gardner

~I wonder if we need to densify and subsample points on the grid? I would expect that to have an effect as we increase dx and dy though, not as we decrease...~

~https://github.com/rafaqz/Rasters.jl/blob/8d13c1887f90444d24e3c19ecbff42c8d5f0de68/ext/RastersArchGDALExt/cellsize.jl#L71-L77~

nevermind, I just saw that the crs there is in long-lat. It might be something with the circular area calculation then?

Sep 26 '24 22:09 asinghvi17

Seems like a precision issue to me... is Float64 geometry being downgraded anywhere?

Sep 26 '24 23:09 alex-s-gardner

No, but it is being converted to radians and then multiplied by pi

Sep 26 '24 23:09 asinghvi17

Any ideas @tiemvanderdeure ?

Sep 27 '24 06:09 rafaqz

Thanks for pointing this out. I don't know exactly but it seems like something is being rounded somewhere?

Anyway, it's been on my list for a little while to make a specialised implementation for lon-lat grids that calculates the size of each latitudinal band in a grid and then divides it up between the cells. That should be much faster and avoids this type of error.

The current implementation just treats each cell as a linearring, which makes it super generic but slow and (apparently) introduces some precision issues.

Sep 27 '24 07:09 tiemvanderdeure

specialised implementation for lon-lat grids that calculates the size of each latitudinal band in a grid and then divides it up between the cells

Could that also apply to any CRS that can map to lat/long via mapcrs?

Sep 27 '24 21:09 asinghvi17

Yes we should be able to do the conversion for any grid aligned crs

Sep 27 '24 22:09 rafaqz

specialised implementation for lon-lat grids that calculates the size of each latitudinal band in a grid and then divides it up between the cells

Could that also apply to any CRS that can map to lat/long via mapcrs?

It depends on this line, which (I think) should return true for any lat/long crs: https://github.com/rafaqz/Rasters.jl/blob/8d13c1887f90444d24e3c19ecbff42c8d5f0de68/ext/RastersArchGDALExt/cellsize.jl#L57

But there may be a better way to test this?

Sep 28 '24 07:09 tiemvanderdeure

For mappedcrs we test if changing the projection of two points with the same X and different Y gets points with the same X. But then the user has specified the conversion explicitly and we are just checking it's not really wrong, so its not the same situation.

Sep 28 '24 08:09 rafaqz

Okay, I've never used mappedcrs and don't think I've ever seen it in the wild, so just trying to wrap my head around this.

For mappedcrs we test if changing the projection of two points with the same X and different Y gets points with the same X.

Are there ever cases where that holds for two projections with a different CoordSys?

I know that there are some projections which use lat/lon but are shifted/rotated compared to EPSG:4326, but I think all of those have Earth Projection 1. My thinking was that, since we're assuming the world is spherical anyway, we don't need to do any reprojecting in those cases.

Sep 28 '24 08:09 tiemvanderdeure

I don't really know CoordSys well. But I imagine yes? Anything grid aligned can be converted to another with independent transformations of X and Y. So e.g. a cylindrical equal area projection can be converted to lat/lon by independently transforming it's xs and ys. In lat/lon X will have regular steps, Y will be irregular. Rasters.reproject should try to do this for you.

But it means you can use the same math you are using for lat/lon after the transformation of the lookups.

Sep 28 '24 09:09 rafaqz

I don't really know CoordSys well.

I don't either and I can't really find much about it, so if you know a better check I'd be happy.

So basically we can just check if the mappedcrs is lon-lat and if it reproject the lookups and calculate the cellsize for those. That makes sense to me.

Sep 28 '24 09:09 tiemvanderdeure

Yeah. There is probably a more correct way... We need Julia crs parsers that give you more useful information more easily.

Sep 28 '24 09:09 rafaqz

So e.g. a cylindrical equal area projection can be converted to lat/lon by independently transforming it's xs and ys.

Do you have an example of a raster like this, so I can add a test? I can't find any in the docs.

Sep 28 '24 09:09 tiemvanderdeure

You can resample to it?

Sep 28 '24 10:09 rafaqz

I just came across https://github.com/google/s2geometry/issues/190, it seems pretty insightful - maybe we should use one of the approaches there, or convert early to completely spherical (x, y, z) Cartesian points?

Oct 03 '24 00:10 asinghvi17

With the PR, this works correctly for lat/long. But if I resample the last example raster to web mercator, I get this:

hello

Oct 03 '24 01:10 asinghvi17

I just came across google/s2geometry#190, it seems pretty insightful - maybe we should use one of the approaches there, or convert early to completely spherical (x, y, z) Cartesian points?

But even in that thread they mention their fallback is Girard's theorem, which is exactly what we use.

In this blogpost: https://www.johndcook.com/blog/2021/11/29/area-of-spherical-triangle/, Erikson's formula was basically equal to the estimate using just planar geometry for the area between some cities in Texas: "The triangle is so flat that numerical error has a bigger effect than the curvature of the earth." And here we're talking about the error that becomes relevant when gridcells are well below 1 km^2.

We actually already test for an equal-area projection with grid spacing of 100x100m (the test is if all grid cells are within 1% of 0.1 km^2).

My vote is for assuming flat space whenever grid cells get much smaller than that and just using the haversine formula.

With the PR, this works correctly for lat/long. But if I resample the last example raster to web mercator, I get this:

Could you copy the code?

Oct 03 '24 09:10 tiemvanderdeure

Okay thinking about this some more, I'm not sure if the Haversine formula is numerically stable at this grid spacing either. It also involves a bunch of square roots and sin and cos

Oct 03 '24 09:10 tiemvanderdeure

Did some more testing and I can report back that

yes, this is a numeric stability issue
yes, Eriksson's formula performs better in this particular example

I did this (_linearring_area is the current implementation):

xb = (166.0, 166.0006)
yb = (-78.0, -78.0003)

ring = [
    (xb[1], yb[1]), 
    (xb[2], yb[1]), 
    (xb[2], yb[2]), 
    (xb[1], yb[2]),
    (xb[1], yb[1])
]

ring_bf = map(p -> BigFloat.(p), ring)

eriksson(GI.LinearRing(ring))
_linearring_area(GI.LinearRing(ring); radius = 1)

eriksson(GI.LinearRing(ring_bf))
_linearring_area(GI.LinearRing(ring_bf); radius = 1)

which returns

julia> eriksson(GI.LinearRing(ring))
1.1399895306967292e-11

julia> _linearring_area(GI.LinearRing(ring); radius = 1)
2.2859936166241823e-11

julia> eriksson(GI.LinearRing(ring_bf))
1.139989369278327047449715369542151261981068071545916771493908664977035933833957e-11

julia> _linearring_area(GI.LinearRing(ring_bf); radius = 1)
1.140013862214309994513259890860796762002232266534835233057189872420615878310853e-11

So: when we use big floats both methods return the same answer, and this answer corresponds to what Eriksson's formula returns using Float64.

I implemented Eriksson's formula like this. It has all these dot products and cross products so I think it's going to allocate no matter what.

function eriksson(ring)
    points = GI.getpoint(ring)
    cartesian_points = lonlat_to_cartesian.(points)
    area = 0.0
    area += eriksson_triangle(cartesian_points[1], cartesian_points[2], cartesian_points[3])
    area += eriksson_triangle(cartesian_points[3], cartesian_points[4], cartesian_points[1])
    return area
end

function eriksson_triangle(a, b, c)
    t = abs(dot(a, cross(b, c)))
    t /= 1 + dot(b,c) + dot(c, a) + dot(a, b)

    # alternative that could be more precise but allocates more
    # t = abs(dot(a, (cross(b .- a, c .- a))) / dot(b .+ a, c .+ a))

    2*atan(t)
end

lonlat_to_cartesian(args) = lonlat_to_cartesian(args...)
function lonlat_to_cartesian(lon, lat)
    x = cosd(lat) * cosd(lon)
    y = cosd(lat) * sind(lon)
    z = sind(lat)
    return [x, y, z]
end

Oct 03 '24 12:10 tiemvanderdeure

Just some observations from testing in this way:

calculation with Girard's theorem starts misbehaving when the height is lower than 0.001
it is more unstable for high/low latitudes

So the options are:

always use eriksson
implement both and let users decide
implement both and automatically switch when the area is less than let's say 1e-6 square degrees.

Oct 03 '24 12:10 tiemvanderdeure

I implemented Eriksson's formula like this. It has all these dot products and cross products so I think it's going to allocate no matter what.

I usually use static vectors or tuples to get around that, you could probably try that, but StaticArrays is a pretty heavy dependency, and tuples don't have all the nice LinearAlgebra dispatches you'd need.

Although you could define a SphericalPoint type here that encodes those specific dispatches in a way where size is known at compile-time...that sounds like a lot of work :D

Oct 03 '24 14:10 asinghvi17

Yes or we can define a _dot and _cross that work on tuples of size 3 :)

Oct 03 '24 14:10 tiemvanderdeure

Here's a super basic implementation that should get you what you need:

using LinearAlgebra

struct SphericalPoint{T <: Real}
	data::NTuple{T, 3}
end

SphericalPoint(x, y, z) = SphericalPoint((x, y, z))

# define the 4 basic mathematical operators elementwise on the data tuple
Base.:+(p::SphericalPoint, q::SphericalPoint) = SphericalPoint(p.data .+ q.data)
Base.:-(p::SphericalPoint, q::SphericalPoint) = SphericalPoint(p.data .- q.data)
Base.:*(p::SphericalPoint, q::SphericalPoint) = SphericalPoint(p.data .* q.data)
Base.:/(p::SphericalPoint, q::SphericalPoint) = SphericalPoint(p.data ./ q.data)
# Define sum on a SphericalPoint to sum across its data
Base.sum(p::SphericalPoint) = sum(p.data)

# define dot and cross products
LinearAlgebra.dot(p::SphericalPoint, q::SphericalPoint) = sum(p * q)
function LinearAlgebra.cross(a::SphericalPoint, b::SphericalPoint)
	a1, a2, a3 = a
    b1, b2, b3 = b
	SphericalPoint((a2*b3-a3*b2, a3*b1-a1*b3, a1*b2-a2*b1))
end

Oct 03 '24 14:10 asinghvi17

okay let me benchmark this!

EDIT: not bad!

julia> @btime eriksson($ring)
  1.045 μs (1 allocation: 176 bytes)
1.1399895306958715e-11

julia> @btime _linearring_area($ring; radius = 1)
  2.050 μs (0 allocations: 0 bytes)
2.2859936166241823e-11

EDIT2: even better after I got rid of the final allocation

julia> @btime eriksson($ring)
  678.571 ns (0 allocations: 0 bytes)
1.1399895306958715e-11

Oct 03 '24 14:10 tiemvanderdeure

Hmm, curious why there's still an allocation though

Oct 03 '24 14:10 asinghvi17

Hmm, curious why there's still an allocation though

because of this line, which isn't so hard to get rid of.

    cartesian_points = lonlat_to_cartesian.(points)

Oct 03 '24 15:10 tiemvanderdeure

Also just reading back in https://github.com/google/s2geometry/issues/190, and the reason they cite for needing to keep Girard's Theorem around is for points that are antipodal, which I don't think is ever going to happen with a Raster, so I think we can safely go for the option to just always use Eriksson's formula

Oct 03 '24 15:10 tiemvanderdeure

Rasters.jl Rasters.jl copied to clipboard

cellsize returns erroneous results for grids with fine spacing [no reprojeciton]

Rasters.jl
Rasters.jl copied to clipboard