stars icon indicating copy to clipboard operation
stars copied to clipboard

identify points that occupy same array "cell" location

Open btupper opened this issue 3 years ago • 6 comments

I would like to filter a set of points scattered on a stars array such that no array location is duplicated. I don't think that the raster concept of "cell" isn't supported - at least that is what I gather from the raster-stars functionality map. But something like raster::cellFromXY() seems like what I am looking for. The closest thing I can think of is a function that creates a copy of the stars object, reassigns values to simple "cell" index values and then runs extract. I can then filter the original dataset for duplicate "cell" indices.

In the example below, I try to programmatically identify that points A and B occupy the same "cell". It works, but I feel like there might be a better way to do this. Any guidance on this?

# https://github.com/r-spatial/stars/wiki/How-%60raster%60-functions-map-to-%60stars%60-functions

suppressPackageStartupMessages({
  library(dplyr)
  library(sf)
  library(stars)
  })

# Retrieve the 2d "cell" index of a set of points into a stars object
# 
# @param x stars object
# @param p object with POINT geometry
# @return vector of 'cell' indices
cell_index <- function(x, p){
  x = x[1]
  names(x) <- "index"
  d = dim(x)
  mutate(x, index = seq_len(prod(d[1:2]))) |>
    st_extract(at = p) |>
    pull(index)
}


nr = 5
nc = 4
n = nc*nr
m = matrix(seq_len(n), nrow = nr, ncol = nc)
dim(m) = c(x = nr, y = nc)
s = st_as_stars(m)

x = tibble(id = LETTERS[1:4],
           x = c(1.2, 1.7, 4.3, 0.9),
           y = c(2.1, 2.2, 0.9, 3.8)) |>
  st_as_sf(coords = c("x", "y"))

plot(s, axes = TRUE, text_values = TRUE, reset = FALSE, col = sf.colors())
plot(x, add = TRUE, cex = 4, col = "black")
text(st_coordinates(x), labels = x$id)

mutate(x, index = cell_index(s,x), .after = 1)
#> Simple feature collection with 4 features and 2 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 0.9 ymin: 0.9 xmax: 4.3 ymax: 3.8
#> CRS:           NA
#> # A tibble: 4 × 3
#>   id    index  geometry
#> * <chr> <int>   <POINT>
#> 1 A        12 (1.2 2.1)
#> 2 B        12 (1.7 2.2)
#> 3 C         5 (4.3 0.9)
#> 4 D        16 (0.9 3.8)

Created on 2022-08-26 by the reprex package (v2.0.1)

btupper avatar Aug 26 '22 13:08 btupper

You can try rewriting the cellFromXY() from terra or raster.

kadyb avatar Sep 08 '22 10:09 kadyb

There's an unexported function that could be used

> stars:::colrow_from_xy(st_coordinates(x), s)
  [,1] [,2]
1    2    3
2    2    3
3    5    1
4    1    4

and then combined, e.g. in

> cell_from_xy = function(pts, s) {
		cr = stars:::colrow_from_xy(st_coordinates(x), s)
		nc = dim(s)[1]
		(cr[,2] - 1) * nc + cr[,1]
}
> cell_from_xy(pts, s)
 1  2  3  4 
12 12  5 16 

This way, it also works for rectilinear and sheared/rotated grids. Shall I add this & call it st_cell_from_points, or implement a _from_xy that takes a matrix?

edzer avatar Sep 08 '22 14:09 edzer

Great reprex, btw @btupper !

edzer avatar Sep 08 '22 15:09 edzer

Oh, sweet! Your question about what to implement gets at an issue that we have been tripping over the last umpteen years. We help many folks (ecologists mostly) who start with a spreadsheet of observations that they want to marry with well known environmental covariates (satellite data, weather/climate models, ocean circulation models, etc). They spend a lot of effort trying to negotiate their locational data into various formats needed to extract data. We (helpers) keep thinking there must be a standard coin-of-the-realm that can be used for these extraction-type transactions. We have been pushing sf - "transform your observations into sf and then we'll help you get your covariates". That's what we tried to push at the recent Ocean Hack Week ala xyzt, ghrsst, obpg, hycom and others. Despite not getting much momentum at OHW we still think it is worthwhile. Well, that's a long winded way of saying I would vote for st_cells(stars, sf) exposed for the general user armed with points/lines/etc and _from_xy(stars, matrix) under the hood available for coders. But wishes and fishes and all that... I'd be grateful for either!

btupper avatar Sep 08 '22 15:09 btupper

gives us

library(stars)
# Loading required package: abind
# Loading required package: sf
# Linking to GEOS 3.10.2, GDAL 3.4.3, PROJ 8.2.0; sf_use_s2() is TRUE
library(tibble)
nr = 5
nc = 4
n = nc*nr
m = matrix(seq_len(n), nrow = nr, ncol = nc)
dim(m) = c(x = nr, y = nc)
s = st_as_stars(m)

x = tibble(id = LETTERS[1:4],
           x = c(1.2, 1.7, 4.3, 0.9),
           y = c(2.1, 2.2, 0.9, 3.8)) |>
  st_as_sf(coords = c("x", "y"))

st_cells(s, x)
#  1  2  3  4 
# 12 12  5 16 

edzer avatar Sep 09 '22 14:09 edzer

Yay! Thank you!

btupper avatar Sep 09 '22 14:09 btupper