stars icon indicating copy to clipboard operation
stars copied to clipboard

Pixels in the border get weird values with `st_rasterize()`

Open jospueyo opened this issue 1 year ago • 4 comments

I have a grid of 1e6 observations covering a region. At the end of the pipeline, I convert this observations to a stars object, but I get weird values in the outside pixels. I use this workflow (I'm sorry I cannot produce a reprex, if you want, I can't share the grid as a csv file):

rasterize_points <- function(grid_xy, df, field, area_sf, resol = 180){

  template <- st_read(area_file, quiet = TRUE) |>
    st_rasterize(dx = resol, dy = resol)

    df |>
        select(all_of(field)) |>
        bind_cols(punts_obs[,c("X", "Y"),]) |>
        st_as_sf(coords = c("X", "Y")) |>
        st_rasterize(template)
}

 summary(df$field)
 #>  Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 #>  0.00    0.00   0.00   19.57   29.42  100.00 

rasterize_points(grid_xy, df, field, area_sf)

#> stars object with 2 dimensions and 1 attribute
#> attribute(s), summary of first 1e+05 cells:
#>           Min.  1st Qu. Median    Mean 3rd Qu.     Max.  NA's
#> m07  40.54316 98.50634    100 300.895     100 12165.57 92209
#> dimension(s):
#>   from   to  offset delta                refsys point x/y
#> x    1 1485  260160   180 ETRS89 / UTM zone 31N FALSE [x]
#> y    1 1441 4747976  -180 ETRS89 / UTM zone 31N FALSE [y]

While the values in df are between 0 and 100, there values in the raster of 12165. Can this be solved passing some option to GDALrasterize? I couldn't find the documentation for this function anywhere.

Thank you! And thanks for this amazing package!

jospueyo avatar Feb 22 '24 16:02 jospueyo

I need some kind of reprex to be able to look at this; if you want you can share data or a link through email.

edzer avatar Feb 22 '24 18:02 edzer

Preparing the reprex I found that my error was that I was using a little different area to create the grid of points that the one I was using to create the template. However, despite my specific problem being solved, I still think that this a non-expected behaviour. I uploaded the grid of points and the study area to github so it can be used for the reprex:

library(sf)
#> Linking to GEOS 3.11.2, GDAL 3.6.2, PROJ 9.2.0; sf_use_s2() is TRUE
library(stars)
#> Loading required package: abind
library(tidytable)
#> 
#> Attaching package: 'tidytable'
#> The following object is masked from 'package:stars':
#> 
#>     %in%
#> The following objects are masked from 'package:stats':
#> 
#>     dt, filter, lag
#> The following object is masked from 'package:base':
#> 
#>     %in%

resol <- 180

grid_points_rds <- tempfile()
download.file("https://github.com/jospueyo/root/raw/master/grid_points.rds", grid_points_rds)
grid_points <- readRDS(grid_points_rds)

area_rds <- tempfile()
download.file("https://github.com/jospueyo/root/raw/master/study_area_sf.rds", area_rds)
study_area <- readRDS(area_rds)

# Generate a random variable for the observations
observation <- tidytable(value = sample(0:100, size = nrow(grid_points), replace = TRUE))

# Create a template for st_rasterize
template <- study_area |>
  st_rasterize(dx = resol, dy = resol)

# Create the raster with the value in grid observations
raster <- observation |>
  bind_cols(grid_points[,c("X", "Y"),]) |>
  st_as_sf(coords = c("X", "Y")) |>
  st_rasterize(template)

plot(raster)
#> downsample set to 1

Created on 2024-02-23 with reprex v2.1.0

As you can see, the values from observation were between 0 and 100, but the raster has higher values up to 12,000. They are located in the margins. I suppose the expected behaviour would be this pixels in the border being NA.

jospueyo avatar Feb 23 '24 17:02 jospueyo

This is because template has values over 12000; if you use tm <- template * 0 as template, you get x

edzer avatar Feb 23 '24 22:02 edzer

Ok, now I have a clearer understanding. Actually, I realized that I don't event need a template. I can pass only the sf object to st_rasterize and it guesses the resolution, or I can pass the resolution which, in my computer, is 2'' faster, because, I suppose, it doesn't have to guess it.

However, I think this is not clear in the documentation, it says template = stars object with desired target geometry, or target geometry alignment if align=TRUE. Maybe a mention that if pixels from template are not covered by sf, values from template will be used.

Thank you!

jospueyo avatar Feb 26 '24 12:02 jospueyo

Good idea

...
# Create the raster with the value in grid observations
raster <- observation |>
  bind_cols(grid_points[,c("X", "Y"),]) |>
  st_as_sf(coords = c("X", "Y")) |>
  st_rasterize(template, align = TRUE)

now does the job (better than before).

edzer avatar Feb 26 '24 19:02 edzer