terra
terra copied to clipboard
spatSample won't create the number of sample requested for each stratum
Hi,
I'm trying to use spatSample to create a survey design that generate random points for each category of my raster:
pred_recl class : SpatRaster dimensions : 1686, 1872, 1 (nrow, ncol, nlyr) resolution : 0.0025, 0.0025 (x, y) extent : 143.8175, 148.4975, -43.7425, -39.5275 (xmin, xmax, ymin, ymax) coord. ref. : +proj=longlat +ellps=GRS80 +no_defs source : memory name : stratum min value : 1 max value : 3
set.seed(1) s <- spatSample(pred_recl, size = 20, method = "stratified", as.points=TRUE, xy=TRUE)
But when I run spatSample() I get the Warning messages: 1: [spatSample] fewer cells returned than requested 2: [sample] fewer samples than requested for strata: 2, 3
All categories are big enough to easily fit 20 points, so I don't understand why it doesn't create them.
To test if the issue was with my raster, I read my data with the raster package and tried the function sampleStratified (x = pred_recl, size = 20, xy = TRUE, sp = TRUE). This works just fine, so I thought I'd open an issue as it seems there is a problem with the spatSample function.
Thanks,
Fernanda.
I think I have a related issue. When I use spatSample(method = "stratified) with a raster with one stratum that only covers a small area I get an error. If I run the same with method = "random" and na.rm = TRUE there is no error but also no samples. Everything works as expected if the area is larger. There are 25 cells in the stratum and I am only looking for 2 samples so I would expect it to be possible.
I tried with the CRAN release and the development version from R universe.
library(terra)
rr <- rast(ncol=100, nrow=100, names="stratum")
rr[6:10, 6:10] <- 1
# returns an error
spatSample(rr, 2, "stratified")
# Error in `colnames<-`(`*tmp*`, value = c("cell", names(x))) :
# attempt to set 'colnames' on an object with less than two dimensions
# In addition: Warning message:
# [spatSample] fewer cells returned than requested
# No error but no samples either
spatSample(rr, size = 2,
method = "random", na.rm = TRUE)
# [1] stratum
# <0 rows> (or 0-length row.names)
# Warning message:
# [spatSample] fewer values returned than requested
# works fine if the patch is big enough
rr <- rast(ncol=100, nrow=100, names="stratum")
rr[6:30, 6:30] <- 1
spatSample(rr, 2, "stratified")
spatSample(rr, size = 2,
method = "random", na.rm = TRUE)
Running into this again and going to try and contribute. I have two ideas one is that if no values are sampled in a stratum try setting other values to NA and then trimming so that if the stratum is in a clump will be less likely to miss it. This would work for my use case but perhaps not many others? The other idea is use the notfound flag that is recorded to redo samples for those strata with a larger sample size.
Am I correct in understanding that the current approach is just to take a random sample that is 5 times larger than the desired sample size times the number of strata and then keep enough samples to match the sample size and warn if that was not reached?
Hello, Do you know if this issue has been looked into? I have a similar problem to the one described by @fernandabiologia but with the "random" sampling method.
The input raster I want to sample the locations from has 178,276 cells that are not NA and I want to randomly select 1% of them but the following command only returns 10 locations.
potential.ignition.points <- spatSample(input.raster, size = 180, method = "random", na.rm = TRUE, as.points = TRUE, as.df = FALSE, ext = ext(input.raster))
input.raster class : SpatRaster dimensions : 4867, 4349, 1 (nrow, ncol, nlyr) resolution : 100, 100 (x, y) extent : -1611382, -1176482, -3930115, -3443415 (xmin, xmax, ymin, ymax) coord. ref. : GDA94 / MGA zone 55 (EPSG:28355) source : memory name : layer min value : 1 max value : 1
Many thanks, Amelie
@AmelieJeanneau The pull request that I created might help with that. If you want to try it you could install from my fork here https://github.com/see24/terra. I just fetched the changes from terra so it should be up to date. I haven't heard anything on this issue or my pull request
I believe these issues have now been solved; or at least partly. Feel free to open a new issue if you run into this problem again, and please include a reproducible example if possible.
Hello @rhijmans,
I am experiencing this problem. I am trying to sample different numbers of pixels randomly, but each time it returns a smaller number than asked.
Unfortunately github won't accept a .tif file, but here is the line of code I used:
samp = spatSample(x, size = 4000, method = 'random', as.points=TRUE, na.rm = TRUE)
Additionally, I would like to generate a regular sample, with pixels chosen every 9 km. However, I only want pixels with values (ignoring NAs). When I set method = 'random', it returns mostly NAs (as there are many NAs in my file). Is it possible to sample regular by perhaps sampling the nearest pixel with a value to the NA pixel that would be chosen if strictly following a regular sampling design?
Thanks. Can you please add two new issues? One to report a possible bug, and please be as specific as possible, by including a reproducible example that recreates this problem. Or, at the very least, include the result of show(x) and global(x, "notNA").
The other one would be for a feature request (a new way to get a regular sample).
Thanks @rhijmans
I think the problem was simply a matter of not using the latest version of terra. I updated the package and now the issue appears to be resolved. I just added an issue for the regular sampling feature.
Thanks!
Dear rhijmans, I meet this problem again.
a1 = terra::spatSample(x1, 2000, as.df=T, replace=T, na.rm=T, xy=T)
#Warning message:
#[spatSample] fewer cells returned than requested
x1
#class : SpatRaster
#dimensions : 24093, 14980, 3 (nrow, ncol, nlyr)
#resolution : 30, 30 (x, y)
#extent : 110850, 560250, 3077450, 3800240 (xmin, xmax, ymin, ymax)
#coord. ref. : WGS 84 / UTM zone 48N (EPSG:32648)
#sources : spat_lvzsihE3Db1Mjos_4516.tif
# bufdem30mutm.tif
# aspectcos1.tif
#names : bamboo21, elev, aspect
#min values : TRUE, 218.7693, -1
#max values : TRUE, 7468.8027, 1
global(x1, "notNA")
# notNA
#bamboo21 2979387
#elev 146837026
#aspect 146677365
I set size=2000, but, I can only get 90 cells' value, I update terra to 1.6-43, but still get such result.
@PetiteTong I suspect that the reason for this is that only very few cells are not NA for all three layers. For "bamboo21" it is less than 1%.
2979387 / (24093 * 14980)
#[1] 0.008255136
Perhaps the number of cells where all three layers are not NA is 90? You can find out with
global(anyNA(x1), sum)
If there are more than 90 cells you could also try to increase argument exp
thanks, I tried,
global(anyNA(x1), sum)
# sum
#lyr1 357933753
maybe what we need is any notNA?
It is indirect, but by subtracting that number from the number of cells, we can see that where is data in bamboo2 there is data in the other layers:
(24093 * 14980) - 357933753
#[1] 2979387
They are just very hard to find when the values are so sparse. I can add a separate method for this (feel free to add a new issue so that I do not forget). For now, you could sample repeatedly and combine the samples, If you keep the cell numbers you can get a sample without replacement by running unique.
Ok, thanks a lot~!