tmap icon indicating copy to clipboard operation
tmap copied to clipboard

Defining "by" parameter in tm_facets() for stars objects

Open loreabad6 opened this issue 5 years ago • 5 comments

Hello and thank you for the package!

I want to create a facetted map where the faceting variable is a dimension other than "band". Currently tm_facets() throws an error when given the "by" argument for raster objects:

library(tmap)
library(stars)

tif = system.file("tif/L7_ETMs.tif", package = "stars") %>% read_stars()

tm_shape(tif) +
  tm_raster() +
  tm_facets()

tm_shape(tif) +
  tm_raster() + 
  tm_facets(by = 'band')
#> Error in mapply(function(gpl, indices, l) {: zero-length inputs cannot be mixed with those of non-zero length

I am particularly interested in this since I would like to create rgb facets by a dimension other than "band", and even I think possibly with the "along" argument for a tmap animation. I noticed that combining tm_rgb and tm_facets currently ignores the facetting code.

tm_shape(tif) +
  tm_rgb(3,2,1) +
  tm_facets(by = 'random_dim') #made up dimension

I would be interested in doing something like the following:

(tif2 = c(tif,tif, along = 'testdim'))
#> stars object with 4 dimensions and 1 attribute
#> attribute(s), summary of first 1e+05 cells:
#>   L7_ETMs.tif    
#>  Min.   : 47.00  
#>  1st Qu.: 65.00  
#>  Median : 76.00  
#>  Mean   : 77.34  
#>  3rd Qu.: 87.00  
#>  Max.   :255.00  
#> dimension(s):
#>         from  to  offset delta                       refsys point values    
#> x          1 349  288776  28.5 UTM Zone 25, Southern Hem... FALSE   NULL [x]
#> y          1 352 9120761 -28.5 UTM Zone 25, Southern Hem... FALSE   NULL [y]
#> band       1   6      NA    NA                           NA    NA   NULL    
#> testdim    1   2      NA    NA                           NA    NA   NULL

tm_shape(tif2) +
  tm_rgb(3,2,1) +
  tm_facets(by = 'testdim')

Or even for showing a single "band" but facetted by other dimensions, which now gives a single panel:

tm_shape(tif2[,,,1]) +
  tm_raster() +
  tm_facets(by = 'testdim')

Is there a way already to do these, might I have missed something? And if not, are these features planned to be implemented, or will they be in the scope of the package?

loreabad6 avatar Sep 10 '20 13:09 loreabad6

Very interesting and important one.

The current implementation ignores the by argument, but in case there is a third dimension (the others being the raster dimensions, usually x and y), this third dimension is treated as by.

This should be improved, but it's not easy at all. Under the hood, a data frame is created where the rows are cell-ids and the columns are values of the third star-dimension. This data is processed as other spatial (sf) data.

Now we need to add 3 other data-dimensions, because in theory, it should be possible to do the following in tmap:

tm_shape(s) +
  tm_rgb(3,2,1) +
  tm_facets(by = c('facet_row', 'facet_column'), along = "page")

So in this example, s has 6 (!) star-dimensions: x, y, band (the color bands), facet_row, facet_column (to create facets) and page (to create pages needed for tmap_animation).

mtennekes avatar Sep 10 '20 16:09 mtennekes

@mtennekes I do not fully grasp the underline implementation of tm_facets() in tmap, however, my thinking about facets are as follows:

  1. It would be great to allow users to create 1D and 2D facet grids. For example, when the user provides just one argument in by - the result stays the same as in the current version of tmap. On the other hand, when the user provides two arguments for by, then the result is shown in rows (the first argument) and columns (the second argument).
  2. It would be also great if users could either select any dimension (except x and y) in by. For example tm_facets(by = c("band", "time")).
  3. There would be also a need to find a way to specify that we want to present attributes (not dimensions). For example, tm_facets(by = c("time", "ATTRIBUTES")).

Nowosad avatar Sep 13 '20 09:09 Nowosad

Yes, 1 and 2 were indeed as I planned to do. Nice that you think the same way! Good point 3: we can probably treat attributes as if they are values for another dimension.

mtennekes avatar Sep 13 '20 18:09 mtennekes

@loreabad6 (or @Nowosad or @edzer ) Could you give me (preferably realistic) examples of stars objects that have 4 or more dimensions and/or objects with 3 or more dimensions and multiple attributes?

I plan to submit tmap 3.2 to CRAN today, and implement this feature later. The reason for this, is that next Saturday, there will be an useR2020 tutorial by Edzer and me called "Analyzing and visualising spatial and spatiotemporal data cubes". With 3.2., stars objects with 3 dimensions or 2 dimensions and multiple attributes can still be visualized.

About the implementation itself: I've checked how it currently works for sf objects:

data(World)
tm_shape(World) +
	tm_fill("life_exp") +
	tm_facets(by = c("economy", "income_grp"), free.coords = F, along = "continent")

In this example, there are three group variables: "economy", "income_grp", and "continent". Temporarily, a new factor variable called GROUP_BY is created that is a cross-term of those three variables. In this way, the data is processed in the same way as with only one group variable.

For stars object, we could do the same, although we have to be aware of the memory use / computation time that it would take. On the other hand, very large stars objects are downsampled anyway.

mtennekes avatar Sep 15 '20 07:09 mtennekes

I was checking on some of the stars system files and think this one can work for testing, 3 dimensions and 2 variables:

library(stars)
(t = read_ncdf(system.file('nc/bcsd_obs_1999.nc', package = 'stars')))
#> stars object with 3 dimensions and 2 attributes
#> attribute(s):
#>    pr [mm/m]         tas [C]      
#>  Min.   :  0.59   Min.   :-0.421  
#>  1st Qu.: 56.14   1st Qu.: 8.899  
#>  Median : 81.88   Median :15.658  
#>  Mean   :101.26   Mean   :15.489  
#>  3rd Qu.:121.07   3rd Qu.:21.780  
#>  Max.   :848.55   Max.   :29.386  
#>  NA's   :7116     NA's   :7116    
#> dimension(s):
#>           from to offset delta  refsys point                    values    
#> longitude    1 81    -85 0.125  WGS 84    NA                      NULL [x]
#> latitude     1 33     33 0.125  WGS 84    NA                      NULL [y]
#> time         1 12     NA    NA POSIXct    NA 1999-01-31,...,1999-12-31

We could also play with the attributes and dimensions to get further test cases:

(t1 = st_set_dimensions(merge(t), names = c('longitude','latitude','time','attributes')))
#> stars object with 4 dimensions and 1 attribute
#> attribute(s):
#>        X          
#>  Min.   : -0.421  
#>  1st Qu.: 15.627  
#>  Median : 27.599  
#>  Mean   : 58.377  
#>  3rd Qu.: 81.880  
#>  Max.   :848.550  
#>  NA's   :14232    
#> dimension(s):
#>            from to offset delta  refsys point                    values    
#> longitude     1 81    -85 0.125  WGS 84    NA                      NULL [x]
#> latitude      1 33     33 0.125  WGS 84    NA                      NULL [y]
#> time          1 12     NA    NA POSIXct    NA 1999-01-31,...,1999-12-31    
#> attributes    1  2     NA    NA      NA    NA                  pr , tas

(t2 = split(t1, 'time'))
#> stars object with 3 dimensions and 12 attributes
#> attribute(s):
#>   1999-01-31        1999-02-28         1999-03-31         1999-04-30      
#>  Min.   : -0.421   Min.   : -0.2148   Min.   :  0.4626   Min.   :  9.068  
#>  1st Qu.:  7.510   1st Qu.:  7.7029   1st Qu.:  8.8546   1st Qu.: 16.657  
#>  Median : 24.409   Median : 17.3044   Median : 12.7710   Median : 20.194  
#>  Mean   : 81.071   Mean   : 38.0218   Mean   : 46.5753   Mean   : 53.547  
#>  3rd Qu.:151.553   3rd Qu.: 61.1050   3rd Qu.: 83.1900   3rd Qu.: 90.315  
#>  Max.   :332.820   Max.   :220.3100   Max.   :229.5600   Max.   :238.160  
#>  NA's   :1186      NA's   :1186       NA's   :1186       NA's   :1186     
#>   1999-05-31       1999-06-30       1999-07-31       1999-08-31     
#>  Min.   :  0.59   Min.   : 11.32   Min.   : 14.46   Min.   :  8.49  
#>  1st Qu.: 19.18   1st Qu.: 23.25   1st Qu.: 26.40   1st Qu.: 26.25  
#>  Median : 21.18   Median : 25.03   Median : 28.35   Median : 28.53  
#>  Mean   : 44.24   Mean   : 67.39   Mean   : 67.78   Mean   : 56.21  
#>  3rd Qu.: 62.30   3rd Qu.:105.82   3rd Qu.:105.45   3rd Qu.: 77.97  
#>  Max.   :247.74   Max.   :401.33   Max.   :300.46   Max.   :388.91  
#>  NA's   :1186     NA's   :1186     NA's   :1186     NA's   :1186    
#>   1999-09-30       1999-10-31        1999-11-30        1999-12-31       
#>  Min.   : 12.87   Min.   :  7.635   Min.   :  5.113   Min.   : -0.4147  
#>  1st Qu.: 21.09   1st Qu.: 15.449   1st Qu.: 12.802   1st Qu.:  6.7550  
#>  Median : 23.23   Median : 21.266   Median : 16.148   Median : 11.4548  
#>  Mean   :119.60   Mean   : 60.357   Mean   : 36.712   Mean   : 29.0312  
#>  3rd Qu.:129.58   3rd Qu.: 86.338   3rd Qu.: 53.300   3rd Qu.: 50.1725  
#>  Max.   :848.55   Max.   :286.590   Max.   :295.510   Max.   :149.8700  
#>  NA's   :1186     NA's   :1186      NA's   :1186      NA's   :1186      
#> dimension(s):
#>            from to offset delta refsys point   values    
#> longitude     1 81    -85 0.125 WGS 84    NA     NULL [x]
#> latitude      1 33     33 0.125 WGS 84    NA     NULL [y]
#> attributes    1  2     NA    NA     NA    NA pr , tas

loreabad6 avatar Sep 16 '20 10:09 loreabad6

The discussion can be continued at https://github.com/r-tmap/tmap/issues/597

Nowosad avatar Sep 18 '23 08:09 Nowosad