tmap
tmap copied to clipboard
tm_labels -> looking for algorithms
In tmap4, I started the implementation of 2 text layer functions:
-
tm_text
Intended to print text to represent data directly, i.e. with visual variables. -
tm_labels
To label points, lines, and/or polygons.
The coordinates in tm_text
are as the are (by default). Example:
tm_shape(World, bbox = World) +
tm_text("name", size="pop_est", col="continent",
col.scale = tm_scale_categorical(values = "seaborn.dark"),
col.legend = tm_legend_hide(),
size.scale = tm_scale_continuous(values.scale = 4),
size.legend = tm_legend_hide())
For tm_labels
the aim is not to print the text at the exact coordinates, but next to (or on top of) the geometries that they refer to.
So, this function requires some intelligent algorithms to place the text.
tmap3 already contained some 'intelligent' features in tm_text
, namely the arguments auto.placement
, remove.overlap
, along.lines
and overwrite.lines
. I've migrated all of them in tmap4 already (except the last one, which was also not working well in the latest version of tmap3). But there still is a need for further extensions.
Points:
The automatic placement function for points is based on car::pointLabel
(earlier part of maptools). This function was also used in tmap3, but I improved it a bit. What is does: it places the labels are close to (but not on top of) the points such that overlap is minimised. Under the hood, simulated annealing and a generic algorithm are used:
metroAfrica = sf::st_intersection(metro, World[World$continent == "Africa", ])
Africa = World[World$continent == "Africa", ]
tm_shape(land) +
tm_raster("cover_cls",
col.scale = tm_scale(values = cols4all::c4a("brewer.pastel1")[c(3,7,7,2,6,1,2,2)]),
col.legend = tm_legend_hide()) +
tm_shape(rivers) +
tm_lines(lwd = "strokelwd", lwd.scale = tm_scale_asis(values.scale = .3), col = cols4all::c4a("brewer.pastel1")[2]) +
tm_shape(Africa, is.main = TRUE) +
tm_borders() +
tm_shape(metroAfrica) +
tm_symbols(fill = "red", shape = "pop2020", size = "pop2020",
size.scale = tm_scale_intervals(breaks = c(1, 2, 5, 10, 15, 20, 25) * 1e6, values.range = c(0.2,2)),
size.legend = tm_legend("Population in 2020"),
shape.scale = tm_scale_intervals(breaks = c(1, 2, 5, 10, 15, 20, 25) * 1e6, values = c(21, 23, 22, 21, 23, 22)),
shape.legend = tm_legend_combine("size")) +
tm_labels("name")
The automatic removal of overlapping labels is also implemented, but could be improved. For instance, we should be able to specify some sort of weight, determining the importance of the labels.
What we also need: linking lines between labels and points, especially for those that are far away.
Lines
For labeling lines, the option along.lines
has been migrated from tmap3. This calculates the angle of the line at the centroid. It works okayish, but needs refinement. Ideally, the labels should be right next to the lines rather than on top:
DE = World[World$name == "Germany",]
rivers_DE = sf::st_intersection(rivers, DE)
tm_shape(DE, crs = 3035) +
tm_polygons() +
tm_shape(rivers_DE) +
tm_lines(lwd = "strokelwd", lwd.scale = tm_scale_asis()) +
tm_labels("name", bgcol = "grey85")
Polygons
No implementation yet. In tmap3, the user could scale the text with "AREA"
, and use several scaling settings with root
, print.tiny
, and size.lowerbound
. In tmap4, those belong imho to tm_text
and the visual variable size
. For tm_labels
I am looking for a geometry-driven rather data-driven procedure.
What I have in mind is the following configurable procedure like this:
- Find a spot in the (multi)polygon where the text fits in. Options could be:
stay.at.centroid
to prevent that text labels are drawn elsewhere,allow.rotation
to allow labels to be rotated if they have a better fit (think of Italy).decrease.font.size
to decrease font size in case labels do not fit. - Find spots for the labels of the unlabelled polygons. They should be placed outside any polygon.
- We need linking lines between those labels and points. Common application is a standard US state map:
Options
Note: tm_text
and tm_labels
are the same layer function, but with different layer options. As you can see, I've placed them into opt_tm_<layer>
, (see #848 option4).
The option names are not finalised, so if you have suggestions for better option names, let me know!
Tips and help welcome!
Do you know any implemented algorithms and we can use? There is ggrepel
(mentioned in #808), but so far, I wasn't able to extract the algorithms from the ggplot2 ecosystem. Help is more than welcome.
Also related to #279 and #373, and pinging @Nowosad @Robinlovelace @olivroy @agila5 @rogerbeecham @staropram
This all looks good to me. I remember seeing a very good package for generating labels along curved lines but cannot recall what it's called. The {ggrepel} also provides good non-overlapping labels. Could that be useful for tm_text()
?
Maybe also useful for polygons https://fosstodon.org/@atsyplenkov/112109165619723305
I used FField
for automatic label placement (function FFieldPtRep
) - but the package is meanwhile archived.
Thanks for your input!
@Robinlovelace Yes, indeed, but see the last paragraph of my opening post.
@tim-salabim Lots of interesting projects mentioned there: https://github.com/atsyplenkov/centerline https://github.com/tylermorganwall/raybevel
@sjewo Awesome, those are the type of functions I am looking for. The source code is still available: https://github.com/cran/FField/blob/master/R/FField.R My understanding is that is essentially does the same thing as car::pointLabel
, but I like how the labels are aligned. It doesn't take polygon areas into account, does it?
Hi @mtennekes !
I used a combination of automatic optimization of y coordinate and manual placement in x direction. I updated my code to use sf and changed it to take the borders into account.
library(tmap)
library(sf)
library(FField)
data("NLD_prov")
# rename shape
shp <- NLD_prov
# Centroids of polygon object
pts_label <- pts_anchor <- st_coordinates(st_centroid(shp))
# Identify polygon position relative to center
lrvec <- (pts_anchor[,"X"] >= mean(pts_anchor[, "X"])) + 1
# detect boundary of shp
shp_boundary <- st_coordinates(st_simplify(st_boundary(st_union(shp)), dTolerance = 2000))
# identify x value from border at height y (here one might take also x position into account))
pts_label[,1] <- sapply(pts_label[,2], function(y) {
row <- which.min(abs(shp_boundary[, "Y"] - y))
shp_boundary[row, "X"]
})
# move x coordinate in away from polygon boundary to left or right (lrvec)
pts_label[,1] <- pts_label[,1] + c(-1, 1)[lrvec] * 0.5 * sd(pts_label[,1])
# normalize coordinates
z_pts_label <- scale(pts_label)
# optimize position
pts_label_optimized <- FFieldPtRep(z_pts_label, rep.dist.lmt = 1, iter.max = 20000)
# inverse normalization only in y direction
#pts_label_optimized[,1] <- (pts_label_optimized[,1] * attr(z_pts_label, "scaled:scale")[1]) + attr(z_pts_label, "scaled:center")[1]
pts_label_optimized[,1] <- pts_label[,1]
pts_label_optimized[,2] <- (pts_label_optimized[,2] * attr(z_pts_label, "scaled:scale")[2]) +attr(z_pts_label, "scaled:center")[2]
# create sf-object
shp_label_optimized <- st_as_sf(data.frame(pts_label_optimized, st_drop_geometry(shp)), coords = c("x", "y"), crs = st_crs(shp))
# get position relative to mean for the coordinates
shp_label_optimized$just <- c(-1.5, 1.5)[(st_coordinates(shp_label_optimized)[,"X"] >= mean(st_coordinates(shp_label_optimized)[, "X"])) + 1]
# create connecting lines
m <- lapply(1:nrow(pts_anchor), function(i) {
matrix(t(data.frame(pts_anchor, pts_label_optimized)[i,]), ncol = 2, byrow = TRUE)
})
edges <- sf::st_sfc(st_multilinestring(x = m), crs = st_crs(shp))
# plot
tm_shape(shp) +
tm_dots(size = 0.7) +
tm_borders() +
tm_shape(shp_label_optimized) +
tm_text("name", size = 0.8, xmod = "just") +
tm_shape(edges) +
tm_lines() +
tm_layout(frame = FALSE,
inner.margins.extra = rep(0.3, 4))
I remember seeing a very good package for generating labels along curved lines but cannot recall what it's called.
I was reminded of isoband, but I'm not sure how good the algo is. Placement is pretty nice: