geofi
geofi copied to clipboard
Consider VRK:n rakennusten osoitetiedot ja äänestysalueet -data
Väestörekisterikeskus publishes annually data containing all buildings in Finland. Data is zipped delimited file with .OPT
-extension and has 3,6 million rows. It can be read and processed in R (slowly) with following code:
# 2019
library(dplyr)
library(sp)
library(sf)
tmpfile <- tempfile()
tmpdir <- tempdir()
download.file("https://www.avoindata.fi/data/dataset/cf9208dc-63a9-44a2-9312-bbd2c3952596/resource/ae13f168-e835-4412-8661-355ea6c4c468/download/suomi_osoitteet_2019-05-15.zip",
destfile = tmpfile)
unzip(zipfile = tmpfile,
exdir = tmpdir)
opt <- read.csv(glue::glue("{tmpdir}/Suomi_osoitteet_2019-05-15.OPT"),
sep = ";",
stringsAsFactors = FALSE,
header = FALSE)
names(opt) <- c("rakennustu","sijaintiku",
"sijaintima","rakennusty",
"CoordY","CoordX",
"osoitenume", "katunimi_f",
"katunimi_s", "katunumero",
"postinumer", "vaalipiirikoodi",
"vaalipiirinimi","tyhja",
"idx", "date")
if (F){ # subsetting just to make conversions faster
opt_orig <- as_tibble(opt)
opt <- sample_n(opt_orig, size = 2000)
}
opt$katunimi_f <- iconv(opt$katunimi_f, from = "windows-1252", to = "UTF-8")
opt$katunimi_s <- iconv(opt$katunimi_s, from = "windows-1252", to = "UTF-8")
opt$katunumero <- iconv(opt$katunumero, from = "windows-1252", to = "UTF-8")
opt$vaalipiirinimi <- iconv(opt$vaalipiirinimi, from = "windows-1252", to = "UTF-8")
sp.data <- SpatialPointsDataFrame(opt[, c("CoordX", "CoordY")],
opt,
proj4string = CRS("+init=epsg:3067"))
# Project the spatial data to lat/lon
# sp.data <- spTransform(sp.data, CRS("+proj=longlat +datum=WGS84"))
shape <- st_as_sf(sp.data)
st_coordinates(shape)
# shape %>% select(rakennustu) %>% plot()
saveRDS(shape, file=paste0("./sf19_buildings.RDS"))
Any ideas how to incorporate this with geofi
. It is useful for instance when geocoding sensitive addresses.
However, this would require a storage as the data should be preprocessed. Do you think this as a suitable data for geofi
and should we create a data repo such as geofi_data
?