bcmaps icon indicating copy to clipboard operation
bcmaps copied to clipboard

Download LiDAR tiles from LidarBC

Open ateucher opened this issue 2 years ago • 36 comments

https://www2.gov.bc.ca/gov/content/data/geographic-data-services/lidarbc

We can probably parse this file: https://nrs.objectstore.gov.bc.ca/gdwuts/

ateucher avatar Nov 16 '21 20:11 ateucher

There are three types of products:

  1. Raw Point Cloud: .laz files
  2. Digital Elevation Model (DEM): 1m and 10m as .tif files
  3. Digital Surface Model (DSM): .laz files

ateucher avatar Nov 16 '21 20:11 ateucher

A first step to parse the files and plot things up

Outstanding issues:

  • The geotifs have a CRS but the LAZ files do not, LAZ are in UTM, so need to know the tile to be able to know the correct CRS (EDIT: UTM Zone is in the LAZ filename...)
  • Need to make some search functions using the tiles
  • Would be nice to be able to query the available tiles (like a tile index sf object)
library(tidyverse)
library(RCurl)
#> 
#> Attaching package: 'RCurl'
#> The following object is masked from 'package:tidyr':
#> 
#>     complete
library(httr)
library(XML)
library(lidR)
#> Loading required package: raster
#> Loading required package: sp
#> 
#> Attaching package: 'raster'
#> The following object is masked from 'package:dplyr':
#> 
#>     select
#> The following object is masked from 'package:tidyr':
#> 
#>     extract
library(stars)
#> Loading required package: abind
#> Loading required package: sf
#> Linking to GEOS 3.9.1, GDAL 3.2.1, PROJ 7.2.1

# XML Location 

url <- "https://nrs.objectstore.gov.bc.ca/gdwuts/"

# Parse XML 

my_xml <- XML::xmlToList(XML::xmlParse(RCurl::getURL(url)))

# Format XML 

my_xml_list <- my_xml %>% 
  unlist(recursive = FALSE) %>% 
  enframe() %>% 
  filter(name == "Contents.Key") %>% 
  mutate(value = unlist(value)) 

# File list 
tif_list <- my_xml_list[str_detect(my_xml_list$value, ".tif"),]
laz_list <- my_xml_list[str_detect(my_xml_list$value, ".laz"),]

#### TIF ####

file <- tif_list[200,2]$value
fileurl <- paste0(url, file)
local_file <- paste0(gsub("/","_",file))
download.file(url = fileurl, destfile = local_file, mode="wb")
tif <- stars::read_stars(local_file) 
tif %>% plot()
#> downsample set to c(2,2)

#### LAZ ####

file <- laz_list[200,2]$value
fileurl <- paste0(url, file)
local_file <- paste0(gsub("/","_",file))
download.file(url = fileurl, destfile = local_file, mode="wb")

las = lidR::readLAS(local_file)
projection(las) <- sf::st_crs(tif)
dtm <- grid_terrain(las, res = 1, knnidw(k=10,p=2))
#> Warning: There were 13 degenerated ground points. Some X Y Z coordinates were
#> repeated. They were removed.
#> Warning: There were 5 degenerated ground points. Some X Y coordinates were
#> repeated but with different Z coordinates. min Z were retained.
plot(dtm)

bevingtona avatar Nov 17 '21 06:11 bevingtona

Amazing @bevingtona!

ateucher avatar Nov 17 '21 17:11 ateucher

Hmm XML is only reading the first 1000 lines... somthing wrong in:

my_xml <- XML::xmlToList(XML::xmlParse(RCurl::getURL(url)))

image

bevingtona avatar Nov 17 '21 17:11 bevingtona

These are stored in an S3 bucket, so can use the {aws.s3} package:

library(aws.s3)
library(stars)
#> Loading required package: abind
#> Loading required package: sf
#> Linking to GEOS 3.8.1, GDAL 3.2.1, PROJ 7.2.1
#> Registered S3 methods overwritten by 'stars':
#>   method             from
#>   st_bbox.SpatRaster sf  
#>   st_crs.SpatRaster  sf

bucket_092bdem <- get_bucket("gdwuts", 
                   prefix = "092/092b/2019/dem", 
                   base_url = "nrs.objectstore.gov.bc.ca", 
                   region = "")

bucket_df <- as.data.frame(bucket_092bdem)

file <- save_object(grep("092b053", bucket_df$Key, value = TRUE), 
                   bucket = "gdwuts", 
                   base_url = "nrs.objectstore.gov.bc.ca", 
                   region = "")

plot(read_stars(file))
#> downsample set to c(27)

Created on 2021-11-18 by the reprex package (v2.0.1)

ateucher avatar Nov 18 '21 23:11 ateucher

Very nice @ateucher ! Do you know of a way to query the available files via aws.s3?

bevingtona avatar Nov 19 '21 00:11 bevingtona

As far as I can see there is get_bucket where you can list objects starting with a prefix... The S3 documentation is here: https://docs.aws.amazon.com/s3/index.html, and it looks like {aws.s3} wraps most of that functionality.

It looks like there are many tens of thousands of files in that bucket, so listing them all isn't really feasible - I think searching/prefixing is the right way to go

ateucher avatar Nov 19 '21 00:11 ateucher

Feels like we can approach this the same way as the CDED data (with the exception of local storage) by just giving a function some spatial data and then returning (using prefixes and maptiles) to return the lidar data. @bevingtona would you use a vrt to similarly stitch together a few lidar files?

boshek avatar Nov 19 '21 00:11 boshek

@boshek For the existing .tif files available, yes the .vrt approach would work well

But the thing about lidar is that there are a great many data products possible from the .las file

So I suppose we need to decide what bcmaps wants to "offer"

  1. Download and mosaic geotiffs for an AOI
  2. Download las/laz files
  3. Process las/laz files into DTM, DSM, CHM, hillshade, etc.?

bevingtona avatar Nov 19 '21 00:11 bevingtona

Is this just leading to a new package? bclidar? where one can plot the index files and download the tif and or the las and then even have some vignettes for processing using the lidR and rlas packages?

bevingtona avatar Nov 19 '21 16:11 bevingtona

+1

boshek avatar Nov 19 '21 17:11 boshek

I think so too

ateucher avatar Nov 19 '21 17:11 ateucher

@ateucher @boshek let me know when it's done ;)

bevingtona avatar Nov 19 '21 17:11 bevingtona

jk, should we close the issue and start fresh in a new repo? I think cded can stay in bcmaps.. it plays well as a "wall to wall" dataset for BC..

bevingtona avatar Nov 19 '21 17:11 bevingtona

You can transfer an issue from one repo to another as well, that might be helpful

ateucher avatar Nov 19 '21 17:11 ateucher

+1 for {bclidar}

stephhazlitt avatar Nov 19 '21 17:11 stephhazlitt

Well this will make things easier!
https://catalogue.data.gov.bc.ca/dataset/lidarbc-open-lidar-data-portal/resource/c4d76fbf-fe8a-433a-8b9c-06a9ad00e664

bevingtona avatar Nov 19 '21 19:11 bevingtona

Totally! This package can definitely has bcdata as a dependency making retrieval pretty robust to backend changes (🤞 )

boshek avatar Nov 19 '21 20:11 boshek

Some more experimentation

library(esri2sf)
library(mapview)
library(bcmaps)
library(dplyr)
library(sf)

# AOI
aoi <- bcmaps::bc_cities() %>% filter(NAME == "Vancouver") %>% 
  st_buffer(5000) %>% 
  st_transform(4326) # NEEDS TO BE EPSG:4326 

mapview(aoi)


# LIST OF DATASETS
lidar_extent     <- "https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/0"
lidar_dsm_02500k <- "https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/1"
lidar_dsm_20000k <- "https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/2"
lidar_pointcloud <- "https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/3"
lidar_dem_02500k <- "https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/4"
lidar_dem_20000k <- "https://services6.arcgis.com/ubm4tcTYICKBpist/arcgis/rest/services/LiDAR_BC_S3_Public/FeatureServer/5"

# QUERY DATASET
tiles <- esri2sf(lidar_dem_20000k, bbox = aoi %>% st_bbox())
#> Layer Type: Feature Layer
#> Geometry Type: esriGeometryPolygon
#> Service Coordinate Reference System: PROJCS["PCS_Albers",GEOGCS["GCS_North_American_1983",DATUM["D_North_American_1983",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Albers"],PARAMETER["False_Easting",1000000.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",-126.0],PARAMETER["Standard_Parallel_1",50.0],PARAMETER["Standard_Parallel_2",58.5],PARAMETER["Latitude_Of_Origin",45.0],UNIT["Meter",1.0]]
#> Output Coordinate Reference System: 4326

mapview(aoi) + mapview(tiles)


# DOWNLOAD
fol <- tempdir()

for(i in 1:nrow(tiles)){
  print(i)
  tile <- tiles[i,]
  file <- tile$filename
  url <- tile$s3Url
  httr::GET(url, 
            httr::write_disk(paste(fol,file,sep="/"), 
                             overwrite=TRUE))}
#> [1] 1
#> [1] 2

l <- list.files(fol, pattern = "bc_", full.names = T)

sf::gdal_utils(util = "buildvrt", 
               source = l, 
               destination = "my.vrt")

stars::read_stars("my.vrt", proxy = T) %>% 
  mapview(maxpixels = 10000) + mapview(tiles)
#> Number of pixels is above 10000.Only about 10000 pixels will be shown.
#> You can increase the value of `maxpixels` to 323986185 to avoid this.

Created on 2021-11-26 by the reprex package (v2.0.0)

bevingtona avatar Nov 26 '21 08:11 bevingtona

Searching for a list of available lidar tiles brought me to this thread before I could find the index.

+1 for {bclidar} as well. Would love to see a command line interface to it as well if possible.

smnorris avatar Dec 02 '21 17:12 smnorris

Just wondering what the status of this functionality is - is this going to be written into the bcmaps package, or is bclidar being actively developed? I tried searching for that package name on Google and GitHub but being outside of government I wasn't able to find anything. Thanks all!

mcoghill avatar Nov 01 '22 18:11 mcoghill

No progress from me! Just the prototype above.. anyone else made any strides? @ateucher @stephhazlitt @boshek @HunterGleason @SashaNasonova

bevingtona avatar Nov 01 '22 20:11 bevingtona

I'm afraid not over here either...

ateucher avatar Nov 01 '22 22:11 ateucher

It is an open issue here too. https://github.com/smnorris/bcdata/issues/107 - has not been a priority so far.

smnorris avatar Nov 01 '22 22:11 smnorris

Not sure a 'bulk download' function is the best approach. Still ends up being quite a bit of processing locally. Wouldn't the better approach be to host the data as Cloud Optimized GeoTIFF and as Cloud Optimized Point Clouds?

bevingtona avatar Nov 01 '22 22:11 bevingtona

I'm sure you're right @bevingtona, unfortunately the hosting question is out of scope for bcmaps!

ateucher avatar Nov 01 '22 22:11 ateucher

I think the DEM TIFFs should be valid COGs already? I wasn't involved with the final publication/conversion but evaluated possible formats / compression and we pushed for COG, and I think with DEFLATE.

smnorris avatar Nov 01 '22 22:11 smnorris

$ rio cogeo validate bc_092b091_xl1m_utm10_2019.tif
The following warnings were found:
- The file is greater than 512xH or 512xW, it is recommended to include internal overviews

/Users/snorris/Downloads/bc_092b091_xl1m_utm10_2019.tif is a valid cloud optimized GeoTIFF

smnorris avatar Nov 01 '22 23:11 smnorris

@ateucher i guess this might be out of scope for bcmaps, but I think if the storage is solved, then bcmaps or bcdata would have associated functions to access the data. So in my mind, not sure where else to bring this conversation.

@smnorris good point! So either a SpatioTemporal Asset Catalog (STAC) of the COG tiles is needed, or stitch the lidar tiles into mosaics per project and serve them as COGs, then the user can query the bounding box etc in the HTTP request... or simply add the COG path into QGIS for example.

Not sure the best path forward.. but cool that these are COGs! I didn't realize!

image

bevingtona avatar Nov 01 '22 23:11 bevingtona

Thanks for the discussion! I know this isn't being prioritized, but I'll throw it out there and say that there is interest in a simple function that can accomplish downloading available LiDAR of a defined area in a similar manner to the cded functions when downloading TRIM elevation. Even if it starts out with @bevingtona's prototype of bulk downloading tiles, as a user I would be okay with that for now, and then perhaps later on changing the function to accomplish what you hope for using STAC's (that is beyond my understanding but hopefully it's not too difficult for you to figure out!).

Thanks all for the wonderful work you all do :)

mcoghill avatar Nov 09 '22 15:11 mcoghill