osmprj icon indicating copy to clipboard operation
osmprj copied to clipboard

Add support for Global Human Settle Layer datasets

Open travishathaway opened this issue 3 years ago • 1 comments

The global human settlement layer (GHSL) offers a global data set for population densities. Combining this with OSM data could provide for some pretty interesting analysis. I am creating this issue as a way to document how I go about adding support for this data in this project.

Acceptance Criteria

  • The process for importing this data is clearly documented somewhere (docs for the project or even a link to a publicly available Google Doc)
  • A new report command has been created showing how to query this data and create some reports/visualizations
  • Perhaps even a new command for importing this data (optional)

travishathaway avatar Apr 19 '22 09:04 travishathaway

Here's the method I've come up so far for this (written as a bash script):

# Download the whole data set first
curl -O -L https://cidportal.jrc.ec.europa.eu/ftp/jrc-opendata/GHSL/GHS_POP_MT_GLOBE_R2019A/GHS_POP_E2015_GLOBE_R2019A_54009_250/V1-0/GHS_POP_E2015_GLOBE_R2019A_54009_250_V1_0.zip

# Unzip to preferred location

# Reproject data set to EPSG:3857
gdalwarp \
  -t_srs EPSG:3857 \
  GHS_POP_E2015_GLOBE_R2019A_54009_250_V1_0.tif \
  GHS_POP_tmp.tif

# Compress; the temp file we just created is about 45GB!!!
gdal_translate \
  -co compress=lzw \
  GHS_POP_tmp.tif \ 
  GHS_POP_E2015_GLOBE_R2019A_54009_250_V1_0-3857.tif

rm GHS_POP_tmp.tif

# Extract based on a country polygon
gdalwarp \ 
    -of GTiff \
    -cutline ../natural_earth_datasets/germany.geojson \
    -crop_to_cutline GHS_POP_E2015_GLOBE_R2019A_54009_250_V1_0-3857.tif \
    germany_population.tiff 

# Import into PostgreSQL
# This step can also be piped directly to the psql command
raster2pgsql \
  -c \
  -s 3857 \
  -t auto \
  -I \
  -M \
  germany_population.tiff \
  public.pop_data > germany_pop.sql

At this point, we are ready to begin querying our database.

The only thing that I don't like about the above solution is that when I reproject to a different SRS, the data set balloons to 45GB!!! (from only being about 500MB). I have no idea why this is happening, but it seems like the solution is just to compress it again and delete the temp file.

travishathaway avatar Apr 19 '22 14:04 travishathaway