t-rex
t-rex copied to clipboard
What format of the geometry is faster for T-Rex?
Hi there,
in the project we are building at the moment the biggest problem is tiles generation speed.
So, these are input parameters:
T-Rex version 0.14.1
Config in toml.
# t-rex configuration
[service.mvt]
viewer = true
[[datasource]]
dbconn = "postgresql://{{USER}}:{{PASS}}@{{DBHOST}}.c4epkxdwfidz.us-east-1.rds.amazonaws.com/{{DBNAME}}"
name = "shared"
[grid]
# Predefined grids: web_mercator, wgs84
predefined = "web_mercator"
# ************** Global Settings **************
[[tileset]]
name = "municipal_zoning_code"
extent = [-198.808594,-16.499992,-61.699219,70.958956]
[[tileset.layer]]
datasource = "shared"
name = "municipal_zoning_code"
table_name = "land_use_city"
geometry_field = "geom"
geometry_type = "MULTIPOLYGON"
srid = 3857
#fid_field = "id"
#tile_size = "4096"
buffer_size = 3
simplify = false
minzoom = 10
maxzoom = 18
#query_limit = 1000
[[tileset.layer.query]]
sql = """SELECT Sieve(geom, ZRes(!zoom!::int - 1) * 3) as geom,id,title,land_use_type,place,jurisdiction,uniqueid,color FROM land_use_city WHERE geom && !bbox! and land_use_type='Municipal Zoning Code'"""
[cache.s3]
endpoint = "https://s3.us-east-1.amazonaws.com"
bucket = "{{BUCKET}}"
access_key = "{{AKEY}}"
secret_key = "{{SKEY}}"
region = "us-east-1"
baseurl = "https://st1-tiles.{{DOMAIN}}.com"
[webserver]
# Bind address. Use 0.0.0.0 to listen on all adresses.
bind = "0.0.0.0"
port = 6767
threads = 4
cache_control_max_age = 2592000
Number of records in DB - 13231 Number of records that match query in selected bbox - 60
Generation logs for:
time docker-compose run --name TL_01 --rm trex-master sh -c 't_rex generate --tileset municipal_zoning_code --overwrite true --minzoom 10 --maxzoom 18 --extent -81.658619,28.346742,-80.863155,28.786278 --config /storage/trex/config.toml;'
Timing
Z10 - 0m 20.023s
Z11 - 0m 14.679s
Z12 - 0m 31.691s
Z13 - 1m 55.904s
Z14 - 6m 17.114s
Z15 - 19m 56.263s
Z16 - 73m 25.188s
Z17 - 284m 10.441s
Z18 - 1130m 6.336s
Generation was performed on ec2 r5.4xlarge (16 vCPU 128 GB RAM) and with DB db.r6g.4xlarge (16 vCPU 128 GB RAM)
With bigger zoom levels, like 17 or 18 this takes hours. We are wondering if the issue is in data that we store as multipolygons. Those multipolygons contain from 1 to 2-3K individual polygons.

From the performance perspective, would it be better to have these data represented in DB as polygons, so instead of 60 records, in our case, we will have several thousand? Looking forward to recommendations.
Note: this is a part of the query that takes the most of the time:
SELECT ST_Multi(ST_Buffer(ST_Intersection(geom,ST_MakeEnvelope($1-?*$6::FLOAT8,$2-?*$6::FLOAT8,$3+?*$6::FLOAT8,$4+?*$6::FLOAT8,?)), ?)) AS geom,"id","title","land_use_type","place","jurisdiction","uniqueid","color" FROM (SELECT Sieve(geom, ZRes($5::int - ?) * ?) as geom,id,title,land_use_type,place,jurisdiction,uniqueid,color FROM land_use_city WHERE geom && ST_MakeEnvelope($1-?*$6::FLOAT8,$2-?*$6::FLOAT8,$3+?*$6::FLOAT8,$4+?*$6::FLOAT8,?) and land_use_type=?) AS _q

This is really extremely slow. Is there an index on the geometry field (CREATE INDEX ON land_use_city USING GIST(geom);) or could the Sieve function on the geometry be slow? I would recommend using EXPLAIN to find out which part of the query is slow.
Is there an index on the geometry field (CREATE INDEX ON land_use_city USING GIST(geom);)
There is an index for the geometry column. I've performed tests with it and without it for zoom levels 12 and 13 using the smaller machine. Generation time was the same.
The blue color and first execution is zoom level 12 and no index.
The red color and first execution is zoom level 13 and no index.
The second execution for both zoom levels was with the index field added.

or could the Sieve function on the geometry be slow?
I don't think that this is the case since the same performance was before we started using that function.
I would recommend using EXPLAIN to find out which part of the query is slow
I'll try to get this information for you and post it here.
Here is a log from DB for all of the queries that are longer than 4 seconds postgresql.log.2021-11-25-11(more than 4 seconds).txt .
Geometry index usage looks fine. How's the timing without using the Sieve function?
This is a log without using the Sieve function
postgresql.log.2021-11-26-13.txt
Also, this is a generation timing using ec2 r5.4xlarge (16 vCPU 128 GB RAM) and with DB db.r6g.4xlarge (16 vCPU 128 GB RAM) without the Sieve function.
| Zoom | With Sieve | Without Sieve |
|---|---|---|
| Z10 | 0m 20.023s | 0m 22.549s |
| Z11 | 0m 14.679s | 0m 8.360s |
| Z12 | 0m 31.691s | 0m 15.165s |
| Z13 | 1m 55.904s | 0m 20.980s |
| Z14 | 6m 17.114s | 0m 37.817s |
| Z15 | 19m 56.263s | 2m 13.246s |
| Z16 | 73m 25.188s | 5m 19.229s |
| Z17 | 284m 10.441s | 11m 45.406s |
| Z18 | 1130m 6.336s | 34m 49.984s |
Looks like I know the answer to why generation is so slow. However, still, wondering what it's better - polygon or multipolygon?