justice40-tool icon indicating copy to clipboard operation
justice40-tool copied to clipboard

Round all floats for frontend

Open mattbowen-usds opened this issue 2 years ago • 2 comments

While trying to understand what drives the size of our pbf tiles, we determined that adding an unrounded float that exists in most tracts increases file sizes for zoom level 5 by about 20% over the minimum file size possible.

We looked, and all of the float fields in the tiles are acceptable to round to two digits of accuracy, so make that change.

mattbowen-usds avatar Oct 25 '22 20:10 mattbowen-usds

I did a little extra analysis, and here's WHY that matters:

  • the pbf format specifies key-value pairs (the variables we send for every tract) have to be string:string. The type information is sent separately, but in the file everything is a string
  • The format extracts strings into a "string table," which is why booleans and state names are so inexpensive --- every unique string goes in the string table exactly once
  • This also explains the difference between rounded and unrounded floats --- rounded floats add at most 100 strings to the string table (0.0 - 1.0), but unrounded floats can add a value for every tract (which is like 14K for some of the bigger files)

If I were building this again, I might consider putting much less data in each tile (basically the minimum necessary to render the map's styles and lookup the tract) and then add a separate call to get the rest of the fields via ajax (and maybe return a bunch of tracts at a time by grouping the results by FIPS code) and maybe even use a more compact encoding like messagepack. But hindsight is of course 20:20.

mattbowen-usds avatar Oct 25 '22 21:10 mattbowen-usds

This ended up saving 448KB off our worst-case tile, which isn't nothing but also isn't anything I'd write home about (about 10%)

mattbowen-usds avatar Oct 26 '22 18:10 mattbowen-usds