nbn-upgrade-map icon indicating copy to clipboard operation
nbn-upgrade-map copied to clipboard

Consider compressing the json files to reduce repo size?

Open lyricnz opened this issue 2 years ago • 6 comments

Not sure if it's an issue (or there is any limit), but the /results directory is currently ~700MB with only 4.0% of the all-suburbs processed (39% of the listed-suburbs).

If we need to reduce that, we could compress the geojson files for about 10:1 savings. This would make "diffs" fairly useless, and need updates to the sites/index.html to support it.

FWIW this is not a problem for the live website, as most browsers will use gzip content-encoding when fetching the geojson files.

lyricnz avatar Jun 06 '23 00:06 lyricnz

GitHub has a soft limit of 5GB for repositories.

I think the suburbs we have already completed tend to skew larger just from looking at the total number of addresses in the results output. (ps might be good to have a tally of total number of premises/all in gnaf).

We may well have to do that which would be annoying but I don't know what other alternatives exist.

LukePrior avatar Jun 06 '23 01:06 LukePrior

I'm not a JS person at all but here's how to decompress gzip files in JS (async, not promises):

        async function loadSuburbA(url) {
            const ds = new DecompressionStream('gzip');
            const result = await fetch(url);
            const decompressedStream = result.body?.pipeThrough(ds);
            const stuff = await new Response(decompressedStream).json()
            ...
        }

lyricnz avatar Jun 06 '23 01:06 lyricnz

We're currently 42.6% through the list of addresses in the DB https://github.com/LukePrior/nbn-upgrade-map/issues/126#issuecomment-1577878069

lyricnz avatar Jun 06 '23 04:06 lyricnz

In that case I say we wait and see what the overhead is from the remaining suburbs.

LukePrior avatar Jun 06 '23 04:06 LukePrior

Current ./results is 2.6GB with 57.3% of addresses processed.

lyricnz avatar Jul 04 '23 23:07 lyricnz

I think leaving it as is will probably be fine unless GitHub starts complaining. The gzip compression used when grabbing from website works fine

LukePrior avatar Jul 06 '23 02:07 LukePrior