Improve resilience if diff fetch fails or .osc.gz file is corrupted
If the download of a .osc.gz file is interrupted, this can leave an incomplete file in the diff directory. Then, when apply_osc_to_db.sh attempts to use this file, it will hang and fail to complete the update. At this point, the script will output the following error:
gzip: stdin: unexpected end of file
Reading XML file ...Parse error at line NNNN:
unclosed token
To recover from this scenario, the broken .osc.gz must be re-downloaded manually. The fetch_osc.sh script will not automatically replace it. Then, all of the components must be restarted including the dispatcher processes. If apply_osc_to_db.sh is restarted without restarting the dispatcher, it will still hang and fail to complete the update.
Two fixes would be good. The fetch_osc.sh script could have better protection against interruptions. And the apply_osc_to_db.sh script could handle corrupted .osc.gz files better.
I think this issue should be reproducible by manually truncating a .osc.gz file before it is processed.
The same thing happens in case of a simple network issue:
2025-02-06 22:53:49 URL:https://osm-planet-eu-central-1.s3.dualstack.eu-central-1.amazonaws.com/planet/replication/minute/state.txt [86/86] -> "/tmp/osm-3s_update_nIXV4S/state.txt" [1]
2025-02-06 22:53:49 URL:https://osm-planet-eu-central-1.s3.dualstack.eu-central-1.amazonaws.com/planet/replication/minute/006/462/662.state.txt [86/86] -> "/tmp/osm-3s_update_nIXV4S/006462662.state.txt" [1]
Unable to establish SSL connection.
gzip: stdin: unexpected end of file
Check State
Apply
Reading XML file ...Parse error at line 1:
no element found
Unfortunately restarting the apply_and_fetch process doesn't help. It will download new files, but then hang after printing 'Apply'. A full restart of the dispatcher is needed in this case.