docker image and downloads limit break loop
I use docker compose with ansible, and I run with these 2 settings:
geoip:
image: "ghcr.io/maxmind/geoipupdate"
restart: always
volumes:
- geodb:/usr/share/GeoIP
environment:
GEOIPUPDATE_FREQUENCY: 12
Thats 12 hours between updates, and restart the container if it fails
Under normal operation the process: logs show it all seems ok:
# STATE: Sleeping for 12 hours
# STATE: Running geoipupdate
But today, I found the image in a loop:
Error retrieving updates: running the job processor: running job: unexpected HTTP status code: received HTTP status code: 429: {"code":"LIMIT_EXCEEDED","error":"Daily GeoIP database download limit reached"}
# STATE: Running geoipupdate
Error retrieving updates: running the job processor: running job: unexpected HTTP status code: received HTTP status code: 429: {"code":"LIMIT_EXCEEDED","error":"Daily GeoIP database download limit reached"}
# STATE: Running geoipupdate
Error retrieving updates: running the job processor: running job: unexpected HTTP status code: received HTTP status code: 429: {"code":"LIMIT_EXCEEDED","error":"Daily GeoIP database download limit reached"}
# STATE: Running geoipupdate
Error retrieving updates: running the job processor: running job: unexpected HTTP status code: received HTTP status code: 429: {"code":"LIMIT_EXCEEDED","error":"Daily GeoIP database download limit reached"}
# STATE: Running geoipupdate
....
With one request per second....
I guess, the geoipupdate must NOT fail under this condition, simply inform that the update is skipped due to reached limit. Failing will kill the docker image and triggers a docker restart, that again will result in the same outcome hitting MaxMid servers once per second.
Normally scripts will continue to work even if a command fails, which would be the desired outcome, unless you include the following statement:
https://github.com/maxmind/geoipupdate/blob/1e656edd083235156034a58f846f10c7023b323e/docker/entry.sh#L3C1-L3C7
That makes the script fail, and triggers the loop.
I have now found the pattern of "failing" if the update cannot be performed in several products using MaxMind.
Not being able to perform the update due to rate limiting is not a fatal error, and should just be logged, otherwise a much bigger problem is generated.
Thank you for the report! I agree that behaviour is not great.
One immediate mitigation to reduce the number of requests might be to increase the delay between retries. It looks like you could do that with restart_policy.
As far as an overall solution, I'm not sure what's best. I don't think we'd want to continue in the loop for all errors.