[management] Auto update geolite
Describe your changes
Introduces automated updates for GeoLite databases, eliminating the need to manually run the download-geolite2.sh script. Key changes include functions to fetch and update database files based on versioned filenames, auto-download logic when outdated files are detected, and a new --disable-geolite-update flag.
By default, this flag is set to true, disabling automatic updates. Users who want to enable auto-updates can do so by setting --disable-geolite-update=false. Additionally, obsolete scripts and reload functions have been removed.
This update simplifies version tracking and ensures database consistency without manual intervention while offering flexibility for self-hosted users.
Issue ticket number and link
Checklist
- [ ] Is it a bug fix
- [ ] Is a typo/documentation fix
- [x] Is a feature enhancement
- [x] It is a refactor
- [ ] Created tests that fail without the change (if possible)
- [x] Extended the README / documentation, if necessary
I can go ahead and refactor the switch...case... statement into two separate functions if that's necessary for the SonarCloud analysis
Hi @benniekiss, Thank you for your valuable contribution!.
How does it handle automatic geolite db's updates? For instance, I'm currently running the 20240305 db's, and when updates are released, will it upgrade the current GeoLite db's?
Hi @benniekiss, Thank you for your valuable contribution!.
How does it handle automatic geolite db's updates? For instance, I'm currently running the
20240305db's, and when updates are released, will it upgrade the current GeoLite db's?
It just downloads the new files and recreates them with the fresh data. The way it decides to download the new files is whether or not the files posted at https://docs.netbird.io/selfhosted/geo-support#downloading-the-databases have changed filenames. None of the logic around database creation has changed -- I really only changed the filename parsing and the heuristics around deciding when to recreate the databases.
Regarding whether it will auto-download when updates are released, I am not 100% sure. However, I have tested by naming the Geolite db's something like GeoLite2-City_20200101.mmdb, starting the management service, and seeing if it will pull the current databases -- and it does successfully download. It does not download if the database filenames match. I've included logging as well that makes this apparent to users.
I noticed that both the geonames.db and Geolite2-Cities.mmdb filenames were hardcoded, and the function getDatabaseFileName always returned download.SUFFIX as the filename since it was parsing the url, not the headers. This meant that it was difficult to tell what version the current geolite database was for admins.
Beyond the auto-updating feature, the new functions name the databases based on the filenames parsed from the Content-Disposition header, not the url, so it's easier to tell what version the database is and whether it is out of date.
The reason I put this together was because I was reviewing the documentation and saw the need to manually run download-geolite2.sh whenever an update was pushed; and I also noticed that outdated geolite data was the cause of some issues users had reported. This should make it easier to diagnose those issues, and also removes the need for users to manually manage the databases.
How should I go about implementing a disable flag? Since the check only happens when the management service is started, I could add a flag, --disable-geolite-update, so that if the geolite database doesnt exist, it is downloaded, and then on subsequent restarts, no further updates are done and only the existing database is loaded.
You’re on the right track with the --disable-geolite-update flag. I suggest implementing it so that the database files are only downloaded the first time if they do not already exist. The key behavior we want to control with this flag is the check for outdated local databases.
If the flag is set, the system should skip the process of checking whether the local databases are outdated by comparing the names from the download URLs. This means that no further updates are attempted unless the user explicitly opts to check for updates on every restart of the management service.
Had to do some refactoring to allow disabling the auto-update feature, but I think this is a little cleaner approach.
I consolidated getGeoNamesDBFilename and getMMDBFilename into an exported function, GetMaxMindFilenames, which accepts two parameters: a dataDir string and an autoUpdate bool. When autoUpdate is false, it skips the URL name lookup and goes straight to finding an existing database file; if it cannot find a database with the new naming scheme, it defaults to the old database names, geonames.db and GeoLite2-City.mmdb. This should allow current deployments to disable auto-updating the database and essentially maintain the same functionality.
I added a unit test for GetMaxMindFilenames, which I tested locally and was passing.
Finally, I changed the .mmdb naming scheme from GeoLite2-City_{DATE}.mmdb to GeoLite2-City-maxmind_{DATE}.mmdb for clarity.
If users want to migrate to the new auto-update functionality, they have to manually remove the old databases.
Otherwise, once I remove the download-geolite2.sh and the reloader function, all that's left is updating the documentation.
Before I remove the reloader functions and download-geolite2.sh, should these maybe be kept around for a while for a deprecation period? Users might still have a need for the download-geolite2.sh script if disabling auto-update.
It would be a little more work, but I could also refactor the reloader functions so that it can periodically poll the url for new database updates if enabled, that way the management service doesn't have to be restarted to get new database updates
Before I remove the reloader functions and
download-geolite2.sh, should these maybe be kept around for a while for a deprecation period? Users might still have a need for thedownload-geolite2.shscript if disabling auto-update.It would be a little more work, but I could also refactor the reloader functions so that it can periodically poll the url for new database updates if enabled, that way the management service doesn't have to be restarted to get new database updates
The database names have changed from the previous ones. We need to either update the documentation for manual updates or adjust the script to use the new names. Since we have an auto-update option that triggers on management restart, updating the script might be redundant. We should remove them.
@benniekiss can you fix test TestGeolite_Name_Lookup failing on windows. Logs
@benniekiss can you fix test
TestGeolite_Name_Lookupfailing on windows. Logs
could you share the logs again? I can't access them.
I think this is because I'm using the path module to parse filenames when I should be using path.filepath so that it targets os-specific file separators
@benniekiss can you fix test
TestGeolite_Name_Lookupfailing on windows. Logscould you share the logs again? I can't access them.
I think this is because I'm using the
pathmodule to parse filenames when I should be usingpath.filepathso that it targets os-specific file separators
Here is the updated link: logs
I think this should fix it I think. I don't have a windows environment to test locally, though.
Do you want me to rebase onto main, or anything else before merging?
I can create a PR updating the docs here, but I was not sure if there were any other places where documentation should be updated.
Do you want me to rebase onto main, or anything else before merging?
I can create a PR updating the docs here, but I was not sure if there were any other places where documentation should be updated.
We're good to merge this PR without a rebase. For the documentation, it would be great to update the page since we’ve moved away from manual downloads to using the auto-download flag.
Quality Gate passed
Issues
0 New issues
1 Accepted issue
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code