core
core copied to clipboard
[Feature Request] Integrating IPinfo's free IP to Country ASN database
Update: https://github.com/opnsense/core/issues/7779#issuecomment-2945306686
We are offering the network/CIDR based IP dataset.
Important notices
Before you add a new report, we ask you kindly to acknowledge the following:
- [x] I have read the contributing guide lines at https://github.com/opnsense/core/blob/master/CONTRIBUTING.md
- [x] I am convinced that my issue is new after having checked both open and closed issues at https://github.com/opnsense/core/issues?q=is%3Aissue
Is your feature request related to a problem? Please describe.
Using the IPinfo IP to Country ASN or IP to Country database will address several problems with the current IP geolocation implementation:
- It enables users to have an alternative (if not replacement) source of data than the current free. https://github.com/opnsense/core/issues/6264
- IPinfo's country-level database provides full accuracy and is updated daily. The current database intentionally provides a compromised accuracy database as they have a paid version of the country database.
- It packages both IPv4 and IPv6 in a single database, eliminating the need to use multiple databases. https://github.com/opnsense/core/blob/5dd269f7e1bef0b637bcb9c45a0338288da7268f/src/opnsense/scripts/filter/download_geoip.py#L38-L39
- The ASN/ISP data provided is useful and can be sourced from this single database.
- The project can also have its own project-specific download access token, eliminating the need for users to bring their own access token and reducing setup time.
Describe the solution you like
I am requesting to add support for IPinfo's IP to Country database to the project. The database has the following features:
- It includes country and ASN information in the same database.
- It is updated daily, with zero compromise to accuracy. There is no range clustering, and the database provides full accuracy.
- The data granularity reaches individual IP level.
- The database comes in MMDB database format.
- It is licensed under CC-BY-SA 4.0, permitting commercial usage.
- Available file formats include: CSV, MMDB, JSON
- The data is tabular and unnested, making it very easy to use. The dataset includes both IPv4 and IPv6 in a single file.
Database schema
| Field Name | Example | Data Type | Description |
|---|---|---|---|
start_ip |
1.0.16.0 | TEXT | Starting IP address of an IP address range |
end_ip |
1.0.31.255 | TEXT | Ending IP address of an IP address range |
country |
JP | TEXT | ISO 3166 country code of the location |
country_name |
Japan | TEXT | Name of the country |
continent |
AS | TEXT | Continent code of the country |
continent_name |
Asia | TEXT | Name of the continent |
asn |
AS2519 | TEXT | Autonomous System Number |
as_name |
ARTERIA Networks Corporation | TEXT | Name of the AS (Autonomous System) organization |
as_domain |
arteria-net.com | TEXT | Official domain or website of the AS organization |
Documentation: https://ipinfo.io/developers/ip-to-country-asn-database
Samples are available here: https://github.com/ipinfo/sample-database/tree/main/IP%20to%20Country%20ASN
The database can be downloaded simply by accessing the storage URI with an access token.
curl -L https://ipinfo.io/data/free/country_asn.mmdb?token=<YOUR_TOKEN> -o country_asn.mmdb
Describe alternatives you considered
A clear and concise description of any alternative solutions or features you considered.
I have not considered an alternative.
Additional context
The business version of OpnSense includes a paid version of the GeoIP country database. However, even though IPinfo's IP to Country database is free, it is the best country-level data available out there because the data source itself is based on latency and networking data-based methodology instead of self-reported locations of ASNs/ISPs.
https://ipinfo.io/accuracy
Additionally, there is no range clustering or delayed updates with IPinfo. IPinfo does not have an accuracy-compromised free country or city database. This database can be considered for the business variant of the software as well and license is permissive to commercial usage.
A Reddit post by redspidr demonstrated the idea of introducing IPinfo's data into Opnsense. Their project converts IP addresses to links to their IPinfo page links, which provide detailed metadata on those IP addresses.
Inspired by it, I have requested the Reddit Opnsense community to review this ticket and recommend bringing our data to Opnsense with the integration of our free IP database first. The current issue does not explore the project demonstrated by the Reddit community user redspidr but a native under-the-hood integration using our database.
A couple of issues were raised by one of the Reddit community users regarding this request, so I am pasting my answers here.
Inclusion of your data rather than the existing provider in the business version?
The free IP database that we have is the best possible variant and is equal if not better than the paid version of the country database the business version of OpnSense is currently using. Consequently, the database is certainly better than the existing free version of the IP database the community version the project uses.
This poses a challenge: how does the Opnsense community benefit most from which action?
- Option 1 (My preference): Replacement of both the free version database from the community version of the project as well as the paid version of the database from the business version with the single IPinfo free IP to Country ASN or IP to Country database. This not only will provide a unified experience to community and business users with better data, but will also reduce business costs associated with licensing the paid database. Our database is licensed under CC-BY-SA 4.0, which is a commercial permissive license and allows distribution. We do not have complex EULA agreements, you can easily use the database. This option is my preference.
- Option 2: Replacing the existing database with our free database from the community version only. The community version will have better data than the business version if the project maintainers accept this proposal. The database is designed to support open source projects primarily.
- Option 3: Adding additional support to the database to the existing database. This will support bringing your own database from us and the existing provider, giving users the option to choose.
Documentation on how to add your data to the free version in parallel to the existing docs?
Considering the previous topic, I am not sure what option would be considered by the project maintainers. If they want to replace their existing database provider they can do that or they can integrate our database in parallel to the existing database.
In terms of documentation, there are slight modifications involved.
- Both are MMDB databases, so parsing the database should not be an issue.
- The existing database uses country-level data which we also provide here. However, there are differences in the database schema. We use a flat and tabular data structure while the existing database uses a nested database provider.
- Update frequency. Our database is updated every day, so we appreciate the database to be updated frequently. The existing database is not updated daily.
- Packaging decision. Because our database has more lenient licensing terms, community version users do not have to download their own database or bring their own access token. The project can use their own project-specific access token.
Here is a blog post: https://ipinfo.io/blog/migrating-from-maxmind-to-ipinfo
Please let me know if you have further questions. Thank you very much.
For us as a core team this isn't a priority, given our business edition does contain simple to use geo aliases out of the box including a documented file format to use for the community version (https://docs.opnsense.org/manual/aliases.html#geoip).
I do understand that your company will prefer your product above one of its competitors, but there's also some marketing involved in claims being made.
Personally I don't have a strong preference for a geoip vendor, but when it's a commercial discussion, our community GitHub might not be the best place.
If someone does want to do the work, and the amount of required guidance is limited, we will assess in the usual way.
@AdSchellevis
Thank you for reviewing the request. I sincerely appreciate you taking the time to review the issue.
This was not a commercial request, nor am I trying to sell the Opnsense community a comercial service. I advocated bringing highly accurate data designed for open-source projects in mind. The free database is licensed under CC-BY-SA 4.0
I do understand that your company will prefer your product over one of its competitors, but there's also some marketing involved in the claims being made.
I understand the skepticism involved. However, in terms of accuracy, we can provide verifiable information to back up our claim, even for a free database. If you are interested in verifying our claims for accuracy, please let me know. I can walk you through a self-evaluation process that ensures you and the community personally verifying this information.
Personally, I don't have a strong preference for a geoip vendor, but when it's a commercial discussion, our community GitHub might not be the best place.
No, I am not making any form of commercial discussion at all. The proposal was the integration of a free database. There was absolutely no hint of a commercial service. My apologies if I have indicated otherwise. I have tried my best to understand the issues and motivation for selecting the geoip database, and I have seen a ticket where you have mentioned that the software offers the paid IO database through the business version.
However, my proposal was to replace even the paid version of the IP database with a free IP database that we can demonstrate can provide better accuracy.
If someone does want to do the work, and the amount of required guidance is limited, we will assess it in the usual way.
Thank you for considering the issue.
My apologies if I was unclear in saying this is not a commercial service. We built this free IP to Country database primarily to support open source projects. I understand Opnsense is a massive project and the changes required to adopt it may be significant. I can assure you that we can demonstrate the value of adopting the free database to the community and the project's customers.
Would love to see alternatives to Maxmind as well, unfortunately, definitely do not have the time to do the coding at the moment.
Now, there would be a super-easy and fast way to get this available in OPNsense @abdullahdevrel - "simply" provide the data in the CSV format documented and required for OPNsense. 😉
Thanks @doktornotor. I really appreciate that you reviewed the request. This is a significant request, and I understand that it will require engineering commitment to support it. We will let our OPNsense users know that this request is being considered.
Now, there would be a super-easy and fast way to get this available in OPNsense @abdullahdevrel - "simply" provide the data in the CSV format documented and required for OPNsense. 😉
I hope my pitch makes sense when we said our database is simple to use. The current implementation requires 3 CSV files.
While we have all this information in a single file in our IP to Country database:
| start_ip | end_ip | country | country_name | continent | continent_name |
|---|---|---|---|---|---|
| 2 | 2620:0:1cff:dead:bef1:100:1:1aa | 2620:0:1cff:dead:bef1:100:1:1b0 | SG | Singapore | AS |
| 3 | 212.221.79.153 | 212.221.79.171 | DE | Germany | EU |
https://github.com/ipinfo/sample-database/tree/main/IP%20to%20Country
If anyone wants to use our data for now, they will have to make modifications to the database on their end.
First of all, I am NOT an OPNsense developer, merely a random code contributor.
If anyone wants to use our data for now, they will have to make modifications to the database on their end.
Well yes, that is the problem. I have been merely hinting the fastest way to get your GeoIP data used in OPNsense - without any coding being required on the OPNsense part (paste in an URL pointing to ZIP file with the required CSV files, done.)
Using a single CSV file might even be easier and faster to process - if someone does the coding, however that's not a drop-in replacement. Not having the IP ranges in CIDR format being one of the examples why the current code won't work and non-trivial amount of coding is required to support this single-file format.
Got it. Thank you.
I am not sure why the project does not use the MMDB file format, which is designed for fast and efficient lookups.
Not having the IP ranges in CIDR format being one of the examples why the current code won't work and non-trivial amount of coding is required to support this single-file format.
We have a tool for that called range2cidr (which also part of our CLI), that can generate the CIDR/range column.
| cidr | country | country_name | continent | continent_name |
|---|---|---|---|---|
| 1.0.0.0/25 | AU | Australia | OC | Oceania |
| 1.0.0.128/26 | AU | Australia | OC | Oceania |
The issue is that the time to generate the CIDR is a bit slow
time ipinfo range2cidr country.csv > country_cidr.csv
real 9m58.852s
user 0m22.040s
sys 0m44.654s
Yeah, a bit. 😉 Same issue with Python, PHP or whatever other code used for the purpose on similar projects. Now, assuming similar HW specs, multiply the wasted CPU time by the user base and one update per day.
As for why MMDB format is not used - the thing is - you are not doing realtime lookups. You simply parse the data once every while and use that parsed data in firewall rules to reject/accept connections. There's no integration for lookups in databases present in the pf firewall code, plus the performance would not exactly rock either I guess.
I will try to think of a solution. The challenge is that we probably won't be able to produce the CIDR variant of the free IP database for download because we would then have to account for maintenance of another variant of the same product.
When MM switched from their legacy geoip to a more modern variant, it broke a lot of things. We aimed to remain stable from day 0 onward to avoid such situations. Introducing a CIDR variant of the database could increase the load on us as we would have to maintain it virtually indefinitely.
I second that this would be awesome!
Update: IPinfo Lite
We now have a CIDR/Network based IP to Country ASN database: IPinfo Lite
🔗 https://ipinfo.io/developers/ipinfo-lite-database
| Field Name | Example | Data Type | Descrption |
|---|---|---|---|
| network | 154.24.39.204/30 | TEXT | CIDR/IP range or single IP address |
| country | Canada | TEXT | Country name |
| country_code | CA | TEXT | Two-letter ISO 3166 country code of the IP addresses |
| continent | North America | TEXT | Continent name of the IP location |
| continent_code | NA | TEXT | Two-letter continent code of the IP location |
| asn | AS174 | TEXT | Autonomous System Number, an organization that owns the IP range block |
| as_name | Cogent Communications | TEXT | Name of the AS (Autonomous System Number) organization |
| as_domain | cogentco.com | TEXT | Official domain or website of the ASN organization |
Our database, despite being free and having a great OSS-friendly license, is the best IP to Country ASN database available.
We are not making any compromises with our data; it is fully accurate. It would be truly an honor for us if the project considers our data. It is the same data that comes from running 1,000 servers across 130 countries.
A firewall software should have native local databases. However, if downloading and maintenance are issues, we have also developed a standalone API system based on this data to support an infinite amount of requests. Unlimited requests are supported and it is free as well.
https://ipinfo.io/developers/responses#lite-api
There is no EULA, there are no restrictions on distribution, there is no limitation on commercial intent, there is no compromise, and there is no downside. We are trying our best to be as OSS friendly, but regardless, it is an enterprise-ready, mission-critical database currently being used by several major F500 companies in production.
Seems cool having more options
Thank you very much, @sopex. It will not be just more options. It is the most accurate data out there right now. Once you integrate our data into the software, regardless of whether the data is free, you will have a direct line to us for asking questions or providing feedback on IP geolocation data. We are super active in all the communities that use our data.
Came across this thread after having issues with Sophos using MaxMind, giving us inaccurate results. I went to block Lithuania due to a large influx of malicious traffic from there, only to find out MaxMind thought the IP was from the USA... IPInfo had no such issue.
@paz That is unfortunate. I just want to mention that our data is the best there is, and if you ever have any doubts about its quality, feel free to reach out to me. We'll gladly investigate the issue and provide evidence to support our reporting.
In case someone wants to try https://github.com/opnsense/core/commit/2d91f22b48d0b72ca8978636e05df0ce7b0b1a1d, install the update on 25.7.1 using:
opnsense-patch 2d91f22
Next input the IPinfo location with an account registered there ( https://ipinfo.io/data/ipinfo_lite.csv.gz?token=XXXXXXX) into the url field and hit apply.
In case you need to manually flush the existing geoip data:
rm /usr/local/share/GeoIP/alias.stats
In case someone wants to try 2d91f22, install the update on 25.7.1 using:
opnsense-patch 2d91f22Next input the IPinfo location with an account registered there ( https://ipinfo.io/data/ipinfo_lite.csv.gz?token=XXXXXXX) into the url field and hit apply.
In case you need to manually flush the existing geoip data:
rm /usr/local/share/GeoIP/alias.stats
Seems to work great, thank you!! Also, the https://api.ipinfo.io/lite/8.8.8.8?token=XXXXXXXX that the IPinfo GUI suggests seems to work too
@AdSchellevis Thank you very much for supporting our database. We are excited to see Opnsense as a software and community adopting our data. To the Opnsense community, you are not only receiving the data but also a commitment from us to provide excellent data, infrastructure, and active participation in the community.
- IPinfo Community: https://community.ipinfo.io/
- Opnsense community: https://forum.opnsense.org/index.php?action=profile;area=forumprofile;u=41997
- Reddit: https://www.reddit.com/r/opnsense/
If you have any questions about our data, please feel free to ask on your community platforms, and I will be there to answer you. Cheers!
@abdullahdevrel thank you for offering it, thirst thing that stood out while working on this was the size of the database, it's quite large and at a first glance seems to be more fine grained than what we used before.
I'm keeping this ticket open for a while, documentation still needs an update, but the code seems to be more or less finished.
We provide full accuracy and zero compromise data for IPinfo Lite, meaning that the granularity of IP address accuracy extends to individual IP addresses. So, the size is quite large.
Can also confirm, this works great! Thank you @AdSchellevis and @abdullahdevrel!