openwisp-controller icon indicating copy to clipboard operation
openwisp-controller copied to clipboard

[feature] Set Up WHOIS Data Retrieval and Storage for Devices

Open DragnEmperor opened this issue 7 months ago • 8 comments

  • In order to fetch the required WHOIS details, we need to first setup a Maxmind account. This gives us access to free databases and keys required for web service.

    There are two ways to fetch the data using geoip2 :

    • Through web service (requires internet for each fetch)
    • Through manually downloading the database (requires scripts for updating)

    We need to finalize which approach we are opting for.

  • After finalizing the approach, we can move forward with creating the WHOIS model with the required fields. These are the fields which can be fetched:

class AbstractWHOISInfo(TimeStampedEditableModel):
    """
    Abstract model to store WHOIS information
    for a device.
    """

    device = models.OneToOneField(
        get_model_name('config', 'Device'),
        on_delete=models.CASCADE,
        related_name='whois_info',
        help_text=_('Device to which this WHOIS info belongs'),
    )
    last_public_ip = models.GenericIPAddressField(
        db_index=True,
        help_text=_(
            'indicates the IP address logged from '
            'the last request coming from the device'
        ),
    )
    organization_name = models.CharField(
        max_length=200,
        blank=True,
        help_text=_('Organization name'),
    )
    country = models.CharField(
        max_length=4,
        blank=True,
        help_text=_('Country Code'),
    )
    asn = models.CharField(
        max_length=100,
        blank=True,
        help_text=_('Autonomous System Number'),
    )
    timezone = models.CharField(
        max_length=100,
        blank=True,
        help_text=_('Time zone'),
    )
    address = models.CharField(
        max_length=255,
        blank=True,
        help_text=_('Address'),
    )
    cidr = models.CharField(
        max_length=20,
        blank=True,
        help_text=_('CIDR'),
    )

    class Meta:
        abstract = True
  • Moving ahead with the celery task implementation, i am planning on using core geoip2 to get the required details like Organization name, ASN, CIDR. Though django provides a wrapper for geoip2, it provides limited fields. Sample Implementation of the celery task:
def fetch_whois_details(device_pk, ip):
    """
    Fetches the WHOIS details of the given IP address
    and creates/updates the device's WHOIS information.
    Also creates/updates the device's fuzzy location.
    """
    WHOISInfo = load_model('config', 'WHOISInfo')
    Location = load_model('geo', 'Location')
    DeviceLocation = load_model('geo', 'DeviceLocation')
    Device = load_model('config', 'Device')
    try:
        # 'geolite.info' is available for free
        ip_client = geoip2.webservice.Client(
            settings.GEOIP_ACCOUNT_ID, settings.GEOIP_LICENSE_KEY, 'geolite.info'
        )
        device = Device.objects.get(pk=device_pk)
        data = ip_client.city(ip)
        # Format address using the data from the geoip2 response
        address = ', '.join(
            [
                data.city.name,
                data.country.name,
                data.continent.name,
                str(data.postal.code),
            ]
        )
        # Create/update the WHOIS information for the device
        WHOISInfo.objects.update_or_create(
            device_id=device_pk,
            defaults={
                'organization_name': data.traits.autonomous_system_organization,
                'asn': data.traits.autonomous_system_number,
                'country': data.country.name,
                'timezone': data.location.time_zone,
                'address': address,
                'cidr': data.traits.network,
                'last_public_ip': ip,
            },
        )

The celery task will run when last_ip field changes. We can use _changed_checked_fields to track the changes like this:

def trigger_whois_lookup(self):
        """Trigger WHOIS lookup if the last IP has changed and is public IP."""
        from ipaddress import ip_address

        from .. import tasks

        if self._initial_last_ip == models.DEFERRED:
            return
        # Trigger fetch WHOIS lookup if it does not exist
        # or if the last IP has changed and is a public IP
        if (
            not hasattr(self, 'whoisinfo') or self.last_ip != self._initial_last_ip
        ) and ip_address(self.last_ip).is_global:
            tasks.fetch_whois_details.delay(self.pk, self.last_ip)

        self._initial_last_ip = self.last_ip
  • We also need to setup an independent celery task which updates the database periodically ensuring latest detail on each lookup. The frequency of this task is based on the provider we are choosing.

DragnEmperor avatar May 11 '25 13:05 DragnEmperor

There are two ways to fetch the data using geoip2 :

  • Through web service (requires internet for each fetch)
  • Through manually downloading the database (requires scripts for updating)

Let me know your thoughts!

DragnEmperor avatar May 12 '25 19:05 DragnEmperor

  • Through manually downloading the database (requires scripts for updating)

I believe Django also advices to download the data. How frequently we may need to update the scripts?

devkapilbansal avatar May 14 '25 14:05 devkapilbansal

  • Through manually downloading the database (requires scripts for updating) I believe Django also advices to download the data. How frequently we may need to update the scripts?

Maxmind updates the databases twice a week, every friday and tuesday. DB-IP databases are updated monthly

NOTE:

  • DB-IP databases do not require an account creation, while Maxmind databases does
  • Updates are a bit simpler on Maxmind since they do it via ACCESS_KEY.
  • Both offer respective libraries for automatic updates via ACCESS_KEY which requires account creation, but DB-IP has 'paid' account creation.

DragnEmperor avatar May 15 '25 03:05 DragnEmperor

There are two ways to fetch the data using geoip2 :

  • Through web service (requires internet for each fetch)
  • Through manually downloading the database (requires scripts for updating)

Let me know your thoughts!

I’m also in favor of proceeding by downloading the database, with the added flexibility to configure and switch the database source when needed from app settings.

codesankalp avatar May 20 '25 14:05 codesankalp

More information regarding services provided by Maxmind:

Webservices

  • GeoLite2 Web services are free but are Limited to 1000 requests per day
  • GeoIP2 Web services are paid and starts with 20$ (10,000 queries amounting to 0.002$ per query)
  • Both these services do not require any additional licensing

More info can be found here: https://www.maxmind.com/en/geoip-api-web-services#buy-now https://geoip2.readthedocs.io/en/latest/#sync-web-service-example

Maxmind's webservice repository : https://github.com/maxmind/GeoIP2-python

Databases

  • GeoLite2 Databases are free to download but requires commercial licensing. The total size can vary between 50 - 70 mb.
  • GeoIP2 Databases are paid with starting price of 134$ (we require city database as the fields we require are present in city db only)

NOTE : GeoLite2 databases are limited to 30 downloads per day More References: https://dev.maxmind.com/geoip/docs/databases/city-and-country/#database-sizes https://www.maxmind.com/en/site-license-overview https://www.maxmind.com/en/geoip-databases

Account Creation

Following link can be used for account creation : https://www.maxmind.com/en/geolite2/signup?utm_source=kb&utm_medium=kb-link&utm_campaign=kb-create-account

After successful creation,

  • user can go to Manage License Keys or use https://www.maxmind.com/en/accounts/<<account_id>>/license-key to create a license which can be used for web services, or
  • go to download databases to download the GeoLite2 City and ASN database.

More information regarding services provided by DB-IP:

Databases

Free Databases are updated monthly with total size of 100 - 130mb.

More on database pricing: https://db-ip.com/db/ip-to-location-isp

Webservice

Requires Paid account creation with prices starting at 11.49 Euros (~13$) for Core API which has the fields required by us.

More info can be found here: https://db-ip.com/api/

DragnEmperor avatar May 20 '25 16:05 DragnEmperor

Hi @DragnEmperor As discussed in the call today, although databases is a preferred solution, for now, we will proceed with the APIs as they will be less time taking to implement.

You can create a separate task for database integration which we can look into once our deliverables are done

devkapilbansal avatar May 20 '25 18:05 devkapilbansal

Hi @DragnEmperor As discussed in the call today, although databases is a preferred solution, for now, we will proceed with the APIs as they will be less time taking to implement.

You can create a separate task for database integration which we can look into once our deliverables are done

Thanks @DragnEmperor for the info, I agree with Kapil to create a separate issue for this and copy that valuable information you have shared there to the issue description.

nemesifier avatar May 21 '25 14:05 nemesifier

Hi @DragnEmperor As discussed in the call today, although databases is a preferred solution, for now, we will proceed with the APIs as they will be less time taking to implement.

You can create a separate task for database integration which we can look into once our deliverables are done

Hi @DragnEmperor As discussed in the call today, although databases is a preferred solution, for now, we will proceed with the APIs as they will be less time taking to implement. You can create a separate task for database integration which we can look into once our deliverables are done

Thanks @DragnEmperor for the info, I agree with Kapil to create a separate issue for this and copy that valuable information you have shared there to the issue description.

Have created an issue for this : https://github.com/openwisp/openwisp-controller/issues/1052 As discussed, keeping it separate from the main GSoC project. We can pick this later as per requirement and demand.

DragnEmperor avatar May 21 '25 15:05 DragnEmperor