PyPDNS icon indicating copy to clipboard operation
PyPDNS copied to clipboard

Have some way of doing bulk requests

Open cudeso opened this issue 9 months ago • 8 comments

This is strictly not related to PyPDNS, but rather to the the CIRCL Passive DNS services. Having some way of doing bulk requests, similar to how Team Cymru does this with the whois query.

Use case: I have a set of +/- 32k IPs and I want to know if they have been observed by pdns. Querying them individually via PyPDNS takes a very, very, very long time. pdnsresult = pypdns.rfc_query(ip)

cudeso avatar Mar 11 '25 18:03 cudeso

Did you paginate? via dribble-paginate-count ?

adulau avatar Mar 11 '25 21:03 adulau

As far as I could check in the docs dribble-paginate-count isn't part of the rfc_query request. I'll check it again.

Is dribble-paginate-count not for paginating the result set? In my use case I have about 32k IPs in the request, and from estimateguessing only 10% of it will have an answer/match.

cudeso avatar Mar 11 '25 21:03 cudeso

Paginate will paginate the result set, so you will still need to do 32K queries (you can parallelize them, but it will still take a while). One thing to speed it up it to query only A and AAAA records for example.

The other issue to keep in mind with the pagination is that the response order is non-deterministic, so even if you paginate, you won't know if you got the most recent entries until you get the complete set.

Rafiot avatar Mar 11 '25 23:03 Rafiot

I'm just doing a 'stupid' pdns = pypdns.rfc_query(ip) query in https://github.com/cudeso/tools/blob/master/minimedusa/parse_minimedusa.py Might have to filter it for A/AAAA to get better results. It takes about 2h to get the minimedusa results parsed. Not fast, but we only parse it once a day so still acceptable.

cudeso avatar Mar 12 '25 00:03 cudeso

So now, if only CIRCL / @adulau would provide a WHOIS service similar to Team Cymru I could limit outbound firewall rules to *circl.lu only ;-)

cudeso avatar Mar 12 '25 00:03 cudeso

@cudeso to summarize :

  • A bulk interface to query Passive DNS records (my only concern is that it may be very large for some record types).
  • A GeoOpen-included response for the associated IP, providing something similar to the WHOIS output of Team Cymru.

Some ideas:

  • Extending mmdb-server to include the Passive DNS output with a summarised DNS output. Available via https://ip.circl.lu/geolookup/8.8.8.8

adulau avatar Mar 12 '25 05:03 adulau

One thing that would also be very useful for large responses is to have away to get only get the recent entries, like "ignore entries with a last-seen older than 30 days" for example (for Lookyloo, that would be good enough). Or "give me the 20 most recent A entries".

Rafiot avatar Mar 12 '25 09:03 Rafiot

@cudeso to summarize :

  • A bulk interface to query Passive DNS records (my only concern is that it may be very large for some record types).

As @Rafiot remarked; having only the most recent ones returned is OK. For bulk querying limit to seen last 30d/60d.

  • A GeoOpen-included response for the associated IP, providing something similar to the WHOIS output of Team Cymru.

Some ideas:

  • Extending mmdb-server to include the Passive DNS output with a summarised DNS output. Available via https://ip.circl.lu/geolookup/8.8.8.8

Yes. An extra key for pdns would be great. Then there's only one external resource to use if you want to do these types of queries.

cudeso avatar Mar 14 '25 18:03 cudeso