Questions regarding the batch functionality and API
Hi Internet.nl team!
I'm working at The Swedish Internet Foundation and we are the registry for the .se zone. We have about 1,5M domain names that we plan to periodically check with the Internet.nl service.
We’ve set up our own instance of Internet.nl (including the batch API), and we have a few questions regarding a usage pattern that works very well in practice — but we’d like to confirm whether it’s officially supported or recommended.
What we’re doing
We start a large batch job (e.g., with 10,000 domains), which understandably takes some time to process. In parallel, we submit additional batch jobs — sometimes with only a single domain — and have observed that these smaller jobs are processed immediately and very quickly, without being queued behind the larger one.
This opens up the possibility of building our own interactive frontend where: • a user submits a single domain, • we create a new (small) batch job in the background, • the frontend polls job status and displays the result.
In other words, we’re submitting single domains via the batch API, but only to our own private instance — not to the public service.
Our questions 1. Is this a supported and acceptable usage pattern, using the batch API with one domain per job, as long as it’s on our self-hosted instance? 2. Are there any technical risks or concerns when submitting many small jobs in parallel? 3. Is the setup guide in Docker-deployment-batch.md considered stable and suitable for production use? 4 Are there any recommendations or best practices you would suggest when running it in production?
Thank you for providing such an excellent and important tool for monitoring internet standards and resilience! We look forward to your feedback.
Hi 🇸🇪!
At Internet.nl we use the services in a different way/pattern. Namely we split it in 2 production instances, one without API and one with API, and another dashboard instance to use the API:
- see #1552
Note that currently large batch jobs require quite some memory because the generating of the batch result (gather_batch_results) doesn't use an .iterator() for the DB nor is writing every result, instead everything is done in memory (including building the complete JSON object). That is why we limit ourselves to <5_000 at the moment, but preferably this is fixed and larger jobs can be run without creating memory issues. If you have no issue with higher memory loads of the slow-worker (which should be increased to not get container OOM kills) you can also run larger batch jobs (e.g. 10_000):
https://github.com/internetstandards/Internet.nl/blob/1fa9fb7feff873edd00035932b4a2643c975f2ac/docker/defaults.env#L202-L205
So the WORKER_SLOW_MEMORY_LIMIT should be set higher in your own local.env.
Regarding your questions:
- We don't use it this way, but @mxsasha will weigh in about the scheduling.
- We don't know, since it is untested/unused (although in our current instance there are quite some small tests).
- Yes, we use the the docker instances in production:
- single test internet.nl since v1.8.0 - 2023-11-13
- API batch.internet.nl since v1.8.7 - 2024-09-11
- @aequitas can you weigh in on our local.env for batch?
Thank you for the ZoneMaster.net tool. The Dutch Internet Standards Platform would love to get into contact regarding the use and scanning of de .se zone. BTW we're also launching an international standards community at the IGF in 🇳🇴 later this month:
🚀 Join the Launch of the Global Internet Standards Community! 🌐
We’re excited to invite you to the launch of the Global Internet Standards Testing Community (GISTC) — a new international effort to advance internet security by promoting the adoption and testing of modern internet standards.
📅 Date: Tuesday, 24 June 2025 🕘 Time: 09:00–09:30 CEST 📍 Where: IGF 2025, Lillestrøm, Norway & Online via Zoom
🔗 More info: https://www.intgovforum.org/en/content/igf-2025-launch-award-event-96-empower-the-global-internet-standards-testing-community 📝 Register to the IGF by June 8: https://indico.un.org/event/1016806/
@bwbroersma, I am part of the working group for Zonemaster. We are open for discussions. If you would like to have private communication, you can find my email address with my profile.
Maybe I should join the session on the IGF meeting. They have extended registerations to June 18.
Regarding scheduling, my reading of the code is that the scheduler selects one user at random, regardless of how many requests they have open, then takes the oldest request from that user:
https://github.com/internetstandards/Internet.nl/blob/a27ad320e86d25954ddec0741f0b6ea44072989c/interface/batch/scheduler.py#L560-L587
https://github.com/internetstandards/Internet.nl/blob/a27ad320e86d25954ddec0741f0b6ea44072989c/interface/batch/scheduler.py#L145-L178
Were your requests under different users? Under the same user, I do not see any round robin scheduling in the code. That doesn't guarantee it's not happening in a non-obvious way though.
@matsduf Great to hear you're open for discussion. I'm providing project management support for the Dutch Internet Standards Platform / internet.nl. I'll reach out by email to invite you for an online meeting with the internet.nl team. Thanks!