advent-of-code-data icon indicating copy to clipboard operation
advent-of-code-data copied to clipboard

Throttle all requests, regardless of type

Open topaz opened this issue 2 years ago • 4 comments

I'm seeing a lot of traffic generated by User-Agent advent-of-code-data v0.9.8, which I assume is this project. Could you make it so requests are throttled (maybe one every few seconds) regardless of the type of request? For example, I see a lot of same-second requests for puzzle text, input data, and answer submission.

topaz avatar Nov 28 '21 21:11 topaz

Hey Eric! Yes it's an older version of this app, the User-Agent was added at your request in 2016. Puzzle text, input data, and answers are all cached client-side, so I wouldn't expect to see this coming from the same IP address repeatedly.

Perhaps that's someone using the library in a CI, with a real auth token but not preserving the cache?

Note that implementing this now would not change the behavior of the v0.9.8 release, the current version is v1.1.0.

wimglenn avatar Nov 29 '21 05:11 wimglenn

I don't mean they're the exact same request; I mean they're sending lots and lots of different requests all within the same second. I was hoping we could prevent this sort of thing in the future by modifying the code to throttle requests in general.

topaz avatar Nov 29 '21 06:11 topaz

Yes, I understand. I'll try and get this into the next version but I'm not sure it will be in time for Dec 1 :-\

When a client is requesting input data and posting answer within the same second, then it can be that they are validating their existing code is also working on a different dataset. That's a use-case I'm sympathetic with because it happens sometimes that your code works only on your own data set by luck, and one of the main use-cases for this app is to validate your solution actually works across different datasets. So I think throttling to "one request every few seconds" seems a bit too conservative, would a delay of 0.1s between requests be acceptable?

wimglenn avatar Nov 30 '21 04:11 wimglenn

I'm not sure what the right approach is; I see this tool being used to basically scrape the site. Maybe an increasing throttle would be better? Start with a small delay and gradually increase it?

topaz avatar Nov 30 '21 23:11 topaz