twitch-tools
twitch-tools copied to clipboard
[Blocklist manager] Loading big lists takes a long time
If you for example used the option to block all currently known bots your blocklist could be a few million accounts big. Loading such a big list when the tool is loaded can take quite a while. It would be nice if this loading time could be reduced.
Unfortunately the Twitch API returns blocked accounts in a "random" order (not really random but still not useful). This means we can't really cache the last known state as there is no point to "start from" or to "check what changed since".
Are you speaking of this request?
https://api.twitch.tv/helix/users/blocks?broadcaster_id=...&first=100&after=......&_=1652452792602
The RangeKey is your index of sorts?
{
"UserID":"...",
"PrimaryKey":"...#...",
"RangeKey":"2022-05-13T14:38:35.138Z#2022-05-13T14:38:35.138Z"
}
I see the response data contains that next link as .pagination.cursor.
I probably had similar issues with the Twitch API, with lists being ordered by a less useful index, unsorted, and having less controls to manage the parts of data accurately, probably all to prevent mining and bot annoyances.
Just wanted to add some detail to the issue at least.
The "Get User Block List" doesn't have a range key nor does it have the time for when the block was done.
Looking at the docs now it says:
Gets a specified user’s block list. The list is sorted by when the block occurred in descending order (i.e. most recent block first).
Not sure if they changed that or if the docs are wrong because last time I checked this wasn't the case and the results where not sorted by a blocked at date.
I too have the same problem, I currently have around 3 million blocked accounts. If I want to add the new know bot accounts, the tool first loads the list of 3 million accounts, I noticed that it loads about 100 per second, so it can take up to 8 hours to load the whole list. I was thinking that you could avoid loading the whole list, but load only the last blocked accounts, in order to determine from which point of the list to start adding the new know bot accounts
Is there a way to download the list of bot accounts? Sorted so I can determine which accounts are most recent, in this way I can block the accounts manually
The currently known bot accounts are listed in this API https://twitch-tools.rootonline.de/blocklist_manager.php?preset=known_bot_users . They are however sorted by account ID and the array only lists the account ID and no usernames as they are not used for the blocking
I see. I noticed there is a graph that shows how many accounts were added each day https://twitch-tools.rootonline.de/known_bot_stats.php
If it were possible to have a list of accounts added every day (grouped by day), it would be very useful, in this way it would be possible to keep the list of blocked accounts synchronized much faster