dosage
dosage copied to clipboard
Scheduling & performance
Currently, dosage downloads comic in a very straightforward way:
- Get page
- Parse page
- Get images
- Continue with next page
For better performance, the user can decide to run download multiple comics in parallel (via the -p
option) - but that's more of a clutch - the threads aren't aware of each other, which could lead to the situation where multiple threads fetch comics from the same hoster.
We should evaluate a better scheduling system, satisfying at least the following requirements:
- [ ] Parallel downloads from multiple hosts
- [ ] Throttling per host (we don't want to overload a hoster)
- [ ] Image downloads can be handled separate from page parsing
It might be worthwhile to look at things like asyncio, async/await or something like that...