rgaudin
rgaudin
Great news ! We'll test and integrate it
Great! Maybe leave progress reporting for a second time? Switching to libzim would be a great achievement already
In your ticket, I see `Prefix: mwoffliner-`. Is that a typo? > Can you split the stuff to have one file per language like we have in other Python projects...
Looks like TW supports gettext so that's probably what we're gonna use. We also have strings in JS code. We'd need to assess tools. https://guillaumepotier.github.io/gettext.js/ would help a single format...
I think `retrying` is probably the way to go here. That's what we do on other scrapers. We uses `backoff` but `retrying` seems like a better choice.
@benoit74 I believe the problem is not the calls but in the fact that they are treated independently, blindly. We are using a single target host and we have more...
@kevinmcmurtrie, your input is important and duly noted ; it's not the first time you're sharing this with us. While it's an highly important point to our process, changing stack...
The most difficult part here is the one that's not been mentioned: the UI. With our generic UI that What does entries look like? An html shell that displays epub.js...
I share this conclusion. I'd prefer more scrapers to work without JS but it's hardly realistic. Some of them are just dependent on JS and others, like gutenberg are built...
@Rayan-Rasheed I believe it is but we only only refresh StackOverflow once or twice a year. What should be done here is to tweak the scraper to not download/process data...