WebToEpub icon indicating copy to clipboard operation
WebToEpub copied to clipboard

Wtr lab trips off in the middle of chapters

Open Okikaemmanuel1 opened this issue 2 years ago • 7 comments

It just shuts down mid grab of chapters and says chapter body not found

Okikaemmanuel1 avatar Sep 13 '22 11:09 Okikaemmanuel1

@Okikaemmanuel1 Site is probably getting suspicious due to rate you're fetching chapters. I suggest going to the advanced options and slowing it down.

dteviot avatar Sep 13 '22 19:09 dteviot

Okay thanks

Okikaemmanuel1 avatar Sep 16 '22 00:09 Okikaemmanuel1

Still trips off

Okikaemmanuel1 avatar Sep 19 '22 09:09 Okikaemmanuel1

Here is an example of one novel that does that https://wtr-lab.com/en/serie-556/my-attributes-cultivation-life

Okikaemmanuel1 avatar Sep 19 '22 09:09 Okikaemmanuel1

@Okikaemmanuel1 OK, It looks like the site will occasionally send a page that has no content. (Presumably to give problems to people scraping the site.) When this happens, WebToEpub generates the error

TypeError: chapterdata.body is not iterable at WtrlabParser.preprocessRawDom (chrome-extension://omliclippfibgegkdnanknpekbaahlhm/js/parsers/WtrlabParser.js:50:38) at chrome-extension://omliclippfibgegkdnanknpekbaahlhm/js/Parser.js:495:24 at async Promise.all (index 0) at async WtrlabParser.fetchWebPages (chrome-extension://omliclippfibgegkdnanknpekbaahlhm/js/Parser.js:470:17)

Note, it looks like there might be some chapters that just don't work. image

I've updated WebToEpub to correctly inform you when a chapter has no content.

There are two possible ways to work around this.

  1. Enable "Skip chapters that return HTTP 404 error", and WebToEpub will continue running, inserting a placeholder in the Epub for those chapters, and give you a list of the problems at the end. You can then run WebToEpub again to get the "missed" chapters. (They will be marked in the chapter list, and you can just select them.) Then you can use something like MergeWebToEpub to replace the chapters.
  2. Disable "Skip chapters that return HTTP 404 error". WebToEpub will stop when a chapter has no content. You can then try starting again from the failed chapter. And again, stitch the results together with MergeWebToEpub.

Test versions for Firefox and Chrome have been uploaded to https://drive.google.com/drive/folders/1B_X2WcsaI_eg9yA-5bHJb8VeTZGKExl8?usp=sharing. Pick the one suitable for you, follow the "How to install from Source (for people who are not developers)" instructions at https://github.com/dteviot/WebToEpub/tree/ExperimentalTabMode#user-content-how-to-install-from-source-for-people-who-are-not-developers and let me know how it goes.

For my notes: 71 minutes work

dteviot avatar Oct 01 '22 05:10 dteviot

Thank u very much

Okikaemmanuel1 avatar Oct 01 '22 21:10 Okikaemmanuel1

Working now

Okikaemmanuel1 avatar Oct 01 '22 21:10 Okikaemmanuel1

Going to call this done.

dteviot avatar Feb 26 '23 01:02 dteviot