WebToEpub icon indicating copy to clipboard operation
WebToEpub copied to clipboard

Please add site https://buntls.com

Open Tenome opened this issue 11 months ago • 1 comments

Provide URL for web page that contains Table of Contents (list of chapters) of a typical story on the site

https://buntls.com/story/artist-who-paints-dungeon/

Did you try using the Default Parser for the site? If not, why not?

It looks like the website scrambles the text until you load it, so the text appears jumbled when you try to scrape it. This is also probably why the website is so painfully slow...

What settings did you use? What didn't work?

  • URL of first chapter https://buntls.com/chapter/chapter-1-%f0%9f%96%bc/
  • CSS selector for element holding content to put into EPUB #chapter-content
  • CSS selector for element holding Title of Chapter #paragraph-0
  • CSS selector for element(s) to remove

Tenome avatar Jan 23 '25 03:01 Tenome

@Tenome You can use EpubEditor https://github.com/dteviot/EpubEditor with a script like this to decrypt the epub once you create it.

let decryptTable = new Map();
let crypt = "abcde fghij klmno pqrst uvwxyz ABCDE FGHIJ KLMNO PQRST UVWXYZ";
let clear = "tonqu erzla wicvf jpsyh gdmkbx JKABR UDQZC THFVL IWNEY PSXGOM";
for(let i = 0; i < crypt.length; ++i) {
    decryptTable.set(crypt[i], clear[i]);
}
let decryptChar = c => decryptTable.get(c) ?? c;
let decryptString = cypherText => cypherText.split("").map(c => decryptChar(c)).join("");
for(let e of dom.querySelectorAll("p")) {
    e.textContent = decryptString(e.textContent);
}
return true;

Please note, you'll probably need to figure out the correct values to put into the "clear" value, because it looks like site might be using a number of different character substitutions. Basically steps are:

  1. Create an Epub.
  2. Open a chapter in the Epub
  3. Open same chapter on Web site.

Look at a string, and you can start to fill in the missing text. e.g. On epub

Mhxozldnh, Grnnhgcdro hmenrphht zlr uhghdwhb clh txmh rvvdgdxn orcdgh xt Glx Xux zhuh whup ehuenhshb.

On web site

Meanwhile, Collection employees who received the same official notice as Cha Ara were very perplexed.

So, looking at these we can see that M -> M h -> e x -> a G -> C X -> A

etc. So, you can look up the epub value in the "crypt" string and then change the matching value in the "clear" string.

dteviot avatar Jan 26 '25 07:01 dteviot