save-as-ebook icon indicating copy to clipboard operation
save-as-ebook copied to clipboard

Capturing unneeded elements

Open Anangaya opened this issue 4 years ago • 4 comments

This plugin usually captures some unneeded elements for me so far. For example it works terrible with scribblehub. If there is even one comment in the comment section the main content is completely ignored and only the comments are captured. When that happens epub starts with the string "Error: Parse Error:".

Even when the main content is captured there are some unneeded elements capture both before and after the necessary content. It would be nice if we can specify which elements are going to be captured or not, preferably by using Xpath expression of the needed elements.

Anangaya avatar Nov 09 '20 20:11 Anangaya

I can't reproduce it. Please send the link that's causing problems

Screenshot 2020-11-12 at 17 26 49

alexadam avatar Nov 12 '20 15:11 alexadam

basically none of them worked for me. Here is link

https://www.scribblehub.com/read/7681-the-mage-emperor/chapter/7683/

this is the epub

The Mage Emperor - Chapter 1 – My sexy childhood friend returned. And… Moved into our house! Scribble Hub.zip

Anangaya avatar Nov 13 '20 02:11 Anangaya

Ok it seems the bug only exists in firefox extension. Chrome gave the epub like it's suppose to.

Anangaya avatar Nov 13 '20 02:11 Anangaya

A method to actually select the html element to capture would be nice. It would be great in cases where the single html element spans muliple webpages in which case it's not possible to select all the text at once. Go to 24symbols.com and try a free book for example. The save page option can capture a chapter almost perfectly, save for an unwanted footer at the end of each chapter (which is still great because it's actually inside an iframe and the footers can be removed easily afterwards). But the save selection method fails spectacularly in this case (chapter spans multiple pages even though the entire chapter gets loaded in each page).

On a side note, I'm really grateful if you can answer this question. How is 24symbols preventing us from accessing the page source of the webpages of the books? (what it gives is completely a different page source)

Ok here is a webpage I saved from 24symbols (with SingleFile plugin),

Aftershock - A Stone Braide Chronicles Story by Bonnie S. Calhoun - Read book online (11_24_2020 12_25_01 PM).zip

The book was just something that used as guinea pig I still don't have any idea what's it about! The page source can be viewed from this file. Which is not the case when I try it directly at the site. The entire chapter is there in the page source but only a part of it's visible from the webpage thus it's impossible to select it all from Save Selection option.

Anangaya avatar Nov 23 '20 20:11 Anangaya