WebToEpub icon indicating copy to clipboard operation
WebToEpub copied to clipboard

Question: How to deal with websites that dynamically insert/delete text with JavaScript?

Open tgf9 opened this issue 11 months ago • 2 comments

I'm trying to convert a website that has a lot of text, but also has these annoying <details>-like boxes. I checked the HTML and I can see they're actually <div> elements with, I guess, a click handler. When I click the div, text gets added to a child <div> element. This looks like I'm opening the <details>-like element. When I click again, JS deletes text in the child <div> element. This looks like I'm closing the <details>-like element.

It's pretty annoying because there's a ton of text hidden in these boxes that is missing from the resulting epub. I understand why it's missing (it's literally not in the HTML until you click a button), but rather I'm wondering if there is anything I can do about this situation.

Is there a way I could manually click on the boxes to add the child text to the page and then have WebToEpub process that HTML?

tgf9 avatar Jan 18 '25 07:01 tgf9

At the moment it is not possible.

gamebeaker avatar Jan 19 '25 20:01 gamebeaker

@tgf9

It's possible that the text is elsewhere on the web page. Probably buried in one or more <script> elements. In which case, it should be possible to make a parser for the site that will scan the page, find the various text elements and stitch them together.

Note, it's also possible the text is elsewhere, or encrypted in which case it may not be possible to solve for WebToEpub.

If you can provide me with a URL, I can look into this.

dteviot avatar Jan 21 '25 06:01 dteviot

no response

gamebeaker avatar May 12 '25 13:05 gamebeaker