Novel-Grabber [SITE BROKEN] NovelUpdates.com

[SITE BROKEN] NovelUpdates.com

Open fangjiejie opened this issue 3 years ago • 8 comments

pasting the NU link in the telegram bot gives nothing. But NU is updated a one of the sources.

Jun 13 '21 18:06 fangjiejie

Novel Updates does not work right now, see #135

Jun 13 '21 19:06 Flameish

Alright I think the best easiest semi-automated solution (as well as backup solution for future NU shenanigans):

user has to go to "novel up dates com/series/xxxx-xxxxx/" and click on the "show all chapters“ button
user has to save "novel up dates com/series/xxxx-xxxxx/„ and save the site as mhtml file/save full site
than open the file in Novel Grabber
everything works as previously with novelupdates url

On the developer side, you’d just make the scraper look for the links & chapter numbers in the „my_popupreading_wrapper“ in the mhtml/HTML file.

Or if they still had the "my_popupreading_wrapper" popup opened when they save it, it should work without changing a thing how the Novel Grabber’s NU scraper works now.

local full html file+assets folder

I tested it and https://www.novel up dates.com/extnu links aren’t behind hCaptcha. Well…yet.

Maybe it could be even automated with browser extension, maybe not the clicking on "show all novels button“ (idk really how sensitive hCaptcha is to that), but definitely the - generate html file, save it, open it NovelGrabber, parts could be automated.

I don’t see this backup way (with extension) as massive downside since most users start with browsing to novelupdates to find the Novel link first anyway.

It’s not elegant or automated, but it should at least give a backup solution for user to just download a novel, when that niche novel isn’t anywhere else but NU & NU adds another layer of protection breaking the scraper.

Most other popular sites will have all the novels scraped by aggregators, but NU has some niche stuff that’s literally nowhere else. That’s why I think there should be a backup way specifically for Novelupdates.

Sep 09 '21 11:09 frykauf

I do get a 403 error on the /extnu links as well, so this won't help. Also, while not fully automated, you could do this with manual grabbing already: Paste a table of contents url, fetch links (doesn't matter if you get a 403, it's needed to work) then manually insert chapter links. There are probably browser addons to copy links from a selection.

Sep 09 '21 12:09 Flameish

@Flameish I don’t get 403s when I use headless browser, without it I also get them.

What you say about grabbing & pasting the content from the NU pop-up window - that suprisingly works well.

If you meant from pasting ToC & chapter links from the website that hosts the novel - there you run into problem if you have 2+ translation groups working on a novel. Or that their ToC is properly updated for newest chapter links. Or that their site isn’t all badly sitemaped/formatted.

But regardless of which one you meant - both require quite a lot of manual work from the users and presumes that they have a tool & knowleadge so they can extract URLs from text - apart from maybe also reversing lines (NU).

That’s pretty learning heavy, slow & unautomated for normal users.

Sep 09 '21 12:09 frykauf

@Flameish

While we’re on that topic, it would be great if you could post multiple ToCs from multiple sites that have the novel to get the links. Now whenever you add new ToC link, the currently fetched links disappear.

It would be great if they only disappeared when you click on a "clear all“ button.

Also useful if you spent 10mins editing the chapter links to not have them gone by loading the url again.

Sep 09 '21 12:09 frykauf

And a feature where you can just paste links to the links window if they are text links. Meaning „Chapter 30“ text that also is a URL link to Chapter 30.

For example:

Chapter names with chapter URL links

Just selecting them, copying them & pasting them to the links „tab“/window/frame whatever it’s called.

Links window

That would also work much better towards what you were saying where people can just copy&paste links - this would make it 100x easier and faster for any person, even if you know text manipulation tools.

Sep 09 '21 12:09 frykauf

Headless does indeed work with the /extnu links, my bad.

For manually grabbing to work you need to click on "fetch links" once, doesn't matter if you get a 403/forbidden error, it's an initialization thing. After that you can manually copy and paste the links from NU to it.

I agree that it can be quite a hassle to accomplish that and I like your idea of parsing html to chapter + links directly, even if this doesn't quite work in browsers since they just copy text as far as I know. You would need something like this addon to copy links, it is also what I use for NU in conjunction with this bookmark script:

javascript:(function(){var x = document.querySelectorAll('li a[title]');var i;for(i = 0; i < x.length; i++){x[i].remove();}})();

When executed, this script removes those "Going to chapter" links from the popup window so that you can copy all the links once for the chapter name and once more with the addon for the links.

Nevertheless please keep this issue on topic and create new ones for your feature requests; but be advised that I'm currently in the process of rewriting Novel-Grabber, which is why I'm not adding new features to this version. Separate issues are easier to keep track of than having these requests buried within others.

Sep 09 '21 13:09 Flameish

Yep, I’ll create seperate issues. I was just pitching you the ideas if they make sense or not while we were on topic.

Sep 16 '21 21:09 frykauf

Novel-Grabber Novel-Grabber copied to clipboard

[SITE BROKEN] NovelUpdates.com

Novel-Grabber
Novel-Grabber copied to clipboard