Raja Tomar comments

Results 55 comments of


                                            Raja Tomar

How to clone linked pages?

Are the hyperlinked pages hosted on the same site domain or outside?

How to clone linked pages?

The pywebcopy builds a hierarchical structure meaning your hyperlinked pages might be in some folders relative to the main html file.

How to clone linked pages?

It could be server side site specific issue. Maybe the hyperlinks are not resolving properly due to bad url or html formatting.

How to clone linked pages?

Ok just use 'save_website' function instead of save_webpage

New release work flow: migration plan and GitHub release

Hey @NickVeld I can't think of any plan as of now other than just pushing the version 7 to pypi directly. If you have any plan or roadmap, please do...

Encoding issues for websites in non-English languages such as Chinese, Japanese, etc.

@mima3 this is one of the ways to do it. The other being changing the `.encoding` attribute of the `WebPage` object.

Skip crawling and replacement of other domains

Hey, This will be possible in pywebcopy 7. Currently its only in source code on github repo. If you want that functionality of domain exclusion, then you have to go...

Skip crawling and replacement of other domains

so the Session object has an attribute `.domain_blacklist` it is an set or list object which is empty by default. Now if you want to skip the downloading from a...

Skip crawling and replacement of other domains

Blacklisting is done by session object separately, so it poses as an unreachable link while the localiser still makes them localised. I guess its not a bug its a feature....

save_website/crawl() does not download PDF

The pdfs are not downloaded because they are not on the same domain server hence the process marks it external and skips it entirely.