linkcheck icon indicating copy to clipboard operation
linkcheck copied to clipboard

Can't save output to text file.

Open languagedan opened this issue 6 years ago • 5 comments

First off, thanks for the tool. Trying to use it for the first time, but cannot get past the problem below...

If I run: linkcheck www.nobleprog.co.uk ...it runs fine.

If I run: linkcheck www.nobleprog.co.uk > list.log ...it gets stuck on: "Crawling..."

I've tried many times on different days, different servers, using different domain names, different log file names, switches such as 2>&1, all to no avail.

Any ideas?

--Daniel

languagedan avatar Aug 07 '18 02:08 languagedan

Hmm. Does your linkcheck www.nobleprog.co.uk (without the > list.log pipe) ever finish?

I just tried this for a while and:

  • linkcheck www.nobleprog.co.uk runs fine, although I didn't wait long enough to see it finish
  • linkcheck www.nobleprog.co.uk > list.log does only add "Crawling..." at first, but when I Ctrl-C the linkcheck process, it will correctly spit out output.

The confusing thing about this is that linkcheck first crawls everything and only then returns output. In normal terminal, that is okay because you still see the numbers going up. But when linkcheck detects that it's run in batch mode (no ANSI), it will just print out "Crawling..." and then seemingly stop operating.

That said, there can definitely also be a bug. Would you mind running linkcheck --debug www.nobleprog.co.uk > list.log, please? Also, please run linkcheck --version and tell me what version you're on.

Thanks for the report!

filiph avatar Aug 07 '18 04:08 filiph

Hi Filip,

Thanks for the quick response.

Does your linkcheck www.nobleprog.co.uk (without the > list.log pipe) ever finish? Yesterday, it finished fine. Today, I got: "Crawling: 31489Killed". No report was output to the console. Running time was 58 min.

I ran 'linkcheck --debug www.nobleprog.co.uk > list.log' but after about 40 min, the process was again killed. Unfortunately, I lost the Kill message after running 'cat' on the output (which was extremely long). The list.log is over 530 MB. Unable to copy it out of the Linux server. What should I be looking for in there?

I'm using linkcheck version 2.0.4. (stable).

languagedan avatar Aug 08 '18 03:08 languagedan

Could you do tail -n5000 list.log > tail.log and send that?

I fear that the process is running out of memory or being killed for some other resource reason. The tool assumes you have the memory to hold all its information there, for reasons of speed. When you have a huge site and/or a smaller box, I could see it just runs out at some point.

Btw, are you able to run this on a localhost version of the site? That would make things much quicker and probably easier to debug.

filiph avatar Aug 08 '18 22:08 filiph

Here is the tail.log.

tail.log

Unfortunately, I'm unable to run this on the production server. I will try to run the dev version of dart to see if that makes any difference.

languagedan avatar Aug 09 '18 03:08 languagedan

I ran the command again today: linkcheck www.nobleprog.co.uk > linkcheck_2018-08-09_uk.log

This is what was output to the console:

INTERNAL ERROR: Sorry! Please open https://github.com/filiph/linkcheck/issues/new in your favorite browser and copy paste the following output there:

Bad state: No element INTERNAL ERROR: Sorry! Please open https://github.com/filiph/linkcheck/issues/new in your favorite browser and copy paste the following output there:

Bad state: No element Killed

Any thoughts?

--Daniel

languagedan avatar Aug 09 '18 06:08 languagedan