Feedback On New Output Formats
I was thinking of two new output modes.
1. Tree view (--format tree)
Display results as a hierarchical tree structure starting from the root directory.
Features:
- Show files in a tree layout
- Display count of broken links per directory/file
- Show broken links inline (with verbosity control)
- Hierarchical visualization of link check results
Example output (only show directories/files with broken links by default):
.
├── foo (2 broken)
│ └── bar.md (2 broken)
│ ├── ✗ https://example.com/404
│ └── ✗ https://example.org/gone
└── qux/quux.md (1 broken)
└── ✗ https://dead-link.com
With increased verbosity (-v):
.
├── foo (2 broken)
│ ├── bar.md (2 broken)
│ │ ├── ✗ https://example.com/404
│ │ └── ✗ https://example.org/gone
│ └── baz.md (0 broken)
└── qux (1 broken)
└── quux.md (1 broken)
└── ✗ https://dead-link.com
With increased verbosity (-vv):
.
├── foo (2 broken, 5 total)
│ ├── bar.md (2 broken, 3 total)
│ │ ├── ✓ https://example.com
│ │ ├── ✗ https://example.com/404 [404 Not Found]
│ │ └── ✗ https://example.org/gone [410 Gone]
│ └── baz.md (0 broken, 2 total)
│ ├── ✓ https://github.com/lycheeverse/lychee
│ └── ✓ https://rust-lang.org
└── qux (1 broken, 1 total)
└── quux.md (1 broken, 1 total)
└── ✗ https://dead-link.com [Connection timeout]
2. Link view (--format links)
Print a unique set of broken links (similar to --dump, but after checking).
Features:
- List each unique broken link once
- Show all files where that link appears
- Inverse of the current default output (grouped by link instead of by file)
Example output:
https://example.com/broken [404 Not Found]
├── foo/bar.md:12
└── qux/quux.md:45
https://example.org/404 [404 Not Found]
└── foo/bar.md:67
https://dead-link.com [Connection timeout]
├── foo/bar.md:89
├── foo/baz.md:23
└── qux/quux.md:5
It could also be sorted by frequency
https://dead-link.com [Connection timeout] (3 occurrences)
├── foo/bar.md:89
├── foo/baz.md:23
└── qux/quux.md:5
https://example.com/broken [404 Not Found] (2 occurrences)
├── foo/bar.md:12
└── qux/quux.md:45
https://example.org/404 [404 Not Found] (1 occurrence)
└── foo/bar.md:67
Feedback
Should we implement these formats?
One caveat: toy examples always look nice, of course, but in reality links or path names might get quite long. I still think it would be nice to support those because the data is presented in a different way, which might help with troubleshooting larger websites with many internal links. (That was the initial motivation as I was checking the links from the Rust docs, specifically the Rustonomicon.
@mre That's a really cool idea! Especially the tree view sounds really useful. I feel like the link view is not that useful (at least compared to the tree view) in most cases since it's not too common to have the same links across many files.
Maybe I'm a bit biased, but I noticed that there are a lot of duplicate links in internal documentation such as wikis and tool documentation. The reason might be that many pages get auto-generated and use the same base-template, so the navigation links and the footer links are often the same. And these are links which ideally shouldn't break because that would break the page navigation, so maybe there's still some merit in it to show a list of (unique) links and the number of occurrences?
I like the link view because it enables a new workflow. You can see the most common broken links and fix all occurences all at once with a find+replace.
I am on the fence about the tree view for two reasons. One, it will get very indented if the folder structure is deep. And two, I don't know if it gains anything over the existing output which is already grouped by file path. We could make the current output format depth first and sorted and I think it would be similar to this tree view.
One thing I don't like about the existing file output is that the paths can get really long and start to feel noisy. But you helped me understand that the same problem still exists in a tree-view in the form of nesting. Yet, the tree-view might be easier to parse visually due to its hierarchical nature. Personally, I mostly reason about paths relative to the current working directory and not in an absolute sense as path starting from root. How about you?