fix: Include untracked files with --gitignore option
The current --gitignore option correctly ignores files which are "ignored" by git, but the --ignore option to git ls-files does not include untracked files in it's output.
These can be detected using git ls-files --others --exclude-standard.
Combine the two calls into a single deduplicated gitignore list to ignore all files properly.
Thanks for the PR. I am a bit hesitated to directly merge it. Technicaly it looks all fine. But I wonder if it wouldn't overcomplicate things? Is there a real world use case / need for this? Untracked files in a git repo sooner or later are committed, deleted or added to gitignore anyways or not?
Fair question! :)
We had an issue in https://github.com/bitcoin/bitcoin/issues/30496 where a dirty cache-hit was sometimes restoring a .pyenv/README.md file, which included a broken markdown link, causing CI failure.
Of course, this should not really happen, and we fixed this already by using out of tree python builds, but IMO it makes sense for a --gitignore option to ignore all files not tracked and ignored by git, I think?
Hm... got it. Still not 100% convinced. I can already think of others having the exact oposite use case. For example they generate md or html files in the CI pipeline and want to check the output with mlc. I guess they would then be confused when mlc would only check files which are staged.
Might it work better for you if I implemented a new, separate --git-untracked flag (or similar), so that the two were independent?
Maybe even '--gitignoreuntracked' or the oposite such as --gitracked which then only chekcks tracked links would (maybe) make more sense for naming the flag?
If you really see a need in having this I would be OK and merge it if you put the logic behind a separte flag without changing the original --gitignore flag.
Maybe even '--gitignoreuntracked' or the oposite such as
--gitrackedwhich then only chekcks tracked links would (maybe) make more sense for naming the flag?If you really see a need in having this I would be OK and merge it if you put the logic behind a separte flag without changing the original
--gitignoreflag.
Ok I have updated this to be behind its own flag --gituntracked.
I think the CI failure is suprious and will pass on a re-run?
left: Failed("Http(s) request failed. error sending request for url (http://gitlab.com/becheran/mlc)")
Verified that the CI passes for me locally:
Details
running 77 tests
test link_extractors::markdown_link_extractor::tests::code_block ... ok
test link_extractors::html_link_extractor::tests::no_link ... ok
test link_extractors::html_link_extractor::tests::commented ... ok
test link_extractors::markdown_link_extractor::tests::escaped_code_block ... ok
test link_extractors::markdown_link_extractor::tests::commented_link ... ok
test link_extractors::markdown_link_extractor::tests::html_code_block ... ok
test link_extractors::markdown_link_extractor::tests::image_reference ... ok
test link_extractors::markdown_link_extractor::tests::inline_code ... ok
test link_extractors::markdown_link_extractor::tests::inline_link_This_is_a_short_link__less_thanhttp_colon_slash_slashexample_full_stopnet_slash_greater_than_22 ... ok
test link_extractors::markdown_link_extractor::tests::inline_link__less_thanhttp_colon_slash_slashexample_full_stopnet_slash_greater_than_1 ... ok
test link_extractors::markdown_link_extractor::tests::inline_no_link ... ok
test link_extractors::markdown_link_extractor::tests::link_escaped ... ok
test link_extractors::markdown_link_extractor::tests::link_in_code_block ... ok
test link_extractors::markdown_link_extractor::tests::link_in_headline ... ok
test link_extractors::markdown_link_extractor::tests::link_near_inline_code ... ok
test link_extractors::markdown_link_extractor::tests::link_no_title ... ok
test link_extractors::markdown_link_extractor::tests::link_very_near_inline_code ... ok
test link_extractors::markdown_link_extractor::tests::link_with_title ... ok
test link_extractors::markdown_link_extractor::tests::nested_links ... ok
test link_extractors::markdown_link_extractor::tests::referenced_link ... ok
test link_extractors::markdown_link_extractor::tests::referenced_link_tag_only ... ok
test link_extractors::markdown_link_extractor::tests::no_link_colon ... ok
test link_extractors::markdown_link_extractor::tests::referenced_link_no_tag_only ... ok
test link_validator::file_system::test::remove_dot ... ok
test link_extractors::html_link_extractor::tests::space ... ok
test link_extractors::html_link_extractor::tests::links__less_thana_hreflang_equal_double_quoteen_double_quote_href_equal_double_quotehttps_colon_slash_slashwww_full_stopw3schools_full_stopcom_double_quote_greater_thanVisit_W3Schools_full_stopcom_exclamation_less_than_slasha_greater_than_1_1 ... ok
test link_extractors::html_link_extractor::tests::url_encoded_path ... ok
test link_extractors::markdown_link_extractor::tests::html_link_ident ... ok
test link_extractors::markdown_link_extractor::tests::raw_html_issue_31 ... ok
test link_extractors::html_link_extractor::tests::links__less_thana_href_equal_double_quotehttps_colon_slash_slashwww_full_stopw3schools_full_stopcom_double_quote_greater_thanVisit_W3Schools_full_stopcom_exclamation_less_than_slasha_greater_than_1_1 ... ok
test link_extractors::html_link_extractor::tests::links__less_than_exclamation_minus_minuscomment_minus_minus_greater_than_less_thana_href_equal_double_quotehttps_colon_slash_slashwww_full_stopw3schools_full_stopcom_double_quote_greater_thanVisit_W3Schools_full_stopcom_exclamation_less_than_slasha_greater_than_1_15 ... ok
test link_extractors::html_link_extractor::tests::links__less_thana_href__equal____double_quotehttps_colon_slash_slashwww_full_stopw3schools_full_stopcom_double_quote_greater_than_Visit_W3Schools_full_stopcom_exclamation__less_than_slasha_greater_than_1_1 ... ok
test link_extractors::markdown_link_extractor::tests::html_link_no_target ... ok
test link_extractors::markdown_link_extractor::tests::html_link_new_line ... ok
test link_extractors::markdown_link_extractor::tests::html_link_with_target ... ok
test link_validator::link_type::tests::ftp_link_types_ftp_colon_slash_slashmueller_colon12345_atftp_full_stopdownloading_full_stopch ... ok
test link_validator::link_type::tests::http_link_types_http_colon_slash_slashwww_full_stopwebsite_full_stopphp ... ok
test link_validator::link_type::tests::test_file_system_link_types_C_colon_back_slashtraditional_back_slashpaths ... ok
test link_validator::link_type::tests::test_file_system_link_types_D_colon_back_slashProgram_Files_left_paranthesisx86_right_paranthesis_back_slashfile_full_stoplog ... ok
test link_validator::link_type::tests::test_file_system_link_types_D_colon_back_slashProgram_Files_left_paranthesisx86_right_paranthesis_back_slashfolder_back_slashfile_full_stoplog ... ok
test link_validator::link_type::tests::test_file_system_link_types__back_slash_back_slashsmb_right_brace_back_slashpaths ... ok
test link_validator::link_type::tests::test_file_system_link_types_F_colon_slashfake_slashwindows_slashpaths ... ok
test link_validator::link_type::tests::test_file_system_link_types__back_slashfile_full_stopext ... ok
test link_validator::link_type::tests::test_file_system_link_types__full_stop_back_slashfile_full_stopmd ... ok
test link_validator::link_type::tests::test_file_system_link_types__full_stop_full_stop_back_slashupper_dir_full_stopmdc ... ok
test link_validator::link_type::tests::test_file_system_link_types__full_stop_slashfile_full_stopext ... ok
test link_validator::link_type::tests::test_file_system_link_types__full_stop_full_stop_slashupper_dir_full_stopmd ... ok
test link_validator::link_type::tests::test_file_system_link_types_file_colon_slash_slash_slashsome_slashpath_slash ... ok
test link_validator::link_type::tests::test_file_system_link_types_path ... ok
test link_validator::link_type::tests::http_link_types_https_colon_slash_slashdoc_full_stoprust_minuslang_full_stoporg_full_stophtml ... ok
test link_validator::mail::tests::invalid_mail_links_mailto_colon_slash_slash_atbar_atbar ... ok
test link_validator::mail::tests::invalid_mail_links_mailto_colonfoo_atl_astname_full_stopcOM ... ok
test link_validator::mail::tests::invalid_mail_links_mailto_colon_slash_slashfoobar_full_stopcom ... ok
test link_validator::mail::tests::invalid_mail_links_mailto_colon_slash_slashfoo_full_stoplastname_full_stopcom ... ok
test link_validator::mail::tests::invalid_mail_links_mailto_colonfoo_full_stopdo_atl_dollarastname_full_stopcOM ... ok
test link_validator::mail::tests::mail_links_bla_full_stopbla_atweb_full_stopde ... ok
test link_validator::mail::tests::mail_links_mailto_colon_slash_slashfoo_full_stoplastname_atbar_full_stopcom ... ok
test link_validator::mail::tests::mail_links_mailto_colon_slash_slashfoo_plus_atbar_full_stopcom ... ok
test link_validator::mail::tests::mail_links_mailto_colon_slash_slashtst_atxyz_full_stopus ... ok
test link_validator::mail::tests::mail_links_mailto_colonBlA_full_stopbLa_full_stopext_atweb_full_stopde ... ok
test link_validator::mail::tests::mail_links_mailto_colon_slash_slash_plusbar_atbar_full_stopcom ... ok
test link_validator::mail::tests::mail_links_mailto_colonbla_full_stopbla_full_stopext_atweb_full_stopde ... ok
test link_validator::mail::tests::mail_links_mailto_colonbla_full_stopbla_atweb_full_stopde ... ok
test link_validator::mail::tests::mail_links_mailto_colon_exclamation_hash_dollar_percent_ampercand_quote_asterisk_plus_minus_slash_equal_questionmark_caret__backtick_left_brace_vertical_bar_right_brace_tilde_minusfoo_atfoobar_full_stopcom ... ok
test link_validator::mail::tests::mail_links_mailto_colonfoo_minusbar_atfoobar_full_stopcom ... ok
test link_validator::mail::tests::mail_links_mailto_colonsome_athostnumbers123_full_stopcom ... ok
test link_validator::mail::tests::mail_links_mailto_colonsome_athost_minusname_full_stopcom ... ok
test link_validator::http::test::check_wrong_http_request ... ok
test link_validator::http::test::check_https_crates_io_available ... ok
test link_validator::http::test::check_http_is_available ... ok
test link_validator::http::test::check_http_is_redirection_failure ... ok
test link_validator::http::test::check_http_is_redirection ... ok
test link_validator::http::test::check_http_redirection_do_warn_if_ignored_mismatch ... ok
test link_validator::http::test::check_http_redirection_do_not_warn_if_ignored_star_pattern ... ok
test link_validator::http::test::check_http_request_redirection_with_hash ... ok
test link_validator::http::test::check_http_request_with_hash ... ok
test link_validator::http::test::check_http_redirection_do_not_warn_if_ignored ... ok
test result: ok. 77 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 3.05s
Running `/home/will/src/mlc/target/debug/deps/mlc-965c11e62faf1c7d`
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Running `/home/will/src/mlc/target/debug/deps/end_to_end-06e2817e47cbc5fc`
running 2 tests
test end_to_end_different_root ... ok
test end_to_end ... ok
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.53s
Running `/home/will/src/mlc/target/debug/deps/file_traversal-0d0647f95b086ba9`
running 2 tests
test empty_folder ... ok
test find_markdown_files ... ok
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Running `/home/will/src/mlc/target/debug/deps/markdown_files-58f1419950840e67`
running 2 tests
test no_links ... ok
test some_links ... ok
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Running `/home/will/src/mlc/target/debug/deps/throttle-33621581b6d6b846`
running 3 tests
test throttle_different_hosts ... ok
test throttle_same_ip ... ok
test throttle_same_hosts ... ok
test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.94s
Doc-tests mlc
Running `rustdoc --edition=2018 --crate-type lib --crate-name mlc --test src/lib.rs --test-run-directory /home/will/src/mlc -L dependency=/home/will/src/mlc/target/debug/deps -L dependency=/home/will/src/mlc/target/debug/deps -L native=/home/will/src/mlc/target/debug/build/openssl-sys-7deea5c42e56b7e5/out/openssl-build/install/lib --extern async_std=/home/will/src/mlc/target/debug/deps/libasync_std-f4199bb1f8f0e555.rlib --extern clap=/home/will/src/mlc/target/debug/deps/libclap-e95e12cca3d996ac.rlib --extern colored=/home/will/src/mlc/target/debug/deps/libcolored-7f78aa5eb0728bee.rlib --extern criterion=/home/will/src/mlc/target/debug/deps/libcriterion-a4ba2fae80259bf0.rlib --extern futures=/home/will/src/mlc/target/debug/deps/libfutures-cdd1337f01633f6b.rlib --extern lazy_static=/home/will/src/mlc/target/debug/deps/liblazy_static-4518ae97c2d4cb4e.rlib --extern log=/home/will/src/mlc/target/debug/deps/liblog-98048c77c908114d.rlib --extern mlc=/home/will/src/mlc/target/debug/deps/libmlc-ad57c5313c31bae9.rlib --extern ntest=/home/will/src/mlc/target/debug/deps/libntest-be800586001dcdbc.rlib --extern pulldown_cmark=/home/will/src/mlc/target/debug/deps/libpulldown_cmark-a117bae152895062.rlib --extern regex=/home/will/src/mlc/target/debug/deps/libregex-ace53665dad0d67f.rlib --extern reqwest=/home/will/src/mlc/target/debug/deps/libreqwest-f1dcceddaf48302d.rlib --extern serde=/home/will/src/mlc/target/debug/deps/libserde-39b33fb8918c6596.rlib --extern simplelog=/home/will/src/mlc/target/debug/deps/libsimplelog-27d00465f6456821.rlib --extern tokio=/home/will/src/mlc/target/debug/deps/libtokio-0fc28e65821446bc.rlib --extern toml=/home/will/src/mlc/target/debug/deps/libtoml-bf1bfe9e0363f570.rlib --extern url=/home/will/src/mlc/target/debug/deps/liburl-eff0512c9509feb8.rlib --extern url_escape=/home/will/src/mlc/target/debug/deps/liburl_escape-9d14a76e1088960c.rlib --extern walkdir=/home/will/src/mlc/target/debug/deps/libwalkdir-cbd159030c9dea21.rlib --extern wildmatch=/home/will/src/mlc/target/debug/deps/libwildmatch-beb28ad7e489c09e.rlib -C embed-bitcode=no --check-cfg 'cfg(docsrs)' --check-cfg 'cfg(feature, values())' --error-format human`
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Thanks @becheran.
Do you think you could tag a new release with this feature in? We currently use mlc by fetching a tagged binary.