[clang-format] clang-format-ignore: Add support for double asterisk patterns
Adds ** support for .clang-format-ignore to support similar pattern matching to .gitignore
closes #110160
@llvm/pr-subscribers-clang-format
Author: Ameer J (ameerj)
Changes
Adds ** support for .clang-format-ignore to support similar pattern matching to .gitignore
closes #110160
Full diff: https://github.com/llvm/llvm-project/pull/110560.diff
2 Files Affected:
- (modified) clang/lib/Format/MatchFilePath.cpp (+20-2)
- (modified) clang/unittests/Format/MatchFilePathTest.cpp (+34)
diff --git a/clang/lib/Format/MatchFilePath.cpp b/clang/lib/Format/MatchFilePath.cpp
index 062b334dcdd8fd..3d7861499b9a46 100644
--- a/clang/lib/Format/MatchFilePath.cpp
+++ b/clang/lib/Format/MatchFilePath.cpp
@@ -49,11 +49,29 @@ bool matchFilePath(StringRef Pattern, StringRef FilePath) {
return false;
break;
case '*': {
- while (++I < EOP && Pattern[I] == '*') { // Skip consecutive stars.
+ if (I + 1 < EOP && Pattern[I + 1] == '*') {
+ // Handle '**' pattern
+ while (++I < EOP && Pattern[I] == '*') { // Skip consecutive stars.
+ }
+ if (I == EOP)
+ return true; // '**' at the end matches everything
+ if (Pattern[I] == Separator) {
+ // Try to match the rest of the pattern without consuming the
+ // separator for the case where we want to match "zero" directories
+ // e.g. "a/**/b" matches "a/b"
+ if (matchFilePath(Pattern.substr(I + 1), FilePath.substr(J)))
+ return true;
+ }
+ while (J < End) {
+ if (matchFilePath(Pattern.substr(I), FilePath.substr(J)))
+ return true;
+ ++J;
+ }
+ return false;
}
const auto K = FilePath.find(Separator, J); // Index of next `Separator`.
const bool NoMoreSeparatorsInFilePath = K == StringRef::npos;
- if (I == EOP) // `Pattern` ends with a star.
+ if (++I == EOP) // `Pattern` ends with a star.
return NoMoreSeparatorsInFilePath;
// `Pattern` ends with a lone backslash.
if (Pattern[I] == '\\' && ++I == EOP)
diff --git a/clang/unittests/Format/MatchFilePathTest.cpp b/clang/unittests/Format/MatchFilePathTest.cpp
index 28f665635718e5..a6df090a802128 100644
--- a/clang/unittests/Format/MatchFilePathTest.cpp
+++ b/clang/unittests/Format/MatchFilePathTest.cpp
@@ -164,6 +164,40 @@ TEST_F(MatchFilePathTest, Path) {
EXPECT_FALSE(match("foo\\", R"(foo*\)"));
}
+TEST_F(MatchFilePathTest, DoubleAsterisk) {
+ EXPECT_TRUE(match("a/b/c/d.cpp", "**b**"));
+ EXPECT_TRUE(match("a/b/c/d.cpp", "**/b/**"));
+ EXPECT_TRUE(match("a/b/c/d_e.cpp", "**d_*"));
+ EXPECT_TRUE(match("a/b/c/d_e.cpp", "**/d_*"));
+ EXPECT_TRUE(match("a/b/c/d_e.cpp", "**d_**"));
+ EXPECT_TRUE(match("a/b/c/d_e.cpp", "**/d_**"));
+
+ EXPECT_TRUE(match("a/b/c/d_e.cpp", "**/b/c/**"));
+
+ EXPECT_TRUE(match("a/b/c/d_e.cpp", "a/**/b/c/d_e.cpp"));
+ EXPECT_TRUE(match("a/b/c/d_e.cpp", "a/**/c/d_e.cpp"));
+ EXPECT_TRUE(match("a/b/c/d_e.cpp", "a/**/b/**/d_e.cpp"));
+ EXPECT_TRUE(match("a/b/c/d_e.cpp", "**/b/**/d_e.cpp"));
+ EXPECT_TRUE(match("a/b/c/d_e.cpp", "a/**/**/b/**"));
+
+ EXPECT_FALSE(match("a/b/c/d_e.cpp", "**/d"));
+ EXPECT_FALSE(match("a/b/c/d_e.cpp", "**/b/d"));
+ EXPECT_FALSE(match("a/b/c/d_e.cpp", "**/b/d/**"));
+ EXPECT_FALSE(match("a/b/c/d_e.cpp", "**/b/c/"));
+
+ // Multiple consecutive asterisks are treated as **
+ EXPECT_TRUE(match("a/b/c/d.cpp", "***b****"));
+ EXPECT_TRUE(match("a/b/c/d.cpp", "****/b/***"));
+ EXPECT_TRUE(match("a/b/c/d_e.cpp", "***d_**"));
+ EXPECT_TRUE(match("a/b/c/d_e.cpp", "****/d_*"));
+ EXPECT_TRUE(match("a/b/c/d_e.cpp", "***/b/c/*****"));
+
+ EXPECT_FALSE(match("a/b/c/d_e.cpp", "*****/d"));
+ EXPECT_FALSE(match("a/b/c/d_e.cpp", "***/b/d"));
+ EXPECT_FALSE(match("a/b/c/d_e.cpp", "*****/b/d/***"));
+ EXPECT_FALSE(match("a/b/c/d_e.cpp", "***/b/c"));
+}
+
} // namespace
} // namespace format
} // namespace clang
@mydeveloperday @owenca bump for review feedback
@HazardyKnusperkeks Can I get your eyes on this PR please?
I want to express support for this feature. Just today I attempted to integrate .clang-format-ignore and was really confused by fact that it cannot ignore recursively. Project I wanted to integrate it to has a lot of generated files which is all over the place. It's very inconvenient have to use that pattern for every directory I want to ignore:
imported/*
imported/*/*
imported/*/*/*
DisableFormat is also inconvenient, since placing .clang-format with disabled formatting all over the project not exactly a solution I would want to go with either.
Just having single .clang-format-ignore file with imported/** would be much more convenient way to do things. IMO.
See #121404.
Closing in favor of #121404