jj icon indicating copy to clipboard operation
jj copied to clipboard

FR: `git grep` like command

Open Vampire opened this issue 1 month ago • 16 comments

Is your feature request related to a problem? Please describe. I often want to find the place of some regex in all versioned files, or in some subset of the versioned files. Sometimes also I want to search through all versioned files in some given commit-ish. With git you can do git grep 'my.*pattern' or git grep 'my.*pattern' v1.2.3 or git grep 'my.*pattern' v1.2.3 docs.

Describe the solution you'd like A similar command to git grep that allows to search for some pattern in all versioned files or a subset of the versioned files in a given revision or by default on @.

Describe alternatives you've considered For the files in @:

jj file list --template 'path ++ "\n"' docs | xargs -r grep --color 'my.*pattern'

or

jj file list docs | tr '\\' / | xargs -r grep --color 'my.*pattern'

For files in an arbitrary revision, though this is awefully slow

__R=v1.2.3; jj file list --revision ${__R} --template "if(file_type == 'file', path ++ \"\n\")" docs | xargs -ri sh -c "jj file show --revision $__R {} | grep --color=always 'my.*pattern' | sed 's|^|$__R:{}:|'"; unset __R

Vampire avatar Nov 04 '25 16:11 Vampire

you should use diff_contains(...) and the upcoming variants (#7595) for this use-case.

PhilipMetzger avatar Nov 04 '25 16:11 PhilipMetzger

I'm here not interested in commits that contain something in the diff. I'm interested in versioned files that contain something in a given revision. Like git grep, not like git log -S or git log -G.

Vampire avatar Nov 04 '25 17:11 Vampire

I'm interested in versioned files that contain something in a given revision.

So does the regex in revsets not produce what you need? See String patterns.

joyously avatar Nov 04 '25 20:11 joyously

How would that help? Again, I am not interested in selecting a revision from the changes done in it.

I want to search for a string/pattern/regex inside all the versioned files of one revision I specify, like git grep is doing.

Vampire avatar Nov 04 '25 20:11 Vampire

I want to search for a string/pattern/regex inside all the versioned files of one revision

I have never used git grep, but what is the output you expect to see? Is it simply the matching lines? So conceptually, "extract that revision to a temp folder and run grep on it"?

joyously avatar Nov 04 '25 21:11 joyously

So conceptually, "extract that revision to a temp folder and run grep on it"?

Yep. I also miss this in jj constantly.

apoelstra avatar Nov 04 '25 21:11 apoelstra

I have never used git grep, but what is the output you expect to see? Is it simply the matching lines?

Well, depends on the provided and supplied options. :-D With git grep it indeed also provides various options vanilla grep provides like -v for inverting the match, -F to search literally, -l to just list the names of matched files, -c to only output count of matches per file with a match and so on.

I guess a respective jj command might not support all these options and that might be ok. 🤷‍♂️

But speaking about the default invocation without such options it is for example like this in the jj repository:

$ git grep 'update-committer-timestamp\|force-rewrite' cli/tests
cli/tests/[email protected]:* `--force-rewrite` — Rewrite the commit, even if no other metadata changed
cli/tests/[email protected]:   $ JJ_USER='Foo Bar' [email protected] jj metaedit --force-rewrite
cli/tests/test_metaedit_command.rs:        .run_jj(["metaedit", "--force-rewrite", "kkmpptxzrspx"])
cli/tests/test_metaedit_command.rs:    insta::assert_snapshot!(work_dir.run_jj(["metaedit", "--force-rewrite"]), @r"
cli/tests/test_metaedit_command.rs:        "--update-committer-timestamp",
cli/tests/test_metaedit_command.rs:        "--force-rewrite",
cli/tests/test_metaedit_command.rs:    error: the argument '--update-committer-timestamp' cannot be used with '--force-rewrite'

$ git grep 'update-committer-timestamp\|force-rewrite' b8fa9ecf9c93fa9fc951c0a2872c5dd189318189 cli/tests
b8fa9ecf9c93fa9fc951c0a2872c5dd189318189:cli/tests/[email protected]:* `--update-committer-timestamp` — Update the committer timestamp
b8fa9ecf9c93fa9fc951c0a2872c5dd189318189:cli/tests/test_metaedit_command.rs:        .run_jj(["metaedit", "--update-committer-timestamp", "kkmpptxzrspx"])

$ jj file list --template 'path ++ "\n"' cli/tests | xargs -r grep --color 'update-committer-timestamp\|force-rewrite'
cli/tests/[email protected]:* `--force-rewrite` — Rewrite the commit, even if no other metadata changed
cli/tests/[email protected]:   $ JJ_USER='Foo Bar' [email protected] jj metaedit --force-rewrite
cli/tests/test_metaedit_command.rs:        .run_jj(["metaedit", "--force-rewrite", "kkmpptxzrspx"])
cli/tests/test_metaedit_command.rs:    insta::assert_snapshot!(work_dir.run_jj(["metaedit", "--force-rewrite"]), @r"
cli/tests/test_metaedit_command.rs:        "--update-committer-timestamp",
cli/tests/test_metaedit_command.rs:        "--force-rewrite",
cli/tests/test_metaedit_command.rs:    error: the argument '--update-committer-timestamp' cannot be used with '--force-rewrite'

$ __R=b8fa9ecf9c93fa9fc951c0a2872c5dd189318189; jj file list --revision $__R --template "if(file_type == 'file', path ++ \"\n\")" cli/tests | xargs -ri sh -c "jj file show --revision $__R {} | grep --color=always 'update-committer-timestamp\|force-rewrite' | sed 's|^|$__R:{}:|'"; unset __R
b8fa9ecf9c93fa9fc951c0a2872c5dd189318189:cli/tests/[email protected]:* `--update-committer-timestamp` — Update the committer timestamp
b8fa9ecf9c93fa9fc951c0a2872c5dd189318189:cli/tests/test_metaedit_command.rs:        .run_jj(["metaedit", "--update-committer-timestamp", "kkmpptxzrspx"])
Image

So conceptually, "extract that revision to a temp folder and run grep on it"?

Conceptually, yes, exactly

Vampire avatar Nov 05 '25 11:11 Vampire

Here are a bunch of opinions of mine on this:

  • grep is a horrible unixism which the command shouldn't use, jj search is much better.
  • The current suggestion is using something like rg on a jj repository.
  • If we build something like it should integrate easily for large monorepos (so an integration point for kythe or Glean should exist) which covers the code-search part in the respective Hyperscaler.
  • I find it highly likely that we won't copy Gits interface here but build our own, Git's CLI isn't something we strive for.

I'm sure Martin knows how Google integrated hg grep with their internal infrastructure and what went well and what learnings to take from it to apply to jj.

PhilipMetzger avatar Nov 05 '25 15:11 PhilipMetzger

I agree with all these opinions -- but also that the current "use rg after checking out a commit" situation has really poor UX, specifically because it makes you check out the commit. But yes, at the same time it should be possible to do much better than git grep.

apoelstra avatar Nov 05 '25 15:11 apoelstra

grep is a horrible unixism which the command shouldn't use, jj search is much better. I find it highly likely that we won't copy Gits interface here but build our own, Git's CLI isn't something we strive for.

Sure, that's why I asked for a git grep like command. It can for example search by "String patterns" as present for revsets or similar. It was mainly to transport the idea for the feature I'd like to see in the future.

I didn't call it jj search due to #5446 which is requesting a jj search that is not related at all to the functionality requested for here as that other ticket targets a git log -S ... adaption, not a git grep adaption.

The current suggestion is using something like rg on a jj repository.

Which is not covering my use-case. Even if that searches only through the non-ignored files in the working copy, with git grep you can also search through the file tree of any given change which to my knowlege you cannot achieve with rg.

Vampire avatar Nov 05 '25 15:11 Vampire

How do the existing jj run and jj util exec fit in with this?

joyously avatar Nov 05 '25 15:11 joyously

How do the existing jj util exec fit in with this?

I don't see how jj util exec is related, unless using it to half-way knit the command yourself as alias by using it to actually check out the target revision somewhere and then running vanilla grep on it.

How do the existing jj run fit in with this?

I don't see any existing jj run, only jj bisect run. If you are talking about the future jj run coming with #1869, that could probably allow doing something like jj run 'sh -c "jj file list --template \"path ++ \\\"\n\\\"\" docs | xargs -r grep --color "my.*pattern"', but that would still not be super convenient.

And additionally, as this is not expected to change any files, it would not necessarily need the files on disk like jj run will require and thus can probably be implemented much more efficient with explicit support. And also when for example using the jj "String patterns", users might feel more familiar with it.

Vampire avatar Nov 05 '25 16:11 Vampire

I'm sure Martin knows how Google integrated hg grep with their internal infrastructure and what went well and what learnings to take from it to apply to jj.

We didn't :) The hg integration at Google worked very differently. Rather than storing commits in a central database, it would store them in the file system (typically our distributed file system). hg grep would just search for the files that were in the repo's set of tracked paths. We don't have to go into details, but the point is that it's different enough that I don't think we'll learn much from it.

The obvious way to implement it is to walk the whole tree and read the contents for each file. That will not work in practice for Google except when limiting the search to a small directory. If we want to make it fast enough, we'll need to add a new index method. We can always start with the naive implementation.

martinvonz avatar Nov 05 '25 17:11 martinvonz