zed
zed copied to clipboard
Show filename when searching across multiple files with zq
tl;dr
In a directory containing multiple files ending in .log
, a user executes a search like:
zq -i line '"test"' *.log
Alongside each search result the user would like a way to also display the name of the file each result came from.
Details
Repro is with Zed commit c39086b.
This issue was originally surfaced in a community Slack thread. In the user's own words:
Is there a way when searching across a glob pattern for multiple files in a directory such as
*.log
to have the file the search result came from also listed? For examplezq -i line '"test"' *.log
similar to
grep -rnwio . -e "test"
which would list the file and the containing string. I had a thought that usingfrom
might have gotten me there but not the right usage.
@mattnibs acknowledged to the user that we don't currently have a way of doing this and for now recommended using tools at the the shell to bridge the gap. So for example, using the zed-sample-data, this shows the baseline problem:
$ zq -version
Version: v1.17.0-11-gc39086ba
$ zq -i line '"thinkwithgoogle"' *.log.gz
"1521912845.237311\t144c918fa2aca4461d3535a237d311cb5102c1919096e0fa9b73ab95af4876fc\t3\t08434F2704007BF2\tCN=*.appspot.com,O=Google Inc,L=Mountain View,ST=California,C=US\tCN=Google Internet Authority G3,O=Google Trust Services,C=US\t1520451204.000000\t1527706320.000000\trsaEncryption\tsha256WithRSAEncryption\trsa\t2048\t65537\t-\t*.appspot.com,*.thinkwithgoogle.com,*.withgoogle.com,*.withyoutube.com,appspot.com,thinkwithgoogle.com,withgoogle.com,withyoutube.com\t-\t-\t-\tF\t-\tT\tF"
And here's the recommended approach from @mattnibs working as intended:
$ find . -name "*.log.gz" | xargs -I {} zq -i line '"thinkwithgoogle" | {file:"{}",value:this}' {}
{file:"./x509.log.gz",value:"1521912845.237311\t144c918fa2aca4461d3535a237d311cb5102c1919096e0fa9b73ab95af4876fc\t3\t08434F2704007BF2\tCN=*.appspot.com,O=Google Inc,L=Mountain View,ST=California,C=US\tCN=Google Internet Authority G3,O=Google Trust Services,C=US\t1520451204.000000\t1527706320.000000\trsaEncryption\tsha256WithRSAEncryption\trsa\t2048\t65537\t-\t*.appspot.com,*.thinkwithgoogle.com,*.withgoogle.com,*.withyoutube.com,appspot.com,thinkwithgoogle.com,withgoogle.com,withyoutube.com\t-\t-\t-\tF\t-\tT\tF"}
The user confirmed this solution should be workable for now.
In terms of how we might address this more directly in the future, @mattnibs offered the following thoughts:
Maybe a solution would be to add globbing to the
file
source operator much as we do for pool sources then maybe have some flag that lets you put the source name/details on each value produced from the source.When talking about decorating each value in the source maybe you have a
-each
flag that accepts a function where the first argument is thethis
value and the second is aninfo
record describing the source and the result of the function would be the new value. So you could do something like:func describe(value, info): ( { value, info } ) file -each=describe *.log
Discussing this reminded us all of another issue https://github.com/brimdata/zui/issues/2931 where a user asked about doing something similar with from *
and wanting to see the name of the pool each result came from.