zoekt icon indicating copy to clipboard operation
zoekt copied to clipboard

Feature request: is it possible to add a query filter on "topics:"

Open hanskr opened this issue 1 year ago • 16 comments

hanskr avatar May 24 '24 06:05 hanskr

This is quite github specific feature request. However, maybe a more generic "repository labels" would make sense here. For example we already have this sort of idea for supporting filtering by archived/fork/etc but that is a static list. Then we could expose in the query parser topic:.

  • RawConfig is how we do archived/forked/etc https://sourcegraph.com/github.com/sourcegraph/zoekt@df7a7e7162cf7d7af4d4cdde3701c57950830676/-/blob/query/query.go?L40:6-40:15#tab=references
  • I imagine to encode topics we would store a Set (map[string]struct{}) per repo. Probably doesn't need to live in memory per shard given the low amount of data.

FYI I don't think this would get worked on anytime soon, but happy to guide anyone who is motivated to do it.

keegancsmith avatar May 26 '24 08:05 keegancsmith

A generic labels/tags/topics functionality that could be populated by "something" per type of backend (topics for GH) (not sure if others have similar functionality) sound like a good approach, and would solve my needs fine.

For now, I have worked my way around it with a script that uses the GH api to translate topics => repo-names and construct a zoekt query with a repo regex.

hanskr avatar May 27 '24 07:05 hanskr

A generic labels/tags/topics feature would be great.

xavier-calland avatar Jul 24 '24 11:07 xavier-calland

@keegancsmith I can try to look at this feature, but I need details on the different modifications to be made. Can you guide me?

xavier-calland avatar Aug 23 '24 08:08 xavier-calland

@xavier-calland

Here is how I would get started

  1. The labels would probably have to live here. Why? For Sourcegraph we have an optimization that updates to metadata don't cause a reindex (see .meta files in the code). I think repository labels shouldn't cause a full reindex.
  2. Indexing: Pick a code host (probably GitHub) and pipe the data from the code host all the way to the builder, which writes the metadata to disk.
  3. Query language: you have to extend the query language. You could add EG a label: filter. Check out query.go and parse.go for inspiration.
  4. Matchtree: The final piece is to use a match tree to skip the repos that don't have the correct label. Checkout RepoSet in matchtree.go or any of the other match trees there for inspiration.

EDIT: 5. UI: It would be nice to show the labels in UI, too.

stefanhengl avatar Sep 10 '24 08:09 stefanhengl

Assign me to this issue please? I could do similar issue in the future.

gogo2464 avatar Apr 08 '25 21:04 gogo2464

I will use this template https://github.com/sourcegraph/zoekt/pull/370/files

gogo2464 avatar Apr 09 '25 09:04 gogo2464

Excuse me I di not finished yet.

gogo2464 avatar Apr 16 '25 08:04 gogo2464

I still did not finished: https://github.com/sourcegraph/zoekt/pull/939

@hanskr could be a great feature, would you still enjoy it?

gogo2464 avatar May 02 '25 12:05 gogo2464

I am not in the group. Should I register?

gogo2464 avatar May 04 '25 22:05 gogo2464

Sorry that linear issue comment we just use at Sourcegraph to help not lose track of stuff. All issue tracking for zoekt happen on this github issue tracker.

keegancsmith avatar May 05 '25 04:05 keegancsmith

I am confused. I started to contribute. See my PR at : https://github.com/sourcegraph/zoekt/pull/939 .

Who is assigned to this issue please?

gogo2464 avatar May 05 '25 13:05 gogo2464

I've assigned it to you to make it clear you are interested in implementing this.

keegancsmith avatar May 05 '25 14:05 keegancsmith

Thank you!

gogo2464 avatar Jun 20 '25 22:06 gogo2464

Hello.

Sorry for the BIG BIG delay!

I am currently doing a research publication in a newspaper. I really hope it will published.

I need zoekt. I will continue the zoekt PR as soon as I can.

gogo2464 avatar Jul 06 '25 13:07 gogo2464