gptel icon indicating copy to clipboard operation
gptel copied to clipboard

gptel-context: Allow exclusion of gitignored files

Open benthamite opened this issue 9 months ago • 11 comments

Continuing the discussion here, this PR allows the user to exclude, by setting the user option gptel-context-exclude-gitignored to t, gitignored files or directories from the context.

Tagging participants in previous discussion: @hartikainen, @collinarnett, @metachip, @sandinmyjoints.

benthamite avatar Feb 26 '25 14:02 benthamite

(I just noticed that I forgot to set the default value of gptel-context-exclude-gitignored to nil, but now I wonder if we should leave it as t. I don’t have a clear view on this.)

benthamite avatar Feb 26 '25 15:02 benthamite

I would set it to t by default. I tested out the new add directory functionality on a repository with node_modules folder and it froze my emacs adding thousands of files to the context window. Other utilities like RepoMix enable gitignore exclusion by default as well.

collinarnett avatar Feb 26 '25 15:02 collinarnett

Could you move all the git-related functions gptel-context--git-files, gptel-context--skip-file-p, gptel-context--message-git-skipped and gptel-context--build-git-cache into one place in the file? You can add a ;;; git tracking functions header over them for navigation.

Additionally, anywhere you run a file-name function like file-relative-name on a file, please check that the file argument can't be nil. I haven't analyzed all possible cases.

karthink avatar Mar 13 '25 03:03 karthink

locate-dominating-file can be very slow on remote files, to the point of locking up Emacs for a while.

It might be worth using file-remote-p and skipping the gitignore check if it's a remote file.

karthink avatar Mar 13 '25 03:03 karthink

@karthink, I believe I have now addressed all your requested changes. Let me know if you have further comments.

benthamite avatar Mar 13 '25 15:03 benthamite

Late to the party here (and I wouldn't want to derail this feature; I was recently bitten by it), but was the use of project.el (via project-files) considered now that it's included in emacs? That would get you cross-VCS support for free.

vermiculus avatar Mar 20 '25 20:03 vermiculus

Hi @vermiculus. I didn’t consider using project.el because I am not familiar with that package. Let me take a look and I’ll see if I can adapt it.

benthamite avatar Mar 20 '25 20:03 benthamite

@vermiculus: I revised the PR to make it use project.el. The code is simpler now.

It would be great if folks could test this before it is merged: I’m not a programmer and my eyes may fail to catch problems.

benthamite avatar Mar 20 '25 22:03 benthamite

I see some whitespace changes that apparently just change spaces to TABs, most likely since emacs has indent-tabs-mode on by default, and especially if you have aggressive-indent-mode on, it can cause unecessary changes in places you did not even meant to touch. (This is why I submitted #712.) In case you remove those changes, consider doing some squashing as well, this PR has currently 34 commits and I think some of them could be squashed together to make the git history cleaner.

pabl0 avatar Mar 27 '25 10:03 pabl0

I think I'm basically done with the changes here. My only concern is that (member file gptel-context--project-files) may be a bit slow for projects with lots of files. I believe this issue arose after we started using project.el, and happens because (member file gptel-context--project-files) has to linearly search through the entire list of project files to check if a file is present. Claude suggests using a hash table, but I don’t know enough Elisp to implement this suggestion.

If and when @karthink agrees to merge this PR, I'm happy to do some squashing to clean up the commit history.

benthamite avatar Apr 07 '25 14:04 benthamite

(If the performance issue is not addressed, I am now inclined to set the default value of gptel-context-restrict-to-project-files to nil.)

benthamite avatar Apr 07 '25 14:04 benthamite

Hi @karthink, and great work @benthamite on this PR!

This feature is excellent! For me, it addresses a pressing need, as it was the first thing I looked for when I started adding projects to the context.

@karthink, is there an estimated timeline for merging this? I'd be happy to contribute by implementing the hash-table optimisation that was discussed, if that would help move this PR forward towards main.

komikat avatar Jun 11 '25 09:06 komikat

@karthink, is there an estimated timeline for merging this? I'd be happy to contribute by implementing the hash-table optimisation that was discussed, if that would help move this PR forward towards main.

I've just been busy with other features (and the new release) is all. I had some performance concerns about this PR, but they might be fixed now. I need to check.

Is adding entire projects to the context even viable? It seems like a lot, not to mention inefficient. Maybe I'm just behind in my thinking, context windows have been increasing recently.

karthink avatar Jun 12 '25 20:06 karthink

FWIW, I do agree it’s generally inefficient. My workflow for integrating AI with projects has changed since submitting this PR, and now I almost always add selected project files rather than all files in a project. Still, I guess that if one wants to do the latter, one typically wants to exclude gitignored files, so I think this is probably still a useful feature to have (modulo the performance concerns, which I agree are valid).

benthamite avatar Jun 12 '25 23:06 benthamite

Still, I guess that if one wants to do the latter, one typically wants to exclude gitignored files

Agreed, excluding the .gitignored files would be a very useful feature in my opinion

peterbecich avatar Jun 29 '25 20:06 peterbecich

Waiting on this feature. Thank you.

CsBigDataHub avatar Aug 22 '25 13:08 CsBigDataHub

@karthink, shall we make a decision regarding this? If you don’t want to merge it, that’s totally fine by me, but I would prefer to close it so that people are not kept waiting.

benthamite avatar Aug 22 '25 13:08 benthamite

@karthink, shall we make a decision regarding this? If you don’t want to merge it, that’s totally fine by me, but I would prefer to close it so that people are not kept waiting.

I wanted to merge this today, but the cache check is too fragile. Here is an example of the kind of bug you can produce:

  1. Run M-x gptel-context-add-file and navigate to a file from another project. Add it.
  2. Now add a gitignored file from the current project. It will be added too.

karthink avatar Sep 12 '25 02:09 karthink

Changed the caching logic, fixed some bugs and merged. Thanks for the PR @benthamite.

karthink avatar Sep 12 '25 08:09 karthink