emacs-counsel-gtags icon indicating copy to clipboard operation
emacs-counsel-gtags copied to clipboard

global-tags performance issue and proposed fix

Open Ergus opened this issue 3 years ago • 3 comments

Hi Felipe:

Today I had an issue with global-tags--get-dbpath that takes too much time when using tramp.

The issue came out trying to use a command like project-switch-to-buffer that calls project--read-project-buffer that basically calls project-current for all the opened buffers calling project--find-in-directory that calls project-find-functions where global-tags-try-project-root is. So the functions in project-find-functions are expected to be as fast as possible.

I thought to report this as an emacs issue but after a while a found that most of the time was taken by global-tags-try-project-root so that makes me think that probably that function needs an important optimization. The time is taken mostly by global-tags--get-dbpath

There are several options:

  1. Lets say a sort of cache to store pairs like: (default-directory . dbpath) everytime global-tags-try-project-root is invoked for the first time for a different default-directory.

  2. Use a buffer-local-variable for the first time we use the function for a buffer. This approach makes to use the slow external process at least once per buffer... so Maybe 1 is a better and simpler approach.

  3. Use a hash-table for same than 1.

Something as simple as (untested)

(defvar global-tags--dbpath-cache (make-hash-table))

(cl-defun global-tags--get-dbpath (&optional (dir default-directory))
  "Filepath for database from DIR."
  (or (gethash dir global-tags--dbpath-cache)
      (when-let* 
        ==bla bla==
	(when (file-exists-p dbpath)
	  (puthash dir dbpath global-tags--dbpath-cache)
          dbpath))))

Another part of the issue is when when-let does not call puthash... because some of the conditions failed. Because the next time the when-let will be called again In that case the solution is to add a dumb value unconditionally like 'empty by default and use some comparisons against it before calling the when-let.

Ergus avatar Jun 27 '21 20:06 Ergus

(cl-defun global-tags--get-dbpath (&optional (dir default-directory))
  "Filepath for database from DIR."
  (pcase (gethash dir global-tags--dbpath-cache)
    ('empty nil)
    ('nil
     (puthash dir 'empty global-tags--dbpath-cache)
     (when-let* 
       === BLA ===
       (when (file-exists-p dbpath)
	 (puthash dir dbpath global-tags--dbpath-cache))))
    (code code)))

Ergus avatar Jun 27 '21 20:06 Ergus

I'm actually using somewhat the first solution you propose, using cache, at work. I hadn't updated that code, although it's stable now. I'm actually inheriting wrapping global-tags-try-project-root with a function that first checks cached roots before calling any global-tags function. This is a great feature of the extensible design of project.el and xref.el

what I'm doing is using a persist-defvar with a list of already-used "project roots per gtags". Since I don't tend to delete source code at work, I've never had to worry about that list including deleted code. I don't know if it's the same for anyone else, but I guess that a Emacs-session-local var should be enough for everyone.

Should we have a timeout for cached project roots? How much? 24h? Refreshing the cache list can be done in the background using async, so that's not a problem.

FelipeLema avatar Jun 29 '21 22:06 FelipeLema

I'm actually using somewhat the first solution you propose, using cache, at work. I hadn't updated that code, although it's stable now. I'm actually inheriting wrapping global-tags-try-project-root with a function that first checks cached roots before calling any global-tags function. This is a great feature of the extensible design of project.el and xref.el

The second solution is actually better because project-find-functions may have more than one function to try, and it is useful to avoid repeating the search with global-tags--get-dbpath when it failed in this session once.

I actually reported an emacs bug bug#49264 to see if they add some caching features to project.el because the vc backend is also very slow.

what I'm doing is using a persist-defvar with a list of already-used "project roots per gtags". Since I don't tend to delete source code at work, I've never had to worry about that list including deleted code. I don't know if it's the same for anyone else, but I guess that a Emacs-session-local var should be enough for everyone.

Should we have a timeout for cached project roots? How much? 24h? Refreshing the cache list can be done in the background using async, so that's not a problem.

That may be a custom... so 0 means disable cache, a number the time in whatever unit you want and nil disable auto erasing.

Ergus avatar Jun 30 '21 00:06 Ergus