telescope.nvim icon indicating copy to clipboard operation
telescope.nvim copied to clipboard

current_buffer_tags is slow when used with large tag files

Open yorickpeterse opened this issue 4 years ago • 10 comments

Description

When using Telescope current_buffer_tags with a large tags file, it can take 5-10 seconds before results are displayed. In my particular case I have a tags file with 282 065 lines. When using FZF I got this to be fast by having ripgrep actually do the parsing, using the following command:

rg --color=never --no-filename --no-line-number '
        . fzf#shellescape(expand('%'))
        . ' tags | sort -s -t \t -k 1,1'

While I could probably adopt something similar, it would be nice if the default performance was better.

Expected Behavior

Tags for the current buffer are displayed in reasonable time.

Actual Behavior

It can take 5-10 seconds before data is displayed.

Details

Reproduce
  1. git clone https://gitlab.com/gitlab-org/gitlab.git && cd gitlab
  2. Create a tags file with the contents of the tags file attached below
  3. Open app/models/user.rb in NeoVim
  4. Run Telescope current_buffer_tags.
  5. Observe this process taking a long time to complete
Environment

NeoVim version:

NVIM v0.5.0
Build type: Release
LuaJIT 2.0.5
Compilation: /usr/bin/cc -D_FORTIFY_SOURCE=2 -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=1 -DNVIM_TS_HAS_SET_MATCH_LIMIT -O2 -DNDEBUG -Wall -Wextra -pedantic -Wno-unused-parameter -Wstrict-prototypes -std=gnu99 -Wshadow -Wconversion -Wmissing-prototypes -Wimplicit-fallthrough -Wvla -fstack-protector-strong -fno-common -fdiagnostics-color=always -DINCLUDE_GENERATED_DECLARATIONS -D_GNU_SOURCE -DNVIM_MSGPACK_HAS_FLOAT32 -DNVIM_UNIBI_HAS_VAR_FROM -DMIN_LOG_LEVEL=3 -I/build/neovim/src/neovim-0.5.0/build/config -I/build/neovim/src/neovim-0.5.0/src -I/usr/include -I/build/neovim/src/neovim-0.5.0/build/src/nvim/auto -I/build/neovim/src/neovim-0.5.0/build/include
Compiled by builduser

Features: +acl +iconv +tui
See ":help feature-compile"

   system vimrc file: "$VIM/sysinit.vim"
  fall-back for $VIM: "/usr/share/nvim"

Run :checkhealth for more info

OS: Arch Linux with kernel 5.13.4 (btw I use Arch Linux)

Telescope version: b742c50

Configuration doesn't matter in this case, as it's a problem with how tag files are processed; not how this is configured (there's nothing to configure anyway). In addition, once the data is loaded filtering is fast.

yorickpeterse avatar Aug 03 '21 14:08 yorickpeterse

The big tags file I'm using:

https://gist.github.com/YorickPeterse/3ab120f98d1898fd43a3919d0a44a13c

I can't attach it directly as it's too big :smile:

yorickpeterse avatar Aug 03 '21 14:08 yorickpeterse

Here's a recording that shows the issue:

asciicast

yorickpeterse avatar Aug 03 '21 14:08 yorickpeterse

Looking at the code (https://github.com/nvim-telescope/telescope.nvim/blob/f8caad1d6bd19dbd79945850342b49df41928525/lua/telescope/builtin/files.lua#L547-L592) the first thing I notice is that the entire file is read into memory, then split into lines. That already is likely going to waste memory.

The second issue seems to be that current_buffer_tags filters the massive list afterwards (if I'm understanding the code correctly). So basically first we load close to 300 000 lines into memory, then some time later we filter them out.

A third issue seems to be that the lines split are kept as-is, and that they are parsed later in the attach_mappings function. Given a lot of data in the tags file is probably redundant, some of this work probably should be done earlier.

yorickpeterse avatar Aug 03 '21 14:08 yorickpeterse

Correction: some parsing seems to be done ahead of time in https://github.com/nvim-telescope/telescope.nvim/blob/79644ab67731c7ba956c354bf0545282f34e10cc/lua/telescope/make_entry.lua#L815

I guess not using string.match may speed things up a bit, but I'm not sure what you'd replace it with.

yorickpeterse avatar Aug 03 '21 14:08 yorickpeterse

We had a attempt to load the file async (similar to find_files) here https://github.com/nvim-telescope/telescope.nvim/pull/288

A lot of problems would be solved that way. He just never finished it and i haven't found the time. Feel free to pick it up tho :) the pr and the conversations have a lot of good ideas how to solve it (using rg or grep, using vim.loop.fs_read async method (its pure c so its super fast)).

A third issue seems to be that the lines split are kept as-is, and that they are parsed later in the attach_mappings function. Given a lot of data in the tags file is probably redundant, some of this work probably should be done earlier.

This is wrong that only happens for the value you select, you can ignore that part, its just for us to find the line in the tag file on selecting because tags dont off the line number. Its better to do that only for the selected entry rather than for all entries.

If you pick it up i can help you with the PR :)

Conni2461 avatar Aug 04 '21 21:08 Conni2461

@Conni2461 I'm not sure if I'll have time in the next few weeks, but if so I'll see if I can take a look at the mentioned pull request.

yorickpeterse avatar Aug 04 '21 22:08 yorickpeterse

Hi, I'm also a user of a large tags file and have hit this issue also - keen to see a fix, but sadly not proficient enough in plugin development to help directly.

astutecat avatar Jun 14 '22 10:06 astutecat

We have merged some performance improvements for it some time ago. Did this improve the situation for you?

Conni2461 avatar Jul 10 '22 19:07 Conni2461

Hi there :) I updated the plugin and the current buffer tags performance is much better :D

Many thanks for your work.

astutecat avatar Jul 18 '22 07:07 astutecat

anecdotally its a lot better for me!

dsummersl avatar Jul 25 '22 16:07 dsummersl