telescope.nvim
telescope.nvim copied to clipboard
current_buffer_tags is slow when used with large tag files
Description
When using Telescope current_buffer_tags with a large tags file, it can take 5-10 seconds before results are displayed. In my particular case I have a tags file with 282 065 lines. When using FZF I got this to be fast by having ripgrep actually do the parsing, using the following command:
rg --color=never --no-filename --no-line-number '
. fzf#shellescape(expand('%'))
. ' tags | sort -s -t \t -k 1,1'
While I could probably adopt something similar, it would be nice if the default performance was better.
Expected Behavior
Tags for the current buffer are displayed in reasonable time.
Actual Behavior
It can take 5-10 seconds before data is displayed.
Details
Reproduce
git clone https://gitlab.com/gitlab-org/gitlab.git && cd gitlab- Create a
tagsfile with the contents of the tags file attached below - Open
app/models/user.rbin NeoVim - Run
Telescope current_buffer_tags. - Observe this process taking a long time to complete
Environment
NeoVim version:
NVIM v0.5.0
Build type: Release
LuaJIT 2.0.5
Compilation: /usr/bin/cc -D_FORTIFY_SOURCE=2 -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=1 -DNVIM_TS_HAS_SET_MATCH_LIMIT -O2 -DNDEBUG -Wall -Wextra -pedantic -Wno-unused-parameter -Wstrict-prototypes -std=gnu99 -Wshadow -Wconversion -Wmissing-prototypes -Wimplicit-fallthrough -Wvla -fstack-protector-strong -fno-common -fdiagnostics-color=always -DINCLUDE_GENERATED_DECLARATIONS -D_GNU_SOURCE -DNVIM_MSGPACK_HAS_FLOAT32 -DNVIM_UNIBI_HAS_VAR_FROM -DMIN_LOG_LEVEL=3 -I/build/neovim/src/neovim-0.5.0/build/config -I/build/neovim/src/neovim-0.5.0/src -I/usr/include -I/build/neovim/src/neovim-0.5.0/build/src/nvim/auto -I/build/neovim/src/neovim-0.5.0/build/include
Compiled by builduser
Features: +acl +iconv +tui
See ":help feature-compile"
system vimrc file: "$VIM/sysinit.vim"
fall-back for $VIM: "/usr/share/nvim"
Run :checkhealth for more info
OS: Arch Linux with kernel 5.13.4 (btw I use Arch Linux)
Telescope version: b742c50
Configuration doesn't matter in this case, as it's a problem with how tag files are processed; not how this is configured (there's nothing to configure anyway). In addition, once the data is loaded filtering is fast.
The big tags file I'm using:
https://gist.github.com/YorickPeterse/3ab120f98d1898fd43a3919d0a44a13c
I can't attach it directly as it's too big :smile:
Looking at the code (https://github.com/nvim-telescope/telescope.nvim/blob/f8caad1d6bd19dbd79945850342b49df41928525/lua/telescope/builtin/files.lua#L547-L592) the first thing I notice is that the entire file is read into memory, then split into lines. That already is likely going to waste memory.
The second issue seems to be that current_buffer_tags filters the massive list afterwards (if I'm understanding the code correctly). So basically first we load close to 300 000 lines into memory, then some time later we filter them out.
A third issue seems to be that the lines split are kept as-is, and that they are parsed later in the attach_mappings function. Given a lot of data in the tags file is probably redundant, some of this work probably should be done earlier.
Correction: some parsing seems to be done ahead of time in https://github.com/nvim-telescope/telescope.nvim/blob/79644ab67731c7ba956c354bf0545282f34e10cc/lua/telescope/make_entry.lua#L815
I guess not using string.match may speed things up a bit, but I'm not sure what you'd replace it with.
We had a attempt to load the file async (similar to find_files) here https://github.com/nvim-telescope/telescope.nvim/pull/288
A lot of problems would be solved that way. He just never finished it and i haven't found the time. Feel free to pick it up tho :) the pr and the conversations have a lot of good ideas how to solve it (using rg or grep, using vim.loop.fs_read async method (its pure c so its super fast)).
A third issue seems to be that the lines split are kept as-is, and that they are parsed later in the attach_mappings function. Given a lot of data in the tags file is probably redundant, some of this work probably should be done earlier.
This is wrong that only happens for the value you select, you can ignore that part, its just for us to find the line in the tag file on selecting because tags dont off the line number. Its better to do that only for the selected entry rather than for all entries.
If you pick it up i can help you with the PR :)
@Conni2461 I'm not sure if I'll have time in the next few weeks, but if so I'll see if I can take a look at the mentioned pull request.
Hi, I'm also a user of a large tags file and have hit this issue also - keen to see a fix, but sadly not proficient enough in plugin development to help directly.
We have merged some performance improvements for it some time ago. Did this improve the situation for you?
Hi there :) I updated the plugin and the current buffer tags performance is much better :D
Many thanks for your work.
anecdotally its a lot better for me!