lite icon indicating copy to clipboard operation
lite copied to clipboard

Thoughts on Tree View performance in large/deep directories

Open DevinCarr opened this issue 5 years ago • 10 comments

Currently if you open lite inside a large directory, it can occasionally hang while it is attempting to recurse through all of the directories to enumerate all of the files. In a large directory, the files will currently not show up at all until all of the files/directories have been enumerated.

One way to fix this would be to expose the project_scan_thread (init.lua:14) functions as override-able to allow for plugins to easily override these with custom logic. This is more friendly I think to your approach to keep the logic simple within lite while allowing a plugin to enable more complex behavior.

Another option would be to implement this in the core logic as lazy-loading the directories when clicked (or just incrementally in background) if it can be done to match your vision of simplicity.

Curious on your thoughts.

DevinCarr avatar May 15 '20 02:05 DevinCarr

Regarding simplicity the current way the treeview operates is what I consider the simplest form: using the existing core.project_files (which is used for fuzzy finding files and everything else). This part specifically I'm not looking to change.

The project scanning is already done incrementally -- are you absolutely sure it's the project scanning which creates the stall in your case? Do you still get it if you delete treeview.lua and run lite on the same project? It may be the treeview iterating all the project files when something's been updated and it's had to invalidate it's cached "skip" values. What kind of more complex behaviour do you think a plugin might provide in regard to the project scanning?

rxi avatar May 17 '20 12:05 rxi

Sounds good!

When I remove the treeview.lua file, I still end up with some rendering lag that is noticeable when I try and move the window around. My example folder I am using to test this is C:\Windows\. It could be something else that is attributing to the slight stalling.

A more complicated behavior that I was possibly looking to add via a plugin would be to only parse not ignored files from a .gitignore file. To make it complete, I think it would also need the availability to disable the current project_scan_thread that is initiated in init.lua.

DevinCarr avatar May 17 '20 22:05 DevinCarr

A more complicated behavior that I was possibly looking to add via a plugin would be to only parse not ignored files from a .gitignore file. To make it complete, I think it would also need the availability to disable the current project_scan_thread that is initiated in init.lua.

This should now be supported with the addition of the config.ignore_files, the project scan thread uses this to work out what files it should skip -- a .gitignore file could be read and the entries converted into patterns that are added to the config.ignore_files table.

My example folder I am using to test this is C:\Windows. It could be something else that is attributing to the slight stalling.

Would you be able to clarify exactly how many files we're talking here? You can press ctrl+shift+p and it will display it in the bottom right corner. Thanks!

rxi avatar May 18 '20 08:05 rxi

This should now be supported with the addition of the config.ignore_files, the project scan thread uses this to work out what files it should skip -- a .gitignore file could be read and the entries converted into patterns that are added to the config.ignore_files table.

Excellent, that is a great addition that would easily support what I previously mentioned!

Would you be able to clarify exactly how many files we're talking here? You can press ctrl+shift+p and it will display it in the bottom right corner. Thanks!

image 284k files in C:\Windows\!

DevinCarr avatar May 19 '20 02:05 DevinCarr

Would it be possible to pass the full file path to the match pattern in init.lua:38? Without the full file path, it is hard to determine from a gitignore a path like: folder/2/ since the recursion only applies with the file/folder at one level at a time. So for a hierarchy:

folder/
  file.txt
  2/
    file2.txt

It ends up currently impossible to ignore the full folder pattern as the first iteration passed to match_pattern would be folder then in the next loop it would provide 2, but never together as folder/2.

To be honest, it would make sense if you think a change to this would be too complicated as it would slow down as the path gets longer (deep folder structure) if you include the full (relative or absolute) path for the ignore evaluation. Maybe there is some better solution to this beyond the full path, that hopefully provides a simple solution to meet the project's goals.

Also, if you think this conversation (plugin support for gitignore) doesn't belong here in the issues, please let me know.

DevinCarr avatar May 19 '20 07:05 DevinCarr

Just wanted to add a little something I found out with the help of rxi. He provided me with a little memory usage script which I used to test some stuff. Here it is:

local core = require "core"
local style = require "core.style"

local draw = core.root_view.draw

core.root_view.draw = function(...)
  draw(...)
  local str = string.format("%.2f mb", collectgarbage("count") / 1024)
  renderer.draw_text(style.font, str, core.root_view.size.x - 90, 4, { 255, 0, 255 })
end

What I found was that lite's memory usage was skyrocketting in deep directories, such as my home, easily going onto the 3GB or more. I'm not familiar with the codebase so I'm not sure if this is just due to the sheer number of files being loaded onto core.project_files or if there's some kind of memory leak error in the underlying C code.
I'll be doing more testing.

Tmpod avatar Jun 14 '20 11:06 Tmpod

@Tmpod 1.08 changes the behaviour to avoid cases where one's home directory is set as the current project unintentionally. If your project directory has millions of files, lite will unavoidably need to use memory to keep track of those files -- the issue of CPU usage or potential hanging is something that may be avoided in the future.

or if there's some kind of memory leak error in the underlying C code.

The printed memory usage in your comment's code only shows that used by lua. As far as I know there are no leaks in the C side of lite, the address sanitiser doesn't catch anything; the C side of lite is typically allocing memory in very few places (font loading and mallocing a temporary buffer in a few places)

rxi avatar Jun 15 '20 10:06 rxi

1.08 changes the behaviour to avoid cases where one's home directory is set as the current project unintentionally.

Will update then

If your project directory has millions of files, lite will unavoidably need to use memory to keep track of those files

You could opt for a little something that evicts the least seen directories when you reach a certain number of file entries in the project view. It would make it load slighly slower on directories that are less used, but it would probably benefit the user in the general.

Tmpod avatar Jun 15 '20 22:06 Tmpod

@Tmpod We need all files present for the sake of fuzzy-finding (ctrl+p), as well as plugins being able to depend on a complete core.project_files array

rxi avatar Jun 17 '20 07:06 rxi

I solved this issue with only loading files up to 5 levels deep, but still load the directories. Then I created a command which changes the project_root to a subdirectory. This way only relevant files are loaded. I also added a max project file variable, similar to what lite-xl does. You can checkout the code here and here Maybe this helps..

0xd61 avatar Jul 10 '21 20:07 0xd61