lakeFS
lakeFS copied to clipboard
[Bug]: new version of lakectl is very slow
What happened?
What actually happened, including error codes if applies.
Steps to Reproduce:
- Download the latest version of
lakectl
- Clone a repo with a decent amount of data (about 100GB in my case)
- Make one small modification and run
lakectl local commit
. This takes about 15 minutes on a fairly fast machine for me. Before upgrading this took about 20 seconds.
Expected behavior
It should be much faster. 15 minutes for a local commit
or a local status
is an awfully long time.
I suspect it's related to https://github.com/treeverse/lakeFS/pull/7563. Specifically, I was using an older version of lakectl
and lakectl local
was reporting diffs on a freshly cloned repo. So I upgraded lakectl
and now it's correct, but much, much slower.
lakeFS version
1.16.0 for lakectl
How lakeFS is installed
Cloud hosted
Affected clients
lakectl
Relevant log output
No response
Contact details
No response
I have an idea that may help here. Will try to write a PoC that performs an alphabetical scan of a directory tree efficiently in both time and space.
I have an idea that may help here. Will try to write a PoC that performs an alphabetical scan of a directory tree efficiently in both time and space.
I'm sure it will take a long time. You can consider storing the state of the current local folder and remote in a specific format, then compare them for differences.
I have an idea that may help here. Will try to write a PoC that performs an alphabetical scan of a directory tree efficiently in both time and space.
I'm sure it will take a long time. You can consider storing the state of the current local folder and remote in a specific format, then compare them for differences.
Hi @thungrac ,
You are of course correct that listing everything takes a long time. Unfortunately that is literally the feature: download a repo to a local directory, work on it locally without involving lakeFS in any way, and then re-upload the changes. By definition, this requires scanning all files. 🤷
The issue that I wish to fix is in how we scan. A further complication is that lakeFS is an object store, while the local filesystem is a filesystem. There's some more info about the difference here. But the important difference for our purposes is that object stores are just a flat list of objects sorted by pathname, and filesystems arrange files in directories. So when we scan the local subtree we get a files in a different order than when we scan lakeFS. The current implementation re-sorts, which is actually slow and uses lots of memory. I want us to try to remove or reduce the size of that sorting. That will reduce both runtime and memory consumption dramatically.
If we manage to do it, I promise a tech blog :-)