OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

fix(frontend): rebuild the file explorer; goodbye lag

Open amanape opened this issue 1 year ago • 7 comments

Fixes #1136

Summary

The file explorer has been rebuilt from the ground up in order to resolve the horrendous lag it was causing. It also introduces a few things that have not previously existed. I am aware I may have missed a few edge cases, so let me know if you find anything.

Motivation

It was not possible for me to work properly and it got real annoying real fast. I apologize to those working on a fix

Fixes

  • Faster mounting
  • Faster file selection
  • Faster refresh
  • Easier to read and extend
  • All relevant logic now exist in the relevant components
  • No more warnings thrown by Redux

Features

  • FULLY TESTED, this was built via TDD (and hopefully will be extended via TDD, too)
  • Hides the directory instead of unmounting it when closing, this also saves load time
  • Starts collapsed (also reduces lag)
  • Can mount multiple directories in the root level (instead of only one as previous; not really tested much though)
  • Can scroll vertically and horizontally for better navigation (from my understanding, it was not there before?)

Note on Redux

One major issue I found that was causing heavy delay was something with how the Redux store was being used (inconsistent lifecycle, duplicates between parent and children, multiple renders, etc...). To address this, I moved the local logic into the component rather than outsource it into a Redux store, the rest was moved to the parent (<CodeEditor />), where it should be.

This means that these new components do not depend on the Redux store for retrieval and now have local logic to efficiently handle this, boosting performance 10-fold.

Note on styles

I have copy-pasted the existing styles, except for some parts that were not applicable after the changes. I also chose not to use the <Accordion /> component as it was inconsistent with the collapsable functionality, added complexity (which was the opposite of what the objective of this change was), and it was not being used properly in the first place.

I didn't bother to go beyond the copy-paste effort, so the styles should be more-or-less as before. If there is any issue, feel free to adjust it.

Naming

Probably can be better

Note on testing

Run npm test file-explorer to run the relevant tests

Please

Please, please, please. If you can verify that I have not missed anything important. From my understanding, the file explorer was responsible for: retrieval, explore, selection (to display on the editor), and that is what was changed/improved.

I have relied 90% on the tests I've written to verify everything works as intended. This means you can reference the test files as a form of documentation and see the different cases that are covered.

Extras

  • This probably means we can remove react-accessible-treeview from dependencies too
  • Let me know if you find any bugs or something I missed

amanape avatar Apr 18 '24 20:04 amanape

Hey, awesome work. In https://github.com/OpenDevin/OpenDevin/issues/1136 we've been talking about only loading some folders in the hierarchy but I think this makes that somewhat obsolete.

We wasted a bunch of time getting backend routes setup for that and I started work on the frontend part today maintaining a partial file tree in state but this has a bunch of issues. For example, if the agent writes something to a file, we'd need to check and load the whole hierarchy of that file to be able to select it, which is pretty bad.

I'd argue this is probably the better solution and we should revert our previous changes with loading partial file trees. @rbren and @foragerr what's your take on this?

If we agree on that I can give this a thorough review.

Sparkier avatar Apr 18 '24 23:04 Sparkier

The alternative I was talking about and started working on is this: https://github.com/OpenDevin/OpenDevin/pull/1222

Another option would be to combine the two approaches (only loading part of the tree and your redesigned file tree, but the main problem with partial loading, which is that if an agent writes to a file, we don't know what part of its hierarchy in the file tree is already in state, remains. Hence, we'd have to do somewhat ugly loading of part of the file tree whenever something is written to a file to be able to select it in the tree.

Sparkier avatar Apr 18 '24 23:04 Sparkier

I'd say if this works, we just go with it.

Q: This solves the issue of the front end bogging down on trying to render a really large workspace structure. but we still have the problem of the backend constructing that very large structure in the first place?

foragerr avatar Apr 18 '24 23:04 foragerr

I'd say if this works, we just go with it.

Q: This solves the issue of the front end bogging down on trying to render a really large workspace structure.

but we still have the problem of the backend constructing that very large structure in the first place?

That is correct. However, the question is how big of a problem that is. I've tried setting the root folder of this project as the workspace and while it takes a few seconds to load with this implementation, it's a relatively big repo with a ton of files (e.g. in node_modules). So I don't know how much of an everyday problem that would be.

Sparkier avatar Apr 19 '24 00:04 Sparkier

Neat, let’s run with this.

I’ll roll back the other changes once this ships.

foragerr avatar Apr 19 '24 00:04 foragerr

@Sparkier I'm a little confused on what the motivation behind a partial tree is, or maybe I am just misunderstanding what it actually is. Could you clarify? I may spin up some test cases of the expected behavior and iterate from there. Anyways I do imagine eventually we would want to add features such as highlighting new/modified files by the agent that would require us to manage tree state more in-depth, so it may be related?

amanape avatar Apr 19 '24 08:04 amanape

@amanape The partial tree comes from an approach that @rbren described -

The API should only return 1 level of files at a time e.g. GET /files should return files and directories in WORKSPACE_BASE e.g. GET /files?prefix=foo/bar/ should return files and directories in WORKSPACE_BASE/foo/bar/ The UI should send a new request every time a folder is expanded When a folder is expanded, other folders should be collapsed

The backend API for this is already implemented. @Sparkier is describing the challenges in front end. The agent can randomly modify files anywhere in the filesystem, and knowing which folders to refresh from the backend, and when, doesn't seem to be trivial.

If we continued to pursue this, perhaps the agent (or backend) can post messages when file changes occur - the front end can then appropriately refresh parts of the tree.

However if your implementation gives us adequate user experience, we could avoid dealing with all of this complexity entirely.

foragerr avatar Apr 19 '24 12:04 foragerr

When I noticed we were accessing the workspace through the API, I considered an alternative:

It might be more efficient to load it locally (I assume similar to the backend's process), and then have the backend transmit only the updated or new files and folders to the frontend. With a little attention on the frontend, we can use this data to refresh the relevant nodes. This in turn can also improve the initial loading times and subsequent update speeds.

Keep in mind though I haven't had the opportunity to explore much of the backend implementation, so I may be misunderstanding

amanape avatar Apr 19 '24 14:04 amanape

Tried this locally, seems to work great! Will merge when it's ready

I do think it's still worth having a smarter backend implementation, rather than returning the entire directory contents in one API call. But this gets us past 90% of the problem

rbren avatar Apr 19 '24 15:04 rbren

Tried this locally, seems to work great! Will merge when it's ready

@rbren What have I missed that needs to be implemented before it is ready?

amanape avatar Apr 19 '24 15:04 amanape

When I noticed we were accessing the workspace through the API, I considered an alternative:

It might be more efficient to load it locally (I assume similar to the backend's process), and then have the backend transmit only the updated or new files and folders to the frontend. With a little attention on the frontend, we can use this data to refresh the relevant nodes. This in turn can also improve the initial loading times and subsequent update speeds.

Keep in mind though I haven't had the opportunity to explore much of the backend implementation, so I may be misunderstanding

If I correctly understand this, that's kinda what we've been doing. If there is a file write event, we update only that part of the file tree. However, both on initial load and whenever the user clicks refresh, we reload the whole file tree. I don't know if that's a big deal, though. Anything else would significantly complicate the implementation and I don't know if that's worth it.

Sparkier avatar Apr 19 '24 16:04 Sparkier