ox icon indicating copy to clipboard operation
ox copied to clipboard

Don't load whole file into memory

Open GoldsteinE opened this issue 5 years ago • 7 comments

Is your feature request related to a problem? Please describe. Opening big files is slow. Other editors solve this problem by loading only visible part of the file. Ox loads whole file into memory, so working with big files (such as logs) is hard and uses a lot of RAM.

Describe the solution you'd like File should be mmap'ed to memory or read part-by-part

Describe alternatives you've considered Don't do anything. Working with big files will remain slow.

GoldsteinE avatar Nov 03 '20 10:11 GoldsteinE

One way to test is to generate some garbage file and try loading it

 base64 /dev/urandom | head -c 1000000000 > file.txt

For inspiration, you might look at https://github.com/arximboldi/ewig It uses an immutable string structure to load file into memory.

Ygg01 avatar Nov 03 '20 16:11 Ygg01

Other good way to test is using Wikipedia dump (you can get one at https://ftp.acc.umu.se/mirror/wikimedia.org/dumps/enwiki/20201001/), which is well-formed XML (so can be syntax highlighted), but weights >5G.

GoldsteinE avatar Nov 03 '20 16:11 GoldsteinE

Well, loading a garbo file is just stress testing. No highlighting at all. Kind of like a "walk before you run" situation.

Ygg01 avatar Nov 03 '20 16:11 Ygg01

Thanks for all the help guys! I'll look into this and try to implement this in the next few updates. 🙂

curlpipe avatar Nov 03 '20 16:11 curlpipe

I did some profiling and most of the time is spent in either allocating, resizing or getting from the hashmap. Most of these issues could be fixed if the functions which return a hashmap take a mutable reference.

ghost avatar Jan 24 '21 19:01 ghost

Thank you for the amazing work on this editor, first of all! I think it has immense potential!

Second, related to the slowness when opening large files, you don't need 10MB files, a file with 1000 lines of Python makes the editor very veeery slow. Try this file: https://raw.githubusercontent.com/aio-libs/aiohttp/master/aiohttp/connector.py

I'm not sure if the Python parsing definitions are slow, but leaving this comment, maybe it helps.

croqaz avatar Feb 08 '21 14:02 croqaz

For inspiration, you might look at https://github.com/arximboldi/ewig It uses an immutable string structure to load file into memory. there exists a crate called im which provides many of the data structures used in ewig

ghost avatar Feb 21 '21 13:02 ghost