RemoteSubl icon indicating copy to clipboard operation
RemoteSubl copied to clipboard

How to improve the speed of opening big files

Open mihan007 opened this issue 5 years ago • 5 comments

I've installed Sublime and server parts. Works like a charm for small files (tried for nginx config) but when I try to rsubl big_log_file.log it do nothing. Tried for log about 12mb.

What is limits for file size?

mihan007 avatar Mar 06 '19 07:03 mihan007

Hm, it opened big log files but it took more than 5 min (I've already post the issue and forgot about it when sublime shows it). By scp it took 2 seconds to download that log file. So I need to clarify the issue: how to improve the speed of opening big files?

mihan007 avatar Mar 06 '19 07:03 mihan007

I guess the reason is because we are parsing the stream line by line. It may be more efficient to parse multiple lines at a time, especially for the file data part.

randy3k avatar Mar 06 '19 13:03 randy3k

Python text concatenation requires copying the entire string each time. This is O(n) on the length of the string. Since this plugin goes line by line and concats each line to a buffer, the number of concatenations done is also O(n). This means that copying a file is O(n*n). This is very slow when opening files with 100,000 lines (like binary disassembly for example).

Compare these examples:

Building up a string with a string buffer, takes 308 seconds.

data = b""
for i in range(100000):
    data += FILLER_TEXT.encode("utf-8")

Building up an array and concatenating once at the end, takes 0.1s.

data = []
for i in range(100000):
    data.append(FILLER_TEXT.encode("utf-8"))
b"".join(data)

This plugin currently uses the first method. I have a basic patch on my fork which uses to the second method. I'm not interested in opening a PR right now; I just got very distracted trying to open a file :) But my code can be considered public domain if someone else wants to incorporate this change.

https://github.com/MatthiasPortzel/RemoteSubl/commit/98786ca6b1830afd1000ba5552b4bcea0874ce22

You can't get faster than O(n), of course, but If you wanted to do it right, you would write each chunk into the file as it came in; this would allow your memory use to be O(1) instead of O(n).

MatthiasPortzel avatar Jul 19 '24 01:07 MatthiasPortzel

Thanks @MatthiasPortzel I have merged your changes into master.

randy3k avatar Aug 16 '24 05:08 randy3k

Thanks for your work maintaining this package @randy3k!

MatthiasPortzel avatar Aug 16 '24 11:08 MatthiasPortzel