cmake-ide icon indicating copy to clipboard operation
cmake-ide copied to clipboard

Long file-open times on large projects

Open mwbuksas opened this issue 9 years ago • 12 comments

I'm using cmake-ide with rtage and have a project which generates a large (159M) compile_commands.json file. This seems to be the cause of a minute-plus long delay when opening a new file. Most of the cpu usage is in cmake-ide--on-cmake-finish, and inside of it, most of the work is in json-read-file and the delete-dups inside of cmake-ide--commands-to-hdr-flags.

Is there anything I can do to mitigate the performance problem?

Failing that, I thought I'd bring the issue to your attention.

mwbuksas avatar Nov 06 '15 17:11 mwbuksas

That's an enormous compile_commands.json. I recently added performance improvements to the JSON reading for my own use case, but this is a lot larger. At that size, I'm assuming it's proprietary? I'll see what I can do, but the fact is that Emacs Lisp isn't that fast.

atilaneves avatar Nov 06 '15 17:11 atilaneves

Understood. The project in question is proprietary. Thanks.

mwbuksas avatar Nov 06 '15 17:11 mwbuksas

I wonder if switching to a binary format would be faster. For example, MongoDb uses BSON (http://bsonspec.org/). There appear to be several Elisp libraries for reading/writing this format:

https://github.com/m2ym/mongo-el/blob/master/bson.el

https://github.com/casualjim/emacs.d/blob/master/elpa/mongo-20120826.14/bson.el

carlos-reyes-123 avatar Nov 11 '15 16:11 carlos-reyes-123

But what CMake supports is the JSON compilation database.

atilaneves avatar Nov 11 '15 17:11 atilaneves

I understand. What I am saying is creating a BSON version of the file. Something that can be read by Elisp a lot more quickly. The JSON file would still have to be read once, but opening new files in Emacs after that should go a lot more quickly.

carlos-reyes-123 avatar Nov 11 '15 21:11 carlos-reyes-123

Interesting

atilaneves avatar Nov 12 '15 18:11 atilaneves

I'd be happy to test new versions when they're ready.

mwbuksas avatar Nov 13 '15 21:11 mwbuksas

As of this commit cmake-ide is no longer trying to parse the JSON compilation database every time a project file is opened. I haven't implemented any sort of binary data saving since for my 400k SLOC work project it's fast enough right now. It's only slow the first time a file is opened. All others should be fast. Can you check with your project? If it's fast enough for you it should be fast enough for everyone.

atilaneves avatar Jan 15 '16 12:01 atilaneves

Opening files is still very slow here (project isn't that big).

By stripping out useless flags I managed to get the time down from ~4-5 seconds to ~2-3 seconds. but it's still slow enough to be annoying.

Could this file be converted to an intermediate format that can be parsed faster? (elisp literal for example). Python's parser runs in 0.008 seconds here, so it could be used to convert the json to elisp (as could any other language with a fast json parser).

Or could the file be kept in memory with a time-stamp and only re-read when the time-stamp on disk changes.

ideasman42 avatar Sep 10 '17 08:09 ideasman42

There are probably ways around it. The main issue is that elisp isn't that fast and at the time I last tried to tackle this there was no way to write Emacs extensions in C.

atilaneves avatar Sep 12 '17 02:09 atilaneves

Is it possible to cache some more operations? I'm still seeing a noticeable slowness (1-2 seconds to open a file) for my project with about 700 source files and a 2.6 MB compile-commands.json. Normally I'm okay with paying the 2 second penalty, but when I start doing large refactoring (e.g. using projectile-replace across a few dozen files), the 2 second penalty starts to add up.

Running the profiler, I see that the time is spent almost entirely by cmake-ide--set-flags-for-file (from cmake-ide--on-cmake-finished) and the garbage collector. Here's an example of opening a file (after already opened other files in the project). cmake-ide--src-buffers has one file and cmake-ide--hdr-buffers is empty:

- command-execute                                                 943  70%
 - call-interactively                                             943  70%
  - funcall-interactively                                         819  61%
   - dired-find-file                                              612  45%
    - find-file                                                   611  45%
     - find-file-noselect                                         603  45%
      - find-file-noselect-1                                      601  44%
       - after-find-file                                          601  44%
        - run-hooks                                               589  44%
         - cmake-ide-maybe-run-cmake                              573  42%
          - cmake-ide--on-cmake-finished                          569  42%
           - mapc                                                 540  40%
            - #<compiled 0x40bc0f65>                              540  40%
             - cmake-ide--set-flags-for-file                      540  40%
              - cmake-ide--commands-to-hdr-flags                  417  31%
               - cmake-ide--delete-dup-hdr-flags                  182  13%
                - cmake-ide--filter                                91   6%
                 - mapcar                                          90   6%
                  - #<compiled 0x41563a67>                         90   6%
                     cmake-ide--dash-i-or-dash-d-p                  5   0%
                - cmake-ide--flags-filtered                        89   6%
                 - cmake-ide--filter                               89   6%
                  - mapcar                                         87   6%
                   - #<compiled 0x4153665b>                        85   6%
                    - #<compiled 0x4146d3f5>                       83   6%
                       cmake-ide--dash-i-or-dash-d-p                  9   0%
                  delete-dups                                       2   0%
               - cmake-ide--args-to-only-flags                    115   8%
                - cmake-ide--filter                               115   8%
                 - mapcar                                          55   4%
                  - #<compiled 0x41522be5>                         55   4%
                   - #<compiled 0x4146d1e5>                        55   4%
                    - cmake-ide--is-src-file                       55   4%
                     - cl-some                                      5   0%
                        #<compiled 0x445449e5>                      2   0%
               - mapcar                                           101   7%
                - cmake-ide--remove-compiler-from-args-string                101   7%
                 - cmake-ide--split-command                       101   7%
                  - split-string-and-unquote                       84   6%
                   - split-string-and-unquote                      65   4%
                    - split-string-and-unquote                     57   4%
                     - split-string-and-unquote                    54   4%
                        split-string                               51   3%
                       split-string                                 3   0%
                      split-string                                  1   0%
                     split-string                                   1   0%
               - cmake-ide--filter                                 19   1%
                + mapcar                                           15   1%
              - cmake-ide--idb-all-commands                        84   6%
               - mapcar                                            82   6%
                - #<compiled 0x4146871f>                           82   6%
                 - cmake-ide--file-params-to-args                  66   4%
                  - cmake-ide--split-command                       64   4%
                   - split-string-and-unquote                      47   3%
                    - split-string-and-unquote                     41   3%
                     + split-string-and-unquote                    29   2%
                      split-string                                  1   0%
                  + mapcar                                          2   0%
                   s-join                                          16   1%
               + cmake-ide--idb-all-objs                            2   0%
              - cmake-ide--set-flags-for-src-file                  38   2%
               - cmake-ide-set-compiler-flags                      37   2%
                - cmake-ide--flags-to-include-paths                 36   2%
                 - mapcar                                          35   2%
                  - #<compiled 0x4146d36d>                         35   2%
                   - cmake-ide--get-build-dir                      35   2%
                    - cmake-ide--locate-project-dir                 33   2%
                     - cmake-ide--locate-cmakelists                 33   2%
                      - cmake-ide--locate-cmakelists-impl                 33   2%
                       + cmake-ide--locate-cmakelists-impl                 21   1%
                       + locate-dominating-file                    12   0%
                 + cmake-ide--to-simple-flags                       1   0%
                + cmake-ide--filter-ac-flags                        1   0%
               + cmake-ide--params-to-src-includes                  1   0%
                cmake-ide--message                                  1   0%
           + cmake-ide--cdb-json-file-to-idb                       15   1%
           + cmake-ide--run-rc                                     14   1%
          + cmake-ide-maybe-start-rdm                               3   0%
          + cmake-ide--need-to-run-cmake                            1   0%
         + vc-refresh-state                                        16   1%
        + normal-mode                                              12   0%
        file-truename                                               1   0%
      + create-file-buffer                                          1   0%
     + switch-to-buffer                                             8   0%
      dired-get-file-for-visit                                      1   0%
   + execute-extended-command                                     158  11%
   + find-file-at-point                                            31   2%
   + next-line                                                     13   0%
   + previous-line                                                  3   0%
   + profiler-report-toggle-entry                                   1   0%
   + dired-next-line                                                1   0%
  + byte-code                                                     124   9%
- ...                                                             343  25%
   Automatic GC                                                   333  24%
 + minibuffer-complete                                             10   0%
+ timer-event-handler                                              24   1%
+ redisplay_internal (C function)                                  22   1%
+ rtags-diagnostics-process-filter                                  5   0%

It looks like there's a lot of expensive string processing that goes on in translating the single entry for a file's command line into the appropriate entries. However, for a project, the flags for files will often be repeated over a large number of files. For example, in this project of 700 files, there are 8 different modules with distinct command line args, so there are only 8 unique combinations of command line params.

I'm not an elisp expert, but would it be possible to cache the parsed data structure that was derived from the command line args, and store it in a map keyed by the original command line string (minus the source file name)? Then, when entering cmake-ide--set-flags-for-file, you could check the map to see if the set already existed & use that?

A bonus would be that you wouldn't need to throw this list away even if you generated a new compile-commands, since the compilation strings would uniquely determine the rest of the config; in fact, you could share this map across all projects (although it would be unlikely that they would be shared, it would also be unnecessary to sequester them).

sbroberg avatar Jan 24 '18 17:01 sbroberg

@sbroberg Thanks for the analysis. I'll see what I can do when I have time. In the meanwhile you can tune Emacs's GC to have fewer collections. I have this is my init.el:

(set 'gc-cons-threshold 100000000)

atilaneves avatar Jan 25 '18 11:01 atilaneves