irony-mode icon indicating copy to clipboard operation
irony-mode copied to clipboard

Reading large JSON databases is slow

Open Hylen opened this issue 10 years ago • 42 comments

Hi,

I've found that irony-cdb-autosetup-compile-options can be really slow when reading large JSON compilation databases. I noticed this when using irony-mode on llvm itself. Every time I open a new source file there is a ~5s delay.

Based on my attempts at profiling, it seems most of the time is spent in irony-cdb-json--transform-command. Time is also spent in json-read-file (emacs built-in). I am no elisp expert, but as I understand, this only reads and postprocess the database a little. The database file is ~10000 lines, 2 MB, so I don't understand where the 5s come form. Maybe it is the communication to the server? The elisp profiler may not see this time.

Anyway, I suggest moving the database reading to the server. I might also be willing to do this, if you agree. It would be a good way to read up libclang and one of my favorite emacs extensions :)

/Karl

Profiler output:

- ...                                                            3387  92%
 - irony-cdb--autodetect-compile-options                         2942  80%
  - catch                                                        2942  80%
   - let                                                         2942  80%
    - while                                                      2942  80%
     - let                                                       2942  80%
      - funcall                                                  2942  80%
       - irony-cdb-json                                          2942  80%
        - cond                                                   2942  80%
         - irony-cdb-json--get-compile-options                   2942  80%
          - let                                                  2942  80%
           - if                                                  2942  80%
            - progn                                              2942  80%
             - progn                                             2942  80%
              - let                                              2942  80%
               - irony-cdb-json--load-db                         2942  80%
                - delq                                           2942  80%
                 - mapcar                                        2939  80%
                  + irony-cdb-json--transform-compile-command               1958  53%
                  + json-read-file                                981  26%
   Automatic GC                                                   445  12%
+ command-execute                                                 254   6%
+ timer-event-handler                                              10   0%

Hylen avatar Feb 20 '15 14:02 Hylen

Hi,

Yeah, I'm aware of the issue and I have some thoughts on that, I'm still not 100% sure about what's the proper solution.

Based on my attempts at profiling, it seems most of the time is spent in irony-cdb-json--transform-command.

I'm not surprised here, one of the function that may be quite inefficient is irony--split-command-line.

Time is also spent in json-read-file (emacs built-in). I am no elisp expert, but as I understand, this only reads and postprocess the database a little.

I think you understand right. The parsing method is not very efficient but I think it may be difficult to get one that scales well in Elisp (I may be wrong).

The database file is ~10000 lines, 2 MB, so I don't understand where the 5s come form. Maybe it is the communication to the server?

That is not the server, for the compilation database everything happens on the Elisp side.

Anyway, I suggest moving the database reading to the server. I might also be willing to do this, if you agree. It would be a good way to read up libclang and one of my favorite emacs extensions :)

That would be a welcomed change. One thing to take care of is to be able to deduce the compile options if the entry is not directly specified in clang_CompilationDatabase_getCompileCommands(). This logic is what the undocumented function irony-cdb-json--guess-flags does. Let me know if you want me to clear that up.

Right now I'm trying to get my head around about a somewhat clean API for the communication with irony-server, so implementing a new commands do not require to copy/pasta the same code/logic every time. So if you happen to implement an irony-server command, try to make things very simple so I can update it easily once I'm satisfied with the irony-server command.

I'm not sure I'm any clear but I would be glad if you work on this, try to do something simple first and I will comment on it.

Sarcasm avatar Feb 20 '15 22:02 Sarcasm

Hi again,

I skimmed through the elisp and the server code, and I have some basic ideas. As of now, the whole database is read, processed and then only the options corresponding to the current file are filtered out. Maybe the database could be stored in memory?

That would be a welcomed change. One thing to take care of is to be able to deduce the compile options if the entry is not directly specified in clang_CompilationDatabase_getCompileCommands(). This logic is what the undocumented function irony-cdb-json--guess-flags does. Let me know if you want me to clear that up.

I think I kind of understand what happens here. I will look more closely at elisp later. Didn't find the explanation you referred to, but the function is interesting. If I understand correctly, clang_CompilationDatabase_fromDirectory can be used to read the database on the server. I will experiment with it when I got the time.

Right now I'm trying to get my head around about a somewhat clean API for the communication with irony-server, so implementing a new commands do not require to copy/pasta the same code/logic every time. So if you happen to implement an irony-server command, try to make things very simple so I can update it easily once I'm satisfied with the irony-server command.

Understood, I will not touch the server API too much. I guess that I will need to remove passing of the compile options read from the database, but nothing more than that. I will also keep the user defined options in irony-additional-clang-options.

/Karl

Hylen avatar Feb 23 '15 12:02 Hylen

I skimmed through the elisp and the server code, and I have some basic ideas. As of now, the whole database is read, processed and then only the options corresponding to the current file are filtered out. Maybe the database could be stored in memory?

Yes, caching the database should help but I would like to wait before doing it. The parsing should be fast anyway, not only starting from the second time.

If I understand correctly, clang_CompilationDatabase_fromDirectory can be used to read the database on the server.

Yes that's the one we want to use.

Understood, I will not touch the server API too much. I guess that I will need to remove passing of the compile options read from the database, but nothing more than that. I will also keep the user defined options in irony-additional-clang-options.

I don't think that's what we want. It is important that Emacs has access to these flags because they may be used for other things than irony-server's commands.

I think the easiest thing to start would be to have a command in irony-server compile-options FILE BUILD-DIR or something like that, that returns the compile options for the current file (using the same algorithms as the elisp one but on the server side).

Then this function should be called in place of irony-cdb-json--get-compile-options. Right now there is not much support for synchronous call in irony--send-request so that may be another problem for you. The refactoring I'm currently doing on the irony-server side should probably take into account the need for synchronous requests.

There will be room for further improvement but that may help already. Then we can improve on this (e.g: caching the compilation database results if the file modification time didn't change).

Sarcasm avatar Feb 24 '15 23:02 Sarcasm

I don't think that's what we want. It is important that Emacs has access to these flags because they may be used for other things than irony-server's commands.

I think the easiest thing to start would be to have a command in irony-server compile-options FILE BUILD-DIR or something like that, that returns the compile options for the current file (using the same algorithms as the elisp one but on the server side).

Then this function should be called in place of irony-cdb-json--get-compile-options. Right now there is not much support for synchronous call in irony--send-request so that may be another problem for you. The refactoring I'm currently doing on the irony-server side should probably take into account the need for synchronous requests.

I were not aware of this requirement. Then let's for now create a second executable that can be called to read database synchronously. When the server api supports it, it can be moved inside irony-server. This way I will not have to touch anything in the server API, that you are working on.

/Karl

Hylen avatar Feb 25 '15 07:02 Hylen

You can implement the command in the irony-server executable but call irony-server synchronously (e.g: with call-process):

~/.emacs.d/irony/bin/irony-server get-compile-options <build-dir>

Sarcasm avatar Feb 25 '15 11:02 Sarcasm

Is it ok to include boost as a dependency? It might be the easiest way to make routines such as expand-file-name portable on the server side.

I also have some thoughts on the server API. According to the section on filter functions in the elisp manual, the filter functions can read only when emacs waits. If we want to simulate a synchronous call to irony-server, we might use the asynchronous methods currently used in combination with sit-for or sleep-for. This way, the options can be fetched from the sever and the server and need not be passed the options from elisp. We would have to write the .clang_complete stuff on the server as well, but that is a minor problem. What do you think?

Hylen avatar Feb 25 '15 18:02 Hylen

Is it ok to include boost as a dependency?

Nope, I understand the issue but I really want irony to be self-contained, apart from the obvious need to have libclang. There is an alternative solution other than to replicate the full logic of the elisp code into C++. Since it's parsing the JSON and parsing the command line that takes time, it would be possible to "compile" the compilation database into an s-expr, just doing a server implementation of irony-cdb-json--load-db.

I also have some thoughts on the server API. According to the section on filter functions in the elisp manual, the filter functions can read only when emacs waits. If we want to simulate a synchronous call to irony-server, we might use the asynchronous methods currently used in combination with sit-for or sleep-for.

That is not the proper way to do waiting on asynchronous processes. One should use accept-process-output. I actually have a comment on irony--parse-buffer-async which reflect the current thought I have on synchronous methods.

This way, the options can be fetched from the sever and the server and need not be passed the options from elisp. We would have to write the .clang_complete stuff on the server as well, but that is a minor problem. What do you think?

We don't want to do this. We really want Emacs to "master" the command line options so that it can interface with:

  • Emacs' built-in modules, like ff-find-other-file (e.g: irony-cdb could set cc-search-directories)
  • external tools, e.g: call a compiler to generate assembly of the current file/function, like disaster
  • or to interface with some other Clang tools (clang-rename, clang-tidy, clang-modernize, ...).

For now I think it's better to optimize the parsing via irony-server but it shouldn't have too much power and state if possible.

Sarcasm avatar Feb 25 '15 19:02 Sarcasm

Nope, I understand the issue but I really want irony to be self-contained, apart from the obvious need to have libclang. There is an alternative solution other than to replicate the full logic of the elisp code into C++. Since it's parsing the JSON and parsing the command line that takes time, it would be possible to "compile" the compilation database into an s-expr, just doing a server implementation of irony-cdb-json--load-db.

Even if we only do irony-cdb-json--load-db on the server, it still needs a portable equivalent to expand-file-name. Do you suggest we do this from scratch? If we really want it to be portable, I think this could be kind of involved. Or do you suggest using the preprocessor to call different utility functions on different systems, like realpath?

That is not the proper way to do waiting on asynchronous processes. One should use accept-process-output. I actually have a comment on irony--parse-buffer-async which reflect the current thought I have on synchronous methods.

Alright, hadn't seen that one. Then synchronous communication shouldn't be that hard.

We don't want to do this. We really want Emacs to "master" the command line options so that it can interface with:

Emacs' built-in modules, like ff-find-other-file (e.g: irony-cdb could set cc-search-directories) external tools, e.g: call a compiler to generate assembly of the current file/function, like disaster or to interface with some other Clang tools (clang-rename, clang-tidy, clang-modernize, ...). For now I think it's better to optimize the parsing via irony-server but it shouldn't have too much power and state if possible.

Alright, got it.

Hylen avatar Feb 26 '15 13:02 Hylen

Even if we only do irony-cdb-json--load-db on the server, it still needs a portable equivalent to expand-file-name. Do you suggest we do this from scratch? If we really want it to be portable, I think this could be kind of involved. Or do you suggest using the preprocessor to call different utility functions on different systems, like realpath?

I think we could start by just having a function that says whether or not "file" is relative, and if that's so prefix it with the directory. I think we will be pretty good with this. Maybe, if we do only irony-cdb-json--load-db, we can still do the expand-file-name on the Elisp side. I think what is important to have is the command line parsing to be done in C++.

Given:

[
  { "directory": "/home/user/llvm/build",
    "command": "/usr/bin/clang++ -Irelative -DSOMEDEF=\"With spaces, quotes and \\-es.\" -c -o file.o file.cc",
    "file": "file.cc" },
  ...
]

We could return:

("file.cpp" "/home/user/llvm/build" '("/usr/bin/clang++" "-Irelative" "-DSOMEDEF=\"With spaces, quotes and \\-es.\"" "-c" "-o" "file.o" "file.cc"))

Hopefully more work can be done but that's a start.

Alright, hadn't seen that one. Then synchronous communication shouldn't be that hard.

Nope, it's not too difficult, irony used to be synchronous a long time ago but asynchronous was better in the end. And now I think both are useful. :)

Sarcasm avatar Feb 27 '15 00:02 Sarcasm

I think we could start by just having a function that says whether or not "file" is relative, and if that's so prefix it with the directory. I think we will be pretty good with this. Maybe, if we do only irony-cdb-json--load-db, we can still do the expand-file-name on the Elisp side. I think what is important to have is the command line parsing to be done in C++.

What do you mean with command line parsing exactly? irony-cdb-json--transform-compile-command? I think the most important thing is to single out one compile command on the server. I think for instance that the guess-flags logic have to be implemented on the server side.

I have almost completely written the code in irony-cdb-json--get-compile-options, and it should work for UNIX based systems. Only thing left is the guess-flags routine. When I started writing this, I noticed a problem. libclang doesn't have a way of getting the target file name from c CXCompileCommand. I would be glad for workaround advice here. Only solution I see is to implement the parsing too (since I guess including a JSON parsing library isn't an option).

Hylen avatar Feb 27 '15 17:02 Hylen

How do you feel about switching from libclang to libtooling? I think this would solve the problem above and make the code much simpler. Another upside is that it would be safer and nicer to use the C++ API. I guess the downside is that it is less stable.

Hylen avatar Feb 27 '15 19:02 Hylen

I have almost completely written the code in irony-cdb-json--get-compile-options, and it should work for UNIX based systems.

Are you using some kind of syscalls for it to be specific to Linux or is is the path parsing methods that are Linux-specific? I would like to avoid relying on syscall which may be slow on some file system when we should be able to do something quite cheap just with path parsing.

I think for instance that the guess-flags logic have to be implemented on the server side.

Maybe that is a good idea but if it's not slow in Elisp and can be cached easily then it is no big deal. It's better if it's done in C++ but if the overall code is way more complex for no visible benefit that may not be worth the maintenance burden.

When I started writing this, I noticed a problem. libclang doesn't have a way of getting the target file name from c CXCompileCommand. I would be glad for workaround advice here. Only solution I see is to implement the parsing too (since I guess including a JSON parsing library isn't an option).

Damn, that's unfortunate. :-( I was really glad that I didn't have to add a JSON parsing library by seeing the compilation database support but I was wrong apparently. A self-hosted JSON parsing library is an option if libclang doesn't provide this information, something like rapidjson.

How do you feel about switching from libclang to libtooling? I think this would solve the problem above and make the code much simpler. Another upside is that it would be safer and nicer to use the C++ API. I guess the downside is that it is less stable.

I prefer to stay with libclang whose exact purpose is to be used by IDE. I like the fact the API is stable, irony-mode support all kind of versions and I think it is a good thing. I wouldn't be against having another program that uses LibTooling but that would be something optional I think.

Sarcasm avatar Feb 27 '15 23:02 Sarcasm

Are you using some kind of syscalls for it to be specific to Linux or is is the path parsing methods that are Linux-specific? I would like to avoid relying on syscall which may be slow on some file system when we should be able to do something quite cheap just with path parsing.

Just the parsing, and this should be quite easy to fix.

Maybe that is a good idea but if it's not slow in Elisp and can be cached easily then it is no big deal. It's better if it's done in C++ but if the overall code is way more complex for no visible benefit that may not be worth the maintenance burden.

Damn, that's unfortunate. :-( I was really glad that I didn't have to add a JSON parsing library by seeing the compilation database support but I was wrong apparently. A self-hosted JSON parsing library is an option if libclang doesn't provide this information, something like rapidjson.

It would be nice to have it on the server, since it includes a couple of passes over the whole database. But at the same time, I guess the common case is to get the flags directly without guessing. So maybe a first attempt should be something like:

  1. Implement the exact logic on the server (irony-cdb-json--load-db + irony-cdb-json--exact-flags)
  2. If this fails, load the database again in elisp and do the guessing

Using this method, if you have your file in the database, the loading of the commands should be fast. No dependencies need to be added.

I prefer to stay with libclang whose exact purpose is to be used by IDE. I like the fact the API is stable, irony-mode support all kind of versions and I think it is a good thing. I wouldn't be against having another program that uses LibTooling but that would be something optional I think.

I understand this reason. It is up to you which way we take. For now, I focus on the solution above.

Hylen avatar Feb 28 '15 08:02 Hylen

I agree with what you are saying, I think it's a first step in the good direction. :+1:

Sarcasm avatar Feb 28 '15 18:02 Sarcasm

Hi,

Check out the server-database branch in my fork. I've added the the exact flags logic on the server, and made a way of communicating synchronously with irony-server. I still have some stuff I want to do before I send a pull request (documented using TODO comments, and some in my head) but I wanted to here your thoughts before working more on it.

Regards, Karl

Hylen avatar Mar 06 '15 15:03 Hylen

Hi,

Glad to have an early look at the implementation, there are some interesting things.

I have a few comments:

  1. I would expect irony--send-request-sync to return the results, no callback should be needed but please, feel free to let it as is, since the way sending request is implemented will change as soon as I get time to finish it up.
  2. the return value of irony-cdb-json--adjust-compile-options isn't used, it should be even if the function modify its argument by side-effects (maybe it shouldn't).
  3. file-truename on the buffer-file-name isn't necessary, one can use buffer-file-truename directly.
  4. when using accept-process-output, C-g should be allowed to not freeze Emacs if something goes wrong, see https://www.gnu.org/software/emacs/manual/html_node/elisp/Timers.html. FYI, a long time ago, irony used to be blocking and I was using this function to wait for server response: https://github.com/Sarcasm/irony-mode/blob/0111357aff8c30a7fa54393365e822274f0ca739/elisp/irony.el#L371
  5. I prefer the term CompilationDatabase, to Database, for the C++ side.
  6. The command stuff in C++ looks ugly now with readCompileOptions = true in complete and parse, I guess it's a little bit weirdly done. Do not worry about that, it shouldn't be hard to have something less strange in the future.
  7. the irony-server command get-compile-options talks about JSON compilation database. I wouldn't write JSON explicitly in the description as the Clang documentation doesn't limit the functions to JSON but potentially more compilation databases in the future. http://clang.llvm.org/doxygen/group__COMPILATIONDB.html#gae0822a6a54afaea92bad5d3b3256bf26
  8. Do you think you can handle graceful support of older version of Cindex? CXCompilationDatabase is supported since CINDEX_VERSION >= 6 (clang >= 3.2).
  9. Formatting of code is done with Clang-Format, if you cannot use this tool I will point you what are the issues with regard to formatting.
  10. You are calling clang_getCString() without disposing of the returned string. Right now I have a helper function called cxStringToStd that I use sometime but I'm wondering about using unique_ptr with a custom deleter instead. https://github.com/Sarcasm/irony-mode/blob/4a208484fdcdd91ce139c88f32c63fd5bf513855/server/src/Irony.cpp#L24
  11. Please do not call exit in getFlags. irony-server -i should returns the errors it gets so the user can be informed inside Emacs.
  12. I think getExactFlags should return a list of compile options, to not lose information. The irony-cdb compilation databases already support multiple compile options so there is no reason to skip it.

What do you think about a new compilation database instead, one that would use irony-server instead of modifying the existing one?

Feel free to make a pull request next time even if it's no finished, that will allow to comment the code inline.

Even if I have a few remarks I'm really happy that you did this. One thing that would be interesting is to know if the delay you got when loading an LLVM file has disappeared. :)

Sarcasm avatar Mar 08 '15 20:03 Sarcasm

Thanks for the comments, I will have a closer look at them when I have time, and then I'll send a pull request. The delay has dissapeared completely, opening a .cpp file in LLVM feels instantaneous. The problem remains for header files, when the guess logic kicks in. For these files I've measured ~10s delay.

/Karl

Hylen avatar Mar 09 '15 15:03 Hylen

  1. I would expect irony--send-request-sync to return the results, no callback should be needed but please, feel free to let it as is, since the way sending request is implemented will change as soon as I get time to finish it up.

I see your point, and leave it as it is for now.

  1. the return value of irony-cdb-json--adjust-compile-options isn't used, it should be even if the function modify its argument by side-effects (maybe it shouldn't).
  2. file-truename on the buffer-file-name isn't necessary, one can use buffer-file-truename directly.

Ok, changed it.

  1. when using accept-process-output, C-g should be allowed to not freeze Emacs if something goes wrong, see https://www.gnu.org/software/emacs/manual/html_node/elisp/Timers.html. FYI, a long time ago, irony used to be blocking and I was using this function to wait for server response: https://github.com/Sarcasm/irony-mode/blob/0111357aff8c30a7fa54393365e822274f0ca739/elisp/irony.el#L371
  2. I prefer the term CompilationDatabase, to Database, for the C++ side.

I also think CompilationDatabase is a better name, but for clarity I didn't want a name that could be mixed up with the libTooling class. I change to CompilationDatabase for now.

  1. The command stuff in C++ looks ugly now with readCompileOptions = true in complete and parse, I guess it's a little bit weirdly done. Do not worry about that, it shouldn't be hard to have something less strange in the future.

The only thing I changed was adding a boolean value. Previously, the code checked if file had been set, which I think was more unclear.

If I designed the Command class, I would probably go for an abstract base class and use polymorphism instead of the switch and parsing callbacks found in CommandParser::parse. In my opinion, it would be clearer and more flexible.

  1. Formatting of code is done with Clang-Format, if you cannot use this tool I will point you what are the issues with regard to formatting.

I tried to keep pretty close to the LLVM style by hand at first. After reading your comment I ran clang-format on the files I had modified. It changed a lot of formatting in Irony.cpp and main.cpp, and I guess that is not wanted. If you have specific complaints on the formatting, please tell me. I kept the clang-formatting in Database.h and Database.cpp.

  1. You are calling clang_getCString() without disposing of the returned string. Right now I have a helper function called cxStringToStd that I use sometime but I'm wondering about using unique_ptr with a custom deleter instead. https://github.com/Sarcasm/irony-mode/blob/4a208484fdcdd91ce139c88f32c63fd5bf513855/server/src/Irony.cpp#L24

Sloppy of me. Fixed.

  1. Please do not call exit in getFlags. irony-server -i should returns the errors it gets so the user can be informed inside Emacs.
  2. I think getExactFlags should return a list of compile options, to not lose information. The irony-cdb compilation databases already support multiple compile options so there is no reason to skip it.

These two were already on my cleanup list. Fixed.

  1. the irony-server command get-compile-options talks about JSON compilation database. I wouldn't write JSON explicitly in the description as the Clang documentation doesn't limit the functions to JSON but potentially more compilation databases in the future. http://clang.llvm.org/doxygen/group__COMPILATIONDB.html#gae0822a6a54afaea92bad5d3b3256bf26
  2. Do you think you can handle graceful support of older version of Cindex? CXCompilationDatabase is supported since CINDEX_VERSION >= 6 (clang >= 3.2).

What do you think about a new compilation database instead, one that would use irony-server instead of modifying the existing one?

I think you're on to something here. If we do it this way, we can keep the old database for people with libclang < 3.2, and focus on the new compilation database for people with a more current version. Then we might even try libTooling as a way of solving the guess logic (which is quite unbearable as it is now, when opening a header file in a large project). This will be some more work, but I will look over it when I have time.

Feel free to make a pull request next time even if it's no finished, that will allow to comment the code inline.

Okay, here it comes. But perhaps it shouldn't be merged before we've fixed the new database in elisp.

Regards, Karl

Hylen avatar Mar 23 '15 19:03 Hylen

I moved my changes to a new database. Check it out when you got the time.

/Karl

Hylen avatar Apr 03 '15 10:04 Hylen

Great! Will do, sorry for the silence, it's still on my mind. ;)

Sarcasm avatar Apr 03 '15 10:04 Sarcasm

I caught a glimpse and it looks good! I will try to add my remarks tomorrow.

For the header files I'm still not sure what to do. I know some Clang plugins for vim implement a logic as follow "get the flags from the most recently opened file". We can add some kind of configurable method in irony-mode for that.

What we could do for example is to to the guess logic not on the whole JSON compilation database but on all the compile options already available in irony-mode buffers. That is the matter of another issue though.

Sarcasm avatar Apr 03 '15 23:04 Sarcasm

For the header files I'm still not sure what to do. I know some Clang plugins for vim implement a logic as follow "get the flags from the most recently opened file". We can add some kind of configurable method in irony-mode for that.

This is a nice simple idea, that we might try. If the user opens a header file first, I guess it won't work though. Another idea that's been growing in my head is an optional libTooling-based compilation database, mimicking the the guess logic (which I think is nice).

I will have a look at your comments when I have more time, it might be a week or two.

/Karl

Hylen avatar Apr 08 '15 15:04 Hylen

Finally got some time to update according to your comments. Opened a new pull request, #208. Will start looking into another way of dealing with header files next.

Hylen avatar Jun 28 '15 17:06 Hylen

Looks nice, I have a few comments which are mostly nitpicking about the coding style/convention used by irony. If you think it is time consuming to correct I can correct them myself before merging.

Anyway, that's great news!

Feel free to share the result of your brainstorming about the compile options guessing.

Sarcasm avatar Jun 28 '15 18:06 Sarcasm

I've been thinking about the guessing logic and implementation. I think a simple algorithm matching the file name without the extension should work fine for most cases, e.g. Irony.h -> Irony.cpp. We can also provide a command with witch the user can choose compile command from a list of all commands, if the guessing fails.

When it comes to the implementation, the problem is that libclang doesn't provide a way of obtaining the file name. We could use the working directory and compile command to reconstruct all of the file names. Otherwise we could try reading the JSON file a second time, only for obtaining the file names. What do you think?

Hylen avatar Jul 31 '15 14:07 Hylen

I've been thinking about the guessing logic and implementation. I think a simple algorithm matching the file name without the extension should work fine for most cases, e.g. Irony.h -> Irony.cpp.

I think there is a lot of situations where this isn't working. For example, type Error.h on this page: https://github.com/llvm-mirror/llvm/find/8fea35acadea2e55824682a78bef54639e5ecfce

As a side note, having some functionality to guess (and allow the user to customize) a source file's header and vice-versa could be nice. But I guess this is another subject, it can be hard to get it right, simple. Now that I think about it, I think it is already present in Emacs (ff-find-other-file), irony should probably just feed the proper compile options that will help find them. But then, it won't help us in our case.

Something I don't like much with this method is that it won't work with third-party headers and headers with no associated source files, I don't think these are edge cases sadly.

When it comes to the implementation, the problem is that libclang doesn't provide a way of obtaining the file name.

Yep it's too bad, at one point I was considering adding a JSON parser to irony-server.

What do you think?

I wish for something better, I think it is too fragile to rely on a matching source file, I think we would have better luck finding "any" file that include the header, or just have the header's directory in the search path.

Sarcasm avatar Jul 31 '15 15:07 Sarcasm

Okey, argument accepted. But what do you think about the command + working directory => file idea? Should we go with it or implement a second reading of the database file?

If we can manage to get the file names, we will also be able to implement any logic for guessing compilation command. The guessing logic in irony-cdb-json could for instance be reimplemented. The only thing I dislike about this logic is that it doesn't work when headers and cpp files are separated in the source tree. Instead we could rank how similar the paths are from the project root, e.g. src/a/b/c/d.cpp would be a good match for include/a/b/c/d.h.

Hylen avatar Jul 31 '15 16:07 Hylen

Okey, argument accepted. But what do you think about the command + working directory => file idea? Should we go with it or implement a second reading of the database file?

I'm not sure, what do you mean by:

We could use the working directory and compile command to reconstruct all of the file names.

?

The only thing I dislike about this logic is that it doesn't work when headers and cpp files are separated in the source tree. Instead we could rank how similar the paths are from the project root, e.g. src/a/b/c/d.cpp would be a good match for include/a/b/c/d.h.

Yeah I think it is a downer too. I use this at work, LLVM/Clang uses this too.

Sarcasm avatar Jul 31 '15 17:07 Sarcasm

I've uploaded a very early stage of the commit I'm working on, #225.

Hylen avatar Aug 10 '15 18:08 Hylen

I opened a new request #270 for this. If you still decide you don't want anything like this I will break it out as a separate program and maintain it myself.

This version still uses rapidJSON for reading the database. The upstream libclang with the ability of giving the file names as well hasn't been released yet, so I don't think it will be of any use to most people at the moment.

I also looked at the CEDET project you linked. They seem to use the prefix of the path to guess the files, which doesn't give the behaviour we want. They also use json.el to read the database, probably making it as slow as the existing solution in Irony.

I've made an attempt at making the database optional. If boost is not found, it isn't activated at all. Whit this, Irony shouldn't be harder to install than before.

Hylen avatar Dec 30 '15 14:12 Hylen