helm-bibtex icon indicating copy to clipboard operation
helm-bibtex copied to clipboard

Control "do-not-find-pdf" argument with a variable

Open ashiklom opened this issue 3 years ago • 1 comments

For moderately large libraries (mine is ~3 MB) with lots of PDF files, the bibtex-completion-find-pdf call to bibtex-completion-prepare-entry becomes a significant bottleneck (see profile under "Details"). It would be nice to be able to disable it with a variable such as bibtex-completion-do-not-find-pdf. An easy modification would be to modify these lines:

https://github.com/tmalsburg/helm-bibtex/blob/master/bibtex-completion.el#L824-L826

...to something like this:

           (entry (if (and (not do-not-find-pdf) (not bibtex-completion-do-not-find-pdf) (bibtex-completion-find-pdf entry))
                      (cons (cons "=has-pdf=" bibtex-completion-pdf-symbol) entry)
                    entry))

The default for bibtex-completion-do-not-find-pdf would be nil to preserve existing behavior.

            - bibtex-completion-candidates                                                     1713  67%
             - apply                                                                           1713  67%
              - #<compiled -0xa9a41cb876ce6d0>                                                 1713  67%
               - bibtex-completion-parse-bibliography                                           957  37%
                - bibtex-completion-prepare-entry                                               786  30%
                 - bibtex-completion-find-pdf                                                   689  27%
                  - bibtex-completion-find-pdf-in-library                                       689  27%
                   - -first                                                                     681  26%
                      f-file?                                                                   681  26%
                   + bibtex-completion-get-value                                                  3   0%
                   + f-join                                                                       3   0%
                   + s-concat                                                                     2   0%
                 + mapcar                                                                        88   3%
                 + bibtex-completion-remove-duplicated-fields                                     8   0%
                   member-ignore-case                                                             1   0%
                + parsebib-read-entry                                                           168   6%
                  parsebib-find-next-item                                                         1   0%
               + insert-file-contents                                                           576  22%

ashiklom avatar Jan 08 '21 15:01 ashiklom

Hey, thanks for sending the profiling data. Yes, the code for finding PDFs is in desperate need to be rewritten. Ideally there would be a plug-in infrastructure that people can use to mix and mash various ways of locating PDFs according to their needs. This would address a whole bunch of issues here. This way, if you're not referencing PDFs in some particular way, you'd just switch off that plugin and things would become faster as a result. Currently, the code tries all methods for locating PDFs no matter what. But some algorithms for locating PDFs could also be made more efficient. I just looked at the code for finding PDFs with the name BibTeX-key.pdf and see that it's touching the filesystem much more than necessary. The time spent in f-file? can be cut at least in half. This line is the worst offender I think. So in sum, I think there are better opportunities to shave off some processing time than to introduce new customization variables as a stop-gap solutions. I will try to make some improvements next week (but if you'd like to have a shot at it, you'd be most welcome to submit a PR).

The 576ms for insert-file-contents is a big chunk but I'm afraid we won't get rid of it. That's just Emacs being slow when reading the bibliography file. (Although it may help to force this buffer to fundamental mode if it's not in that mode yet.)

Other than that I highly recommend switching to Emacs' new native-comp branch. On my system, helm-bibtex is about 3 times faster with native compilation.

tmalsburg avatar Jan 08 '21 16:01 tmalsburg