helm-bibtex icon indicating copy to clipboard operation
helm-bibtex copied to clipboard

do not reverse (and unreverse) the files list

Open malb opened this issue 3 years ago • 3 comments

BibTeX requires files in a particular order (dependencies first) and so does this library (also dependencies first). However, the double reversal makes the latter incompatible with the former.

Consider a bibliographical database spread across two bib files, e.g.

% file-1.bib

@string{eurocryptname =         "EUROCRYPT"}
@string{eurocryptpub =          springer}
@string{eurocrypt17-2 =         "EC17-2"}
@string{eurocrypt17name2 =      eurocryptname # "~2017, Part~II"}
@string{eurocrypt17ed =         "Jean-S{\'{e}}bastien Coron and Jesper Buus Nielsen"}
@string{eurocrypt17addr =       "Paris, France"}
@string{eurocrypt17month =      apr # "~30~--~" # may # "~4,"}

% file-2.bib

@Proceedings{EC17-2,
  title =        eurocrypt17name2,
  editor =       eurocrypt17ed,
  booktitle =    eurocrypt17name2,
  volume =       eurocrypt17vol2,
  address =      eurocrypt17addr,
  month =        eurocrypt17month,
  publisher =    eurocryptpub,
  series =       mylncs,
  year =         2017,
  key =          eurocrypt17key2,
}

@InProceedings{EC:Albrecht17,
  author =       "Martin R. Albrecht",
  title =        "On Dual Lattice Attacks Against Small-Secret {LWE} and Parameter Choices in {HElib} and {SEAL}",
  pages =        "103--129",
  doi =          "10.1007/978-3-319-56614-6_4",
  crossref =     eurocrypt17-2,
}

Both BibTeX and this library require file-1.bib to come before file-2.bib but the double reversal means this library processed files in the opposite order of BibTeX.

malb avatar Dec 20 '21 15:12 malb

Hi Martin. Sorry, but I don't understand which problem this PR is trying to solve. When I used the example data you provided above, things worked as expected.

tmalsburg avatar Jan 10 '22 13:01 tmalsburg

I cannot reproduce it now either. Sorry for the noise!

malb avatar Jan 30 '22 12:01 malb

Sorry, this is hamfisted but it at least shows there is some issue here (and I wasn't dreaming :))

(package-initialize)
(load "~/Downloads/testbed/helm-bibtex/bibtex-completion.el")
(load "~/Downloads/testbed/helm-bibtex/helm-bibtex.el")

;; git clone https://github.com/cryptobib/export_crossref cryptobib

(setq bibtex-completion-bibliography
      '("~/Downloads/testbed/cryptobib/abbrev3.bib"
        "~/Downloads/testbed/cryptobib/crypto_crossref.bib"))

(defun bibtex-completion-candidates ()
  "Read the BibTeX files and return a list of conses, one for each entry.
The first element of these conses is a string containing authors,
editors, title, year, type, and key of the entry.  This string
is used for matching.  The second element is the entry (only the
fields listed above) as an alist."
  (let ((files (nreverse (bibtex-completion-normalize-bibliography 'bibtex)))
        (ht-strings (make-hash-table :test #'equal))
        reparsed-files)

    ;; Open each bibliography file in a temporary buffer,
    ;; check hash of bibliography and mark for reparsing if necessary:

    (cl-loop
     for file in files
     do
     (with-temp-buffer
       (insert-file-contents file)
       (let ((bibliography-hash (secure-hash 'sha256 (current-buffer))))
         (unless (string= (cadr (assoc file bibtex-completion-cache))
                          bibliography-hash)
           ;; Mark file as reparsed.
           ;; This will be useful to resolve cross-references:
           (push file reparsed-files)))))

    ;; TODO This code doesn't belong here.  It's specific to just one
    ;; way of doing notes and should be a module.  Could be run via a
    ;; hook, a defadvice, or perhaps via inotify when the notes file
    ;; changes.  The benefit of the last option is that it needs no
    ;; new interface in core bibtex-completion and runs independently.
    ;; The downside is that we get in trouble if the user changes the
    ;; `bibtex-completion-notes-path'.  We're then tracking an
    ;; incorrect file.

    ;; TODO This code does not respect
    ;; `bibtex-completion-notes-key-pattern'.

    ;; TODO There should be a checksum for the notes file and the keys
    ;; should only be collected if this checksum has changed.

    ;; TODO This code should only be run if we actually reload BibTeX
    ;; files.  No need to do it otherwise.
    (when (and bibtex-completion-notes-path
               (f-file? bibtex-completion-notes-path))
      (require 'org-element)
      (with-temp-buffer
	    (org-mode)     ;;  need this to avoid error in emacs 25.3.1
        (insert-file-contents bibtex-completion-notes-path)
        (setq bibtex-completion-cached-notes-keys
              (let ((tree (org-element-parse-buffer 'headline)))
                (org-element-map tree 'headline
                  (lambda (key) (org-element-property :CUSTOM_ID key)))))))

    ;; reparse if necessary

    (when reparsed-files
      (cl-loop
       for file in files
       do
       (with-temp-buffer
         (insert-file-contents file)
         (let ((bibliography-hash (secure-hash 'sha256 (current-buffer))))
           (if (not (member file reparsed-files))
               (bibtex-completion-update-strings-ht ht-strings
                                                    (cddr (assoc file bibtex-completion-string-cache)))
             (progn
               (message "Parsing bibliography file %s ..." file)
               (bibtex-completion-clear-string-cache (list file))
               (push (-cons* file
                             bibliography-hash
                             (bibtex-completion-parse-strings ht-strings))
                     bibtex-completion-string-cache)

               (bibtex-completion-clear-cache (list file))
               (push (-cons* file
                             bibliography-hash
                             (bibtex-completion-parse-bibliography ht-strings))
                     bibtex-completion-cache))))))
      (setf bibtex-completion-string-hash-table ht-strings))

    ;; If some files were reparsed, resolve cross-references:
    (when reparsed-files
      (message "Resolving cross-references ...")
      (bibtex-completion-resolve-crossrefs files reparsed-files))

    ;; Finally return the list of candidates:
    (message "Done (re)loading bibliography.")
    (nreverse
     (cl-loop
      for file in files
      append (cddr (assoc file bibtex-completion-cache))))))

(bibtex-completion-candidates)
(helm-bibtex)

;; W/O: nreverse
(defun bibtex-completion-candidates ()
  "Read the BibTeX files and return a list of conses, one for each entry.
The first element of these conses is a string containing authors,
editors, title, year, type, and key of the entry.  This string
is used for matching.  The second element is the entry (only the
fields listed above) as an alist."
  (let ((files (bibtex-completion-normalize-bibliography 'bibtex))
        (ht-strings (make-hash-table :test #'equal))
        reparsed-files)

    ;; Open each bibliography file in a temporary buffer,
    ;; check hash of bibliography and mark for reparsing if necessary:

    (cl-loop
     for file in files
     do
     (with-temp-buffer
       (insert-file-contents file)
       (let ((bibliography-hash (secure-hash 'sha256 (current-buffer))))
         (unless (string= (cadr (assoc file bibtex-completion-cache))
                          bibliography-hash)
           ;; Mark file as reparsed.
           ;; This will be useful to resolve cross-references:
           (push file reparsed-files)))))

    ;; TODO This code doesn't belong here.  It's specific to just one
    ;; way of doing notes and should be a module.  Could be run via a
    ;; hook, a defadvice, or perhaps via inotify when the notes file
    ;; changes.  The benefit of the last option is that it needs no
    ;; new interface in core bibtex-completion and runs independently.
    ;; The downside is that we get in trouble if the user changes the
    ;; `bibtex-completion-notes-path'.  We're then tracking an
    ;; incorrect file.

    ;; TODO This code does not respect
    ;; `bibtex-completion-notes-key-pattern'.

    ;; TODO There should be a checksum for the notes file and the keys
    ;; should only be collected if this checksum has changed.

    ;; TODO This code should only be run if we actually reload BibTeX
    ;; files.  No need to do it otherwise.
    (when (and bibtex-completion-notes-path
               (f-file? bibtex-completion-notes-path))
      (require 'org-element)
      (with-temp-buffer
	    (org-mode) ;;  need this to avoid error in emacs 25.3.1
        (insert-file-contents bibtex-completion-notes-path)
        (setq bibtex-completion-cached-notes-keys
              (let ((tree (org-element-parse-buffer 'headline)))
                (org-element-map tree 'headline
                  (lambda (key) (org-element-property :CUSTOM_ID key)))))))

    ;; reparse if necessary

    (when reparsed-files
      (cl-loop
       for file in files
       do
       (with-temp-buffer
         (insert-file-contents file)
         (let ((bibliography-hash (secure-hash 'sha256 (current-buffer))))
           (if (not (member file reparsed-files))
               (bibtex-completion-update-strings-ht ht-strings
                                                    (cddr (assoc file bibtex-completion-string-cache)))
             (progn
               (message "Parsing bibliography file %s ..." file)
               (bibtex-completion-clear-string-cache (list file))
               (push (-cons* file
                             bibliography-hash
                             (bibtex-completion-parse-strings ht-strings))
                     bibtex-completion-string-cache)

               (bibtex-completion-clear-cache (list file))
               (push (-cons* file
                             bibliography-hash
                             (bibtex-completion-parse-bibliography ht-strings))
                     bibtex-completion-cache))))))
      (setf bibtex-completion-string-hash-table ht-strings))

    ;; If some files were reparsed, resolve cross-references:
    (when reparsed-files
      (message "Resolving cross-references ...")
      (bibtex-completion-resolve-crossrefs files reparsed-files))

    ;; Finally return the list of candidates:
    (message "Done (re)loading bibliography.")
    (cl-loop
     for file in files
     append (cddr (assoc file bibtex-completion-cache)))))

(bibtex-completion-clear-cache)
(bibtex-completion-candidates)
(helm-bibtex)

If you then use the two helm prompts to search for e.g. "EC:Albrecht17" you get a missing year first and then you get it.

malb avatar Feb 02 '22 10:02 malb

Sorry for not getting back to this earlier. Too much going on.

I've got to say that I don't remember why we reverse and then unreverse. Your PR just gets rid of this and I assume that solves your specific problem. But are we sure that it doesn't break something else that relies on the reversal?

tmalsburg avatar Oct 11 '22 16:10 tmalsburg

I've been using it since opening the ticket, but I didn't systematically test it, so not sure if that's enough.

malb avatar Oct 11 '22 16:10 malb

So with this PR, strings are resolved as expected but otherwise nothing changes? Sorry for asking but I don't fully understand what's going on here.

tmalsburg avatar Oct 11 '22 16:10 tmalsburg

Yep, that's my impression, all working as expected and strings resolved.

malb avatar Oct 11 '22 16:10 malb

Then let's do it!

tmalsburg avatar Oct 11 '22 18:10 tmalsburg

Thanks for the PR, and sorry again that it took so long.

tmalsburg avatar Oct 11 '22 18:10 tmalsburg

Thank you :)

malb avatar Oct 12 '22 08:10 malb

After merging this PR the order of candidates is reversed. I was under the impression that the presented order shouldn't change. Am I missing something?

I prefer the reverse order for two reasons: 1. This way the latest articles (added at the bottom) appear at the top. 2. It's been this way forever, and I wouldn't want to irritate users with seemingly random changes.

tmalsburg avatar Oct 20 '22 09:10 tmalsburg

Oh! Sorry, I hadn't noticed that (my bib files are massive in size and I always only search in them). At least we now know why you reversed :) Perhaps we can reverse after processing to get this result?

malb avatar Oct 20 '22 10:10 malb

Yes, that's why I reversed. But the code used to reverse and then unreverse. I think your PR removed both and I thought that it should therefore have no effect on the order. But I don't seem to understand my own code years after having written it :)

tmalsburg avatar Oct 20 '22 11:10 tmalsburg

I restored the original order.

tmalsburg avatar Oct 24 '22 09:10 tmalsburg