citar
citar copied to clipboard
file parsing: zotero, window paths
I guess this the right issue to follow up on...
I use Zotero with ZotFile on Windows to manage PDF's in custom directory. When trying to open a pdf from citar with citar-open-files
, it is unable to do so, giving me the following (edited a bit) error:
None of the files for ‘paperOfInterest2022’ exist; check ‘citar-library-paths’ and ‘citar-file-parser-functions’: ("C\\:\\\\Users\\\\username\\\\Documents\\\\Biblio\\\\Articles\\\\2022\\\\Author_et_al-Paper_of_interest.pdf")
While I'm not that familiar with Windows pathnames, I figured out messing on pwsh that drive letter "C\: ..." is not valid.
In the BIB file (generated with Zotero's BetterBibtex plugin) the field is:
@article{articleOfInterest2022,
...
file = {C\:\\Users\\username\\Documents\\Biblio\\Articles\\2022\\articleOfInterst2022.pdf}
}
Originally posted by @allumik in https://github.com/emacs-citar/citar/issues/296#issuecomment-1252336277
I created a new issue for this @allumik. I don't really know much at all about Windows though.
I'm going to ask you to do some research into Emacs and windows file paths. I don't think this is really a citar issue. For example, I suspect if you try to open that file path from IELM using something like find-file
or file-exists-p
, it will fail.
Depending on what you find, however, we might be able to tweak something.
I don't use Windows, so can't test this.
Hi bdarcus,
I had the same problem, but managed to fix it with the following code.
I replaced the default parser function (citar-file--parser-default
) with a function which removes the excess backslashes.
Then I added the changed function to the citar-file-parser-functions
alist.
Note that this is a quick and dirty fix, no doubt there is a more elegant solution.
(setq citar-file-parser-functions
'(cv/citar-file--parser-default
citar-file--parser-triplet))
(defun cv/citar-file--parser-default (file-field)
"Replacement function for citar-file-parser default, to handle windows \\"
(seq-remove
#'string-empty-p
(mapcar
#'string-trim
(citar-file--split-escaped-string
(replace-in-string "\\:" ":"
(replace-in-string "\\\\" "\\" file-field))
?\;))))
(defun replace-in-string (what with in)
(replace-regexp-in-string (regexp-quote what) with in nil 'literal))
Edit: to recap, the problem is that Zotero exports the path to the bib file with included escape characters: instead of C:\my\path\to\pdf.pdf
it exports it as C\:\\my\\path\\to\\pdf.pdf
. When Emacs reads this, it again adds a \ before every : and \ , doubling every \
... This causes the path to be completely messed up, so citar is unable to find the pdf. Obviously Zotero shouldn't export it that way, but here we are trying to find a fix...
Ah, I see we have the unescape functionality in citar-file--parser-triplet
but not in citar-file--parser-default
. I'll write up a PR to handle unescaping for both parsers. Thanks for investigating, @CVanmarcke!
I do wonder why Zotero does that, however, and if someone should report it?
I'm not sure Zotero can do anything else. They have to escape at least closing braces (for BibTeX fields) and separator characters (;
), and then also escape the backslashes used to escape those. Perhaps they don't have to escape as many characters as they do, but that doesn't really make a difference for us; we just replace \<char>
with <char>
regardless.