papers icon indicating copy to clipboard operation
papers copied to clipboard

papers --add fails if in a subdir?

Open boyanpenkov opened this issue 1 year ago • 13 comments

I see

papers add 2013_AdvCIS_Modeling\ and\ simulation\ of\ electrostatically\ gated\ nanochannels.pdf --rename --copy --info         
INFO:papers:bibtex: '/home/boyan/Vazhno/Work/Literature/library.bib'
INFO:papers:filesdir: '/home/boyan/Vazhno/Work/Literature/papers_organized'
INFO:papers:8036 entry files were updated
INFO:papers:pdftotext -f 1 -l 1 2013_AdvCIS_Modeling and simulation of electrostatically gated nanochannels.pdf /tmp/tmppsa0k9ff.txt
INFO:papers:found doi:10.1016/j.cis.2013.06.006
INFO:papers:duplicate :: update key to match existing entry: 2013/2013_pardon_van-der-wijngaart_modeling-and-simulation-of-electrostatically-gated-nanochannels => Pardon2013
Traceback (most recent call last):
  File "/home/boyan/miniconda3/envs/python/bin/papers", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/__main__.py", line 1071, in main
    check_install(subp, o, config) and addcmd(subp, o, config)
                                       ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/__main__.py", line 452, in addcmd
    biblio.add_pdf(file, attachments=o.attachment, rename=o.rename, copy=o.copy,
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/bib.py", line 432, in add_pdf
    self.insert_entry(entry, update_key=True, **kw)
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/bib.py", line 288, in insert_entry
    self.insert_entry_check(entry, update_key=update_key, rename=rename, copy=copy, **checkopt)
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/bib.py", line 345, in insert_entry_check
    file = merge_files([candidate, entry], relative_to=self.relative_to)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/duplicate.py", line 290, in merge_files
    check = checksum(f) if os.path.exists(f) else None
            ^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/utils.py", line 81, in checksum
    return hash_bytestr_iter(file_as_blockiter(open(fname, 'rb')), hashlib.sha256())
                                               ^^^^^^^^^^^^^^^^^
IsADirectoryError: [Errno 21] Is a directory: '/home/boyan/Vazhno/Work/Literature'

The pdf itself is OK, I think -- this is the right PDF metadata after the add.

papers extract 2013_AdvCIS_Modeling\ and\ simulation\ of\ electrostatically\ gated\ nanochannels.pdf                    
@article{Pardon_2013, title={Modeling and simulation of electrostatically gated nanochannels}, volume={199–200}, ISSN={0001-8686}, url={http://dx.doi.org/10.1016/j.cis.2013.06.006}, DOI={10.1016/j.cis.2013.06.006}, journal={Advances in Colloid and Interface Science}, publisher={Elsevier BV}, author={Pardon, G. and van der Wijngaart, W.}, year={2013}, month=nov, pages={78–94} }

My papers is installed, and the config is:

(python) → working Literature/Stage cat ~/.local/share/config.json                                                                                                   
{
  "absolute_paths": true,
  "backup_files": false,
  "bibtex": "/home/boyan/Vazhno/Work/Literature/library.bib",
  "editor": null,
  "filesdir": "/home/boyan/Vazhno/Work/Literature/papers_organized",
  "git": true,
  "gitdir": "/home/boyan/.local/share",
  "gitlfs": true,
  "keyformat": {
    "author_num": 2,
    "author_sep": "_",
    "template": "{year}/{year}_{author}_{title}",
    "title_length": 100,
    "title_sep": "-",
    "title_word_num": 100,
    "title_word_size": 1
  },
  "local": false,
  "nameformat": {
    "author_num": 2,
    "author_sep": "_",
    "template": "{authorX}_{year}_{title}",
    "title_length": 100,
    "title_sep": "-",
    "title_word_num": 100,
    "title_word_size": 1
  }
}

Note that if I switch to the {journal} tag in the config ( by doing "template": "{journal}/{year}{author}{title}") which should be supported, as {journal} is a valid BibTex field, I get

INFO:papers:bibtex: '/home/boyan/Vazhno/Work/Literature/library.bib'
INFO:papers:filesdir: '/home/boyan/Vazhno/Work/Literature/papers_organized'
INFO:papers:8036 entry files were updated
INFO:papers:pdftotext -f 1 -l 1 2013_AdvCIS_Modeling and simulation of electrostatically gated nanochannels.pdf /tmp/tmp1if0wmzn.txt
INFO:papers:found doi:10.1016/j.cis.2013.06.006
Traceback (most recent call last):
  File "/home/boyan/miniconda3/envs/python/bin/papers", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/__main__.py", line 1071, in main
    check_install(subp, o, config) and addcmd(subp, o, config)
                                       ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/__main__.py", line 452, in addcmd
    biblio.add_pdf(file, attachments=o.attachment, rename=o.rename, copy=o.copy,
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/bib.py", line 427, in add_pdf
    entry['ID'] = self.generate_key(entry)
                  ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/bib.py", line 367, in generate_key
    key = self.keyformat(entry)
          ^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/filename.py", line 108, in __call__
    return self.render(**entry)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/filename.py", line 105, in render
    return stringify_entry(entry, **vars(self))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/filename.py", line 68, in stringify_entry
    res = template.format(**fields)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'journal'

I can refile this as two issues, but am I calling "add" correctly? The behavior I expect is to have the PDF renamed and moved, and the entry added to the end of library.bib.

boyanpenkov avatar May 18 '24 23:05 boyanpenkov

Hmm, your command looks correct. Which version of papers are you using? Lately I have been using the pr-perfect-undo branch, which I have wanted to merge into master for a while now. I have fixed a few bugs in that branch. Do you mind trying it and see if you still have the issue? Note there are also some tiny differences between "local" and "global" install. Can you try papers status -v ?

perrette avatar May 20 '24 09:05 perrette

Regarding the file-formatting issue, not all fields are available. So far only author, title, year and ID, in various formatting options. Here an example:

{'author': 'oelsmann_passaro', 'Author': 'Oelsmann_Passaro', 'AUTHOR': 'OELSMANN_PASSARO', 'authorX': 'oelsmann_et_al', 'AuthorX': 'Oelsmann_et_al', 'year': '2021', 'title': 'the-zone-of-influence-matching-sea-level-variability-from-coastal-altimetry-and-tide-gauges-for', 'Title': 'The-Zone-Of-Influence-Matching-Sea-Level-Variability-From-Coastal-Altimetry-And-Tide-Gauges-For', 'ID': 'oelsmann_passaro2021'}

It would be better to have generic filters such as { key | filter1 | filter2 ...} such as { author | capitalize | '-'.join } but it needs a few hours to implement and test. Anyway, I could probably add the journal in a simple manner already, if that's something you'd find useful.

perrette avatar May 20 '24 09:05 perrette

papers 2.4 -- I'll try your branch now...

boyanpenkov avatar May 26 '24 14:05 boyanpenkov

OK, super -- on your branch, this issue is no longer observed:

papers add 2013_AdvCIS_Modeling\ and\ simulation\ of\ electrostatically\ gated\ nanochannels.pdf
Traceback (most recent call last):
  File "/home/boyan/miniconda3/envs/python/bin/papers", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/__main__.py", line 1071, in main
    check_install(subp, o, config) and addcmd(subp, o, config)
                                       ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/__main__.py", line 452, in addcmd
    biblio.add_pdf(file, attachments=o.attachment, rename=o.rename, copy=o.copy,
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/bib.py", line 427, in add_pdf
    entry['ID'] = self.generate_key(entry)
                  ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/bib.py", line 367, in generate_key
    key = self.keyformat(entry)
          ^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/filename.py", line 108, in __call__
    return self.render(**entry)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/filename.py", line 105, in render
    return stringify_entry(entry, **vars(self))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/filename.py", line 68, in stringify_entry
    res = template.format(**fields)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'journal'

boyanpenkov avatar May 26 '24 14:05 boyanpenkov

I would indeed find "journal" useful, if it's not too much work; please let me know...

boyanpenkov avatar May 26 '24 14:05 boyanpenkov

What might be a good way to go it to merge pr-perfect-undo if it's not too work-in-progress, and then I can get up to speed on that, as a whole.

boyanpenkov avatar May 26 '24 14:05 boyanpenkov

@perrette By chance, does the config file change between pr-perfect-undo and 2.4?

I see:

(python) → pr-perfect-undo Repos/papers papers --version                                                 11:01:20
WARNING:papers:Legacy config file found: /home/boyan/.local/share/config.json. Delete to remove this warning:  rm -f '/home/boyan/.local/share/config.json'
2.5.dev25+geeb2892

This still does reproduce the error above when I remove the call to "journal" and just try the add:

Traceback (most recent call last):
  File "/home/boyan/miniconda3/envs/python/bin/papers", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/boyan/Vazhno/Work/Repos/papers/papers/__main__.py", line 1195, in main
    check_install(subp, o, config) and addcmd(subp, o, config)
                                       ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/Vazhno/Work/Repos/papers/papers/__main__.py", line 534, in addcmd
    biblio.add_pdf(file, attachments=o.attachment, rename=o.rename, copy=o.copy,
  File "/home/boyan/Vazhno/Work/Repos/papers/papers/bib.py", line 435, in add_pdf
    self.insert_entry(entry, update_key=True, **kw)
  File "/home/boyan/Vazhno/Work/Repos/papers/papers/bib.py", line 288, in insert_entry
    self.insert_entry_check(entry, update_key=update_key, rename=rename, copy=copy, **checkopt)
  File "/home/boyan/Vazhno/Work/Repos/papers/papers/bib.py", line 345, in insert_entry_check
    file = merge_files([candidate, entry], relative_to=self.relative_to)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/Vazhno/Work/Repos/papers/papers/duplicate.py", line 290, in merge_files
    check = checksum(f) if os.path.exists(f) else None
            ^^^^^^^^^^^
  File "/home/boyan/Vazhno/Work/Repos/papers/papers/utils.py", line 109, in checksum
    return hash_bytestr_iter(file_as_blockiter(open(fname, 'rb')), hashlib.sha256())
                                               ^^^^^^^^^^^^^^^^^
IsADirectoryError: [Errno 21] Is a directory: '/home/boyan/Vazhno/Work/Literature'

Cheers!

boyanpenkov avatar Jun 01 '24 15:06 boyanpenkov

It has been a while now, but it may change the default location in relation to local/global install (possibly it also does local by default, I can't remember now). This is a feature branch that grew up bigger than it should have. Should be merged and be done with it. Calling it version 3 I guess. Anyway to remove the warning why not follow the recommendation in the message and remove the config file that triggers the warning?

perrette avatar Jun 01 '24 20:06 perrette

Oh, sorry -- yes, I was not sure of the downstream consequences on the install/not-install state there, since I'm still new to that bit...

boyanpenkov avatar Jun 01 '24 21:06 boyanpenkov

Hey @perrette -- I'm reading through https://github.com/perrette/papers/tree/pr-perfect-undo (specifically papers/filename.py). Do I understand correctly that the available fields are defined solely there? If so, I can try to come back with a PR that adds the journal; do let me know...

Cheers!

boyanpenkov avatar Jun 06 '24 17:06 boyanpenkov

Yes exactly. The fields are defined in make_template_fields. A PR should includes new user-defined arguments and ideally a few tests. Please go ahead, this might actually be a good one, as I think the underlying code as it is now is relatively clear (despite the caveats discussed earlier in this issue).

perrette avatar Jun 06 '24 19:06 perrette

@perrette -- this is extending the conversation in https://github.com/perrette/papers/pull/65

I looked at this, and can confirm the behavior in https://github.com/perrette/papers/tree/pr-perfect-undo I saw:

-- when running papers add thing.pdf (or combinations like papers add ../../thing.pdf)

-- if checking the bibtex yields a duplicate AND that duplicate has a file attribute:

-- then the duplicates' file attribute (even if malformed or wrong or moved) will be used as the merge, which triggers the above bug.

In my case, I had the Pardon2013 paper in my database, with file = {::}, which then pointed papers to the /home/boyan/Vazhno/Work/Literature directory which contains all my PDFs.

So, question -- how corner-case'y is this corner-case?

boyanpenkov avatar Jul 03 '24 13:07 boyanpenkov

@perrette hello, hello -- let me know?

boyanpenkov avatar Sep 18 '24 23:09 boyanpenkov

I don't know how up-to-date that issue is. Can you double-check and close if necessary? The latest branch is master.

perrette avatar Feb 10 '25 12:02 perrette

Yes, I will double-check against ^HEAD here when I get a chance.

boyanpenkov avatar Feb 10 '25 12:02 boyanpenkov

I can't reproduce the bug so I close for now.

perrette avatar Feb 10 '25 17:02 perrette