papis
papis copied to clipboard
sample workflow?
Hi folks, I really appreciate the thorough description of the commands/flags. Are there any published possible workflows?
I seem to be having trouble querying the documents that I have added? Either in the list command or in the addto command (to select the document I want to add to).
It would be great to have a sample workflow which shows adding a few documents to a library, and then searching/opening for those documents by tags or authors or partial titles etc.
Best, RH
I don't know if this is exactly what you're looking for, but I outlined my process for using papis in this gist.
Regarding having trouble with querying, it could be a number of things. Are you saying that a document is verifiably in the library but is not found by the query? Every so often, for very recently added documents, this happens to me and I run papis --clear-cache
to force papis to rebuild the database. It could also be the match-format
in your configuration. For example, I think if you're using papis' builtin database that you can only query things based on what's in the match-format
. So if you want to query by a particular year, you need to make sure {doc[year]}
is in your match-format
. I could be wrong on that. However, if you're using the whoosh database, you can query by fields as in papis open project:my-new-paper
(i.e., return only documents whose project
field contains my-new-paper
).
As for using the addto
command, I almost always just query by first author's last name, e.g., papis addto -f FILE.pdf smith
. Then I narrow down to the document I want, interactively, within the picker. I'm using the default picker.
Many thanks, I need to study your config and workflow and see if it solves my problem. I think the problem is actually how I add the documents to my libraries.
Are you doing this on a mac? What packages are needed to get whoosh to work?
Sincerely,
Dr. Ronald D. Haynes
Professor, Department of Mathematics and Statistics Chair, MSc and PhD Scientific Computing Programs Memorial University of Newfoundland
We acknowledge that the lands on which Memorial University’s campuses are situated are in the traditional territories of diverse Indigenous groups, and we acknowledge with respect the diverse histories and cultures of the Beothuk, Mi’kmaq, Innu, and Inuit of this province. On Dec 8, 2020, 11:06 AM -0330, Alexander Von Moll [email protected], wrote:
I don't know if this is exactly what you're looking for, but I outlined my process for using papis in this gist. Regarding having trouble with querying, it could be a number of things. Are you saying that a document is verifiably in the library but is not found by the query? Every so often, for very recently added documents, this happens to me and I run papis --clear-cache to force papis to rebuild the database. It could also be the match-format in your configuration. For example, I think if you're using papis' builtin database that you can only query things based on what's in the match-format. So if you want to query by a particular year, you need to make sure {doc[year]} is in your match-format. I could be wrong on that. However, if you're using the whoosh database, you can query by fields as in papis open project:my-new-paper (i.e., return only documents whose project field contains my-new-paper). As for using the addto command, I almost always just query by first author's last name, e.g., papis addto -f FILE.pdf smith. Then I narrow down to the document I want, interactively, within the picker. I'm using the default picker. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
@avonmoll thanks that looks awesome, can I add the gist to the papis README
or documentation?
@rhaynes74 have you seen generally the papis documentation here ?
https://papis.readthedocs.io/en/latest/quick_start.html
I'd be interested in hearing what are your thoughts about what we can improve in the documentation to make this transition to new users smoother.
I recently also wrote a blog post about getting references from papers using papis
https://alejandrogallo.github.io/blog/get-paper-references.html , maybe this is also helpful.
Hi folks, thanks for the responses. I have read over that documentation. To me the documentation doesn’t make it clear what information needs to be provided when I add a document so that I can then search for it later, and how to form the query for that search.
Sincerely,
Dr. Ronald D. Haynes
Professor, Department of Mathematics and Statistics Chair, MSc and PhD Scientific Computing Programs Memorial University of Newfoundland
We acknowledge that the lands on which Memorial University’s campuses are situated are in the traditional territories of diverse Indigenous groups, and we acknowledge with respect the diverse histories and cultures of the Beothuk, Mi’kmaq, Innu, and Inuit of this province. On Dec 8, 2020, 2:14 PM -0330, Alejandro Gallo [email protected], wrote:
@avonmoll thanks that looks awesome, can I add the gist to the papis README or documentation? @rhaynes74 have you seen generally the papis documentation here ? https://papis.readthedocs.io/en/latest/quick_start.html I'd be interested in hearing what are your thoughts about what we can improve in the documentation to make this transition to new users smoother. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
@alejandrogallo - fine by me! I admit that my workflow is not necessarily generic and does not make use of all of papis' features.
@rhaynes74 - how are you adding documents to your library? Regarding forming a query, this part of the documentation might give some answer.
Edit: Forgot to mention: I'm using the same workflow (and git synchronized paper repository) on mac and linux (Manjaro)
Hi folks, right now just using
papis add filename.ext —set author ‘Name’ —set title ‘Some title'
Sincerely,
Dr. Ronald D. Haynes
Professor, Department of Mathematics and Statistics Chair, MSc and PhD Scientific Computing Programs Memorial University of Newfoundland
We acknowledge that the lands on which Memorial University’s campuses are situated are in the traditional territories of diverse Indigenous groups, and we acknowledge with respect the diverse histories and cultures of the Beothuk, Mi’kmaq, Innu, and Inuit of this province. On Dec 8, 2020, 4:28 PM -0330, Alexander Von Moll [email protected], wrote:
@alejandrogallo - fine by me! I admit that my workflow is not necessarily generic and does not make use of all of papis' features. @rhaynes74 - how are you adding documents to your library? Regarding forming a query, this part of the documentation might give some answer. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
Maybe we lack a section in the documentation where we make explicit and clear that papis
can
download documents from different sources (in the papis parlance, these are importers
).
Smart mode
So for instance when you do something like
papis add https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.124.171801/
papis does not know where the information is, so it activates the "smart" mode. This is, it looks at the string being added, in this case
https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.124.171801/
and it checks if it understands it somehow.
- Papis first checks if this string is an existing file in your file system, which in this case it is not because we don't have a file called like this in our system.
- Then papis says, ok, maybe it's a doi, it tries to validate this doi and it fails, so it is not a doi
- Or maybe it is an
arxiv
id, which is not. - Or maybe it is a
url
, and it says, well yes, it is one. At this moment, it will check if papis knows this journal or this kind of url. We have implemented several journal parsers, for instance there is one foraps.org
inpapis/downloaders/aps.py
.- Papis recognises that it knows this kind of url as a
aps
-url, and says, ok, I know how to retrieve information from here because someone implemented it, I'll try to get the information and download a pdf. - There is a universal
url
parser which is calledfallback
and is inpapis/downloaders/fallback.py
. This means, even if you're checking out an obscure journal or some random url, thefallback
downloader will try to get as much information as it cans from the metadata of the website. This works amazingly well thanks to facebook, twitter and these guys. Yes, you've heard well. Since these companies are so important, many web developers want to make sure that the metadata of their webpages are understandable for the big tech companies. This ensures a good Search Engine Optimization and therefore the visibility of the journal's content (or general website's content)
- Papis recognises that it knows this kind of url as a
Explicit mode
as @avonmoll wrote in his workflow, you can also tell papis
exactly what the thing you're adding is, this you do with the --from
flag. For instance, if you're adding the upper paper using a doi
and you're sure this is a doi and don't want papis trying funky combinations
to figure out what it is, then you'd do
papis add --from doi 10.1103/PhysRevLett.124.171801/
or if you're telling it it's a url, then
papis add --from url https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.124.171801/
TODO
I think you're right, and maybe this piece of information should be made more explicit in the documentation
Combined mode
Something I did not mention and that I use all the time is the combined mode, i.e., using your example, I download the pdf (using some kind of resource at your disposal) and I have the doi or url, then I'd do
papis add paper.pdf https://onlinelibrary.wiley.com/doi/abs/10.1002/andp.19163540702
Now, papis will say,
-
paper.pdf
is an actual file in my file system, so you mean that I should be adding it as a pdf, right? I'll do right that. -
https://onlinelibrary.wiley.com/doi/abs/10.1002/andp.19163540702
is not an existing file, so I guess you want me to run my magic and try to figure out data for thepaper.pdf
document using the smart mode.
Just adding a pdf
Notice that there is also an automatic doi
and arxiv
parser.
This means, imagine you just downloaded a paper
let's say
https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.123.156401,
(I'm using this one since I know you can download it)
Then if you download it and save it as paper.pdf
and run
papis add paper.pdf
papis will use the smart mode to try to get the first doi
appearing in the text and supposing that this doi belongs to the paper.
It will also try to get an arxivid, sometimes this happens, but as you'll
see you can just select whatever suits your needs.
I just attached a gif of how this would look like.
In this case, I select the information from
pdf2doi
importer, this means, the importer that
takes a pdf
and tries to get a doi
from it.
At the end of the gif there is a clash since I already have this paper in my library and I do not wish to add it twice.
It is maybe a good idea to set the
edit
, open
and confirm
options of the papis-add
section
to True
https://papis.readthedocs.io/en/latest/configuration.html#papis-add-options
Example document: output.pdf