polar-bookshelf icon indicating copy to clipboard operation
polar-bookshelf copied to clipboard

Bibtex support

Open sotte opened this issue 5 years ago • 11 comments

Polar looks really cool, but I have a huge bibtex library. I would like to 1) import the library and b) be able to export polar's library as bibtex.

Any plans on supporting bibtex?

sotte avatar Jan 05 '19 08:01 sotte

Thanks.. Could you share your bibtex file? Looking at a real-world bibtex might be helpful for looking at importing it.

Does it have a path to the PDF file it references?

Polar was designed to hold files but I think also a lot of people just have 'references' and might not have the PDF file.

I guess I could build in support for this but it wasn't initially considered as part of the use case.

burtonator avatar Jan 05 '19 14:01 burtonator

In my current bibtex setup I don't have a nice option to annotate the pdfs for a given bibtex entry. This is polars strength. Zotero has some limited capabilities with zotfile. Mendeley is pretty good at annotating, but now encrypts your database so you can't really access it anymore 1.

Writing an importer (bibtex + pdf) for polar should be fairly easy. But does polar want to support bibtex as representation of the bookshelf? Maybe just a simple export? Should there be an additional bibtex field for each entry?

Here is a small sample of my bibtex file. Nowadays, most entries come from arxiv.org, but not all. I don't have a path to the PDF file, but I use the key as filename, ie @article{zhang19_fixup_initial, ... has the corresponding pdf zhang19_fixup_initial.pdf. The bibtex sample

@article{zhang19_fixup_initial,
  author =       {Zhang, Hongyi and Dauphin, Yann N. and Ma, Tengyu},
  title =        {Fixup Initialization: Residual Learning Without Normalization},
  journal =      {CoRR},
  year =         2019,
  url =          {http://arxiv.org/abs/1901.09321v1},
  abstract =     {Normalization layers are a staple in state-of-the-art deep
                  neural network architectures. They are widely believed to
                  stabilize training, enable higher learning rate, accelerate
                  convergence and improve generalization, though the reason for
                  their effectiveness is still an active research topic. In this
                  work, we challenge the commonly-held beliefs by showing that
                  none of the perceived benefits is unique to normalization.
                  Specifically, we propose fixed-update initialization (Fixup),
                  an initialization motivated by solving the exploding and
                  vanishing gradient problem at the beginning of training via
                  properly rescaling a standard initialization. We find training
                  residual networks with Fixup to be as stable as training with
                  normalization -- even for networks with 10,000 layers.
                  Furthermore, with proper regularization, Fixup enables
                  residual networks without normalization to achieve
                  state-of-the-art performance in image classification and
                  machine translation.},
  archivePrefix ={arXiv},
  eprint =       {1901.09321},
  primaryClass = {cs.LG},
}
@Article{Hadsell,
  author       = {Hadsell, R. and Chopra, S. and LeCun, Y.},
  title        = {Dimensionality Reduction by Learning an Invariant Mapping},
  doi          = {10.1109/cvpr.2006.100},
  url          = {http://dx.doi.org/10.1109/cvpr.2006.100},
  isbn         = 0769525970,
  journal      = {2006 IEEE Computer Society Conference on Computer Vision and
                  Pattern Recognition - Volume 2 (CVPR’06)},
  publisher    = {IEEE}
}

Let me know if I can help out.

sotte avatar Feb 07 '19 10:02 sotte

I don't know how we're going to find the pdf mapping to the metadata in that entry.

Would you be open to providing a tar.gz of all your PDFs?

My thinking now is that if we can find the DOIs in them, then I can just lookup this information via an api. but you're right ideally we would parse this via an input format but if there's no standard path to the PDF I won't be able to reliably resolve it properly.

burtonator avatar Feb 07 '19 16:02 burtonator

By library is shy of 4GB. I don't think it makes sense to provide that :) I'm not sure that all papers have a proper DOI in them.

I would be happy if I could use a (hypothetical) polar CLI to add one bibtex entry with with the corresponding pdf/file:

polar-bookshelf add --bibtexfile entry1.bib --document paper/entry1.pdf

And of course I want to be able to export the current polar shelf as bibtex. Easy, right :)

(I understand that this might be a niche use case and there are more pressing issues. But one can dream ;) )

sotte avatar Feb 07 '19 16:02 sotte

I also think that Polar could be a bibtex library manager (similar to what JabRef, Mendeley or docear try to do).

Below is an example of bibtex as managed by JabRef:

@ARTICLE{Buchner2014stats,
  author = {Buchner, Johannes},
  title = {{A statistical test for Nested Sampling algorithms}},
  journal = {Statistics and Computing},
  year = {2014},
  pages = {1-10},
  month = jul,
  adsnote = {Provided by the SAO/NASA Astrophysics Data System},
  adsurl = {http://adsabs.harvard.edu/abs/2014arXiv1407.5459B},
  archiveprefix = {arXiv},
  doi = {10.1007/s11222-014-9512-y},
  eprint = {1407.5459},
  file = {Published version:Buchner2014stats.pdf:PDF;arXiv v3:Buchner2014stats-eprintv3.pdf:PDF},
  issn = {0960-3174},
  keywords = {Nested sampling; MCMC; Bayesian inference; Evidence; Test; Marginal
        likelihood},
  language = {English},
  owner = {user},
  primaryclass = {stat.CO},
  publisher = {Springer US},
  timestamp = {2014.08.20}
}

@ARTICLE{Buchner2017a,
  author = {{Buchner}, J. and {Bauer}, F.~E.},
  title = {{Galaxy gas as obscurer {\ndash} II. Separating the galaxy-scale
        and nuclear obscurers of active galactic nuclei}},
  journal = {\mnras},
  year = {2017},
  volume = {465},
  pages = {4348-4362},
  month = mar,
  adsnote = {Provided by the SAO/NASA Astrophysics Data System},
  adsurl = {http://adsabs.harvard.edu/abs/2017MNRAS.465.4348B},
  archiveprefix = {arXiv},
  doi = {10.1093/mnras/stw2955},
  eprint = {1610.09380},
  file = {arXiv v1:Buchner2017a-eprintv1.pdf:PDF},
  owner = {user},
  primaryclass = {astro-ph.HE},
  timestamp = {2017.01.14}
}

The file entry shows the nickname, file name in the library folder, and file type. I think you only need the file name. The library folder is stored in the settings of JabRef.

The eprint entry also lets you point to https://arxiv.org/pdf/1610.09380.pdf for a download of ArXiV preprints.

The old JabRef had an extension called LocalCopy to follow DOI links and download PDF files from the journal pages.

If Polar could ingest bibtex entries, enhance them with fetching PDF, and export again to a bibtex file (without losing bibtex entries), I would use it for my academic library.

However, I don't know if you want to go in that direction with your program.

JohannesBuchner avatar Apr 23 '19 14:04 JohannesBuchner

@JohannesBuchner There's another bug I can point you to that discusses how we're going to implement bibliographic support.

The idea is that we store the metadata directly in JSON after importing it... you can then re-export to bib if you want.

Any URLs or DOI links can be automatically fetched as part of the import.

I'm hoping it's not too much work. Like maybe a day or two....

burtonator avatar Apr 23 '19 16:04 burtonator

Quick note that pandoc-citeproc converst between the CSL-JSON spec and bibtex (as well as a number of other formats).

stites avatar Feb 05 '20 03:02 stites

  1. Would love this :D
  2. Mendeley (and maybe other tools?) that export .bib files include extra fields for the local file path, tags, keywords, etc.
  3. With this feature, we'd also need a way to "add" a document as a reference only, not including the content. If I'm not annotating it, I'd love if I could forgo synching it with the cloud :) Sometimes I have quick reads that I'll reference later but won't need for long. It'd be nice to keep them as a bookmark + tags & meta.

munael avatar Feb 23 '21 19:02 munael

Can you guys vote on this issue?

http://feedback.getpolarized.io/feature-requests/p/bibtex-import-support

burtonator avatar Feb 23 '21 22:02 burtonator

Can you guys vote on this issue?

http://feedback.getpolarized.io/feature-requests/p/bibtex-import-support

It says to "login with your Polar account". I logged in on the webapp, but that doesn't propagate to the forums. Logging in with an email on the forums requires a password, but the email account I'd created for Polar doesn't use them.

munael avatar Feb 23 '21 23:02 munael

If you login to the webapp first, then canny , it should work. ...

burtonator avatar Feb 24 '21 01:02 burtonator