papermage icon indicating copy to clipboard operation
papermage copied to clipboard

library supporting NLP and CV research on scientific papers

Results 25 papermage issues
Sort by recently updated
recently updated
newest added

Hi! First of all, thank you for the white paper and the library. The concept of layers is interesting. It's a smart way to represent a document for different purposes....

Hello, I'm glad to read your paper, but I ran into a problem while running the code: Running quick_start_demo.ipynb reported error OSError: [Errno 22] Invalid argument: 'D:\\\Users\\ure/.torch/iopath_cache\s/ukbw5s673633hsw\\publaynet-tf_ efficientdet_d0.pth.tar?dl=1.lock'

Hi, When I run the second cell, I get the following error: `--------------------------------------------------------------------------- KeyError Traceback (most recent call last) Cell In[2], line 5 2 from papermage.recipes import CoreRecipe 3 fixture_path...

Hi, thanks for this great toolkit! I tried the papermage with several PDF files. It works really well with recent papers but when I tried to parse some papers published...

Rasterizers, DocMetadataExtractors, Parsers, etc. all should emit Documents And we should have some sort of `Doc.update()` functionality or `merge()` functionality to combine Documents.

1. To avoid @soldni type checking horrors, let's add something akin to `.get()` 2. We may still want to represent fields as `None` to separate them from `[]` or `Layer(entities=[])`

- https://pypi.org/project/msgspec/ - favor gzip - round out floats to ints w/ some precision