ramSeraph

Results 12 issues of ramSeraph

**Bug report** - Description of the bug Image extraction does not handle the case when colorspace is a PdfObjRef bmp handling might be broken in some cases - Steps to...

type: bug
component: converter
status: accepted

The current imagery blacklist meant to block google maps imagery also blocks imagery hosted on Google cloud storage. Current regex at https://github.com/openstreetmap/openstreetmap-website/blob/master/config/settings.yml#L89 sample google cloud storage url: "https://storage.googleapis.com/bucket_name/obj_name"

Both the old and new SOI projection files are in wgs84 and look like they match. This seems off to me. https://github.com/datameet/maps/blob/master/Survey-of-India-Index-Maps/OSM_UTM_WGS84/OSM_50K_Index.prj ``` GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]] ``` https://github.com/datameet/maps/blob/master/Survey-of-India-Index-Maps/OldSystem_Everest_Polyconic/OldSystem_50K_Index.prj ``` GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]] ```

NeedsDetails

`IndexError` thrown when using `split_text=True` ``` Traceback (most recent call last): File "/code/seperate_page.py", line 16, in tables = camelot.read_pdf(out_filename, File "/usr/local/lib/python3.9/site-packages/camelot/io.py", line 113, in read_pdf tables = p.parse( File "/usr/local/lib/python3.9/site-packages/camelot/handlers.py",...

bug

Apart from the currently available graphical debugging tools available with `camelot.plot`, sometimes it helps to actually have access to the pdfminer textline objects to debug when things go wrong. Adding...

**Describe the bug** `text_in_bbox()` function which is used to get the textlines which are in a table area, currently drops textlines in the table area which intersect with other textlines...

bug

In the absence of a wheel for macos m1( don't see support for this architecture in github actions yet, actions/virtual-environments#2187 ), the compilation invoked on `pip install pdf2png` fails at...

state can be serialised to disk and used to skip already handled features Submitting the PR, to see if there is interest in this feature, given the amount of code...

Note to self. Seem to be failing on permission issues while cleaning up

Should follow #10 Also, consider exporting as sqlite

enhancement
LGD