tabula-java issues

Undesired UI Windows Appearing when using Programmatically

Hello All, Firstly, thank you VERY much for publishing this amazing library! I am working on an integration with Apache Drill which enables users to query PDF files directly using...

cgivre

Arabic Letters are ???

4

Hello I try to extract table from PDF that contains Arabic latter but when I extract the table I get **???** for all Arabic letter this issue happens only when...

yovrer

PDF with CID/Identity-H Font cannot be extracted properly.

We have been able to extract PDF with ANSI encoding. However, we started seeing some PDFs using Identity-H encoding with TrueType CID Font. Does Tabula support this scenario?

rayleeriver

Extract table with one row

I am trying to extract tables from a PDF with no lines, i.e., I am using the `stream` option. The table stretches over several pages with the header being repeated...

mschoettle

All pages not converted

1

In both java based tabula and its python wrapper tabula-py , even when all pages option is given only 1st page is converted. Currently to overcome this i need to...

tathastu871

Feature request

Add option to specify specific delimeter as field seperator

tathastu871

[Feature Request] Specify separate area option for each page

For each page I would like to specify the area to extract the table. `-p 1 -a y1, x1, y2, x2 -p 2 -a Y1, X1, Y2, X2` Could that...

mamtoraah

"scratch file already closed" message

1

The "scratch file already closed" message when building is related to the premature closing of the document in Utils.pageConvertToImage(). Please remove "document.close();". However even that isn't really enough. The PDDocument...

THausherr

Bug: empty cells dropped and cells below shifted up when using area option

### Issue summary: when use area options, empty cells in the table is removed and the cells below are shifted up automatically. but if not use area option, output remains...

luke4u

Nurminen Detection Algorithm is not ignoring page headers and footers.

1

Hi Team, I have been working with Tabula and pdfbox for quite some time, and my issue here is nurminen detection alogirithm is not ignoring page headers and footers while...

satyaraj479

tabula-java
tabula-java copied to clipboard

Metadata

Undesired UI Windows Appearing when using Programmatically

Arabic Letters are ???

PDF with CID/Identity-H Font cannot be extracted properly.

Extract table with one row

All pages not converted

Feature request

[Feature Request] Specify separate area option for each page

"scratch file already closed" message

Bug: empty cells dropped and cells below shifted up when using area option

Nurminen Detection Algorithm is not ignoring page headers and footers.

← Metadata

Owner

Metadata

tabula-java tabula-java copied to clipboard

Metadata

← Metadata

Owner

Metadata

tabula-java
tabula-java copied to clipboard