Possible Bug: converting paranthesis "(" into minus sign
I am trying to extract table from a PDF file using tabulizer. it is running fine and does extract tables. However, my table has parenthesis around numbers in a table, for example, (20,076) and tabulizer is interpreting "(" as minus sign and extracted table has -20,076 ( a negative number. Can any body help me why is it doing this and what could be a solution to address this problem.
Code here: library(tabulizer) location <- "ccc.pdf" out <- extract_tables(location, output = "csv")
Input File and extracted Table
ExtractedTable.xlsx
That's strange. It's unlikely to be a tabulizer issue, the problem probably lies somewhere upstream. Either Tabula or even the pdf file itself. What happens if you copy-paste these figures in some pdf viewer? Could you provide an example pdf file?