camelot
camelot copied to clipboard
Input of different table areas on different pdf pages.
If a PDF has multiple pages, I want to specify a different table area for each page.
In other words, I would like the table_areas argument of camelot.read_pdf() to be able to specify the table area per page as follows.
table_areas = {
1: [[10, 20, 30, 40], ..],
2: [[80, 100, 90, 120], ..],
4: [[..], ..], ...
}
The dictionary key is the page number.
This would be very useful! Currently, if you have different table areas on different pages, you need to call read_pdf() separately for each table_areas value and then manually combine the data.
I think the dictionary suggested by @RyosukeSakaguchi is a good approach, but it would probably still use the current string syntax:
table_areas = {
1: ["10, 20, 30, 40", ..],
2: ["80, 100, 90, 120", ..],
4: ["..", ..], ...
}