Optimised and cleaned the code [MRG]
Made some minor changes to the code – mostly making some parts of the code more Pythonic and readable – while trying to understand the implementation of Camelot. Great piece of work, and would love to work on more issues/features in the near future. Here is a list of changes which I have made.
camelot/core.py — Modified the 'for' loops in lines 527-532 to a cleaner and optimised form.
camelot/utils.py — Added "Accept-Encoding": "gzip;q=1.0, deflate;q=0.9, br;q=0.8, compress;q=0.7, *;q=0.1" to headers in line 86
camelot/handlers.py — Modified line 38 as self.filepath = download_url(filepath) if is_url(filepath) else filepath
camelot/parsers/lattice.py — Changed the attribute from 't' to 'table' in the static method _reduce_index() to match the docstring — Changed the attribute from 't' to 'table' in the static method _copy_spanning_text() and the docstring for consistency across the code, and clarity of usage — Code optimisations to if conditionals in lines 190-201 to become more Pythonic (Merged adjacent 'if' conditionals with 'and')
Hey!
As camelot is dead, we try to build a maintained fork at pypdf_table_extraction.
Do you want to open the PR against that branch so that we can merge your improvement?
I would love to contribute to Camelot. Let me know if you need any help.
On Sun, 25 Feb 2024, 16:43 Martin Thoma, @.***> wrote:
Hey!
As camelot is dead https://github.com/camelot-dev/camelot/issues/343, we try to build a maintained fork at pypdf_table_extraction https://github.com/py-pdf/pypdf_table_extraction.
Do you want to open the PR against that branch so that we can merge your improvement?
— Reply to this email directly, view it on GitHub https://github.com/camelot-dev/camelot/pull/280#issuecomment-1962897832, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADMUP5N4WQTVBGKXMURGE3LYVMMGHAVCNFSM5LJHJ7Q2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJWGI4DSNZYGMZA . You are receiving this because you authored the thread.Message ID: @.***>
If i remembered correctly, this pr introduced some new bugs.
Maybe better, just to only merge from the main pypdf_table_extraction_repo.
Then makr the individual PR's here as included.
WDYT @vinayak-mehta ?
I merged only from the main branch of the pypdf_table_extraction repo, this commit was somehow included from there 🤔
Do you know what the bug was? I can look into adding a fix or reverting this single commit, wdyt?