camelot Optimised and cleaned the code [MRG]

Made some minor changes to the code – mostly making some parts of the code more Pythonic and readable – while trying to understand the implementation of Camelot. Great piece of work, and would love to work on more issues/features in the near future. Here is a list of changes which I have made.

camelot/core.py — Modified the 'for' loops in lines 527-532 to a cleaner and optimised form.

camelot/utils.py — Added "Accept-Encoding": "gzip;q=1.0, deflate;q=0.9, br;q=0.8, compress;q=0.7, *;q=0.1" to headers in line 86

camelot/handlers.py — Modified line 38 as self.filepath = download_url(filepath) if is_url(filepath) else filepath

camelot/parsers/lattice.py — Changed the attribute from 't' to 'table' in the static method _reduce_index() to match the docstring — Changed the attribute from 't' to 'table' in the static method _copy_spanning_text() and the docstring for consistency across the code, and clarity of usage — Code optimisations to if conditionals in lines 190-201 to become more Pythonic (Merged adjacent 'if' conditionals with 'and')

Jan 05 '22 07:01 python3-dev

Hey!

As camelot is dead, we try to build a maintained fork at pypdf_table_extraction.

Do you want to open the PR against that branch so that we can merge your improvement?

Feb 25 '24 11:02 MartinThoma

I would love to contribute to Camelot. Let me know if you need any help.

On Sun, 25 Feb 2024, 16:43 Martin Thoma, @.***> wrote:

Hey!

As camelot is dead https://github.com/camelot-dev/camelot/issues/343, we try to build a maintained fork at pypdf_table_extraction https://github.com/py-pdf/pypdf_table_extraction.

Do you want to open the PR against that branch so that we can merge your improvement?

— Reply to this email directly, view it on GitHub https://github.com/camelot-dev/camelot/pull/280#issuecomment-1962897832, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADMUP5N4WQTVBGKXMURGE3LYVMMGHAVCNFSM5LJHJ7Q2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJWGI4DSNZYGMZA . You are receiving this because you authored the thread.Message ID: @.***>

Feb 27 '24 07:02 python3-dev

If i remembered correctly, this pr introduced some new bugs. Maybe better, just to only merge from the main pypdf_table_extraction_repo. Then makr the individual PR's here as included. WDYT @vinayak-mehta ?

Dec 27 '24 08:12 bosd

I merged only from the main branch of the pypdf_table_extraction repo, this commit was somehow included from there 🤔

Do you know what the bug was? I can look into adding a fix or reverting this single commit, wdyt?

Dec 27 '24 08:12 vinayak-mehta