camelot icon indicating copy to clipboard operation
camelot copied to clipboard

We need more maintainers

Open MartinThoma opened this issue 2 years ago • 33 comments

It seems like camelot is dead:

  • Last commit: 2021-07-11 - @dimitern is the only other project owner besides @vinayak-mehta
  • Last PyPI release: 2021-07-11 - @vinayak-mehta is the only owner
  • Several PRs which look ready to be merged, but are still open

Besides the owner there are only 35 other contributors.

https://opencollective.com/camelot might be another way to check if it's dead.

Does anybody know more? Should we try to transfer the project to https://github.com/jazzband ?

MartinThoma avatar Jan 06 '23 13:01 MartinThoma

Or is somebody there who would like to become a maintainer?

MartinThoma avatar Jan 06 '23 13:01 MartinThoma

Hi @MartinThoma, sorry for not being responsive here. I've been busy with some life stuff for some time now and haven't had the mindspace to look into the issues here. I've been wanting to get back into it, I'll look into them over this weekend.

I also want to stop being a single point of failure here and would love to get help maintaining camelot going forward.

vinayak-mehta avatar Jan 06 '23 14:01 vinayak-mehta

Is there any movement on this topic?

foarsitter avatar Feb 07 '23 15:02 foarsitter

@vinayak-mehta Would you mind if I post this on the Indian FOSS and opendata channels? This tool has been extremely helpful in dealing with the PDF crap Indian Government puts out.

ramSeraph avatar Feb 08 '23 07:02 ramSeraph

Sorry for my inpatientence but the show must go on. In order to take camelot to production I created a fork and released it to pypi under camelot-fork==0.20.0. My intentions are limited and I hope this project finds new maintainers soon.

When the request came to extract tables from pdf files I thought it would be very tricky job but camelot does it all. Therefore I want to express my gratitude to all of you that made that possible.

If people are in need of a fix I'm willing to accept pull request as long as they have test-coverage.

foarsitter avatar Feb 08 '23 11:02 foarsitter

Sorry for being a bit unresponsive since I created this issue. I've pushed a release based on @MartinThoma's last PR: https://pypi.org/project/camelot-py/

@MartinThoma Thank you for the PR, would you like to be added to the github org so that you have push access to the repo?

@foarsitter Are you interested in maintaining the project here instead of the fork? I can add you to the github org too.

vinayak-mehta avatar Feb 26 '23 06:02 vinayak-mehta

Thank you for making a new release :pray:

would you like to be added to the github org so that you have push access to the repo?

I would probably not be super active as I spend most of my time with pypdf. If that is ok for you, then yes, please add me :-) I could probably go over a couple of the PRs / ~introduce~ update CI so that maintaining the library becomes easier :-)

MartinThoma avatar Feb 26 '23 06:02 MartinThoma

@MartinThoma That would be awesome! Just sent you an invite ✉️

vinayak-mehta avatar Feb 26 '23 07:02 vinayak-mehta

@vinayak-mehta should be awesome

foarsitter avatar Feb 26 '23 09:02 foarsitter

Are there any rules I should follow, e.g.

  1. Reviews: When I make a PR, should I ask you (or somebody else) for a review before merging it? (Here is the first one, btw: https://github.com/camelot-dev/camelot/pull/356 :smile: )
  2. Commit Messages: e.g. something like https://docs.scipy.org/doc/scipy/dev/contributor/development_workflow.html#writing-the-commit-message ?
  3. Merges: Merge / Squash+Merge / Rebase+Merge: I prefer squash+merge, but you seem to do normal merges only. Is sqash+merge ok?
  4. Tests: Do you have any hard rules in regards to unit tests, e.g. that every new feature needs to have full test coverage?

MartinThoma avatar Feb 26 '23 09:02 MartinThoma

As I see it:

  1. a review is always good, unless it is something realy trivial. Be patiënt. Reverting releases because we are to eager is something we should want to avoid.
  2. a commit message should be clear about its contents, which style applied is less important to me. As I see it we can generate a changelog based on the titles of the merged pull requests (see my fork: https://github.com/foarsitter/camelot/releases)
  3. If there are more commit message needed because there are changes in various parts of the codebase then a squash seems not a good fit to me, so it depends on the PR.
  4. Adding tests afterwards is really hard, even harder when you are not the author of the code. So full coverage is recommended here in my point of view. If the addition is trivial the test should be trivial too right?

foarsitter avatar Feb 26 '23 09:02 foarsitter

@foarsitter I just invited you, sorry it took so long

I'm a big fan of the scikit-learn contributing guidelines.

  1. We should go for at least 1 review.
  2. I agree with @foarsitter's point.
  3. I'm gonna lean towards squash + merge because it's easier to revert, just in case we need to do that. It should also lead to small PRs which would be easier to review. If we have PRs with major enhancements that touch various parts of the codebase, then squashing might not make sense.
  4. I agree with @foarsitter's point.

vinayak-mehta avatar Mar 18 '23 08:03 vinayak-mehta

I'd be interested in contributing, particularly to the docs initially.

It seems to me there's a lot of value in this repo, but things seem to have got into a fairly confusing state. Devoting time to it, I think I've figured out most of the misunderstandings I had and it seems like it's worth sharing / updating docs, so that others don't fall into the exact same traps I (and others) did.

Correct me if I'm wrong but I get the sense that what has made things harder overall is that the migration to pdftopng/poppler backend was in progress yet not completed when the maintenance fell away (quite reasonably given world events!)

@foarsitter 's idea of a fork that cuts pdftopng out is interesting, although I would feel more comfortable if it was directly part of the main repo.

How feasible is it to make the "base" install be equivalent to the fork (ie such that it doesn't install pdftopng as a requirement)? And with that, introduce a "pdftopng" extra requires option so people can optionally try it and then - only once it's deemed to work sufficiently well - it is switched to be what gets delivered with "base" at some later point. Presumably for that to happen there needs to be a bit of maintenance upstream in pdftopng too. If this last paragraph is best discussed in a separate issue, that's fine by me, just say 🙂

nmstoker avatar Mar 28 '23 16:03 nmstoker

@nmstoker I cannot answer that question, but at least I could review/merge PRs with documentation updates :-) so if there are specific learnings you want to share, I would support you :-)

MartinThoma avatar Mar 28 '23 18:03 MartinThoma

Sounds a good start, thanks @MartinThoma !

nmstoker avatar Mar 28 '23 19:03 nmstoker

@nmstoker Looking forward to your learnings!

foarsitter avatar Mar 29 '23 06:03 foarsitter

After using the products for a long time in my developer career, I just started my contribution to Camlot with my in initial pull request for (#364). I would love to contribute to other projects as well. Thanks

kshitiz305 avatar Apr 24 '23 09:04 kshitiz305

How about Excalibur?? That might need some :heart: as well. There is still an open refresh issue on windows which makes it unusuable.

P.s. I'm happy to contribute/maintain a bit on both projects.

bosd avatar Jul 14 '23 11:07 bosd

Looking forward to your contributions @bosd

foarsitter avatar Jul 14 '23 12:07 foarsitter

@vinayak-mehta Have you seen my e-mail?

  1. PyPI permissions: Can you please give me Owner permissions (instead of just Maintainer) via https://pypi.org/manage/project/camelot-py/collaboration/ so that I can take care of https://github.com/camelot-dev/camelot/issues/389 ?
  2. Github permissions: Can you please give me Admin permissions via https://github.com/camelot-dev/camelot-py/settings/access so that I can allow merge-commits for https://github.com/camelot-dev/camelot/pull/353 ?
  3. Project Governance: Would you be OK with the Github organization camelot-dev merging into py-pdf?

MartinThoma avatar Sep 02 '23 11:09 MartinThoma

@MartinThoma How about Owner / Admin persmissions for Excalibur?

bosd avatar Sep 02 '23 12:09 bosd

Just wandering in but happy to contribute. 👋🏻

bahoo avatar Sep 08 '23 22:09 bahoo

@vinayak-mehta Have you seen my e-mail?

  1. PyPI permissions: Can you please give me Owner permissions (instead of just Maintainer) via https://pypi.org/manage/project/camelot-py/collaboration/ so that I can take care of Release to PyPI via Github Action #389 ?
  2. Github permissions: Can you please give me Admin permissions via https://github.com/camelot-dev/camelot-py/settings/access so that I can allow merge-commits for Release camelot-fork 0.20.1 #353 ?
  3. Project Governance: Would you be OK with the Github organization camelot-dev merging into py-pdf?

Are these permission issues solved already, @MartinThoma?

Can you please take care of these blockers, @vinayak-mehta?

ZupoLlask avatar Dec 02 '23 15:12 ZupoLlask

No. I still don't have sufficient permissions to bring the project back to life. Camelot is dead.

MartinThoma avatar Dec 02 '23 22:12 MartinThoma

In case this helps others (since we didn't know until we tried camelot and ran into various issues how much it's maintenance is suffering), here are a few active PDF processing alternatives in the Python ecosystem:

  • https://github.com/jsvine/pdfplumber
  • https://github.com/chezou/tabula-py
  • https://github.com/py-pdf/pypdf
    • (But no table extraction: https://github.com/py-pdf/pypdf/discussions/1181)
  • https://github.com/pdfminer/pdfminer.six
  • https://github.com/jstockwin/py-pdf-parser

johnthagen avatar Dec 21 '23 17:12 johnthagen

Not sure if people saw it, but in #479 I show some ideas I had with the docs.

With care I think it should be feasible to guide most people around the current difficulties with installation (I've managed setup in Windows and various Linux environments, no access to Mac but guess it's not that different to the Linux steps for the most part)

nmstoker avatar Jan 05 '24 20:01 nmstoker

We need to fork camelot if we want to continue developing it.

I've already talked with the people of py-pdf (website) and they are fine moving it there. But we need two people who would take care of it so that it's not another dead version.

@bosd @foarsitter Would it still be fine to you to become the new maintainers?

Discussion is here: https://github.com/py-pdf/pypdf/discussions/2466

MartinThoma avatar Feb 24 '24 08:02 MartinThoma

@MartinThoma I'm willing to help where I can!

foarsitter avatar Feb 26 '24 07:02 foarsitter

@MartinThoma : Please pull me in. I would like to contribute to the code.

python3-dev avatar Feb 29 '24 07:02 python3-dev

I can fix the PdfFileReader deprecation error, please pull me in.

ammadakram avatar Mar 22 '24 09:03 ammadakram