PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

[WIP] Split dependencies

Open Bobholamovic opened this issue 6 months ago • 1 comments

WIP; DO NOT MERGE!

This PR relies on https://github.com/PaddlePaddle/PaddleX/pull/4177

Some preliminary results:

  • base (with the basic OCR functionality only)
    • Installation command: pip install paddleocr
    • Number of dependencies: 35
    • Total size of dependencies: 401 MB
  • all (with all functionalities)
    • Installation command: pip install "paddleocr[all]"
    • Number of dependencies: 97
    • Total size of dependencies: 758 MB
  • paddleocr 2.10
    • Number of dependencies: 41
    • Total size of dependencies: 728 MB

Bobholamovic avatar Jun 12 '25 08:06 Bobholamovic

Thanks for your contribution!

paddle-bot[bot] avatar Jun 12 '25 08:06 paddle-bot[bot]

Im really interested in this PR. Is it possible to say for you when this will be ready? Would love to see this.

timminator avatar Jun 29 '25 22:06 timminator

Since this PR might change the installation method, we're concerned that some users may have questions about it. Therefore, we initiated two polls: https://github.com/PaddlePaddle/PaddleOCR/discussions/15748 https://github.com/PaddlePaddle/PaddleOCR/discussions/15795 . Currently, these polls have not received a considerable number of votes, so we're still waiting for more votes. We are considering adding this support in the next minor release (3.2) if the majority is in favor of the proposal.

Bobholamovic avatar Jun 30 '25 02:06 Bobholamovic

Thanks for your answer. I voted now. To tell the truth - I had never taken a look at the Poll tab before😅. A silly question - this PR does work in its current state? If that's the case I will try it out in a new venv when I find some time.

timminator avatar Jun 30 '25 07:06 timminator

Thanks for your answer. I voted now. To tell the truth - I had never taken a look at the Poll tab before😅.

Maybe we should find a place to put these polls so they’re more noticeable. If you have any suggestions, don't hesitate to tell us.

this PR does work in its current state? If that's the case I will try it out in a new venv when I find some time.

Yes, it works. But please note that some additional modifications are done in PaddleX, which means that to get this PR work you may have to update PaddleX according to https://github.com/PaddlePaddle/PaddleX/pull/4177 .

Bobholamovic avatar Jun 30 '25 07:06 Bobholamovic

Tested it. Works perfectly. Thank you for working on this. I only need the OCR capability so a slimmed down version is ideal for my use case.

Maybe we should find a place to put these polls so they’re more noticeable. If you have any suggestions, don't hesitate to tell us.

You could open an issue similar to the Non-CJK/English Support Initiative and pin it. There you could shorty explain the two options and post a link to the poll. I think this way definitely more people will see this.

timminator avatar Jun 30 '25 14:06 timminator