OCRmyPDF icon indicating copy to clipboard operation
OCRmyPDF copied to clipboard

Added GitHub Action

Open MarketingPip opened this issue 3 years ago • 6 comments

Added a GitHub action for this project. You can find the action here.

OCR PDF Action: A GitHub action for turning scanned PDF's into searchable documents

:+1:

MarketingPip avatar Jul 11 '22 09:07 MarketingPip

Is this legitimate use of GitHub Actions?

jbarlow83 avatar Jul 12 '22 09:07 jbarlow83

@jbarlow83 - please define legitimate use of GitHub Actions....? As far as I know it is..? I see this as a benefit to user's being able to use this in a GitHub Workflow action without needing to install Python etc on their own local computer.

As well I have provided an example an PDF turned into a OCR'd PDF.

Feel free to try on a test PDF file of your own in a GitHub repo.

Hope this helps :+1:

Any more questions etc, feel free to ask!

MarketingPip avatar Jul 12 '22 09:07 MarketingPip

GitHub Actions is intended primarily for CI/CD of software. https://docs.github.com/en/site-policy/github-terms/github-terms-for-additional-products-and-features#actions

any activity that places a burden on our servers, where that burden is disproportionate to the benefits provided to users (for example, don't use Actions as a content delivery network or as part of a serverless application, but a low benefit Action could be ok if it’s also low burden); or

I believe using OCRmyPDF as a GitHub Action would be treating the service as a serverless application, and since OCR is CPU intensive, it would not qualify as a low burden.

I am open to discussion - perhaps there's a practical and justifiable use that I'm not seeing. But I'll need to be convinced that this is not violating the Terms of Service.

jbarlow83 avatar Jul 12 '22 20:07 jbarlow83

GitHub Actions is intended primarily for CI/CD of software.

https://docs.github.com/en/site-policy/github-terms/github-terms-for-additional-products-and-features#actions

any activity that places a burden on our servers, where that burden is disproportionate to the benefits provided to users (for example, don't use Actions as a content delivery network or as part of a serverless application, but a low benefit Action could be ok if it’s also low burden); or

I believe using OCRmyPDF as a GitHub Action would be treating the service as a serverless application, and since OCR is CPU intensive, it would not qualify as a low burden.

I am open to discussion - perhaps there's a practical and justifiable use that I'm not seeing. But I'll need to be convinced that this is not violating the Terms of Service.

As far as I know, there is NO violation of terms of service.

Too better understand tho, I'd even be willing to allow yourself etc to message the GitHub team and ask if the action is a violation etc.

As said - as I know there is none, but I don't want to make any false claims.

MarketingPip avatar Jul 13 '22 00:07 MarketingPip

I will leave it to you to contact GitHub and obtain their approval. Please post their response here.

jbarlow83 avatar Jul 13 '22 01:07 jbarlow83

@jbarlow83 - will do! I obviously would not want to violate the terms or publish any work that does this. So I will confirm but I am under the assumption until now this is allowed.

MarketingPip avatar Jul 13 '22 02:07 MarketingPip

You can keep this PR open.. image

Still in the process of waiting for a reply via GitHub. Tho again - I do NOT believe this violates any TOS...

If this was the case I would have NOT published it.

MarketingPip avatar Sep 16 '22 20:09 MarketingPip