datahub icon indicating copy to clipboard operation
datahub copied to clipboard

Please allow selfsigned ca-certs for S3 source

Open GentleGhostCoder opened this issue 2 years ago • 2 comments

I get SSL / CA-CERT errors

'SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain '

When trying to connect a non-AWS S3, the boto3-client requires its own CA certificate. It would be best if all source connections had a custom cert path opportunity (e.g. /etc/ssl/certs/ca-certificates.crt).

GentleGhostCoder avatar Jun 30 '22 20:06 GentleGhostCoder

This issue is stale because it has been open for 15 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io

github-actions[bot] avatar Aug 02 '22 02:08 github-actions[bot]

Yes it is still an issue. I used the following image (version). acryldata/datahub-actions:v0.0.4 I also believe it could be just a few lines of code. Only the "cafile" / "verify" parameter would have to be passed to the S3 client.

GentleGhostCoder avatar Aug 02 '22 07:08 GentleGhostCoder

Hi @semmjon Could you please add detail for the issue you are facing, like the recipe, DataHub version, and which container is raising the above error?

siddiquebagwan avatar Sep 02 '22 12:09 siddiquebagwan

Hi, it's been a while now, which is why i hadn't tried it again with the new version. Test case was the following:

  • private S3 object storage (with selfsigned certificate) (in my case Ceph, but I think it doesn't matter which S3 storage)
  • One of the most recent Helm charts on a K3s cluster
  • The error appeared when scheduling the S3 ingestion
  • Probably in the boto3 module in the acryldata/datahub-actions container (then v0.0.4 , but probably not fixed yet in newer)
  • To fix the error, a cafile would have to be given when creating the S3 client.

GentleGhostCoder avatar Sep 02 '22 14:09 GentleGhostCoder

This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io

github-actions[bot] avatar Oct 03 '22 02:10 github-actions[bot]