DataflowTemplates icon indicating copy to clipboard operation
DataflowTemplates copied to clipboard

[Bug]: Local build following readme results in OpenSSL error

Open pcgilday opened this issue 1 year ago • 4 comments

Related Template(s)

Cloud_Datastream_To_BigQuery

What happened?

I am trying to build a template and getting java.lang.NoClassDefFoundError: org/conscrypt/OpenSSLProvider. It's unclear if this is a bug, an issue with the README instructions, or user error on my part (apologies if it's the last one).

I am building the template with the staging templates docs:

mvn clean package -PtemplatesStage \
 -DskipTests \
 -DprojectId="<company-project>" \
 -DbucketName="gs://<company-bucket>" \
 -DstagePrefix="images/$(date +%Y*%m*%d)\_01" \
 -DtemplateName="Cloud_Datastream_To_BigQuery" \
 -pl v2/datastream-to-bigquery -am

This appears to build successfully, but does not deploy to my bucket I've specified, so I used the following command that I found in the docs.

gcloud dataflow flex-template build gs://<company-bucket>/datastream-to-bigquery.json \
 --image-gcr-path "<artifact-registry>" \
 --sdk-language "JAVA" \
 --flex-template-base-image JAVA11 \
 --metadata-file "metadata.json" \
 --jar "target/datastream-to-bigquery-1.0-SNAPSHOT.jar" \
 --env FLEX_TEMPLATE_JAVA_MAIN_CLASS="com.google.cloud.teleport.v2.templates.DataStreamToBigQuery"

Everything looks good to this point, but when I run the job, I get the OpenSSL error mentioned above.

Beam Version

Newer than 2.46.0

Relevant log output

{
  "insertId": "4800712471473325515:155206:0:41351",
  "jsonPayload": {
    "line": "exec.go:66",
    "message": "java.lang.NoClassDefFoundError: org/conscrypt/OpenSSLProvider"
  },
  "resource": {
    "type": "dataflow_step",
    "labels": {
      "job_name": "datastream-cdc-dataflow-job-f0aa8fb",
      "job_id": "2023-05-08_11_08_15-16393896831592254474",
      "region": "us-east4",
      "project_id": "<company-project>",
      "step_id": ""
    }
  },
  "timestamp": "2023-05-08T18:14:49.846934Z",
  "severity": "INFO",
  "labels": {
    "compute.googleapis.com/resource_name": "datastream-cdc-datafl-05081110-tko8-harness-qzj3",
    "compute.googleapis.com/resource_type": "instance",
    "dataflow.googleapis.com/log_type": "system",
    "dataflow.googleapis.com/job_name": "datastream-cdc-dataflow-job-f0aa8fb",
    "dataflow.googleapis.com/job_id": "2023-05-08_11_08_15-16393896831592254474",
    "dataflow.googleapis.com/region": "us-east4",
    "compute.googleapis.com/resource_id": "4800712471473325515"
  },
  "logName": "projects/<company-project>/logs/dataflow.googleapis.com%2Fworker-startup",
  "receiveTimestamp": "2023-05-08T18:14:50.397145129Z"
}

pcgilday avatar May 08 '23 18:05 pcgilday

You don't need to use gcloud dataflow flex-template build after used the plugins mvn clean package -PtemplatesStage. You can move on to run with the template that was given by the stage command.

There is some non-determinism when using Conscrypt that comes from the shaded JAR. The plugin ensures that Conscrypt is added separately and before in the classpath.

bvolpato avatar May 15 '23 13:05 bvolpato

Suggest checking the steps here: https://github.com/GoogleCloudPlatform/DataflowTemplates/blob/main/v2/datastream-to-bigquery/README_Cloud_Datastream_to_BigQuery.md

bvolpato avatar May 16 '23 17:05 bvolpato

Actually I'm experiencing exactly the same issue using a different template. If I follow the instructions from v2/kafka-to-pubsub README.md, I can create the template, and start running it. But the workers fail with the same error message as above. The generated kafka-to-pubsub-1.0-SNAPSHOT.jar doesn't contain the conscrypt classes.

meken avatar Oct 04 '23 12:10 meken

@meken Thanks! KafkaToPubsub wasn't released as a proper template yet, so it wasn't annotated to use with the Templates plugin, so the suggestion above doesn't hold.

By default shading is excluding Conscrypt as it contains signatures / doesn't work properly (see https://github.com/GoogleCloudPlatform/DataflowTemplates/blob/main/v2/pom.xml#L523-L525). We'll have to package the Conscrypt JAR separately if the template is requiring it.

We can likely revisit the annotations to use the plugin, as it has this handling embedded.

bvolpato avatar Oct 05 '23 02:10 bvolpato

This issue has been marked as stale due to 180 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the issue at any time. Thank you for your contributions.

github-actions[bot] avatar May 20 '24 14:05 github-actions[bot]

This issue has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

github-actions[bot] avatar May 29 '24 02:05 github-actions[bot]