DataflowTemplates
DataflowTemplates copied to clipboard
[Bug]: Local build following readme results in OpenSSL error
Related Template(s)
Cloud_Datastream_To_BigQuery
What happened?
I am trying to build a template and getting java.lang.NoClassDefFoundError: org/conscrypt/OpenSSLProvider
. It's unclear if this is a bug, an issue with the README instructions, or user error on my part (apologies if it's the last one).
I am building the template with the staging templates docs:
mvn clean package -PtemplatesStage \
-DskipTests \
-DprojectId="<company-project>" \
-DbucketName="gs://<company-bucket>" \
-DstagePrefix="images/$(date +%Y*%m*%d)\_01" \
-DtemplateName="Cloud_Datastream_To_BigQuery" \
-pl v2/datastream-to-bigquery -am
This appears to build successfully, but does not deploy to my bucket I've specified, so I used the following command that I found in the docs.
gcloud dataflow flex-template build gs://<company-bucket>/datastream-to-bigquery.json \
--image-gcr-path "<artifact-registry>" \
--sdk-language "JAVA" \
--flex-template-base-image JAVA11 \
--metadata-file "metadata.json" \
--jar "target/datastream-to-bigquery-1.0-SNAPSHOT.jar" \
--env FLEX_TEMPLATE_JAVA_MAIN_CLASS="com.google.cloud.teleport.v2.templates.DataStreamToBigQuery"
Everything looks good to this point, but when I run the job, I get the OpenSSL error mentioned above.
Beam Version
Newer than 2.46.0
Relevant log output
{
"insertId": "4800712471473325515:155206:0:41351",
"jsonPayload": {
"line": "exec.go:66",
"message": "java.lang.NoClassDefFoundError: org/conscrypt/OpenSSLProvider"
},
"resource": {
"type": "dataflow_step",
"labels": {
"job_name": "datastream-cdc-dataflow-job-f0aa8fb",
"job_id": "2023-05-08_11_08_15-16393896831592254474",
"region": "us-east4",
"project_id": "<company-project>",
"step_id": ""
}
},
"timestamp": "2023-05-08T18:14:49.846934Z",
"severity": "INFO",
"labels": {
"compute.googleapis.com/resource_name": "datastream-cdc-datafl-05081110-tko8-harness-qzj3",
"compute.googleapis.com/resource_type": "instance",
"dataflow.googleapis.com/log_type": "system",
"dataflow.googleapis.com/job_name": "datastream-cdc-dataflow-job-f0aa8fb",
"dataflow.googleapis.com/job_id": "2023-05-08_11_08_15-16393896831592254474",
"dataflow.googleapis.com/region": "us-east4",
"compute.googleapis.com/resource_id": "4800712471473325515"
},
"logName": "projects/<company-project>/logs/dataflow.googleapis.com%2Fworker-startup",
"receiveTimestamp": "2023-05-08T18:14:50.397145129Z"
}
You don't need to use gcloud dataflow flex-template build
after used the plugins mvn clean package -PtemplatesStage
. You can move on to run with the template that was given by the stage command.
There is some non-determinism when using Conscrypt that comes from the shaded JAR. The plugin ensures that Conscrypt is added separately and before in the classpath.
Suggest checking the steps here: https://github.com/GoogleCloudPlatform/DataflowTemplates/blob/main/v2/datastream-to-bigquery/README_Cloud_Datastream_to_BigQuery.md
Actually I'm experiencing exactly the same issue using a different template. If I follow the instructions from v2/kafka-to-pubsub README.md, I can create the template, and start running it. But the workers fail with the same error message as above. The generated kafka-to-pubsub-1.0-SNAPSHOT.jar doesn't contain the conscrypt classes.
@meken Thanks! KafkaToPubsub
wasn't released as a proper template yet, so it wasn't annotated to use with the Templates plugin, so the suggestion above doesn't hold.
By default shading is excluding Conscrypt as it contains signatures / doesn't work properly (see https://github.com/GoogleCloudPlatform/DataflowTemplates/blob/main/v2/pom.xml#L523-L525). We'll have to package the Conscrypt JAR separately if the template is requiring it.
We can likely revisit the annotations to use the plugin, as it has this handling embedded.
This issue has been marked as stale due to 180 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the issue at any time. Thank you for your contributions.
This issue has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.