flyte icon indicating copy to clipboard operation
flyte copied to clipboard

[BUG] Sending cloud events from flyteadmin not documented properly for GCP and results in errors

Open fg91 opened this issue 2 years ago • 1 comments

Describe the bug

How the cloud events integration of flyteadmin can be configured is documented here.

For GCP, for instance, it says:

         cloud_events.yaml: |
           cloudEvents:
             enable: true
             gcp:
               region: us-east-2
             eventsPublisher:
               eventTypes:
               - all # or node, task, workflow
               topicName: my-topic
             type: gcp

When including such a config in the helm values file, the following error is raised in flyteadmin:

{"json":{},"level":"fatal","msg":"caught panic: project id is required [goroutine 1 [running]:\nruntime/debug.Stack()\n\t/usr/local/go/src/runtime/debug/stack.go:24 +0x65\ngithub.com/flyteorg/flyteadmin/pkg/rpc/adminservice.NewAdminServer.func1()\n\t/go/src/github.com/flyteorg/flyteadmin/pkg/rpc/adminservice/base.go:74 +0x88\npanic({0x2273960, 0xc0013f1dd0})\n\t/usr/local/go/src/runtime/panic.go:838 +0x207\ngithub.com/flyteorg/flyteadmin/pkg/async/cloudevent.NewCloudEventsPublisher({0x2c04440, 0xc000128000}, {0x1, {0xc00051b4e8, 0x3}, {{0xc00051b470, 0xb}}, {{0x0, 0x0}}, {{0x0, ...}, ...}, ...}, ...)\n\t/go/src/github.com/flyteorg/flyteadmin/pkg/async/cloudevent/factory.go:61 +0x905\ngithub.com/flyteorg/flyteadmin/pkg/rpc/adminservice.NewAdminServer({0x2c04440?, 0xc000128000}, 0xc0006a3b80, {0x2c0cb60, 0xc00058c6c0}, {0x0, 0x0}, {0x0, 0x0}, 0xc000b563c0, ...)\n\t/go/src/github.com/flyteorg/flyteadmin/pkg/rpc/adminservice/base.go:103 +0x845\ngithub.com/flyteorg/flyteadmin/pkg/

When setting the projectId instead of the region as done here, this error disappears.

However, the resulting yaml file in the flyteadmin configmap still contains aws config:

❯ k -n flyte get configmaps flyte-admin-base-config -o yaml
apiVersion: v1
data:
  cloud_events.yaml: "cloudEvents: \n  aws:\n    region: us-east-2\n  enable: true\n
    \ eventsPublisher:\n    eventTypes:\n    - all\n    topicName: <my-topic>\n
    \ gcp:\n    projectId: <my-project>\n  type: gcp\n"

Despite the unintended aws config, the published events do make it to GCP pub/sub and can be pulled in the cloud console.

However, even though the messages can successfully be pulled from the pub sub topic/subscription, flyteadmin always shows error logs:

{"json":{"exec_id":"b862f22a535442bda2e4","node":"n0"},"level":"error","msg":"Failed to publish a message with key [flyteidl.admin.TaskExecutionEventRequest] and message [Context Attributes,\n  specversion: 1.0\n  type: com.flyte.resource.flyteidl.admin.TaskExecutionEventRequest\n   ... ] and error: context canceled","ts":"2023-05-10T11:46:42Z"}
{"json":{"exec_id":"b862f22a535442bda2e4","node":"n0"},"level":"error","msg":"Failed to send message [event:\u003ctask_id:\u003cresource_type:TASK ... ] with error: context canceled","ts":"2023-05-10T11:46:42Z"}

Expected behavior

  • The documentation should not say to add the region for GCP but the projedId.
  • When configuring GCP, the resulting configmap should not contain AWS config.
  • Flyteadmin should not show errors even though the published events make it to pub/sub and can be pulled from there.

Additional context to reproduce

No response

Screenshots

No response

Are you sure this issue hasn't been raised already?

  • [X] Yes

Have you read the Code of Conduct?

  • [X] Yes

fg91 avatar May 10 '23 11:05 fg91

Hello 👋, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable. Thank you for your contribution and understanding! 🙏

github-actions[bot] avatar Feb 07 '24 00:02 github-actions[bot]