keda icon indicating copy to clipboard operation
keda copied to clipboard

Prefix for 'goSDK' checkpoint strategy for Azure Event Hub scalter

Open brainslush opened this issue 2 years ago • 17 comments

Proposal

I suggest to add the possibility to define a prefix for the 'goSDK' checkpoint strategy in the trigger specification of the Azure Event Hub scaler.

e.g.

triggers:
- type: azure-eventhub
  metadata:
    connectionFromEnv: EVENTHUB_CONNECTIONSTRING_ENV_NAME
    storageConnectionFromEnv: STORAGE_CONNECTIONSTRING_ENV_NAME
    consumerGroup: $Default
    unprocessedEventThreshold: '64'
    blobContainer: 'name_of_container'
    goSdkPrefix: 'someDir/dapr-dapreventhubname-daprconsumergroup-'
    checkpointStrategy: goSdk

The search path for the checkpoint file would look like: https://myaccount.blob.core.windows.net/name_of_container/someDir/dapr-dapreventhubname-daprconsumergroup-0

Use-Case

Allows usage of Keda together with DAPR and Azure Event Hub. Dapr creates files on the blob storage which are named {blobContainer}/dapr-{topic}-{consumerGroup}-{partitionId} . Keda on the other hand looks for {blobContainer}/{partiotionId}.

Anything else?

Even though I don't know go yet I would try myself on a PR.

brainslush avatar May 08 '22 11:05 brainslush

There are two options here, we can do either or both:

  1. Support a prefix/format, potentially requiring value injection as shown above
  2. Support dapr checkpointing, but then this same problem might come up later on with another product

Thoughts @kedacore/keda-core-contributors?

tomkerkhove avatar May 09 '22 05:05 tomkerkhove

I would go with the value injection approach as it leaves room for customization.

brainslush avatar May 09 '22 06:05 brainslush

That is a good option, but doing both might be good as well given every Dapr end-user would otherwise have to do this customization which is less trivial from a UX standpoint.

tomkerkhove avatar May 09 '22 06:05 tomkerkhove

A suggestion:

triggers:
- type: azure-eventhub
  metadata:
    connectionFromEnv: EVENTHUB_CONNECTIONSTRING_ENV_NAME
    storageConnectionFromEnv: STORAGE_CONNECTIONSTRING_ENV_NAME
    consumerGroup: $Default
    unprocessedEventThreshold: '64'
    blobContainer: 'name_of_container'
    goSdkTemplate: 'dapr'
    checkpointStrategy: goSdk

-> {blobContainer}/dapr-{topic}-{consumerGroup}-{partitionId}

triggers:
- type: azure-eventhub
  metadata:
    connectionFromEnv: EVENTHUB_CONNECTIONSTRING_ENV_NAME
    storageConnectionFromEnv: STORAGE_CONNECTIONSTRING_ENV_NAME
    consumerGroup: $Default
    unprocessedEventThreshold: '64'
    blobContainer: 'name_of_container'
    goSdkTemplate: 'prefix'
    goSdkPrefix: 'someDir/somePrefix-'
    checkpointStrategy: goSdk

-> {blobContainer}/someDir/somePrefix-{partitionId}

brainslush avatar May 09 '22 12:05 brainslush

If we go that route, I wouldn't use goSdk* but something more generic in case we want to do this for other checkpoint strategies as well.

tomkerkhove avatar May 09 '22 12:05 tomkerkhove

How about checkpointStrategyTemplate and chekpointPrefix ?

brainslush avatar May 09 '22 21:05 brainslush

Fine by me. Thoughts @zroubalik & @JorTurFer? I'd add a Dapr strategy as well though.

tomkerkhove avatar May 10 '22 11:05 tomkerkhove

I agree with @tomkerkhove adding a Dapr strategy because it has value itself and for future product integrations, we can support also the checkpointStrategyTemplate/chekpointPrefix

JorTurFer avatar May 10 '22 16:05 JorTurFer

Are you willing to still contribute this @brainslush?

tomkerkhove avatar May 11 '22 06:05 tomkerkhove

I'll try. Haven't done coding in Go before but when I read the code I got the impression that this shouldn't be difficult.

brainslush avatar May 12 '22 11:05 brainslush

Hi,

As i checked the dapr behavior with container apps and the eventhub scaler, a view month ago, it worked well with the goSdk. But the eventhub component was rewritten:

https://github.com/dapr/components-contrib/pull/1292/commits/29e22a0d4308a1de861271a0fba1ec48fe737555#diff-3a38d7491c58638ad54630db869afd8fbd80db683f62dc62f18a9c5b214b0962

which is good, we are waiting for some changes too :-)

So, I'm with @tomkerkhove. from the end-user perspective dapr is more concrete and readable. For example, if you create a container app you don't want to configure things like a template path. This is an implementation detail.

I would focus on some concrete checkpoint strategy implementations and not a generic solution that lifts the complexity into the scaler configuration. Most of the eventhub processor implementations like in c#, Azure Functions, java, javascript, or python switched over to the metadata checkpoint strategy. So the gosdk variant is currently a very specific one.

just my opinion. I think If there is a need for a very generic solution, it could be added later. So i guess the dapr one is more important.

@brainslush, i started last year with the checkpoint strategy variant. If you need help or want that i jump in, feel free to ask!

christle avatar May 30 '22 16:05 christle

I'm currently a bit busy but I still intend to do it. @christle But, in case I get stuck I'll get back to you. I agree that implementing a specific strategy is the best solution. I had a chat with the dapr developers and they assured me that they intend to keep the path and behavior as it is right now.

brainslush avatar May 31 '22 22:05 brainslush

Created https://github.com/kedacore/keda/issues/3141 for Dapr checkpointing

tomkerkhove avatar Jun 08 '22 12:06 tomkerkhove

FYI: I've created a PR on Dapr to bring that topic forward: https://github.com/dapr/components-contrib/pull/1940

christle avatar Aug 06 '22 13:08 christle

Does that mean we will only support the new Dapr version or not? (just to verify)

tomkerkhove avatar Aug 08 '22 05:08 tomkerkhove

For older versions, the goSdk Checkpoint Strategy works like before. But since January 2022, this strategy doesn't work anymore for pubsub because of the naming chance. But still for bindings.

With the small dapr change, for new dapr versions, a new checkpoint strategy "dapr" is needed. For older versions, the goSdk checkpoint strategy works well.

christle avatar Aug 09 '22 08:08 christle

Oh boy, this will be fun for support cases - Thanks for following up and let's make sure we have good docs on this :)

tomkerkhove avatar Aug 09 '22 11:08 tomkerkhove

I think this is already done. We can reopen it if not

JorTurFer avatar Dec 06 '22 15:12 JorTurFer

Related: https://github.com/kedacore/keda/issues/3141

tomkerkhove avatar Dec 09 '22 11:12 tomkerkhove