spec icon indicating copy to clipboard operation
spec copied to clipboard

Support for multiple Kafka clusters within a spec

Open dalelane opened this issue 3 years ago • 18 comments

Is your feature request related to a problem? Please describe.

A Kafka cluster is typically made of a group of Kafka brokers. The brokers act as peers, so a client is able to make an initial connection to any broker in the cluster, and a metadata exchange takes place to inform the client which broker it should connect to.

To enable the cluster to be highly-available to connections from clients, Kafka clients are typically configured with the address of every broker in the Kafka cluster - so that it can try each broker in turn in the event that the first broker it attempts to connect to is unavailable.

Currently, this is being modelled in AsyncAPI by identifying each broker as a separate server in the list of servers. For the purposes of code generation, an assumption is made that all the Kafka servers listed in the servers section of the spec belong to the same cluster, and so all server URLs can be combined to provide a single bootstrap servers list.

Describe the solution you'd like

I'd like a way to identify multiple separate Kafka clusters within a single AsyncAPI spec.

For example, imagine I have a development/test Kafka cluster composed of three brokers A, B, C and a production Kafka cluster composed of three brokers D, E, F

It's not safe to put all six broker addresses as six separate server objects in the spec, as I don't have a way to identify the grouping, and current code-generation would treat it as a single Kafka cluster with six brokers in.

Additional context

dalelane avatar Nov 25 '20 17:11 dalelane

Welcome to AsyncAPI. Thanks a lot for reporting your first issue.

Keep in mind there are also other channels you can use to interact with AsyncAPI community. For more details check out this issue.

github-actions[bot] avatar Nov 25 '20 17:11 github-actions[bot]

Copying over discussion from Slack as we don't know when it will dissapear cause of the free plan


Dale Lane Yesterday at 10:51 AM
It's customary in Kafka to provide client applications with a comma-separated list of brokers in the cluster (so the client is able to attempt connection to any of the broker addresses in that list)
Are there any issues with putting a comma-separated list in the url field of a Server object for this purpose?
https://www.asyncapi.com/docs/specifications/2.0.0#fixed-fields-4
(or are there better ways I should represent this? like having each broker as a separate server object?) (edited) 




9 replies

Lukasz Gornicki  22 hours ago
Hey Dale, at the moment url is just for one url, and you should have multiple Server objects. This is how it is for example supported in the java template -> https://github.com/asyncapi/java-spring-template/blob/master/template/src/main/resources/application.yml#L61-L64
I do remember though that we had discussion about it once in the past (but we have free slack and it is lost) and I’m not sure…. @Semen do you remember if we ended up creating an issue to discuss further?
The problem is that having those bootstrap servers, every single one in separate server object is in conflict with our assumption how servers should be use, that they should represent environments (this is how generator supports it) (edited) 

Lukasz Gornicki  22 hours ago
The problem is that having those bootstrap servers, every single one in separate server object is in conflict with our assumption how servers should be use, that they should represent environments (this is how generator supports it)
This is something I’m discussion at the moment with @Semen here -> https://github.com/asyncapi/java-spring-template/pull/55 (edited) 

Semen  22 hours ago
Yes, because of this limitation in spec, spring-Java-template uses all server from API marked with protocol: kafka

Dale Lane  22 hours ago
that's really useful - thanks, both

Fran Méndez  22 hours ago
I wonder if it would be a good thing to add to the spec itself. Something like alternativeUrls . Or maybe a good use case for the Kafka Server Binding, which still doesn't exist. It could be called altenativeHosts, since Kafka usually requires hosts instead of urls.

Dale Lane  21 hours ago
That would be helpful for the case where I have two clusters (e.g. a production cluster and a development cluster) each made up of three brokers. Listing all six brokers in the servers section would get confused.
That may be an edge case though

Fran Méndez  21 hours ago
I don't think it is. It might be an edge case for development but people often have production clusters and staging/test clusters or even a separate cluster for partners.

Fran Méndez  21 hours ago
Feel free to leave your opinion on an issue at github.com/asyncapi/asyncapi. I think this is something we should consider having on the spec, either on the core spec or on the kafka bindings.

Dale Lane  21 hours ago
will do, thanks

derberg avatar Nov 26 '20 08:11 derberg

Strawman suggestion for an approach - we could provide a cluster identifier for each server

asyncapi: '2.0.0'
servers:  
  prod-broker0:
    url: dale-prod-broker-0:9092
    protocol: kafka
    bindings:
      kafka:
        cluster: production
  prod-broker1:
    url: dale-prod-broker-1:9092
    protocol: kafka
    bindings:
      kafka:
        cluster: production
  prod-broker2:
    url: dale-prod-broker-2:9092
    protocol: kafka
    bindings:
      kafka:
        cluster: production
  dev-broker0:
    url: dale-dev-broker-0:9092
    protocol: kafka
    bindings:
      kafka:
        cluster: development
  dev-broker1:
    url: dale-dev-broker-1:9092
    protocol: kafka
    bindings:
      kafka:
        cluster: development
  dev-broker2:
    url: dale-dev-broker-2:9092
    protocol: kafka
    bindings:
      kafka:
        cluster: development

dalelane avatar Nov 26 '20 13:11 dalelane

I'm wondering if we really need the multiple clusters feature. For instance, if you think about how you would deploy an application generated with this AsyncApiSpec in K8s, the multiple cluster feature would not add any benefit. Usually, you have one Async spec for dev and one for prod because they will evolve at a different speed. Am I missing anything here? 🤔

Having said that, I like the feature of passing multiple brokers for one cluster. It's very in line with almost every streaming platform 🙂

fnobilia avatar Dec 05 '20 12:12 fnobilia

I think I'm just used to being able to include both because that's how I use OpenAPI (e.g. https://spec.openapis.org/oas/v3.0.3#server-object-example )

dalelane avatar Dec 05 '20 12:12 dalelane

Usually, I use /doc, or I have a separate service serving just the spec. This setup plus my CI/CD forces me to have only one spec per environment. How do you expose your OpenAPI spec? Can you help me understand your setup?

fnobilia avatar Dec 05 '20 18:12 fnobilia

+1 for this feature request. When developing a new version of the spec, we will typically create a new branch with a new spec version. We do not programmatically change the contract when promoting the spec from dev/test to prod, it is just approved and promoted.

Our Kafka clients need a cluster server list in prod that is separate and independent from the list for the dev/test environment

buyukim avatar Jan 13 '21 21:01 buyukim

Could this be considered a duplicate of https://github.com/asyncapi/spec/issues/244? Not sure. That's why I'm asking.

smoya avatar Mar 19 '21 11:03 smoya

Could this be considered a duplicate of #244? Not sure. That's why I'm asking.

Yes, I think so - I hadn't seen that issue before

dalelane avatar Mar 19 '21 11:03 dalelane

Isn't that issue more related to using multiple sets of brokers, each with different topics? Seems like a more complicated implementation since there is still only one url per server

buyukim avatar Mar 21 '21 13:03 buyukim

This issue has been automatically marked as stale because it has not had recent activity :sleeping: It will be closed in 60 days if no further activity occurs. To unstale this issue, add a comment with detailed explanation. Thank you for your contributions :heart:

github-actions[bot] avatar Jul 12 '21 00:07 github-actions[bot]

This issue has been automatically marked as stale because it has not had recent activity :sleeping: It will be closed in 60 days if no further activity occurs. To unstale this issue, add a comment with detailed explanation. Thank you for your contributions :heart:

github-actions[bot] avatar Sep 11 '21 00:09 github-actions[bot]

Yes, I think so - I hadn't seen that issue before

It is not. That one was pointing to another direction.

This one is still valid and I think this should become an actual strawman RFC0.

smoya avatar Oct 05 '21 15:10 smoya

I think the question we might need to answer here is: Is it worth to be added in the core spec or rather as a Kafka server binding?

Does the "cluster" concept applies to other protocols (It doesn't have to apply to all but most)?

smoya avatar Oct 05 '21 16:10 smoya

This is not really native to Kafka, NATS have the same "problem". Usually, in code you just pass an array or ; separated string of URLs to connect to. I wonder if it would make sense to allow multiple connection URL to be defined?

asyncapi: '2.0.0'
servers:  
  prod-broker:
    url: 
      - dale-prod-broker-0:9092
      - dale-prod-broker-1:9092
      - dale-prod-broker-2:9092
    protocol: kafka
  dev-broker:
    url: 
      - dale-dev-broker-0:9092
      - dale-dev-broker-1:9092
      - dale-dev-broker-2:9092
    protocol: kafka

To me, this is the most simple and uncomplicated approach to solve this 🤔

jonaslagoni avatar Dec 17 '21 13:12 jonaslagoni

@dalelane do you want to champion this? 🙂 Or can we consider this issue as needs champion? 🤔

jonaslagoni avatar Jan 19 '22 18:01 jonaslagoni

@jonaslagoni sure - I'd be happy to pick this up again

dalelane avatar Jan 23 '22 20:01 dalelane

This issue has been automatically marked as stale because it has not had recent activity :sleeping:

It will be closed in 120 days if no further activity occurs. To unstale this issue, add a comment with a detailed explanation.

There can be many reasons why some specific issue has no activity. The most probable cause is lack of time, not lack of interest. AsyncAPI Initiative is a Linux Foundation project not owned by a single for-profit company. It is a community-driven initiative ruled under open governance model.

Let us figure out together how to push this issue forward. Connect with us through one of many communication channels we established here.

Thank you for your patience :heart:

github-actions[bot] avatar Jul 28 '22 00:07 github-actions[bot]

support for tags in servers being added in https://github.com/asyncapi/spec/issues/465 gives us a mechanism for describing this

I'll leave the issue open for now so I can add an example that demonstrates this

dalelane avatar Sep 22 '22 08:09 dalelane

support for tags in servers being added in #465 gives us a mechanism for describing this

I think the link is not correct. Maybe you wanted to refer to https://github.com/asyncapi/spec/pull/809

smoya avatar Sep 22 '22 10:09 smoya

yes, that's right... thanks!

(sorry - that'll teach me to try and multi-task! 🤦‍♂️)

dalelane avatar Sep 22 '22 11:09 dalelane