incubator-kie-kogito-apps
incubator-kie-kogito-apps copied to clipboard
Data Index support to handle the same workflow/process name in different namespaces
Description
In a cloud environment when a workflow is deployed via a sonataflow CR, the identity of a workflow in the context of a k8s cluster is determined by a combination of the workflow's name and namespace. However, in the Data-Index, there is no concept of a namespace of a workflow.
This might lead to a problem when deploying two workflows on the same cluster into different namespaces. Having supported the SonataflowClusterPlatform as a method to create a global Data-index that expands over the cluster, we might hit a situation in which two (or more) workflows with the same name will be deployed successfully in their namespace, however only the later workflow definition will be available in the Data-Index.
The implication in this case is not being able to view/list/track workflows from other namespaces via the Data-index, and if relying on the Data-index to expose the endpoint for the workflow, the target workflow might be the undesired one.
A similar issue is reported to the operator: https://github.com/apache/incubator-kie-kogito-serverless-operator/issues/442
Implementation ideas
No response
@nmirasch @wmedvede @fjtirado it seems like the Data Index only uses the workflow/process id as the key identifier. Is it possible to add another key or do we have to do a composite key in a single field like <namespace>/<workflow-id>
?
@nmirasch can you do an assessment?
@ricardozanini We should have discussed the k8s namespace relation with data-index more thoroughly. In my opinion, the issue is that k8s namespaces exist precisely to avoid conflicts between ids but since we are using data-index for multiple namespaces, a colision will indeed might occur. This makes me thing that either we should not being using data index to group different namepaces or we should concatenate the namepsace into the id (during build time) or we should add namespace concept (which is abstract enough) into our system.
@ricardozanini @masayag Data index is expecting that each workflow's id is unique at a global level in the ecosystem. Seeing the explained problem when deploying two workflows on the same cluster into different namespaces the first thing that comes to my mind is that is needed some id management(manual) to ensure the uniqueness of that ids if we want Dataindex understand them as a different workflows (invoke different endpoints, serviceUrl,..). If I understood well, we want to add some sort of configuration that allows us to incorporate that context information to the workflow id. Let me try to explore different alternatives to add that 'context' information.
@nmirasch @fjtirado as discussed offline we should bring this discussion to CNCF as well to add a group/namespace/context parameter to workflows. So a given implementation can use a set of attributes to identify the workflow in the ecosystem such as ID, version, and group. The same idea implemented successfully by many systems, including k8s.
@ricardozanini I have created https://github.com/serverlessworkflow/specification/issues/838 in the spec