nifikop icon indicating copy to clipboard operation
nifikop copied to clipboard

Add a new resource to deploy extensions (custom processor) into nifi

Open ggerla opened this issue 11 months ago • 10 comments

Is your feature request related to a problem?

NiFi allows to add custom processor copying nar files into /opt/nifi/nifi-current/extensions. Of course this is not possible if NiFi is deployed into a container/cluster.

Describe the solution you'd like to see

I wrote a script that using Kubernetes API copy nar file into all pods of the cluster. Of course at the pod restart the file is lost. A possible solution can be to fetch the file from some resource during the init container runs and copy it into the pod. Another solution can be mount a configmap in the desired folder, but this will introduce limits on the nar file size.

Describe alternatives you've considered

No response

Additional context

No response

ggerla avatar Mar 05 '24 17:03 ggerla

This is an interesting idea, but I hesitate to make it a custom resource given the number of ways & places nars can be hosted.

Nars could be hosted via a standard web service, S3, a maven repository, a database, etc. Each of which have their own access semantics. It's easy enough to add an init container to have the nars copied to a directory that lives in a PVC that survives pod restarts so it only needs copied once. Does this need a new resource to accomplish?

mh013370 avatar Mar 05 '24 17:03 mh013370

Another thing to consider is that NiFi will eventually support pulling extensions (nars) via NiFi Registry: https://nifi.apache.org/docs/nifi-registry-docs/html/administration-guide.html#bundle-persistence-providers

Stateless NiFi already behaves this way: https://github.com/apache/nifi/blob/main/nifi-stateless/nifi-stateless-assembly/README.md

When Stateless NiFi is started, it parses the provided dataflow and determines which bundles/extensions are necessary to run the dataflow. If an extension is not available, or the version referenced by the flow is not available, Stateless may attempt to download the extensions automatically.

This seems like a feature that should be implemented in NiFi rather than this operator

mh013370 avatar Mar 05 '24 17:03 mh013370

Thank you for your answer. The 2 links you posted seems interesting... I will try to investigate this way. In general it is not clear to me if this solve also the second issue I have that is related to lib driver (i.e. postgres jdbc driver)

ggerla avatar Mar 05 '24 18:03 ggerla

I analyzed your suggestion and I understood that NiFi registry is an subproject that should be installed "additionally" to NiFi. Does this operator support also the registry installation?

ggerla avatar Mar 06 '24 08:03 ggerla

I analyzed your suggestion and I understood that NiFi registry is an subproject that should be installed "additionally" to NiFi. Does this operator support also the registry installation?

Nifikop supports installing NiFi and not the NiFi Registry application, but there exist helm charts for that. I want to clarify that automatically pulling extensions via NiFi Registry in core NiFi is not currently supported, but it is meant to eventually.

In the meantime, you can solve this problem by using an init container. For example:

              initContainers:
                - name: pull-extensions
                  image: d3fk/s3cmd:stable
                  imagePullPolicy: Always
                  workingDir: /extensions
                  command:
                    - "/bin/sh"
                    - "-c"
                    - |
                      s3cmd sync s3://my-bucket/my-nar.nar ./
                  volumeMounts:
                    - name: extensions
                      mountPath: /extensions

And in this case, the extensions volume is a PVC so it will survive pod restarts. This is the current recommended way to solve this problem.

mh013370 avatar Mar 06 '24 09:03 mh013370

Thanks basically you wrote exactly what I have in mind, an init container that use an s3 client to download the nar and copy it in a volume of the nifi pod. Now Let's come back to my original question. The need has 2 phase.

  1. add a new extension at runtime
  2. remember all extensions added

the init container satisfy the second phase, but to enable the first I need a way to copy the nar file into the s3 bucket and into the volume without restart the nifi pods. This is because I asked for a new custom resource.

ggerla avatar Mar 06 '24 10:03 ggerla

Okay, so this is all declarative configuration for a deployment. If you want a new nar in your deployment, you need to add it to an init container. Deployed pods are immutable and so if you want to change them, you must change the deployment configuration.

If i write a custom nar, then part of its release process is to push it to an S3 bucket. I'd then go update my nifi deployment to pull the new nar via the init container. At that point i'm free to use it in NiFi

mh013370 avatar Mar 06 '24 11:03 mh013370

sorry I'm not sure we are 100% aligned. Each NiFi pod has its own volume attached (as it is now without change). In this volume there is a folder /opt/nifi/nifi-current/extensions. Now suppose to write a new CRD called nifi-extension (just as example). When this resource is deployed into k8s the operator download the nar from the s3 object store and copy it into the /opt/nifi/nifi-current/extensions path of each nifi pod. This will enable the run time deployment of the nar. Then after a pod restart the init container can check the list of nifi-extension resources and download them from s3 bucket and copy them into the /opt/nifi/nifi-current/extensions path of its own nifi pod.

Do you agree?

ggerla avatar Mar 06 '24 12:03 ggerla

sorry I'm not sure we are 100% aligned. Each NiFi pod has its own volume attached (as it is now without change). In this volume there is a folder /opt/nifi/nifi-current/extensions. Now suppose to write a new CRD called nifi-extension (just as example). When this resource is deployed into k8s the operator download the nar from the s3 object store and copy it into the /opt/nifi/nifi-current/extensions path of each nifi pod. This will enable the run time deployment of the nar. Then after a pod restart the init container can check the list of nifi-extension resources and download them from s3 bucket and copy them into the /opt/nifi/nifi-current/extensions path of its own nifi pod.

Do you agree?

You can accomplish this with an initContainer alone. initContainers run on every pod restart. You won't need a custom resource for that.

mh013370 avatar Mar 06 '24 12:03 mh013370

ok understood thanks for your support

ggerla avatar Mar 06 '24 12:03 ggerla