cloudnative-pg icon indicating copy to clipboard operation
cloudnative-pg copied to clipboard

feat(cnpg-i): support remote plugins

Open leonardoce opened this issue 1 year ago • 1 comments
trafficstars

This patch allows the operator to work with remote CNPG-i plugins, that are expected to be deployed in a different Pod in the same namespace of the operator.

The discovery of plugins is based on a set of annotations and labels to be added to services to be used as a plugin.

The plugin developer or distributor is expected to provide a set of credentials that will be used by the operator to authenticate to the plugin via mTLS, and vice-versa.

leonardoce avatar Jun 27 '24 14:06 leonardoce

:exclamation: By default, the pull request is configured to backport to all release branches.

  • To stop backporting this pr, remove the label: backport-requested :arrow_backward: or add the label 'do not backport'
  • To stop backporting this pr to a certain release branch, remove the specific branch label: release-x.y

github-actions[bot] avatar Jun 27 '24 14:06 github-actions[bot]

Looks related to https://github.com/cloudnative-pg/cnpg-i-hello-world/pull/8

ringerc avatar Jul 02 '24 03:07 ringerc

This looks really handy, and would help solve some deployment and lifecycle issues with plugins.

A few questions:

  • Does the plugin have to be in the same namespace as CNPG, or can CNPG be configured with namespaces it'll do plugin service discovery in?
  • How will CNPG-I know when the plugin backed by the Service has (re)started in order to call the identity service again? Will the plugin need to initiate an outbound registration attempt to CNPG-I when it starts too? Or maintain a streaming gRPC connection for a plugin keepalive channel, so it notices when the connection breaks and knows to re-connect?
  • What is the error handling behaviour expected to be for plugins running as Pod s with TLS connection? Are there retries on calls to the plugin? If the plugin is "down" what does the operator do?
  • What's a plugin repository?

ringerc avatar Jul 02 '24 03:07 ringerc

  • Does the plugin have to be in the same namespace as CNPG, or can CNPG be configured with namespaces it'll do plugin service discovery in?

For now the need to be in the same namespace where the operator is deployed. You can even have different versions of the operator installed at the same time (with the same CRD) in different namespaces with different plugins.

  • How will CNPG-I know when the plugin backed by the Service has (re)started in order to call the identity service again? Will the plugin need to initiate an outbound registration attempt to CNPG-I when it starts too? Or maintain a streaming gRPC connection for a plugin keepalive channel, so it notices when the connection breaks and knows to re-connect?

We maintain a connection pool in the so-called "plugin repository" (bad naming, I know). Every time the operator need a connection to a plugin, it will get one from the pool. The pool will actually test the connection before returning it back.

  • What is the error handling behaviour expected to be for plugins running as Pod s with TLS connection? Are there retries on calls to the plugin? If the plugin is "down" what does the operator do?

Il will reschedule a reconciliation loop once a connection error is detected. This will be done with an exponential backoff by the controller-runtime library.

  • What's a plugin repository?

A bad name for a set of connection pools

leonardoce avatar Jul 02 '24 07:07 leonardoce

/test tl=4 d=main

mnencia avatar Jul 10 '24 14:07 mnencia

@mnencia, here's the link to the E2E on CNPG workflow run: https://github.com/cloudnative-pg/cloudnative-pg/actions/runs/9875654645

github-actions[bot] avatar Jul 10 '24 14:07 github-actions[bot]