provider-gcp Re-using a GCP CloudSQLInstance

There are potentially 2 separate use cases here:

I would like to be able to bind to a SQL instance that already exists.
I would like to be able to use Crossplane to create a SQL instance, and then have it re-used by multiple claims (e.g. I deploy the same app twice).

The objective is just to save time waiting for the resource to be created. I don't want to wait 5 minutes every time I start an application, especially if I am giving a demo, or iterating fast. I am aware that the database might contain data from a previous deployment. Applications are used to dealing with that sort of thing - the database is persistent and has a long lifecycle, and the applications that consume the data are short-lived and liable to change more frequently.

What actually happens currently is

If I point a CloudSQLInstance to an existing resource, Crossplane happily creates a secret that contains a lot of useful information, but does not contain the database password. You can then create a claim for the instance (using MySQLInstance) and another secret gets created and the claim looks successful. So an application that uses the secret and tries to connect thinks it can do it, but fails when it actually needs to create a connection.
Then there is a (possibly separate) problem that the default reclaimPolicy causes a CloudSQLInstance to be marked as Released when a claim is deleted. The result is that you can subsequently claim the resource but the MySQLInstance gets stuck in a state where it is "Managed claim is waiting for managed resource to become bindable". Again, it looks successful, but never actually becomes usable by an application. In this case the secret is never created.

If it isn't possible to re-use instances, or is deemed bad practice (e.g. because the database might leak data into the next claim), maybe it would be better to make it obvious that it has failed? Refuse to accept the claim, for instance.

Feb 03 '20 14:02 dsyer

Another interesting use case - 2 apps want to connect to the same database. Currently they have to be in the same namespace in order to share a secret? Is that the only way that pattern will work?

Feb 03 '20 14:02 dsyer

If I point a CloudSQLInstance to an existing resource, Crossplane happily creates a secret that contains a lot of useful information, but does not contain the database password. You can then create a claim for the instance (using MySQLInstance) and another secret gets created and the claim looks successful. So an application that uses the secret and tries to connect thinks it can do it, but fails when it actually needs to create a connection.

I've tried to reproduce this but failed. I used the following YAML and it created the secret referred in spec.writeConnectionSecretToRef and it does contain the password. Could you try with that config?

apiVersion: database.gcp.crossplane.io/v1beta1
kind: CloudSQLInstance
metadata:
  name: mycloudsql
spec:
  forProvider:
    databaseVersion: MYSQL_5_6
    region: us-west2
    settings:
      dataDiskSizeGb: 10
      dataDiskType: PD_SSD
      ipConfiguration:
        ipv4Enabled: true
      tier: db-n1-standard-1
  providerRef:
    name: gcp-provider
  reclaimPolicy: Delete
  writeConnectionSecretToRef:
    name: conn-secret-name
    namespace: default

I would like to be able to use Crossplane to create a SQL instance, and then have it re-used by multiple claims (e.g. I deploy the same app twice).

I believe this is a valid concern. I've opened https://github.com/crossplaneio/crossplane/issues/1229 for a different thing but it'd solve your use-case, too. It'd be great if you could upvote/leave comment on that issue for prioritization. Meanwhile, you can work around and achieve this. For example, you can create MySQLInstance with constant name in spec.writeConnectionSecretToRef.name in the namespace you deploy your apps so that name of the secret is constant and you can use it in your app bundle. Then you can mount the same secret in different pods. The caveat is that you can't bundle MySQLInstance resource in your app bundle since it'll result in creating a new instance every time you deploy it. Though you can overcome this by having duplicate MySQLInstances in your app bundles and deploy them via kubectl apply -f so that it patches the existing one rather than creating a new one.

Then there is a (possibly separate) problem that the default reclaimPolicy causes a CloudSQLInstance to be marked as Released when a claim is deleted. The result is that you can subsequently claim the resource but the MySQLInstance gets stuck in a state where it is "Managed claim is waiting for managed resource to become bindable".

I've opened https://github.com/crossplaneio/crossplane-runtime/issues/88 a while ago to address this use-case. It'd be great to have your input on that issue, too.

Feb 04 '20 10:02 muvaf

Another interesting use case - 2 apps want to connect to the same database. Currently they have to be in the same namespace in order to share a secret? Is that the only way that pattern will work?

@dsyer Currently, yes, that's the consumption model. The k8s secret has to live in the same namespace as the app. You should either deploy the app to the namespace where your claim lives or to the namespace where you specify your secret ref in CloudSQLInstance resource. They are same secrets.

For KubernetesCluster claim, we have KubernetesTarget kind to propagate the kubeconfig secret to different namespaces but we don't have that for databases, yet.

Feb 04 '20 10:02 muvaf

Could you try with that config?

I believe I probably already did, up to a point. Your config has reclaimPolicy=Delete though which doesn’t make sense if it’s an existing resource, does it? Am I missing something, or did you actually try this with an existing instance called “mycloudsql”?

Feb 04 '20 10:02 dsyer

I simply created this resource in a fresh cluster and looked at the content of the secret with kubectl get secret conn-secret-name -n default -o jsonpath='{.data.password}' | base64 -d

Since I created this manually and don't need a claim, reclaimPolicy is not really important. All reclaimPolicy does is to decide what happens to CloudSQLInstance when you delete MySQLInstance that it's bound to.

Feb 04 '20 10:02 muvaf

That’s not an “existing resource” then is it? It was created by crossplane. And IIRC it does matter what the reclaim policy is, because if it’s Delete then the instance is deleted when you delete the CloudSQLInstance.

Feb 04 '20 10:02 dsyer

@dsyer Just checked the codebase again and you're right. I thought we changed the how reclaimPolicy works but it seems we just added a new functionality. So, right now, reclaimPolicy: Delete causes two things:

Delete actual resource in GCP when CloudSQLInstance is deleted.
Delete CloudSQLInstance when MySQLInstance that it was bound is deleted.

I remember discussing making the first bullet above default but it looks like we didn't.

However, do note that you cannot fetch the password if you import an existing CloudSQL resource in GCP into your cluster as CloudSQLInstance. GCP allows you to specify a password during creation only. So, if you try to import an existing resource, Crossplane cannot fetch the password from GCP. This could be the reason that you don't see the password because Crossplane didn't create it initially. I guess that also explains why you see the password when you create a claim, because it creates a new CloudSQLInstance with random suffix, so, the underlying resource is indeed created by Crossplane, hence has a password assigned by Crossplane.

What you could possibly do is to inject the password string into the secret referred in CloudSQLInstance.spec.writeConnectionSecretToRef. Then the claim that that you'd explicitly bind to CloudSQLInstance through MySQLInstance.spec.resourceRef will propagate your password, too, along with other keys.

Feb 04 '20 11:02 muvaf

What you could possibly do is to inject the password string

I guess I could. I would have to script it on the client side (using kubectl or similar) and it would be fragile and non-declarative. The whole point of this issue was to try and avoid doing that kind of thing. The basic idea that a database lives longer than an app has yet to be addressed in crossplane, I think.

Feb 04 '20 11:02 dsyer

FWIW Google Config Connector has a SqlUser CRD that you can use in this scenario to set the username and password in the SQL instance. I don't know if that's really necessary - it could just go in the MySQLInstance in crossplane.

Feb 05 '20 07:02 dsyer

I see that User is a standalone API object, meaning we will likely have a CRD for that type in Crossplane, too.

In current phase, we're working towards enabling key scenarios and going deep on quality of common pieces so that Crossplane fits the high level needs of the users rather than be just an abstraction on top of HTTP API of providers.

As we upgrade more resources to v1beta1, we'll continue expanding the resources that you can provision via Crossplane.

Feb 05 '20 12:02 muvaf

I don't think we should allow a managed resource like a CloudSQL instance to be reused. That said, there are some bugs pointed out in this issue that need addressing.

Here's some background on why our reclaim policies are how they are: https://github.com/crossplaneio/crossplane-runtime/pull/87#issuecomment-555747586. This is all modelled on persistent volumes, which I believe is an appropriate model for most cloud infrastructure. Under this model you are able to reuse infrastructure, but you do so at the claim level. This is different from reusing infrastructure at the managed resource level in two important ways:

You're logically choosing to reuse the claim that represents the billing database instance, not some anonymous SQL instance that happens to have been the database instance.
You're limited to reusing infrastructure within namespace boundaries, which frequently means team or environment boundaries.

I think there's (understandably) a lot of ambiguity around what a resource claim is. I see it as what it says on the tin; a claim by an entity or entities to use some infrastructure for a particular purpose or purposes. Claims don't have to represent the needs of a single application, and don't have to be tied to the lifecycle of a single application.

That all said, I would like for it to be possible but manual to reuse a managed resource. Persistent volume claims allow this; you set reclaimPolicy=Retain, clean up the database by hand if necessary, delete the managed resource (retaining the external resource), then create a new managed resource that imports the retained external resource. This makes it really hard to accidentally reuse infrastructure.

We have a few issues preventing this. Namely:

It's not intuitive to create a managed resource that corresponds to an existing external resource.
As pointed out by @dsyer, frequently we can't access a resource's credentials after create time.

If it isn't possible to re-use instances, or is deemed bad practice (e.g. because the database might leak data into the next claim), maybe it would be better to make it obvious that it has failed? Refuse to accept the claim, for instance.

Agreed - this is a bug. Our claim binding reconciler should clearly indicate that binding to a released managed resource is not allowed.

I would like to be able to use Crossplane to create a SQL instance, and then have it re-used by multiple claims (e.g. I deploy the same app twice). The objective is just to save time waiting for the resource to be created.

Could the apps share the database at the resource claim level instead?

Feb 12 '20 19:02 negz

Could the apps share the database at the resource claim level instead?

I'm kind of confused by this question. Probably I'm missing something, and haven't understood the abstraction. Can you provide an example? Can I bind the same database to 2 apps? Or delete an app and re-create it and bind to the same database? Or create a new version of an app and bind to the old database during a transition period? All of those seem like valid choices.

Feb 12 '20 21:02 dsyer

Sure. At risk of over-communicating, there's a few concepts in play here.

Managed resources, for example a CloudSQLInstance. These may be manually created in advance by an infrastructure operator (i.e. SRE type) for an application operator (i.e. developer type) to later claim and bind. They may also be created on demand by leveraging a resource class that describes how they should be dynamically provisioned to satisfy newly authored resource claims.
Resource claims, for example a MySQLInstance, claim a managed resource for a particular purpose. They indicate that the managed resource they claim is now in use, and should not be allocated to other folks who need infrastructure.

I'm going to assume your applications are modelled as Kubernetes Deployments. So the application -> database (instance) consumption model here is:

The CloudSQLInstance managed resource stores its connection details in a Secret. We do this because the CloudSQLInstance could exist before a MySQLInstance comes along and claims it.
When a MySQLInstance binds to the CloudSQLInstance its Secret is replicated to the writeConnectionSecretToRef in the namespace of the MySQLInstance.
The application Deployment is configured such that its pods mount or otherwise expose the content of the MySQLInstance's connection Secret to its containers.

So if the goal is to have multiple application Deployments use one CloudSQLInstance (either serially or in parallel) there are two options:

Opt for a one-to-one Deployment-to-MySQLInstance relationship, but allow a many-to-one MySQLInstance-to-CloudSQLInstance relationship.
Opt for a many-to-one Deployment-to-MySQLInstance relationship, and maintain the one-to-one MySQLInstance-to-CloudSQLInstance relationship we have today.

Can I bind the same database to 2 apps? Or delete an app and re-create it and bind to the same database? Or create a new version of an app and bind to the old database during a transition period? All of those seem like valid choices.

So in summary, the answer to all of these is "yes", as long as you think of the MySQLInstance as "the database", rather than the underlying CloudSQLInstance.

Does this help at all?

Feb 13 '20 01:02 negz

provider-gcp provider-gcp copied to clipboard

Re-using a GCP CloudSQLInstance

provider-gcp
provider-gcp copied to clipboard