cryostat-operator icon indicating copy to clipboard operation
cryostat-operator copied to clipboard

Add more k8s labels to instances managed by container-jfr-operator

Open jiekang opened this issue 5 years ago • 9 comments

For example, the RH Jaeger operator has the following labels on it's Jaeger instances:

app=jaeger
app.kubernetes.io/component=all-in-one
app.kubernetes.io/instance=jaeger-all-in-one-inmemory
app.kubernetes.io/managed-by=jaeger-operator
app.kubernetes.io/name=jaeger-all-in-one-inmemory
app.kubernetes.io/part-of=jaeger
pod-template-hash=856b547bf

I think the app.kubernetes.io labels should be added to the container-jfr instances.

jiekang avatar Jan 28 '20 15:01 jiekang

We do have the app.kubernetes.io/name label which is applied to the ContainerJFR Deployment. Are these labels also on the Jaeger Deployment? Are there any other resources or component associated with the Jaeger operator?

andrewazores avatar Jan 31 '20 21:01 andrewazores

Yes these labels are applied to the Jaeger Deployment which is created from an instance of ResourceKind Jager. Hopefully I understood the question.

Regarding other resources/components, I'll get back to you in a day on that.

jiekang avatar Jan 31 '20 22:01 jiekang

The Jaeger ResourceKind instance has resources:

Deployment ReplicaSet Pod Service (4) ConfigMap (2) Secret

The above, excluding Secret have the labels I listed above with the correct fields.

jiekang avatar Feb 06 '20 14:02 jiekang

In comparison, the ContainerJFR ResourceKind instance has resources: Pod Service (4)

We will need to make sure it is connected to all the resources.

jiekang avatar Feb 06 '20 14:02 jiekang

Where is this "ResourceKind instance" from? I'm not sure how I can tell which things are linked together in the way you're talking about.

andrewazores avatar Feb 06 '20 21:02 andrewazores

@andrewazores Hmm I think my terminology is incorrect or maybe outdated sorry. These are maybe called instances of the CRDs? E.g. a yaml with Kind: ContainerJFR or Kind: Jaeger.

In the console if you view an installed operator, there should be a tab for "All Instances" and one for each kind of instance.

jiekang avatar Feb 06 '20 21:02 jiekang

Ah, okay I see what you mean. I'm not sure how the console is populating that list that you're seeing. It might be coming from the OperatorHub ClusterServiceVersion, but this is what we currently have in the list of the operator's owned resources:

  customresourcedefinitions:
    owned:
      - name: containerjfrs.rhjmc.redhat.com
        displayName: ContainerJFR
        kind: ContainerJFR
        version: v1alpha1
        description: Container JFR
        resources:
          - version: v1
            kind: Deployment
          - version: v1
            kind: Service
          - version: v1
            kind: ReplicaSet
          - version: v1
            kind: Pod
          - version: v1
            kind: Secret
          - version: v1
            kind: ConfigMap
        specDescriptors: []
        statusDescriptors: []

So the "make sure it is connected to all resources" is a piece that I don't really know how to address, since it looks like it should already be set up for what you're looking for, but I don't know how the UI is determining it.

It may simply be that our CSV is claiming that we own those resources, but don't actually use any in practice - the ContainerJFR CRD represents the Operator's view of what a container-jfr looks like, which basically just means "one container-jfr container and its exporter and command-channel services, a persistent volume, and optionally grafana/jfr-datasource containers and associated services". All of those containers get wrapped up into a Pod, which is managed by a Deployment/ReplicaSet and which all services point back to, but the Deployment/ReplicaSet belong to the Operator - the ContainerJFR lives inside of those resources, rather than owning them.

andrewazores avatar Feb 06 '20 21:02 andrewazores

Hmm okay. I guess it might depend precisely on how you describe the structure where the Deployment and ReplicaSet belong to the operator, not the ContainerJFR instance. I wonder if it makes sense to revisit this and have the ContainerJFR own the Deployment and ReplicaSet. How does this interact with the cleanup of the operator, or the operator's resources? E.g. when deleting a ContainerJFR instance, or "uninstalling" the operator.

In general this is another issue outside the k8s labels described here. I'll try to investigate further and open another issue.

jiekang avatar Feb 07 '20 18:02 jiekang

When you delete the ContainerJFR resource instance, the Operator is what sees this and cleans up the associated Deployment/ReplicaSet. If the ContainerJFR itself owned these resources then it would... somehow need to see its own deletion and clean itself up, I think. When you uninstall the operator, since that's what owns the Deployment and ReplicaSet, those are also recursively cleaned up. Interestingly I think this still leaves a dangling ContainerJFR resource instance around, but without any associated resources (no pod, no services). In practice this just means that the next time you (re)install the operator, you'll get back a ContainerJFR deployment configured the same way as the last one you had, albeit missing any of the archived recordings or other state. If the ContainerJFR owned its own Deployment/ReplicaSet/Services/Routes then you might uninstall the operator and still have all of those pieces left over and running, I think? It's something we can tinker with (grep for SetControllerReference in the sources), but as you say, it's a separate concern than labels on those resources.

andrewazores avatar Feb 07 '20 21:02 andrewazores