kpt icon indicating copy to clipboard operation
kpt copied to clipboard

Add 'package' resource

Open mortent opened this issue 2 years ago • 9 comments

Currently the porch api doesn't expose a package resource. Instead, we expose package revisions with the grouping revisions that is part of the same package only available through the PackageName field on revisions.

We need to determine how the different operations on a package will translate into packagerevisions. For example, if a user mutates a package, presumably that would automatically create a new package revision. We also need to consider if we also need to add a packageresources that mirrors packagerevisionresources, but that always works on the latest revision.

mortent avatar May 22 '22 22:05 mortent

A bit of a clarification question, what exactly do we mean by 'package' resource? I'm imagining a resource that contains references to all of the package revisions that belong to the package, i.e. something like:

apiVersion: porch.kpt.dev/v1alpha1
kind: Package
metadata:
  namespace: {namespace}
  name: {repository-hashsuffix}
spec:
  repository: {repository}
  packageName: {package name}
status:
  revisions: 
  - {list of PackageRevision objects that belong to this Package}
  latestRevision: {name of the latest packageRevision}

Is this aligned with what you had in mind? If we want porch to automatically group together package revisions that have the same package name, then perhaps that information belongs in the status field?

With that mental model, I'm trying to think more about what you mean here:

For example, if a user mutates a package, presumably that would automatically create a new package revision.

What kind of "mutations" are we anticipating that users will do on a Package resource? The only one that comes to mind for me are to add or remove PackageRevisions, or modify package resources. It seems like most mutations done to the Package object itself should be handled by porch, and not the user directly. Is there anything else to account for?

natasha41575 avatar Jul 29 '22 22:07 natasha41575

I'm hoping that we can define operations on the package resource that "papered over" the underlying package revisions. For example:

  • Creating a new package would mean creating the initial package revision.
  • Getting a package means getting the latest package revision.
  • Deleting a package means deleting the package and all package revisions.

I'm not yet sure if we can define sensible behavior for all the verbs. For example, does it make sense that getting a package returns the last package revision while deleting a package removes all package revisions? I think it seems ok, but I haven't thought enough about it.

I don't think we need to include the list of package revisions in the status. Users can get this information directly from the API by listing the package revisions.

mortent avatar Aug 01 '22 16:08 mortent

I am not sure this is the right approach. I can't think of other K8s resources off hand that have this sort of relationship. It makes sense to explore the use cases. But if what we are talking about is primarily bulk operations on top of revisions, appropriate labels may be sufficient (if we implement label selectors ala #3402 ).

I suspect we should move to CRDs for everything except the actual ResourceList content.

johnbelamaric avatar Aug 01 '22 18:08 johnbelamaric

@johnbelamaric Deployment and ReplicaSet. Job and Pods.

bgrant0607 avatar Aug 03 '22 23:08 bgrant0607

Enumerating all the revisions in one resource is a scalability problem and violates Kubernetes API conventions. Anyone who wants to see the revisions can list the revisions separately.

bgrant0607 avatar Aug 03 '22 23:08 bgrant0607

@johnbelamaric Deployment and ReplicaSet. Job and Pods.

I don't really think of those as "containment" relationships, which is how I see this. Deployments create and manage ReplicaSets, and the attributes of the Deployment control how that is done. Similar with Jobs and Pods.

I don't see a Package controller that is managing PackageRevision resources. I guess maybe that's a way to implement the tagging. What would a PackageSpec look like? Something "feels" off with this - but I can't put my finger on it without more thought. It feels like a sort of "psuedo" resource. Probably seeing the operations will help.

johnbelamaric avatar Aug 04 '22 00:08 johnbelamaric

Package would create and manage PackageRevisions.

How we would implement deletion is an interesting question. There are multiple options, including asynchronous GC.

The whole reason for aggregated APIs in general was to allow functionality not present in the core K8s apiserver and/or to use different storage than the core etcd instance. The original use case was core metrics. The mechanism was also used to implement CRDs (https://github.com/kubernetes/apiextensions-apiserver). Here we're using it to use alternative storage also, to provide some data-plane-like operations, and to implement some functionality synchronously.

Another example of providing a veneer or virtual view on top of other resources is: https://github.com/openshift/kube-projects

bgrant0607 avatar Aug 04 '22 00:08 bgrant0607

Based on other discussion, I think the following makes sense:

  1. Creating a new package will create an empty package with no package revisions or resources. Creating a package revision should automatically create a corresponding package object if one doesn't already exist.

Creating a new package would mean creating the initial package revision.

After discussion with @mortent and @justinsb, it seems like this might not be the best thing to allow as it would require the package to take parameters to tell it where the upstream package is, which could complicate the package resource.

  1. Getting a package means getting the most recent package revision.

  2. Deletion of a package means deleting all the package revisions (so all the branches and tags from the git repository), and also removing the package from the main branch.

IIUC, the package revision should look something like this:

apiVersion: porch.kpt.dev/v1alpha1
kind: Package
metadata:
  namespace: {namespace}
  name: {repository-hashsuffix}
spec:
  repository: {repository}
  packageName: {package name}
status:
  latestRevision: {name of the latest packageRevision}

and, as Morten described above, we should have a packageResources that mirrors packageRevisionResources.

I'm planning to take an initial stab at implementing this, but welcome more discussion and feedback about what the verbs should mean and what the resource should look like.

Edited to add:

For having deletion go through an approval process, we can do the same thing that we currently do with PackageRevisions, using subresources.

natasha41575 avatar Aug 04 '22 18:08 natasha41575

On creating a new package:

It's a good point that we have at least two ways of creating new packages currently, init and get:

https://github.com/GoogleContainerTools/kpt/blob/main/docs/design-docs/07-package-orchestration.md#package-authoring

bgrant0607 avatar Aug 04 '22 23:08 bgrant0607