pipelines
pipelines copied to clipboard
[feature] Store Pipeline IR in database, not object storage
Feature Area
What feature would you like to see?
Currently the Object Store in KFP is largely used for artifacts, except for one outlier, which is the Pipeline IR.
I agree with the inline comments that this should be stored in the DB just like everything else that's not an artifact.
What is the use case or pain point?
Moving this to be stored in db, removes api server's dependency on the object store, and will make it fore future solutions for different artifact store implementations, without having to worry about api server.
Is there a workaround currently?
No
Anything else?
There's also archive logging, but this seems delegated to the backend engine (currently argo, but soon tekton as well), I'm not sure what to do about this one.
Love this idea? Give it a 👍.
Related: https://github.com/kubeflow/pipelines/issues/10510
follow up from Feb 02, 2024 call
@chensun suggests we might actually be storing pipeline ir in both db and object storage
It is not clear if the object store is being used any more for pipeline IR, we should confirm if that's indeed the case, if so we should remove this from apiserver and just rely on the db for this.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
bumping to unstale.
I've looked into this, at a decent glance it does appear that the Pipeline IR stored in Object Storage goes unused*, and I believe we can remove that copy of the definition since it creates duplicate sources-of-truth and just rely on the definition stored in DB.
A couple other findings:
- I did find one area of code that checks ObjStore for a PipelineVersion if it can't find it in the DB. Since it's a failsafe we can likely leave it, at least temporarily, even though data wouldn't be placed in those 'backup' destinations.
- It does appear that PipelineURI (which points to the pipeline definition location in the object store) needs to remain as it appears to be leveraged for the upload-from-web mechanism.
/assign @gmfrasca
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
/remove-lifecycle stale