Tracking: PostgreSQL support in KFP
PostgreSQL request has become the top upvoted issue on KFP repo: https://github.com/kubeflow/pipelines/issues/7512. This issue is for tracking the work of this integration.
-
[ ] KFP Backend integration
- [x] Define the DB config for PostgreSQL DB info
- [x] Basic connection config
- [ ] Secure connection config: passfile, hostaddr, ssl mode, certificate, etc.
- [x] Flag to switch between Postgresql driver and MySQL driver
- [x] ~~#9859~~
- [ ] Syntax support of PostgreSQL DB in backend
- [ ] Adopt a different syntax during initialization. (schema difference)
- [ ] Adopt a different syntax during execution. (Modification of data)
- [ ] Explore the dialect difference and develop a control mechanism to enable easy testing on the storage layer.
- [ ] Testing
- [ ] Unit testing for Postgresql behavior.
- [ ] Functional testing for Postgresql behavior.
- [ ] E2E testing for Postgresql behavior.
- [ ] Cache server integration
- [ ] KFP API server integration
- [ ] Manifest support for Postgresql
- [x] #9860
- [x] #9861
- [ ] Marketplace KFP (managed CloudSQL instance)
- [x] Define the DB config for PostgreSQL DB info
-
[ ] MLMD integration
- [ ] Develop Postgresql integration on MLMD
- [ ] Investigate https://github.com/google/ml-metadata/issues/194#issuecomment-1975207465
- [ ] Release new version of MLMD
- [x] #9848
- [ ] Manifest support for Postgresql in MLMD
- [ ] Standalone KFP
- [ ] Full Kubeflow Kubeflow
- [ ] Marketplace KFP (managed CloudSQL instance)
- [ ] Develop Postgresql integration on MLMD
cc @chensun
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi! I am Eshaan Aggarwal, an avid Open Source enthusiast from India. I am a web developer proficient in GoLang and PostgreSQL and have recently started learning about Kubernetes. I would love to contribute to this issue and hopefully join Kubeflow as a GSoC '24 mentee. Are there any pre-tests or other beginner-friendly contributions I can make to get acquainted with this project and as a proof of skill?
Hello, I'm Udit. Proficient in Python, Golang, and PostgreSQL, I've recently completed a comprehensive course on Kubernetes. The skills acquired perfectly align with the requirements of this project, making it an ideal platform for me to apply and further enhance my knowledge. I'm enthusiastic about contributing to this GSoC project, especially on the specified issue. Eagerly anticipating the chance to contribute to its development!
Hello @EshaanAgg and @UditNayak , thank you for your interest and I am assuming @rimolive will be your mentor.
As a start, I would recommend learning:
- PostgreSQL syntax
- GORM which is a database syntax abstraction library: https://gorm.io/index.html
- Have a kubernetes environment yourself for development.
Then, my advice to the development will be in following orders:
- Make sure you can bring up KFP in the kubernetes environment
- Make sure you can bring up a postgresql instance and access to it manually in the kubernetes environment
- Make changes in KFP API server so it can read postgresql connection config from parameter/envionrment-variable. Then KFP API server should establish connection with the postgresql instance in the same cluster with such connection config.
- Make corresponding GORM change so that CRUD (create/read/update/delete) operation of KFP can be executed correctly using postgresql. (It is a good time to write some unit test or E2E test)
- Perform the similar actions as above for cache server.
I believe @rimolive can facilitate more once you dive deep into the project. But feel free to take any task you want to work on and ask questions along the way. Have fun!
In addition to what @zijianjoy said, please join us on our Slack. We have the #gsoc-participants channel to welcome everyone interested in the GSoC projects.
@zijianjoy I am a web developer from India, and I would like to work on this issue. I know python,golang and a bit of kubernetes. How to get started on this
@zijianjoy Im currently doing ops in kubeflow, loved the concept of changing db, Will be working on these.
#Make sure you can bring up KFP in the kubernetes environment #Make sure you can bring up a postgresql instance and access to it manually in the kubernetes environment #Make changes in KFP API server so it can read postgresql connection config from parameter/envionrment-variable. Then KFP API #server should establish connection with the postgresql instance in the same cluster with such connection config. #Make corresponding GORM change so that CRUD (create/read/update/delete) operation of KFP can be executed correctly using postgresql. (It is a good time to write some unit test or E2E test) #Perform the similar actions as above for cache server.
@zijianjoy I am a BE developer from India and am comfortable with python, goLang, java, Kubernetes and SQL databases. I would like to contribute to this issue. What are the steps?
hello@zijianjoy I'm a junior studen from China major in computer science and technology, I'm very interested in open source project and want to do some contribution to this project, at the same time to improve my skill. I have joined our school lab which connect with database and cloud. So I frequently contact with postgresql and kubernetes. I'm doing the steps you refered above,but I can't join in the #gsoc-participants,can this have influence?
Hey Guys,
Everyone has setup local kubeflow setup ?
On Mon, Mar 25, 2024 at 8:50 AM Jiduyuting @.***> wrote:
@.*** I'm a junior studen from China major in computer science and technology, I'm very interested in open source project and want to do some contribution to this project, at the same time to improve my skill. I have joined our school lab which connect with database and cloud. So I frequently contact with postgresql and kubernetes. I'm doing the steps you refered above,but I can't join in the #gsoc-participants,can this have influence?
— Reply to this email directly, view it on GitHub https://github.com/kubeflow/pipelines/issues/9813#issuecomment-2017138982, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXRWTPO2DUQDZHVE6JBLTCLYZ6JZVAVCNFSM6AAAAAA3DJ3CPGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJXGEZTQOJYGI . You are receiving this because you commented.Message ID: @.***>
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
/lifecycle frozen
@zijianjoy anyone working on this issue ?
@sagnik3788 This is part of the Google Summer of Code. You can find details in https://www.kubeflow.org/events/gsoc-2024/#project-9-postgresql-integration-in-kubeflow-pipelines.