website icon indicating copy to clipboard operation
website copied to clipboard

Improve the KFP / User Guides / Core Functions docs

Open thesuperzapper opened this issue 1 year ago • 5 comments

This PR makes significant formatting updates and rewording to the Kubeflow Pipelines / User Guides / Core Functions, and Migrate to Kubeflow Pipelines v2 guides.

For the most part this is a formatting update, but I draw reviwers attention to a content change I made on the Connect the SDK to the API page. I have updated the Python script given as an example for connecting the KFP SDK with Dex. I have used the same reference example as in the deployKF docs. (Note, it will work for all other distributions, including manifests, as long as they use Dex).

The most important changes in this PR are:

  • Re-naming the Core Functions pages to have shorter titles to be more clear
  • Re-ordering the Core Functions pages to better tell a story and be readable from top to bottom
  • Significantly improving the Migrate to Kubeflow Pipelines v2 page which was very hard to read before.

The easiest way to review this PR will be to look at the Netlify preview and review the following pages:

After we merge this, we will need to do the same thing for the "Create Components" and "Data Handling" sections.

thesuperzapper avatar Jul 07 '24 01:07 thesuperzapper

@hbelmiro I would appreciate your review of this.

I believe it significantly improves the readability and usefulness of the "Core Functions" and "Migrate to Kubeflow Pipelines v2" guides.

/assign @hbelmiro

thesuperzapper avatar Jul 07 '24 01:07 thesuperzapper

It would be nice if @rimolive and @diegolovison could also take a look.

hbelmiro avatar Jul 08 '24 17:07 hbelmiro

Hi @thesuperzapper

I left some comments. It will be good if you agree with my suggestion do it in a new commit. Easy for review :) We can squash and merge later.

Tks

diegolovison avatar Jul 08 '24 18:07 diegolovison

I also added some comments but overall lgtm.

rimolive avatar Jul 08 '24 18:07 rimolive

A few people raised concerns about keeping the content which talks about Vertex AI (on the migration guide). Let's discuss this in a follow up issue, the existing page already mentions Vertex AI, and I don't feel comfortable removing it without a larger community discussion that should not block this PR.

Personally, I think it's important to clarify that there are some parts of the Kubeflow Pipelines SDK are only for Vertex AI (and not used by the open source Kubeflow Pipelines). If we as a community disagree with the presence of this code, it should be removed from the SDK, not silently dropped from the documentation.

thesuperzapper avatar Aug 19 '24 19:08 thesuperzapper

I have also fixed an issue which is being reported since the release of 1.9.0 in the Kubeflow Pipelines "Connect the SDK to the API - Outside Cluster" guide:

  • https://github.com/kubeflow/manifests/issues/2844

The issue is simply that Kubeflow 1.9.0 introduced a OIDC "approve grant" screen (which the old script was getting stuck on), see the preview site for the updated code:

  • https://deploy-preview-3795--competent-brattain-de2d6d.netlify.app/docs/components/pipelines/user-guides/core-functions/connect-api/#kubeflow-platform---outside-the-cluster

thesuperzapper avatar Aug 19 '24 19:08 thesuperzapper

@james-jwu or @chensun or @terrytangyuan can you please approve this important update to the KFP docs.

Among other things, it fixes two critical issues impacting users:

  1. Fixes the "connect SDK outside cluster" script, which is no longer working on Kubeflow 1.9.0:
    • https://deploy-preview-3795--competent-brattain-de2d6d.netlify.app/docs/components/pipelines/user-guides/core-functions/connect-api/#kubeflow-platform---outside-the-cluster
  2. Improves the KFP V2 migration guide, as users are struggling to follow it right now:
    • https://deploy-preview-3795--competent-brattain-de2d6d.netlify.app/docs/components/pipelines/user-guides/migration/

thesuperzapper avatar Aug 19 '24 20:08 thesuperzapper

I think the DEX part in comnect-api.md should link to the Kubeflow Platform page in general https://www.kubeflow.org/docs/started/installing-kubeflow/#kubeflow-platform instead of directly to the distributions. Furthermore we should say that below DEX there is oauth2-proxy that supports machine to machine authentication with Kubernetes tokens. For machine accounts you should rely on service accounts and not usernames+passwords. Oauth2-proxy should have a more stable interface and be more platform/distribution agnostic in the future. But for some experimentation Dex might be enough for new users.

CC also @kimwnasptd for platform and @andreyvelich @terrytangyuan from the KSC since it is a large change.

juliusvonkohout avatar Aug 20 '24 05:08 juliusvonkohout

The question is when and how do we want to address the Dex/oauth2-proxy changes in https://github.com/kubeflow/manifests/issues/2844#issuecomment-2298009812

@kimwnasptd @thesuperzapper @kromanow94 I am fine with presenting both in the documentation and adjusting it slightly as proposed in the comment.

juliusvonkohout avatar Aug 20 '24 05:08 juliusvonkohout

@thesuperzapper maybe it makes sense to link to the recently added https://github.com/kubeflow/manifests/blob/master/tests%2Fgh-actions%2Ftest_dex_login.py and adjust it if needed. Because this way we always have it up to date and it is tested regularly. Code on the website can soon deprecate or get out of synchronization.

juliusvonkohout avatar Aug 20 '24 05:08 juliusvonkohout

@juliusvonkohout we need to focus on getting this merged so the Dex script on the website actually works with Kubeflow 1.9.0. All previous users will be using the dex approach.

Can you confirm that the new script under "Kubeflow Platform - Outside the Cluster" works on your test 1.9.0 clusters?


In terms of future updates like including the m2m stuff on this page, let's do that in a follow-up PR, so we don't block the fix to the dex script.

PS: I think it's more user friendly to include all the information directly on the website, so we can add explanations and comments, and not break if you update your test script.

PSS: once we merge this PR, please update your test login script to the new one from this PR, because its currently disabled on your repo (hence why the script on the website is broken).

thesuperzapper avatar Aug 20 '24 16:08 thesuperzapper

@james-jwu or @chensun can you please approve this critical update to the KFP docs?

It fixes two critical issues impacting users:

  1. Fixes the "connect SDK outside cluster" script, which is no longer working on Kubeflow 1.9.0:
    • https://deploy-preview-3795--competent-brattain-de2d6d.netlify.app/docs/components/pipelines/user-guides/core-functions/connect-api/#kubeflow-platform---outside-the-cluster
  2. Improves the KFP V2 migration guide, as users are struggling to follow it right now:
    • https://deploy-preview-3795--competent-brattain-de2d6d.netlify.app/docs/components/pipelines/user-guides/migration/

thesuperzapper avatar Aug 20 '24 22:08 thesuperzapper

/lgtm

rimolive avatar Aug 21 '24 13:08 rimolive

@juliusvonkohout we need to focus on getting this merged so the Dex script on the website actually works with Kubeflow 1.9.0. All previous users will be using the dex approach.

/lgtm with the linking to Kubeflow platform

Can you confirm that the new script under "Kubeflow Platform - Outside the Cluster" works on your test 1.9.0 clusters?

there is positive user feedback in https://github.com/kubeflow/manifests/issues/2844#issuecomment-2299741037 but i did not have the time to test it myself yet

In terms of future updates like including the m2m stuff on this page, let's do that in a follow-up PR, so we don't block the fix to the dex script.

Yes that is fine.

PS: I think it's more user friendly to include all the information directly on the website, so we can add explanations and comments, and not break if you update your test script.

PSS: once we merge this PR, please update your test login script to the new one from this PR, because its currently disabled on your repo (hence why the script on the website is broken).

Please comment in the follow up https://github.com/kubeflow/manifests/pull/2854. Note this does not yet contain your changes here.

juliusvonkohout avatar Aug 21 '24 21:08 juliusvonkohout

Can @andreyvelich or @terrytangyuan please approve?

We need a root approver for this because we are also changing content from the Katib section (to fix broken links).

We have 2 LGTMs already, and given the very important changes (including fixes the KFP "out of cluster" script for 1.9.0), I think we should merge ASAP.

thesuperzapper avatar Aug 22 '24 15:08 thesuperzapper

/cc @connor-mccarthy

andreyvelich avatar Aug 22 '24 15:08 andreyvelich

/lgtm /approve

james-jwu avatar Aug 27 '24 15:08 james-jwu

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: james-jwu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

google-oss-prow[bot] avatar Aug 27 '24 15:08 google-oss-prow[bot]