website icon indicating copy to clipboard operation
website copied to clipboard

katib: Update LLM HP tuning guide to clarify tunable fields and fix resource section

Open SanthoshToorpu opened this issue 8 months ago • 9 comments

…es config, remove unsupported params

Checklist:

  • [x] You have signed off your commits
  • [x] Ensure you follow best practices from our guide. Contributing.
  • [x] You have included screenshots when changing the website style or adding a new page.

Description of your changes: As per andrey's recommendation I added the changes mentioned in the following screenshots attached. A minimal work. image (2) image (1) image

Issue

Closes: #2522

Labels

/area katib

/area website


SanthoshToorpu avatar Mar 29 '25 18:03 SanthoshToorpu

Hi @SanthoshToorpu. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

google-oss-prow[bot] avatar Mar 29 '25 18:03 google-oss-prow[bot]

@helenxie-bit please look into the changes

SanthoshToorpu avatar Mar 29 '25 18:03 SanthoshToorpu

@andreyvelich as per @helenxie-bit instructions and after reading the thread i fixed 2/3 issues except for the params you told to remove irrelevant params.

image

I downloaded the docker image rn but can you be more specific on what to be removed and what not?

SanthoshToorpu avatar Mar 30 '25 08:03 SanthoshToorpu

@andreyvelich as per @helenxie-bit instructions and after reading the thread i fixed 2/3 issues except for the params you told to remove irrelevant params.

image

I downloaded the docker image rn but can you be more specific on what to be removed and what not?

I think we need to remove unused parameters in this part: https://www.kubeflow.org/docs/components/katib/user-guides/llm-hp-optimization/#key-parameters-for-llm-hyperparameter-tuning. Can you remove parameters objective, base_image, and parameters? Since they will not be used when optimizing hyperparameters for LLMs.

helenxie-bit avatar Mar 31 '25 03:03 helenxie-bit

Ref issue: https://github.com/kubeflow/katib/issues/2522

helenxie-bit avatar Mar 31 '25 03:03 helenxie-bit

@andreyvelich as per @helenxie-bit instructions and after reading the thread i fixed 2/3 issues except for the params you told to remove irrelevant params. image I downloaded the docker image rn but can you be more specific on what to be removed and what not?

I think we need to remove unused parameters in this part: https://www.kubeflow.org/docs/components/katib/user-guides/llm-hp-optimization/#key-parameters-for-llm-hyperparameter-tuning. Can you remove parameters objective, base_image, and parameters? Since they will not be used when optimizing hyperparameters for LLMs.

Okay so only those threee params?

SanthoshToorpu avatar Mar 31 '25 03:03 SanthoshToorpu

@andreyvelich as per @helenxie-bit instructions and after reading the thread i fixed 2/3 issues except for the params you told to remove irrelevant params. image I downloaded the docker image rn but can you be more specific on what to be removed and what not?

I think we need to remove unused parameters in this part: https://www.kubeflow.org/docs/components/katib/user-guides/llm-hp-optimization/#key-parameters-for-llm-hyperparameter-tuning. Can you remove parameters objective, base_image, and parameters? Since they will not be used when optimizing hyperparameters for LLMs.

Okay so only those threee params?

Yes, all other parameters may be used when optimizing hyperparameters for LLMs.

helenxie-bit avatar Mar 31 '25 04:03 helenxie-bit

@helenxie-bit :Great! Maybe here it's better to use a different title since we include S3DatasetParams in this part too:

how about

Dataset and Model Parameter Classes

in legacy trainer docs

SanthoshToorpu avatar Mar 31 '25 05:03 SanthoshToorpu

@helenxie-bit :Great! Maybe here it's better to use a different title since we include S3DatasetParams in this part too:

how about

Dataset and Model Parameter Classes

in legacy trainer docs

That sounds good to me.

helenxie-bit avatar Mar 31 '25 17:03 helenxie-bit

Please review

SanthoshToorpu avatar Mar 31 '25 17:03 SanthoshToorpu

Thanks for the contribution! LGTM! Please have a review when you have time @andreyvelich @mahdikhashan

helenxie-bit avatar Mar 31 '25 19:03 helenxie-bit

/assign

mahdikhashan avatar Mar 31 '25 20:03 mahdikhashan

@andreyvelich @Electronic-Waste is this ready to be merged ?

juliusvonkohout avatar Jun 17 '25 14:06 juliusvonkohout

@andreyvelich @Electronic-Waste is this ready to be merged ?

I think It was pending for my review, I had a quick look and seems good to me to be merged. as i remember it to be a link to old doc and further improvment.

mahdikhashan avatar Jun 17 '25 15:06 mahdikhashan

/lgtm

mahdikhashan avatar Jun 17 '25 15:06 mahdikhashan

/approve then

juliusvonkohout avatar Jun 17 '25 15:06 juliusvonkohout

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: juliusvonkohout

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

google-oss-prow[bot] avatar Jun 17 '25 15:06 google-oss-prow[bot]