pipelines icon indicating copy to clipboard operation
pipelines copied to clipboard

feat(components): add labels to gcp ml engine train/batch_predict jobs. Fixes #5038

Open dkajtoch opened this issue 4 years ago • 15 comments

Description of your changes: Add labels to organize jobs on ml engine

Checklist:

dkajtoch avatar Mar 05 '21 10:03 dkajtoch

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: dkajtoch To complete the pull request process, please assign ark-kun after the PR has been reviewed. You can assign the PR to them by writing /assign @ark-kun in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

google-oss-robot avatar Mar 05 '21 10:03 google-oss-robot

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

:memo: Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar Mar 05 '21 10:03 google-cla[bot]

Hi @dkajtoch. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

google-oss-robot avatar Mar 05 '21 10:03 google-oss-robot

/assign @ark-kun

dkajtoch avatar Mar 05 '21 10:03 dkajtoch

@googlebot I signed it!

dkajtoch avatar Mar 05 '21 11:03 dkajtoch

How did you verify that the updated component code works?

Ark-kun avatar Mar 09 '21 00:03 Ark-kun

/ok-to-test

Ark-kun avatar Mar 09 '21 00:03 Ark-kun

@Ark-kun I changed return type to JsonObject and updated ml engine train test

I build docker image locally and run some poc pipeline with single task to test if labels appear and everything works (I have access to google cloud).

import kfp
from kfp import dsl

task_factory = kfp.components.load_component_from_file("component.yaml")


@dsl.pipeline(
    name='ml-train',
    description='ML train pipeline.'
)
def poc_pipeline():
    task = task_factory(
        project_id='project-id',
        region='europe-west4',
        args=[
            '--preprocess',
            '--model_type', 'classification',
            '--batch_size', '250',
            '--learning_rate', '0.1',
            '--max_steps', '1',
            '--training_data_path',
            'gs://cloud-samples-data/ai-platform/census/algorithms/data/train.csv'
        ],
        job_dir='gs://my-test-bucket',
        job_id_prefix='job-',
        job_id='job-1',
        labels={
            'env': 'dev',
            'user': 'test',
        },
        master_image_uri='gcr.io/cloud-ml-algos/linear_learner_cpu:latest',
    )

dkajtoch avatar Mar 09 '21 14:03 dkajtoch

Interesting. I do not fully understand why this code works. The command-line receives the labels as a JSON-serialized string. The train function has an untyped parameter labels which is treated as a dictionary. Where is the JSON stricg converted to dictionary?

Ark-kun avatar Mar 17 '21 08:03 Ark-kun

@Ark-kun I need to check it once again since I am passing Dict in tests and in pipeline so maybe there is no conversion. BTW in ml engine train yaml we have List as args so do not know why json.dumps is ok here https://github.com/kubeflow/pipelines/blob/master/samples/core/ai_platform/ai_platform.ipynb (I pass list directly and it works)

dkajtoch avatar Mar 17 '21 16:03 dkajtoch

BTW in ml engine train yaml we have List as args so do not know why json.dumps is ok here

When you pass a string, the SDK assumes that you've serialized the argument yourself. Otherwise it will try to serialize the argument based on the input type. This only works for a small number of types: int, float, bool, list (==JsonArray) and dict (==JsonObject).

In the end the command line is always an array of strings when being executed.

I'd expect that the train function receives the labels as a JSON string. But then it's strange that train backend does not complain about the string instead of map.

Ark-kun avatar Mar 19 '21 05:03 Ark-kun

@Ark-kun sorry for such a delay. I did some tests with labels and it seems to work both with Dict and Json string. I added logger to _train.py in order to see what type I have in train(for labels) function. I pass labels to poc_pipeline in two forms:

labels = {"name": "a", "env": "dev"}  # dict
labels = json.dumps({"name": "a", "env": "dev"})  # string

In both cases, type is Dict.

I am running kubeflow pipelines via gcp ai platform version 1.0.4

dkajtoch avatar Mar 30 '21 11:03 dkajtoch

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jun 28 '21 12:06 stale[bot]

Thank you for verifying.

I did some investigation and it looks like the Fire library is doing this. It tries to parse every argument by adding missing quotes then parsing it as python code. https://github.com/google/python-fire/blob/c1266d0dbb2114514fcf8be62044344b5a51c733/fire/parser.py#L78 Since most valid JSON is also valid python (not always), this usually works.

I do not really like this - we're basically parsing JSON as if it was Python.

Could you please add explicit parsing to train and batch_predict?

import json

...

@decorators.SetParseFns(python_version=str, runtime_version=str, labels=json.loads)
def train(project_id,

import json
from fire import decorators
...
@decorators.SetParseFns(labels=json.loads)
def batch_predict

Ark-kun avatar Jul 11 '21 01:07 Ark-kun

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Mar 02 '22 12:03 stale[bot]