flytekit feat: Update files with respect to common ReplicaSpec refactor

Tracking issue

Resolves: flyteorg/flyte#4408

Why are the changes needed?

https://github.com/flyteorg/flyte/pull/5355 changes protobuf files, so we need to update the corresponding files in flytekit.

What changes were proposed in this pull request?

Update files with respect to common ReplicaSpec refactor.

How was this patch tested?

Setup process

In `flyte` repo

Checkout https://github.com/flyteorg/flyte/pull/5355
make compile
flytectl demo start --dev
kubectl apply -k "github.com/kubeflow/training-operator/manifests/overlays/standalone?ref=v1.7.0"
POD_NAMESPACE=flyte ./flyte start --config kubeflow.yaml

where kubeflow.yaml is

# This is a sample configuration file for running single-binary Flyte locally against
# a sandbox.
admin:
  # This endpoint is used by flytepropeller to talk to admin
  # and artifacts to talk to admin,
  # and _also_, admin to talk to artifacts
  endpoint: localhost:30080
  insecure: true

catalog-cache:
  endpoint: localhost:8081
  insecure: true
  type: datacatalog

cluster_resources:
  standaloneDeployment: false
  templatePath: $HOME/.flyte/sandbox/cluster-resource-templates

logger:
  show-source: true
  level: 5

propeller:
  create-flyteworkflow-crd: true
  kube-config: $HOME/.flyte/sandbox/kubeconfig
  rawoutput-prefix: s3://my-s3-bucket/data

server:
  kube-config: $HOME/.flyte/sandbox/kubeconfig

webhook:
  certDir: $HOME/.flyte/webhook-certs
  localCert: true
  secretName: flyte-sandbox-webhook-secret
  serviceName: flyte-sandbox-local
  servicePort: 9443

tasks:
  task-plugins:
    enabled-plugins:
      - container
      - sidecar
      - K8S-ARRAY
      #- pytorch
      - tensorflow
      #- mpi
    default-for-task-types:
      - container: container
      - container_array: K8S-ARRAY
      - sidecar: sidecar
      #- pytorch: pytorch
      - tensorflow: tensorflow
      #- mpi: mpi
    fallback-to-container-handler: false

plugins:
  logs:
    kubernetes-enabled: true
    kubernetes-template-uri: http://localhost:30080/kubernetes-dashboard/#/log/{{.namespace }}/{{ .podName }}/pod?namespace={{ .namespace }}
    cloudwatch-enabled: false
    stackdriver-enabled: false
  k8s:
    image-pull-policy: Always
    default-env-vars:
      - FLYTE_AWS_ENDPOINT: http://flyte-sandbox-minio.flyte:9000
      - FLYTE_AWS_ACCESS_KEY_ID: minio
      - FLYTE_AWS_SECRET_ACCESS_KEY: miniostorage
  k8s-array:
    logs:
      config:
        kubernetes-enabled: true
        kubernetes-template-uri: http://localhost:30080/kubernetes-dashboard/#/log/{{.namespace }}/{{ .podName }}/pod?namespace={{ .namespace }}
        cloudwatch-enabled: false
        stackdriver-enabled: false

database:
  postgres:
    username: postgres
    password: postgres
    host: 127.0.0.1
    port: 30001
    dbname: flyte
    options: "sslmode=disable"
storage:
  type: stow
  stow:
    kind: s3
    config:
      region: us-east-1
      disable_ssl: true
      v2_signing: true
      endpoint: http://localhost:30002
      auth_type: accesskey
      access_key_id: minio
      secret_key: miniostorage
  container: my-s3-bucket

task_resources:
  defaults:
    cpu: 2
    memory: 1Gi
  limits:
    cpu: 4
    memory: 4Gi

In the parent folder of `flyte` and `flytekit` repo

Create Dockerfile

FROM python:3.11-slim-bookworm as builder

WORKDIR /root
ENV PYTHONPATH /root

# Install build dependencies
RUN apt update \
    && apt install build-essential git wget -y \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

# Copy necessary directories
COPY flyte /flyte
COPY flytekit /flytekit

# Install Python packages (Order is important!)
RUN pip install --no-cache-dir /flytekit/plugins/flytekit-kf-tensorflow \
    && pip install --no-cache-dir /flytekit \
    && pip install --no-cache-dir /flyte/flyteidl

Run docker buildx build -t localhost:30000/flytekit:dev --file Dockerfile --push .

In an arbitrary folder

Create kubeflow_tf_evaluator.py

from flytekitplugins.kftensorflow import PS, Chief, Evaluator, TfJob, Worker

from flytekit import Resources, task

task_config = TfJob(
    worker=Worker(replicas=2),
    chief=Chief(replicas=1),
    ps=PS(replicas=1),
    evaluator=Evaluator(replicas=1),
)


@task(
    task_config=task_config,
    requests=Resources(cpu="1"),
)
def my_tensorflow_task(x: int, y: str) -> str:
    return f"{x=}, {y=}"

Run pyflyte run --remote --image localhost:30000/flytekit:dev kubeflow_tf_evaluator.py my_tensorflow_task --x 100 --y acc

Test backward compatibility

Checkout to master branch in flytekit repo.
Rebuild docker image with docker buildx build -t localhost:30000/flytekit:dev --file Dockerfile --push . (In the parent folder of flyte and flytekit folder)
Run pyflyte run --remote --image localhost:30000/flytekit:dev kubeflow_tf_evaluator.py my_tensorflow_task --x 100 --y acc

Screenshots

Note that the worker replica is 2.

Check all the applicable boxes

[x] I updated the documentation accordingly.
[x] All new and existing tests passed.
[x] All commits are signed-off.

Related PRs

https://github.com/flyteorg/flyte/pull/5355

Docs link

May 16 '24 15:05 MortalHappiness

This is a PR for you to test evaluator. https://github.com/flyteorg/flytekit/pull/1870

May 17 '24 04:05 Future-Outlier

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 58.31%. Comparing base (69445ff) to head (cd6232a). Report is 4 commits behind head on master.

Additional details and impacted files

@@             Coverage Diff             @@
##           master    #2424       +/-   ##
===========================================
- Coverage   79.24%   58.31%   -20.94%     
===========================================
  Files         196      250       +54     
  Lines       19785    22092     +2307     
  Branches     4008     4006        -2     
===========================================
- Hits        15678    12882     -2796     
- Misses       3407     8700     +5293     
+ Partials      700      510      -190

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

May 26 '24 15:05 codecov[bot]

@MortalHappiness , can you merge master to get rid of the dbt failures?

In order to fix the CI failures for kf-mpi and kf-tensorflow you'll need to add a line similar to this to https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-kf-pytorch/dev-requirements.in and create the corresponding dev-requirements.in in https://github.com/flyteorg/flytekit/tree/master/plugins/flytekit-kf-tensorflow.

Jun 12 '24 01:06 eapolinario

feat: Update files with respect to common ReplicaSpec refactor

Tracking issue

Why are the changes needed?

What changes were proposed in this pull request?

How was this patch tested?

Setup process

In flyte repo

In the parent folder of flyte and flytekit repo

In an arbitrary folder

Test backward compatibility

Screenshots

Check all the applicable boxes

Related PRs

Docs link

Codecov Report

In `flyte` repo

In the parent folder of `flyte` and `flytekit` repo