Richard Liu

Results 11 comments of Richard Liu

I would like to see a more detailed proposal for the migration plan. Specifically: * How do we avoid having two divergent versions of the MpiJob? * Assuming that the...

Thanks for the reply. For the case with 100 workers - suppose that different users created two such clusters in the same Kubernetes cluster. Neither of them have sufficient workers,...

@xyhuang @swiftdiaries Let's try to have this for 0.5. A few things to consider: 1. How should we automate this? I think it makes sense to create a periodic Prow...

Let's split up the work. I can take care of item 1 (set up project, cluster, and Prow workflow).

/cc @johnugeorge /cc @terrytangyuan /cc @jian-he

That is the plan. We can add e2e tests with the TestJob as well.

As a reference, these are the graduation criteria for TFJob 1.0: https://github.com/kubeflow/tf-operator/issues/1076

Can we move the test framework code (test_runner etc) into kubeflow/testing? That way we don't need to replicate the code in every repository.