Jeremy Lewi

Results 242 comments of Jeremy Lewi

In TF one of the most common patterns for distributed training is - all workers read the data (e.g. worker processes a different subset) - one worker writes the model/checkpoints...

Do we need a corresponding issue for a sink?

Did you try adding the user directly by creating appropriate RBAC and ISTIORBAC resources?

Can we have a single copy of the code in kubeflow/examples? We can still have a codelab hosted at https://codelabs.developers.google.com that points to a fixed commit in kubeflow/examples. We need...

Here's some additional thoughts on how to reconcile the code * I think we should use https://github.com/kubeflow/examples/blob/master/mnist/model.py and not https://github.com/googlecodelabs/kubeflow-introduction/blob/master/tensorflow-model/MNIST.py * The former is using tf.estimator * The latter is...

I think if you want to change the model to use Keras that's fine. Swapping out the model once we have tests should be fine because the tests will ensure...

Serving is now updated in kubeflow/examples/mnist. I'm working on moving over the web-ui and then I think the copy in kubeflow/examples will be up to date and contain everything we...

In 0.3 the TFJob prototypes are no in the examples package https://github.com/kubeflow/kubeflow/tree/v0.3-branch/kubeflow/examples/prototypes If you use kfctl to install Kubeflow and create your ksonnet application the correct packages should be installed.

I was able to work around this by doing the following 1. Create a symbolic link ``` ln -s /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libnvinfer_plugin_tensorrt_llm.so /usr/lib/libnvinfer_plugin_tensorrt_llm.so.9 ``` 1. Set the LD_LIBRARY_PATH as follows ``` export...

> If a Docker image does not exist, runme will build it Does this imply i) docker is running locally? ii) images are stored locally? What if I build my...