Jonathan Shen
Jonathan Shen
Tensorflow has some profiling guides: https://www.tensorflow.org/guide/profiler https://www.tensorflow.org/guide/gpu_performance_analysis One important thing to check is if the training is disk io bounded. If that turns out to be the case you may...
Yes that is correct. worker_replicas = number of machines you have in the cluster, worker_gpus = number of gpus per machine. Actually reading #1 I think worker_replicas should be set...
It looks like the problem is with the .so being compiled for ubuntu but trying to be loaded into osx. Did you start the colab kernel from inside of docker?
You don't have to use docker, but then it is up to you to get the correct environment configuration. If you're using a mac and not using docker we have...
I think so -- there's currently no windows or osx pip package. Nobody on the team knows how to build them...
To be quite honest I don't think we will be adding one in the near future :( The scripts for getting the datasets are https://github.com/tensorflow/lingvo/tree/master/lingvo/tasks/asr/tools
Hi, even with #7117 (using 30.1.4-edge) I'm seeing the same issue in the injected linkerd-proxy container: ``` Message: time="2022-06-14T23:28:47Z" level=info msg="Found pre-existing key: /var/run/linkerd/identity/end-entity/key.p8" time="2022-06-14T23:28:47Z" level=info msg="Found pre-existing CSR: /var/run/linkerd/identity/end-entity/csr.der"...
So I think the problem above is some incompatibility between int8 and modules_to_save. Using float16 instead of int8 is fine. But actually, it seems that after https://github.com/huggingface/peft/commit/c21afbe868734c0af8bd4577c4c7acdf366b96d1 setting modules_to_save for...
Hmm I'm still seeing this on main (commit 79fd06a72c9d5c6c98a1864b43e417335c580f59)
I seem to be getting some errors, eg. ``` [tool.uv] allow-insecure-host = [ "amazonaws.com", ] ``` ``` warning: Failed to parse `pyproject.toml` during settings discovery: TOML parse error at line...