Bernard Han
Bernard Han
Hi there, This is similar to https://github.com/mlcommons/storage/issues/22. I'm following all the instructions [here](https://github.com/mlcommons/storage#u-net3d) but still received an error about the directory structure. The full error message with the command is...
We have a CPU-only cluster -- n2-standard-32-1024 that has 1024 of n2-standard-32 nodes. There, we technically should have a rough 1024 * 32 CPU resources but I'm seeing 1024 nominated...
[DON'T MERGE] GCS Distributed Training Benchmark Infra + File-parallelism + Range-read Parquet files
# This is created as a draft PR for GCS internal members to comment. This will not be merged to main. ## File-parallelism + Range-read Parquet files This PR supports...
# This is created as a draft PR for GCS internal members to comment. This will not be merged to main. ## Checkpointing a 64B model through MaxText - Read...
Lab Note: https://colab.corp.google.com/drive/19ckkIejDDZIolyvn_41RRx7DDt6RebL-?resourcekey=0-im9ySMvZ-zZyqA8c8gP0eg#scrollTo=9zqkB5a0RpD5