Ansh Radhakrishnan

Results 14 comments of Ansh Radhakrishnan

@Rocamonde did you happen to get around to doing any flakiness benchmarking? I'm going to try and address some of the flaky tests over the next few days, just wanted...

``` Launching training on 8 GPUs. --------------------------------------------------------------------------- ProcessRaisedException Traceback (most recent call last) [/tmp/ipykernel_1090735/2038238995.py](https://localhost:8080/#) in 1 args = ("bf16", 42, 64) ----> 2 notebook_launcher(training_loop, args, num_processes=8) 2 frames [/opt/conda/lib/python3.7/site-packages/accelerate/launchers.py](https://localhost:8080/#) in...

Nope, still breaks strangely enough (same error and stack trace).

It's actually connected to a local runtime which consists of 8 A100s.

Hmm it still seems to be failing on the minimal example - I get the following stack trace: ``` ProcessRaisedException: -- Process 0 terminated with the following error: Traceback (most...

Nope, same stack trace for both of those cases.