Tobias Ringwald

Results 24 comments of Tobias Ringwald

> We are preserving MAX_NUM_DEVICES in fbcode so I don't expect it will break stuff internally. Let's merge as is! Great! Thanks for checking.

This is actually a problem even with dark mode addons, as most browsers disable them on local files for security reasons. I have flashbanged myself more than once when building...

Thank you for your bug report. Can you provide a minimal reproducible example that exhibits the problem?

I ran the linked repro on a 128 core/8 GPU machine with `--gpu_num=8` and `--batch_size=16`. At the peak, I observed roughly ~400 Python processes, but given 8 GPU processes and...

Unfortunately, I have already terminated that instance. It's kinda expensive to run a machine like that ;)

> @tringwald why did you tag this as "triage review"? I wasn't sure if this might also need a high prio label. It seems to be a 9x slowdown, according...

Thank you for your detailed bug report. I can reproduce the output tensor on Linux with the given weights. The results are identical for both CPU and CUDA on Linux,...

@r-barnes Thanks for reviewing, I added some type annotations and changed the C++ parameters to `const`.

Thanks @eqy. Those tests in `test_transformers.py` use `torch._fill_mem_eff_dropout_mask_`, which in turn calls a custom CUDA kernel to populate the dropout mask with uniform values before thresholding. I'm not sure why...