Oren
Oren
Love WebRPC!
- supporting gcp cloud storage bucket
:sparkles: Added Eval For Sam Altman Degree (he has an honorary degree from University of Waterloo)
# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...
### Description Ray Collective Doc Declarative Style Doesn't work ### Suggestion update docs from the removed `declare_collective_group` to the `create_collective_group` example of proper usage https://github.com/ray-project/ray/blob/7f1bacc7dc9caf6d0ec042e39499bbf1d9a7d065/python/ray/util/collective/examples/nccl_allreduce_example_declare_collective_group.py#L7-L33 Error Trace: ``` Traceback (most...
### Description Existing CuPy Collective work out of the box with jax arrays either through a wrapper around cupy like in alpa ### Use case GPU to GPU collective ops...
torch==2.0.1 resolves in error for h100s as it is not built for SM90 arch. updated deps to support h100s
I was getting 104% MFU on h100 then i realized that MFU calculation might of been based on a100 312 tflops h100 is 989 tflops at bfloat16. nvidia claims 1,979tflops...
### 🐛 Describe the bug when enabling `kineto__tensor_core_insts` or `dram__bytes_read.sum`, the pytorch profiler outputs this warning and the trace becomes unusable. I have even tried adding the following profiler config...
Is it possible to use streaming dataset as a distributed key value store? i have a set of keys (strings like "xyz_123") each that correspond to an numpy array ideally...