Xin Yao
Xin Yao
Issue #10989 Add support for any case where `len(axis) == weights.ndim` for `np.average` Also, support iterable type for axis by calling [normalize_axis_tuple](https://github.com/numpy/numpy/blob/master/numpy/core/numeric.py#L1468) I suggest all functions that accept tuple of...
I spent some time profiling the GAT example with AMP in https://docs.dgl.ai/en/0.9.x/guide/mixed_precision.html and want to know why we didn't obtain performance gain from FP16. I observed regression in both the...
## Description Add bfloat16 support for CUDA >= 11.0. 1. Add bf16 specializations for supported functions. 2. Change the float type dispatcher from bits to real data types. 3. Make...
A serious bug in [cluster.py](https://github.com/jasonwbw/DensityPeakCluster/blob/master/cluster.py#L193-L200) from line 193 to line 200 First, when assign points to the cluster they belong to, we should start from the points with higher local...
v2ray-plugin (v1.3.1) exits unexpectedly on macOS Big Sur Beta 10. ``` 2020-10-28 18:14:31 INFO: plugin "v2ray-plugin" enabled 2020-10-28 18:14:31 INFO: using tcp fast open 2020-10-28 18:14:31 INFO: initializing ciphers... aes-256-cfb...
# Description Add a FP8AllToAll layer, which conducts `cast_to_fp8` -> `all_to_all in fp8` -> `cast_from_fp8`. We're getting about 5% end to end performance gain in Mixtral 8x7B and 8x22B training...
# Description Grouped GEMM for fp32/bf16/fp16 via multi-stream cuBLAS. This is for MoE training. I'll add FP8 support and a `GroupedLinear` layer in future PRs. ## Type of change -...
# Description 1. Commit 806448592bb2d6ab867154665d1613f8f88f664d adds `isort` and move black and isort configs to `pyproject.toml`. 2. Commit 94385e2c573ea1b7fb4daf1e2be1bb8c8174ddbe add cancellation on concurrency to CI jobs to save CI resources. ##...
# Description Fix `autocast` deprecation warnings. Starting from PyTorch 2.4, use ```python torch.get_autocast_dtype("cuda") torch.amp.autocast("cuda") ``` instead of device-specific APIs. Closes https://github.com/NVIDIA/TransformerEngine/pull/1167. ## Type of change - [ ] Documentation change...