apex icon indicating copy to clipboard operation
apex copied to clipboard

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Results 296 apex issues
Sort by recently updated
recently updated
newest added

This PRs aims to move the attributes of `DistributedFusedAdam` to the correct device for v1 state dict. After loading V1 state dict, tensors in `DistributedFusedAdam.["buckets"]` will be on CPU device....

https://github.com/NVIDIA/apex/blob/6a40a0ad9ff3d6ebea715cf28faf39792312acbf/apex/transformer/utils.py#L10 when i use FusedAdam in torch1.8, there is no all_gather_into_tensor or _all_gather_base in dir(torch.distributed).

``` Running command python setup.py egg_info Traceback (most recent call last): File "", line 2, in File "", line 34, in File "/home/glm/apex/setup.py", line 4, in from packaging.version import parse,...

bug

**Describe the Bug** **Minimal Steps/Code to Reproduce the Bug** running script: "python setup.py install --cpp_ext --cuda_ext" The reporting log: "torch.__version__ = 2.1.2+cu121 Compiling cuda extensions with nvcc: NVIDIA (R) Cuda...

bug

**Describe the Bug** When doing pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./, shows ModuleNotFoundError: No module named 'packaging', but I have packaging installed The detailed error looks like this: Traceback...

bug

While I don't think this is a bug in apex code, I think **technically it's a deficiency in the documentation**. Having counted the number of people in the Issues tab/stackoverflow/etc...

bug

Hi, I have some questions about ASP module: The document and related paper about N:M sparsity says that the matrices are compressed and the metedata are 2-bit . But I...

**Describe the Bug** Try install on HGX-H100 nodes, pip install cannot enable build on cuda extensions like amp_C, etc. **Minimal Steps/Code to Reproduce the Bug** `pip install -v --disable-pip-version-check --no-cache-dir...

bug

**Describe the Bug** #1669 adds a `pyproject.toml` file, but the build dependencies are underspecified. The [`setup.py` file depends on `packaging`](https://github.com/NVIDIA/apex/blob/2d8302a6c12e202f7b40b13a43daa95f326fd0ea/setup.py#L4) but this dependency isn't declared in the [build dependencies](https://github.com/NVIDIA/apex/blob/2d8302a6c12e202f7b40b13a43daa95f326fd0ea/pyproject.toml#L2-L5). **Minimal...

bug

torch.__version__ = 2.1.0+cu121 Compiling cuda extensions with nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Fri_Sep__8_19:17:24_PDT_2023 Cuda compilation tools, release 12.3, V12.3.52 Build cuda_12.3.r12.3/compiler.33281558_0 from...

bug