torchdistx issues

Update reference to torch::TypeError

2

**What does this PR do? Please describe:** Following the changes in https://github.com/pytorch/pytorch/pull/87855, we want to update references to `torch:TypeError` to `TORCH_CHECK_TYPE`. The changes are already made to fbcode in [internal...

mehtanirav

CLA Signed

No Suitable Distribution for PyTorch 1.14

1

I am not familiar with builds, but it seems that I cannot install `torchdistx` for any PyTorch version past 1.13 (e.g. if I am developing on top of current `master`)....

awgu

de-materialize fake tensor/module

Hi I just took a quick look of fake tensor/module APIs. The defer initialization feature looks really cool to me. I am wondering, is there a way to de-materialize the...

GuanhuaWang

enhancement

AnyPrecision optimizer dynamic casting

**Describe the bug:** `exp_avg_sq.mul_(beta2).addcmul_(grad, grad, value=1 - beta2)` dtypes of `exp_avg_sq` and `grad` differ while using in-place operations. **Describe how to reproduce:** ``` # uses default hyperparameters such as momentum=float32...

atturaioe

bug

Make python sub-packages visible

3

Import sub-packages in `__init__.py` so they are the attributes of the package object. Fixes #66 **Check list:** - [ ] Was this **discussed and approved** via a GitHub issue? (not...

atturaioe

CLA Signed

[AnyPrecision optimizer] add automatic BF16 support check (network and gpu)

**What does this PR do? Please describe:** Adds an automatic check for BFloat16 support to AnyPrecision optimizer (self.verify_bfloat_support()). This happens at optimizer init if any of the relevant states are...

lessw2020

CLA Signed

Python sub-packages are not visible

torchdistx sub-packages are not visible while trying to access them: ``` >>> import torchdistx >>> torchdistx.optimizers Traceback (most recent call last): File "", line 1, in AttributeError: module 'torchdistx' has...

atturaioe

bug

[AnyPrecision optimizer] consider FP32 defaults, possibly automated via BF16 support check

1

Enhancement (credit to @rohan-varma): "this can be done in a follow up PR, but let's maybe consider not defaulting things to torch.bfloat16 eventually. this is because it might be good...

lessw2020

enhancement

[AnyPrecision optimizer] Kahan compensation buffer should be stored in state dict for checkpointing

Problem - if the user runs AnyPrecision optimizer with Kahan and checkpoints the model/optimizer, restarting training may start with an empty compensation buffer. This is not a blocking problem, but...

lessw2020

enhancement

torchdistx
torchdistx copied to clipboard

Metadata

Update reference to torch::TypeError

No Suitable Distribution for PyTorch 1.14

de-materialize fake tensor/module

AnyPrecision optimizer dynamic casting

Make python sub-packages visible

[AnyPrecision optimizer] add automatic BF16 support check (network and gpu)

Python sub-packages are not visible

[AnyPrecision optimizer] consider FP32 defaults, possibly automated via BF16 support check

[AnyPrecision optimizer] Kahan compensation buffer should be stored in state dict for checkpointing

← Metadata

Owner

Metadata

torchdistx torchdistx copied to clipboard

Metadata

← Metadata

Owner

Metadata

torchdistx
torchdistx copied to clipboard