Stas Bekman
Stas Bekman
This PR isn't backward compatible. It breaks with pytorch-1.8: ``` E File "/mnt/nvme0/code/huggingface/transformers-master/src/transformers/models/gptj/modeling_gptj.py", line 63, in E @torch.fx.wrap E AttributeError: module 'torch' has no attribute 'fx' ``` not sure if...
ok, the deepspeed CI is running pt-1.8 - how do we solve that then?
oh, ok, I guess everything is fine then. thank you for the heads up, @ydshieh
it still fails with pt-1.9.1 1. you need `import torch.fx` (thanks @mrwyattii) 2. it then fails with: ``` E File "/mnt/nvme0/code/huggingface/transformers-master/src/transformers/models/gptj/modeling_gptj.py", line 61, in create_sinusoidal_positions E return torch.concat((torch.sin(sinusoid_inp), torch.cos(sinusoid_inp)), dim=1)...
and it fails w/o `import torch.fx` ``` E File "/mnt/nvme0/code/huggingface/transformers-master/examples/pytorch/language-modeling/run_clm.py", line 412, in main E model = AutoModelForCausalLM.from_pretrained( E File "/mnt/nvme0/code/huggingface/transformers-master/src/transformers/models/auto/auto_factory.py", line 470, in from_pretrained E model_class = _get_model_class(config, cls._model_mapping)...
I confirm that it works with `torch.cat` perhaps use `torch.concat` but add an alias: ``` # bc for pt
`import torch.fx` is a must - even with pt-1.10 it won't work w/o it.
@njhill, are you on top of fixing this? This is a bit urgent since Deepspeed CI uses our bleed edge to test deepspeed bleed edge on live CI. and currently...
It's not in the HF Trainer's arsenal of optimizers, if you'd like to make a PR to integrate it then it can be done.
oh, I wrongly assumed that they were saved. Yes, then it makes sense. There will be no miscalculation then, just some very minor intermediary results loss. I think it's all...