DeepSpeedExamples
DeepSpeedExamples copied to clipboard
Example models using DeepSpeed
Bump joblib from 0.16.0 to 1.2.0 in /MoQ/huggingface-transformers/examples/research_projects/lxmert
Bumps [joblib](https://github.com/joblib/joblib) from 0.16.0 to 1.2.0. Changelog Sourced from joblib's changelog. Release 1.2.0 Fix a security issue where eval(pre_dispatch) could potentially run arbitrary code. Now only basic numerics are supported....
Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.25.8 to 1.26.5. Release notes Sourced from urllib3's releases. 1.26.5 :warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap Fixed...
Bump numpy from 1.19.2 to 1.22.0 in /MoQ/huggingface-transformers/examples/research_projects/lxmert
Bumps [numpy](https://github.com/numpy/numpy) from 1.19.2 to 1.22.0. Release notes Sourced from numpy's releases. v1.22.0 NumPy 1.22.0 Release Notes NumPy 1.22.0 is a big release featuring the work of 153 contributors spread...
Bumps [notebook](http://jupyter.org) from 6.1.5 to 6.4.12. [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) You can trigger a rebase of this PR by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- Dependabot commands...
Bumps [pytorch-lightning](https://github.com/PyTorchLightning/pytorch-lightning) from 1.0.4 to 1.6.0. Release notes Sourced from pytorch-lightning's releases. PyTorch Lightning 1.6: Support Intel's Habana Accelerator, New efficient DDP strategy (Bagua), Manual Fault-tolerance, Stability and Reliability. The...
Bumps [pytorch-lightning](https://github.com/PyTorchLightning/pytorch-lightning) from 1.0.4 to 1.6.0. Release notes Sourced from pytorch-lightning's releases. PyTorch Lightning 1.6: Support Intel's Habana Accelerator, New efficient DDP strategy (Bagua), Manual Fault-tolerance, Stability and Reliability. The...
Hello, I have successfully ran through the three stages. But I had to make some cuts on the batch size / lora training. I don't have a good baseline on...
A recent branch of peft is about to support multiple lora adapters. This implementation feels very suitable for the training in ppo stage. An sft model can be used as...
I found that the memory usage is very large even when using zero3 and lora, so I was wondering whether can I support pipeline parallelism or Tensor parallelism?
When I run the step1_supervised_finetuning script, I find that the memory usage of zero3 is larger than that of zero1, which seems unreasonable. Is there any other optimization here?