Bruno Magalhaes
Bruno Magalhaes
@conglongli a question related to LR scheduling: the [LR scheduler documentation](https://deepspeed.readthedocs.io/en/latest/schedulers.html) says: > if the schedule is supposed to execute at every training step, then the user can pass the...
@conglongli @mrwyattii I added some information to this PR in line with the [new contributions page](https://github.com/microsoft/DeepSpeed/blob/master/CONTRIBUTING.md#new-feature-contribution-guidelines) you sent. The logic for this PR is done, and the example works in...
> great work and waiting for this. thank you @npuichigo . For the time being, you can use this as in the example (first initialize deepspeed to get the deepspeed...
@colynhn @trianxy @stan-kirdey if not too late: I am building dynamic batch sizes (and corresponding LR scaling) on deepspeed in [PR 5237](https://github.com/microsoft/DeepSpeed/pull/5237), as part of the data analysis module. Stay...
AFAIK, the python version version of TensorRT-LLM matches the one shipped by the default base image. So far (`docker/Makefile`): ``` BASE_TAG = 12.4.1-devel-ubuntu22.04 ``` And ubuntu 22.04 ships with python...