DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

fix opt-350m shard loading issue in AutoTP

Open sywangyi opened this issue 2 years ago • 2 comments

sywangyi avatar May 24 '23 13:05 sywangyi

@delock @tjruwase please help review

sywangyi avatar May 24 '23 13:05 sywangyi

@tjruwase @jeffra could assign a reviewer for this PR? This PR fix OPT checkpoint sharded loading with AutoTP and improve OPT+AutoTP usability, it is needed when run OPT models on CPU server with small memory.

delock avatar Jun 15 '23 01:06 delock

@RezaYazdaniAminabadi can you review this PR? This PR fix OPT sharded loading for AutoTP. Previously only OPT-125m has sharded checkpoint loading, with this fix OPT >350m will have sharded checkpoint loading as well.

delock avatar Jul 06 '23 10:07 delock

@RezaYazdaniAminabadi Hi, a quick check whether this PR is still under consideration. We have verified this PR for CPU accelerator and like to know whether it could be merged into master branch, thanks!

delock avatar Jul 18 '23 06:07 delock