DeepSpeed
DeepSpeed copied to clipboard
MoQ problem :'str' object has no attribute 'size'
I got an error using MoQ on ROCM environment on AMD GPU
error:
‘’‘
Traceback (most recent call last):
File "train_deepspeed.py", line 694, in
My JSON file is as follows ‘’‘ { "train_batch_size": 128, "train_micro_batch_size_per_gpu": 4, "steps_per_print": 2000, "zero_optimization": { "stage": 0 }, "scheduler": { "type": "WarmupLR", "params": { "warmup_min_lr": 0, "warmup_max_lr": 0.0032, "warmup_num_steps": 91 } }, "optimizer": { "type": "Lamb", "params": { "lr": 0.0032, "betas": [ 0.8, 0.999 ], "eps": 1e-8, "weight_decay": 3e-7 } }, "zero_allow_untested_optimizer": true, "fp16": { "enabled": false }, "quantize_training": { "enabled": true, "quantize_verbose": true, "quantizer_kernel": true, "quantize-algo": { "q_type": "symmetric" }, "quantize_bits": { "start_bits": 16, "target_bits": 8 }, "quantize_schedule": { "quantize_period": 22, "schedule_offset": 182 }, "quantize_groups": 8 } } ‘’’
Hi,
We recently refactored the MoQ part (version >=0.7.0 for Deepspeed). Please try the newest version and let us know if that works. Here is the new tutorial link: https://www.deepspeed.ai/tutorials/model-compression/
AMD support updated as well, please try with recent DeepSpeed and re-open if the issue persists.