FlagAI icon indicating copy to clipboard operation
FlagAI copied to clipboard

binascii.Error: Incorrect padding

Open zhihao-chen opened this issue 2 years ago • 4 comments

Description

报错: Traceback (most recent call last): File "/root/work2/work2/chenzhihao/kefu_dialogue/examples/finetune_chatyuan_by_flagai.py", line 360, in main() File "/root/work2/work2/chenzhihao/kefu_dialogue/examples/finetune_chatyuan_by_flagai.py", line 352, in main trainer.train(model, File "/root/anaconda3/envs/flagai/lib/python3.9/site-packages/flagai/trainer.py", line 499, in train model, optimizer, _, lr_scheduler = deepspeed.initialize( File "/root/anaconda3/envs/flagai/lib/python3.9/site-packages/deepspeed/init.py", line 125, in initialize engine = DeepSpeedEngine(args=args, File "/root/anaconda3/envs/flagai/lib/python3.9/site-packages/deepspeed/runtime/engine.py", line 272, in init self._configure_with_arguments(args, mpu) File "/root/anaconda3/envs/flagai/lib/python3.9/site-packages/deepspeed/runtime/engine.py", line 1010, in _configure_with_arguments self._config = DeepSpeedConfig(self.config, mpu) File "/root/anaconda3/envs/flagai/lib/python3.9/site-packages/deepspeed/runtime/config.py", line 722, in init config_decoded = base64.urlsafe_b64decode(config).decode('utf-8') File "/root/anaconda3/envs/flagai/lib/python3.9/base64.py", line 133, in urlsafe_b64decode return b64decode(s) File "/root/anaconda3/envs/flagai/lib/python3.9/base64.py", line 87, in b64decode return binascii.a2b_base64(s) binascii.Error: Incorrect padding

deepspeed config: { "train_micro_batch_size_per_gpu": 8, "gradient_accumulation_steps": 32, "steps_per_print": 500, "gradient_clipping": 1.0, "zero_optimization": { "stage": 3, "contiguous_gradients": false, "overlap_comm": true, "reduce_scatter": true, "reduce_bucket_size": 5e7, "allgather_bucket_size": 5e7, "cpu_offload": true }, "zero_allow_untested_optimizer": true, "fp16": { "enabled": true, "loss_scale": 0, "loss_scale_window": 1000, "hysteresis": 2, "min_loss_scale": 1 }, "optimizer": { "type": "Adam", "params": { "lr": 0.0004, "weight_decay": 0.01, "betas": [ 0.9, 0.98 ], "eps": 1e-6 } }, "activation_checkpointing": { "partition_activations": false, "contiguous_memory_optimization": false }, "wall_clock_breakdown": false }

Alternatives

No response

zhihao-chen avatar Apr 08 '23 08:04 zhihao-chen

你用的是什么optimizer

BAAI-OpenPlatform avatar Apr 12 '23 08:04 BAAI-OpenPlatform

Adam,在deepspeed_config.json中指定的 "optimizer": { "type": "Adam", "params": { "lr": 0.0004, "weight_decay": 0.01, "betas": [ 0.9, 0.98 ], "eps": 1e-6 }

zhihao-chen avatar Apr 12 '23 08:04 zhihao-chen

请问您那边用的deepspeed版本是?看这个错误是deepspeed解码配置文件出现错误。 https://blog.csdn.net/baidu_19473529/article/details/104529756

920232796 avatar Apr 13 '23 02:04 920232796

deepspeed=0.8.3 cuda=10.2 pytorch=1.12.1

zhihao-chen avatar Apr 13 '23 05:04 zhihao-chen