Results 2 issues of liuhatry

# Machine NVIDIA-SMI 535.161.08 Driver Version: 535.161.08 CUDA Version: 12.2 # SoftWare torch 2.1.1 transformer-engine 1.9.0.dev0+56e0b35 # Run Cmd: deepspeed --hostfile hostfile --master_addr ${MASTER_IP} pretrain_gpt.py --deepspeed-activation-checkpointing --deepspeed_config=ds_config_gpt_test.json --deepspeed --tensor-model-parallel-size 4...