Gait3D-Benchmark lib/modeling/models/smplgait.py throwing error when training a new dataset

Hi Jinkai,

When I try to use the SMPLGait to apply on other dataset, during the training process, the smplgait.py throws the error that: smpls = ipts[1][0] # [n, s, d] IndexError: list index out of range It is also interesting that I used 4 GPUs in the training. 3 of them could detect the the ipts[1][0] tensor with size 1. However, the fourth one failed to do so. Could I know how I can solve this?

Aug 21 '22 06:08 ThomasNing

Hi~ Because the framework is based on DDP mode, it is recommended that you use only 1 GPU for debugging. This will make it clear for you to examine your problem.

Aug 21 '22 07:08 JinkaiZheng

Could I know how to modify the code to running with 1 GPU?

Aug 21 '22 07:08 ThomasNing

Just like this, change the value of CUDA_VISIBLE_DEVICES and --nproc_per_node: CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node=1 lib/main.py --cfgs ./config/smplgait_64pixel.yaml --phase train

Aug 21 '22 07:08 JinkaiZheng

Thank you! I have tried and the same error appeared. Do you have any guess on why the smpls could not retrieve the tensor information from ipts? Also, I keep meeting the error of :

"/home/zhiyuann/Gait3D-Benchmark/lib/modeling/base_model.py:338: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray. for smpl in smpls_batch]"

Do you think that may contribute to the error?

Aug 21 '22 23:08 ThomasNing

I recommend you start at the source and go step by step to make sure what is causing the missing of smpl data.

Aug 22 '22 06:08 JinkaiZheng

Hi Jinkai,

I retrace the error and find that it happens in the base_model.py when running till the pretreating the smpls with the code smpls = [np2var(np.asarray([fra for fra in smpl]), requires_grad=requires_grad).float() for smpl in smpls_batch]

It throws the error that TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.

It will throw errors even if I change the dtype to float16 as the trainer_cfg indicates. Do you know what may contribute to that?

Aug 26 '22 01:08 ThomasNing

The "enable_float16" in trainer_cfg aims to memory reduction and speed up. Maybe you can try: smpls = [np2var(np.asarray([fra for fra in smpl]).astype(float), requires_grad=requires_grad).float() for smpl in smpls_batch]

Aug 26 '22 02:08 JinkaiZheng

Gait3D-Benchmark Gait3D-Benchmark copied to clipboard

lib/modeling/models/smplgait.py throwing error when training a new dataset

Gait3D-Benchmark
Gait3D-Benchmark copied to clipboard