ltu icon indicating copy to clipboard operation
ltu copied to clipboard

Question about Finetune exp

Open doubleHon opened this issue 10 months ago • 4 comments

File "/transformers/tokenization_utils_base.py", line 708, in as_tensor return torch.tensor(value) ValueError: too many dimensions 'str' ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (audio_id in this case) have excessive nesting (inputs type list where type int is expected).

Isn’t the json file read and converted into the corresponding tensor? Why do I find that it has not been converted during debugging? The error is reported when running ./finetune_toy.sh. I don’t know how to solve it.

doubleHon avatar Apr 07 '24 09:04 doubleHon

The error is reported when running ./finetune_toy.sh.

Did you change anything of our script (including the data downloading part)?

YuanGongND avatar Apr 07 '24 21:04 YuanGongND

I seem to know where the problem is. Thanks for your reminder! Is there any specific difference between "finetune_toy.sh" and "finetune_toy_low_resource.sh?" I can now run "finetune_toy_low_resource", but for "finetune_toy" it is always out of memory. We have two 40g A100s and four 24g 3090s.

doubleHon avatar Apr 08 '24 08:04 doubleHon

low_resource splits the model and place the parts into different GPUs (model parallelism), it helps if you have multiple small GPUs a the same machine. It is slower than the other script. The toy one place an entire model on each GPU.

We mentioned the training device requirement in the readme file. I believe 40g and 24g GPUs need to use low_resource script with our recipe. But it is possible to to run the original one with techs like 8bit training etc.

YuanGongND avatar Apr 09 '24 23:04 YuanGongND

Thanks for your reply and good luck with your work!

doubleHon avatar Apr 10 '24 13:04 doubleHon